Original content by Linda L. Briggs
Source: http://www.tdwi.org/News/display.aspx?id=9608
The appeal of the cloud is catching on in the business intelligence and data
warehousing world, where cloud-based data warehouses and software-as-a-service
(SaaS) BI can offer clear benefits in both data warehousing and BI. Those
benefits include lower initial capital expenditures -- especially attractive in
a weak economy -- a shorter time to deployment, and options for collaborations
between divisions and even across company borders.
As a managing director investing in information technology for
Palo-Alto-based venture capital firm Trident Capital, Evangelos Simoudis is
admittedly bullish about SaaS BI and cloud-based data warehousing. In this
interview, he explains why, as well as where he sees SaaS BI headed despite the
recent demise of one of the first SaaS BI companies. He also discusses how
companies can mitigate the risk of moving data warehousing to the cloud.
Simoudis, who has more than 25 years of experience in information
technology, holds a Ph.D. in computer science from Brandeis University
and a B.S. in electrical engineering from Caltech. He joined Trident in 2005
and invests in Internet and software businesses. He is a director on the boards
of two BI companies, Pivotlink and Host Analytics.
BI This Week: As a venture capitalist watching technology markets, what do
you see as the immediate and long-term future for BI offered as an off-premise
service?
Evangelos Simoudis: Clearly, there's a renaissance of sorts with BI and
analytics in general. Businesses, regardless of industry sector or size, are
becoming analytics driven. Online businesses in particular have led the way in
demonstrating the effective use of data to make meaningful decisions.
Seeing how online businesses (for example, Amazon's use of data about
customer activity to make its product recommendation engine more effective) are
analyzing their data with good results, other corporations now want a quick
path to analytic information. Unfortunately, though progress has been made in
the 25 years that BI solutions have been available, the reach of BI within
corporations is still limited. Even within the Global 2000, estimates are that
on average, only 20 percent of corporate users that might benefit from BI and
analytics, actually use any type of formal BI solution.
We can attribute this low adoption rate to either high cost or the perceived
complexity of most solutions. Smaller companies, in particular, often lack the
financial resources to purchase a costly solution or the IT resources to
properly deploy and support it.
Even with the slower adoption rate for formal BI solutions, corporations
usually have dozens or even hundreds of small ad hoc data marts in house, built
using inexpensive databases such as MySQL and Microsoft Access, often accessed
through Microsoft Excel. These applications are typically maintained by their
business owners rather than by IT personnel, because while their content is
valuable, they aren't viewed as strategic enough to warrant IT resources.
SaaS BI solutions can effectively address all of these less-than-ideal
situations. Such solutions are simpler to deploy, require few corporate
resources to manage and maintain, and have a lower total cost of ownership
(TCO) than on-premise BI solutions. As a result, SaaS BI solutions are emerging
as a strong alternative for companies of any size that aim at analytics-driven
decision-making.
Here are just a few of the unique advantages that SaaS BI solutions offer
over their on-premise counterparts:
The wisdom of the crowds: By monitoring the use of a SaaS BI solution, the
vendor can (a) dynamically optimize it to improve performance and (b) determine
which features are being used effectively, which features present difficulty to
the user although users find them important (thus enabling the vendor to offer
training), and which features aren't being used at all, enabling the vendor to
remove them from the application and avoid maintaining them over time.
Customer metrics: This information allows the vendor to make immediate,
incremental improvements to the product. That is difficult to do when the
vendor's solution is on-premise with each customer. The bigger point here is
that vendors of on-premise solutions don't know how their software is being
used. SaaS vendors, in contrast, can closely monitor usage and even provide
user metrics to customers in the way Nielsen and comScore do.
Leveraging collective knowledge across divisions and companies: Companies
whose data is in cloud-based warehouses can easily choose to share or combine
certain data sets with other companies using the same vendor, thus enabling
collaborative problem-solving. Employees within a company but separated
geographically or by division can more easily collaborate with colleagues and
partners by looking together at a single view of internal and external data,
all housed in the cloud.
Decisions that are most effective in today's business world are often based
on data blended from inside and outside the company. I'm already seeing early
examples of this type of collaboration, and I expect collaborative BI will
accelerate in the next five years; SaaS solutions will be the catalyst.
Specific analysis results and associated data can also be tagged (using systems
like the ones found today in consumer Internet products such as Yahoo's
Delicious). Tags can be searched and shared, enabling yet another level of
information use and collaboration among users.
Given those advantages, the prospects for SaaS BI seem very good.
However, what does the recent demise of SaaS BI company LucidEra indicate?
I continue to consider the near- and long-term prospects of SaaS BI
solutions to be extremely bright. LucidEra's recent failure is no different
than what we investors see in every other early market opportunity, because
even though BI has been around for many years, SaaS BI is relatively new. In new
markets, some companies succeed and some fail.
I'm sure LucidEra's failure is causing companies that are already using SaaS
BI solutions, or are planning to use them, to once again go over the soundness
of their decision. However, my prediction is that this will only serve to
improve how companies interact with their SaaS application vendors, including
the ones that provide BI solutions. That is, companies should become more
proactive about asking for detailed specifications on the vendor's data policies,
disaster recovery policies and plans, and protection of customer data should
the vendor fail.
How should organizations think about data warehouses in the cloud?
Cloud-based data warehouses offer smaller companies their first true
opportunity to take full advantage of data warehousing in a cost-effective
manner. By the same token, large companies are interested in the potential of
cloud-based data warehousing solutions to speed up the creation of data marts
from enterprise data warehouses. They can do this while reducing development
time and maintenance costs, thus improving the "time to value" of
data by making data available to analysts faster.
Finally, cloud-based data warehouses can provide companies with elastic
capacity to help address the usage spikes that are common with mission-critical
data warehouses and marts.
What sorts of concerns do you hear from IT executives about cloud-based
data warehouses?
I'm hearing a number of legitimate concerns, which vendors will need to
address in order to move cloud-based data warehouses forward.
Security: As prominent corporations create cloud-based data warehouses and
draw visibility to the technology, CIOs fear that the cloud will become a prime
target for hackers who would want to exploit the existence of large databases
with prized corporate data, all in a single place.
Data Integrity: IT must guarantee that the data stored in a company's
operational databases is synchronized with reasonable frequency with the data
stored in the cloud-based data warehouse to ensure the cloud-based DW has the
correct data to drive analyses.
Data Ingestion Throughput: This is not an issue in the short term, but will
become more important as ever-larger data warehouses and marts are implemented.
The size of the data sets and the time it takes to ingest them will become
issues around which IT organizations will need to obtain guarantees. This issue
will also need to be considered if a cloud-based warehouse is used to provide
elastic capacity to address usage spikes.
Service-level agreements (SLAa): The cloud data warehouse vendor must be
able to address SLAs such as uptime, query response time, and disaster recovery
plans. For backup and recovery in particular, any cloud-based software vendor,
including those that provide data warehousing services, should have automatic
failover capability to ensure the customer won't experience any service
interruptions. In addition, and depending on the critical nature of the
analytic applications supported by the cloud-based DW, the customer may want to
insist that the vendor use an alternate cloud vendor for backup and disaster
recovery services.
The Ability to Take Back the Warehoused Data: IT must ensure that the
corporate data will not become hostage to the cloud-based DW vendor. To this
end, the vendor needs to make it easy for the customer to take back its entire
data set, either because of a switch in vendors or because the vendor is going
out of business. The cloud-based DW vendor also must make it easy for the
customer to obtain subsets of stored data to use with other analytic
applications offered by other on-premise or SaaS vendors.
Warehouse Auditing: IT must be able to audit the warehouse in the cloud in
the same way that they audit any on-premise data warehouse, for issues
including data integrity, query efficiency, and information delivery
reliability.
How should companies integrate data from cloud-based transactional
applications into their SaaS BI systems?
The first generation of cloud-based data warehouses are actually data marts
that provided reporting either on data stored in a single application (for
example, Salesforce and Netsuite are already offering reporting functionality
with their applications; SaaS BI vendors such as Pivotlink also provide
solutions that analyze the data stored in SaaS applications) or on data that is
extracted from existing on-premise data warehouses. The next wave of
cloud-based data warehouses will integrate data from several SaaS applications.
SaaS application vendors are just now starting to develop APIs that allow
the type of bulk data extraction necessary for the creation and updating of
data warehouses. We are also starting to see the emergence of SaaS data
extraction and integration vendors (some with proprietary software and others
with open source software) whose tools are used in conjunction with these
cloud-based data warehouses.
Today I can think of three types of risk related to data integration into
cloud-based DW:
- Immature APIs: The
APIs provided by SaaS applications are not yet mature, since their first
use is transactional operations rather than analytical/warehousing
operations. Since the operational databases are under the control of the
vendor rather than the customer, it's up to each SaaS vendor to place the
correct priority and develop APIs for warehousing operations.
- Immature ETL Tools:
ETL tools for SaaS are improving in their ability to deal with data from a
single data source, but are just now being tested on multiple data sources
distributed across several SaaS application vendors.
- Distributed Data:
Source data is becoming distributed across SaaS application vendors with
different capabilities and priorities. That means that coordinating,
synchronizing, and moving the source data across the cloud and into the
cloud-based data warehouse needs to be taken into account when planning
such a cloud warehouse, because the warehouse's owner has less control
over the entire operation than with an on-premise equivalent.
Does that mean that having federated source data across multiple SaaS
application vendors makes it harder to create a cloud-based data warehouse?
As I mentioned, we are still in the very early stages of the thinking
regarding cloud-based data warehouses. Companies are most commonly at the stage
of warehousing data from a single source (a single SaaS application, a single
on-premise application, or another on-premise warehouse).
Companies are not yet warehousing data from multiple SaaS applications --
that still represents state-of-the-art practice rather than a mainstream practice.
For the progression to be successful, IT organizations in collaboration with
business users will need to exhibit new and different thinking. Will
organizations be willing to devote resources to create this type of
multi-source database cloud-based DW? That will depend on their overall
experiences with first-generation cloud-based data warehouses. In any event,
the next five years will be an active and exciting period for SaaS BI and
cloud-based data warehousing.
Recent Comments