Business Intelligence has been with us for quite a while now in one form or another, from the days of decision support systems of the 1970s through the coining of the term data warehousing in the early 1990s and the addition of OLAP (Online Analytical Processing) to our lexicon through to the current world of business intelligence.
By and large although technologies have come and gone during that time or undergone many changes, the processes that allow end users access to the data on which decisions are made have remained fairly constant.
In a nutshell, this process involves the extraction of data from operational systems and transforming the data into a format more suited for analysis.
The result is that organizations invest in a traditional BI stack or cherry pick best of breed solutions to suit their needs, and bypass those that they do not need.
The Traditional BI Stack
The term BI Stack refers to the multiple architectural layers inherent in the solution. These are described below at a high level.
ETL
This is the integration layer that enables the
extraction of data from a whole raft of
applications, databases and flat files. The key
players in this space are Informatica, Ab
Initio, Business Objects, Microsoft and IBM
through their acquisition of Ascential.
Staging
There are not many BI projects in this day and
age that do not require some form of staging
area separate to the data warehouse. This
staging area acts as
- a landing area for incoming data from multiple source systems.
- for the application of business rules upon the incoming data
- cleansing and consolidation of data
- transforming the data ready to load into the data warehouse.
Whether you advocate the normalized approach of Inmon or the dimensional approach of Kimball, this stage is the key piece in the architectural jigsaw of the typical BI project. A data warehouse is a central repository for all, or a significant part of the data that can be collected by an enterprises various business systems. This is an area that all of the major database vendors are now onboard with and they enable their software to be configured specifically for decision support.
The key players are Oracle, Microsoft, Teradata and IBM.
Data Marts/OLAP Cubes
Typically, there is a layer of abstraction
between the end user and the data warehouse, and
this is where there is the most diversification
in terms of approach. The two main routes that
can be taken are either relational or
multi-dimensional.
Relational OLAP (ROLAP) uses a relational database to store data in either a star schema or snowflake data marts containing data aggregated to a higher level that in the atomic level data warehouse. The advantage of this is that the data is accessible by SQL meaning that you do not require a business intelligence tool to access the data. ROLAP databases tend to be very scalable and allow for the storage of vast amounts of data within their structures.
Multi-dimensional OLAP (MOLAP) The arrangement of data into cubes avoids a limitation of relational databases which are not best suited for near instantaneous analysis of large amounts of data. OLAP cubes conceptually are extensions to the two-dimensional array of a spreadsheet. For example an organisation might wish to analyse some financial data (measures) by product, by time-period, by city, by type of revenue and cost, and by comparing actual data with a budget. These additional methods of analysing the data are known as dimensions. Recently many OLAP vendors have embraced a standardised query language for cubes called Multidimensional Expressions (MDX).
Presentation Layer
This layer consists of how the user interacts
with the data, from applications as simple as an
Excel spreadsheet to complex data mining
applications. At a high level, these business
intelligence tools fall into one of two camps.
There are query, reporting and analysis tools
and advanced analytics.
Query, Reporting & Analysis tools
These tools
include ad-hoc query and multidimensional
analysis tools as well as dashboards. These
tools offer rows and column reporting, the
ability to show data in a chart , and the
ability to be able to slice and dice data to
easily change the layout of a report and drill
through dimension hierarchies in order to show
summary or detail figures.
Advanced Analytics
These tools include data
mining and statistical analysis applications
which use technologies such as neural networks,
clustering and Bayesian Classifiers among others
to establish relationships in the data that may
be too complex to be extracted using
query/analysis tools. This enables predictions
to be made based on past behaviour.
Often solutions comprise of software sourced from multiple vendors to cover each layer of the stack and it is the job of IT and/or outside help to make them all fit together. While this gives you the ability to take best of breed solutions specifically designed from the ground up for that purpose, this can prove to be a difficult task, owing to different technologies and platforms and it is no surprise that recent changes has seen some vendors moving to a deep vertical often integrated product offering across the whole BI stack.
Among these it is the more traditional technology stack players such as Microsoft and Oracle who are moving more and more into this space offering a one-stop-shop for all things BI related. Oracle has Oracle 10g, Warehouse Builder at the back-end and following their purchase of Hyperion, they have a strong offering at the front-end too. Microsoft has the current release of SQL Server (2008) and its suite of integrated BI applications (Integration Services, Analysis Services and Reporting Services).
