Data-Marts are the initial stages of the evolution cycle of an enterprise business intelligence platform. Please refer to Data Marts v/s Data Warehouse for a back-ground. Data-Marts start as a function level or sub-function level initiative.
In a real-life scenario, each of Data-Marts, may have their own unique:
-
Set of dimensional structure: For example customer dimension being different across customer value analysis data-mart vs customer service analysis data-mart).
-
Unique ETL scripts- Due to unique dimensional model, there will be typically unique set of Extraction, Transformation and Loading scripts.
-
Silo IT support structure: (by IT or by outsourced vendor directly hired by business)
-
Own super-users: Different super users, having different level of skills to design the views.
-
Different labels and formulas: Different formulas having same labels and different labels having same formulae
This results in:
-
Source System overload- Due to multiplicity of ETL requests for the same information, the source systems, get into extended end of the day extraction jobs. For example 5 different data-marts related to customer analysis (Customer value analysis, customer service analysis, sales revenue analysis, sales leads analysis..), could be asking for same customer data 5 times, due to their individual ETL routines running.
-
Questionable and conflicting information- Due to non-standards terminologies and formulae, different reports from different data-marts will show conflicting figures. This leads to board-room fights.
-
Large number of super-users involved in the Data-Mart management- Given separate super users for each data-mart (and even separate super-users for each different element of the data-mart), significantly increases the number of headcounts and the cost associated.
-
IT support overload- The entire coding and development effort linked to creating & maintaining ETL scripts, Calc scripts (for OLAP servers) etc. will lead to high IT cost per data-mart.
A commonsensical solution
A high-level recommendation will be to do consolidation of data-marts to what-ever extent possible. However it is not an easy task and an ideal consolidation is impossible in the real world, unless you are doing a massive restructuring of your BI environment. Data-mart consolidation for cost-cutting & streamlining can be an outcome of a comprehensive BI program, but not the reason for it. This is because re-structuring of your BI environment is a longer term commitment for strategic reasons and not only for tactical efficiencies.
Different Data-Mart Consolidation steps
Each of these steps has a different level of effort and change management associated with it. You can adopt a combination of the below-said, depending upon your own scenarios:
-
Creating common-pool of super-users- This will save on HR cost and also promote cross data-mart visibility.
- Create virtual cubes using common dimension across the cube- let's assume that one data-mart is leading to a single single-cube. If two data-marts share a common dimension, they can be linked with each other through virtual cubes (creating joins across the cube- different tool vendors, adopt different techniques. Virtual cubes is a method used by tool like MS OLAP). For example if sales revenue and sales channel data-marts have the channel and location as shared dimension, they can be linked together. This will allow you:
-
To have reports, which use data across these two cubes (information enrichment)
- To identify any discrepancy across the shared dimensions (location channel defined in a different way in two different data-marts), as the shared dimensions have to be standardized, if they have to be used for joining two cubes.
- To identify any redundant measures across the data-marts. Once you have a join, in most cases you will be able to identify on the source for a measure. Let's say that you have sales revenue as a measure in both the data-marts. You can decide on which data-mart to use as the source of sales revenue (information source standardization)
- Standardizing the labels, formulae, business rules- by creating a council of super-users and IT resources, you can identify the non-standard elements (same formulae with different labels, same labels with different formulae). This will mean, the some design and development work will be needed in your calc scripts and ETL routines. This however is still not a true consolidation, where you re-structure multiple data-marts with a lesser number.
- Consolidate the ETL scripts and routines- You can create staging data-base, which is populated once from the source systems, and thereafter, the data-marts are picking-up data from the staging database, instead of overloading the source systems.
- Consolidate the data-marts- This means a true consolidation, whereby similar data-marts can be merged together and converted into a large data-mart(s).
|