Building Making It Happen
Building Making It Happen
  Sign-in         Register
    
Principles and Rules Listing Page
You cannot have a super-flexible Data warehouse
While storing data at the most granular level , makes a data-warehouse flexible to unforeseen queries. However, one does need to keep the key purpose behind the creation of a schema or a cube in OLAP in mind.
 
This page of 'Principles and Rules' is linked to:  Data Warehousing, Data Analysis/OLAP, BI platform Tools Evaluation,


We have shared the tips on how to make your dimensional model scalable and extensible, a data warehouse cannot be super flexible, in terms of handling any kind of query.

When you present a business-case on BI ( or in particular Data Warehouse and OLAP - the largest piece in terns of efforts), avoid over-selling and positioning a DW as a black-box, which can provide you the answers to all the questions. Having gone through the brochures of many platforms, the over-selling is much more acute in BI platforms.

The reasons that you cannot have a super-flexible DW are:

No platform can realistically handle beyond certain number of dimensions

One can think that if you pick all possible data at detailed level and place it in distinct dimensional model schemas, you can process any query. This is not the case, because a super flexible data warehouse will have to have a single schema and large number of dimensions. While many BI platforms claim to be able to handle hundreds of dimensions, in reality going beyond 30 odd dimensions can lead to huge data explosion.

Summary and aggregates

You need to have summary or aggregate schemas, where you have data at summary level (like agent level sales figures, instead of individual invoice level data), to increase the response time to the queries. If you only have detailed data, multiple queries running on a schema can de-mobilize the system. These summary schemas are drive by the specific business requirements. You can have million different options to summarize the detailed data. You have to be selective on which aggregate schemas to be made for which level of summarization.

Derived attributes and facts

To increase the response time, one also creates derived facts. Here also you need to be selective.

Processing and population time

Even if you have unlimited storage and processing infrastructure, a large dimension set, and overly large number of aggregations/derived facts will expand the end of day extraction, transformation and loading process, which will reduce the windows available to the users.

Conclusion

In other words, one can have a super flexible data warehouse, if one has infinite storage and processing power and one is ready to spend half a day to populate the DW and ready to wait for many days to run a complex query. As you will agree, none of these conditions are feasible.

   Access more details on this page   

Quick Feedback- Was this information helpful ?
Relevant Links to this page
Principles & Rules → Add extra buffer for ETL phase → Principles & Rules → Homework before interviews is must (Business Requirements Phase in Data Warehouse) → Principles & Rules → Excel is the competition, which should be challenged → Principles & Rules → Avoid Pure MOLAP → Practice Techniques → Field Tips Series- Streamlining & Cost-Reduction in Business Intelligence- Consolidate Data-Marts → Practice Techniques → Field Tips Series- Streamlining & Cost-Reduction in Business Intelligence- Licensing & Maintenance Contracts → Practice Techniques → Field Tips Series- Streamlining & Cost-Reduction in Business Intelligence- Governance & Standards → Practice Techniques → Field Tips Series- Streamlining & reducing cost of Business Intelligence- Evaluate Open Source → Principles & Rules → Master Data Management- Making a Right Start → Practice Techniques → How to integrate stand-alone BI environments- Gradual Approach → Principles & Rules → Business owned applications are a reality- Manage it → Principles & Rules → New Data Standards- What about existing data and applications? → Principles & Rules → Handle Each Time-stamp in the Fact Table as a separate dimension → Principles & Rules → Keep Aggregates and Details data in different Fact tables → Principles & Rules → Some considerations for Infrastructure in Data Warehouse → Principles & Rules → For Core BI platform go for a single, established and robust player → Principles & Rules → Don't be guided only by the business requirements for your Business Intelligence → Practice Techniques → Using Synonyms and Views → 
 
Back
Featured Pages
MDM Data Quality Control
Knowledge Discovery in Databases Methodology
Data Warehouse Infrastructure Considerations
Data Warehouse Design and Architecture Overview

Make 'Executable' Strategy
Maximize Results
Maximize People
Manage Execution

Featured Pages
Business Intelligence organization roles
Customer Data Quality Impacts
DMA Data flow Analysis
Knowledge Discovery in Databases Process