Sales Management Customer Relationship Human Resources Business Performance BI & Data Quality IT Tools & Vendors

Sign-in   Register
Establishing 'Making it Happen' as a 'Formal & Predictable' Discipline
   Data Warehouse Modeling and Analyze Phase OLAP + Data Warehouse Design Phase  

ENCYCLOPEDIA→   Enterprise Intelligence  →   -  Data-Warehouse/Mart  →   -  Data Warehouse Project Plan- Work Break-Down Structure  → 

Data Warehouse Design Phase

Data Warehouse design phase is hinged around the extraction, transformation and loading phase. It is advisable to do a test run on a simple schema, before getting down to a full design of your enterprise data warehouse.

Detailed Assessment of your existing IT landscape

This is a detailed inventorization of your IT landscape, which could have implication in selection of your BI tool vendor as well as for the Business Intelligence design. This inventorization will include-

Integration platform

  • Any application integration platform, currently in use (like Tibco or MQ Series...).
  • What are the various application integration techniques are used? Most of the organizations, though having a core EAI platform, still use different type of integration techniques ranging from state of the art data exchange bus to point-to-point data file exchange.
  • Any current issues in the integration space.

The source systems (as identified in the analyze stage) inventory-

  • The servers and their location
  • The operating systems
  • The Database management system
  • How these source systems are currently exchanging data

Data Quality and Stability issues

  • Are there any weak spots related to the data quality?
  • What have been major production issues, related to the source systems?
  • Have there been any audit comments related to the source systems?
  • Are users happy with the data accuracy in the reports produced out of the source systems?

Organization's platform direction

  • Is the organization Oracle or IBM or MS centric in terms of its DBMS
  • Which ERP track the organization is going towards Oracle or SAP?

Future IT road-map

  • Are we planning any major IT initiative, which could have an impact on BI platform?
  • Is any of the source system planned to be re-written or significantly changed?

Evaluate and identify the core BI platform

Before you start designing the data warehouse, you should have identified your core BI platform including the ETL, Data Warehouse server, metadata management and OLAP server.

Along with this you would have identified the application server and web server, which will enable the integration of the core BI platform with other applications and with the end-users. In reality, most of the times, you select your core BI platform components (recommended to be taken from a single vendor apart from data Warehouse server...) before you initiate the project. This is because your vendor brings in huge knowledge-base to support you through analyze and design phases. The evaluation criteria are provided in the Business Intelligence Tool Evaluation Kit. The whole process of evaluating and identifying vendors and tools is covered in vendor-tool domain section

Do source system Mapping

This is well covered in the Source systems mapping and Source systems mapping matrix. In-brief, you create the list of source systems, and get the details on the level of stability, quality of data, and which data they can provide. At this stage you will also decide on which system will be the master reference for which data.

Do ETL for first set of schemas

The details of ETL are covered in Extraction Design, Transformation design and Loading Design. If you are working on an enterprise data warehouse and not data-mart (refer Data Warehouse vs Data mart), you will be picking up first 1-2 simpler schemas and do an end to end design of extraction, transformation and loading. This will provide you a significant level of learning on the issues you face, your learning in terms of the configuration and usage of ETL tools (and check if they really can do what they claim to do...)

NOTE- Some part of this effort may be throw-away. This is because, when you create your ETL strategy for all schemas in the data-warehouse, your ETL design will be different vis-à-vis doing it for 1-2 schemas. Many of the Extraction and transformation routines could serve multiple schemas. (As a help note- schema is like a single cube in OLAP...)

Develop and Test ETL for first set of schemas for sample production data

Do the development and testing for the first set of identified schema and check on how it works. This will provide you the last leg of initial learnings you need to have before you go for full-fledged design.  Development and testing should not be that much of an issue, if you have a smart ETL tool.

Do the ETL design of the entire set of schemas

Now is the time to go ahead with the detailed design for the entire enterprise data-warehouse. This means that you will do a holistic design of all the schemas involved in the current scope of data-warehouse initiative. As you do the holistic design, across the schemas, you may use the principles of synergies so that one can-

  • Avoid duplicate extractions- Though three schemas are looking for the customer data, the extraction routine for getting customer master should be run once.
  • Avoid duplicate transformation routines- It’s the same principle as above. If there are 3 schemas looking for customer data, you may run the transformation routine of de-duping customer master only once.
  • Make entity based transformation routines- This is aligned to the principle of conformed and foundation dimensions. A healthy data warehouse design will use standard dimension table design for a given dimension across various schemas. This will mean that even if a given schema is not using specific dimension attribute (like not using the state attribute in the location dimension), it will still be maintained in the dimension table and populated (if the data exists...).You may refer use foundation dimensions for cross-drilling.

Taking this principle into account, an ideal transformation will be the one which will do the end to end transformation of a dimension in one go. For example, it will do all transformation related to customer master, location master, product master, so that the whole process is clean and auditable.

Implement the physical database design

The whole ETL design (which includes data base structures as well as the ETL routines), will be in the logical form to start with. The same has to be translated into the physical design. A complete chapter on Data Warehouse physical design will be released soon.

Configure Metadata for the design elements

Once your design is complete, you will start configuring the data warehouse metadata in the metadata management tool. The metadata tool configuration will be an ongoing process, which primarily starts with design phase.

Sometimes the question comes on 'Metadata also includes dimensional models as you create them in the analyze phase. Why should you not configure those dimensional models in Metadata tools in the analyze phase?’ The answer to this question is that firstly the dimensional models are signed off towards the end of the analyze phase and beginning of the design phase. Secondly, as you do the ETL design, they undergo some level of fine-tuning in the dimensional model. Therefore our suggestion would be to start configuring your metadata tool in the design phase, after you have done the holistic ETL design.

Design and Configure Data Warehouse Job flow and Scheduling

The ETL design includes the sequence of tasks. In these steps, you will translate this sequence into a job flow and control language. This is part of data base management and data centre operations task. ETL scheduler tool is the key in this set-up. All good ETL tools come with the ETL schedulers. You may also decide to use a third party schedule as you may want a single schedule for all your job scheduling activities.

Design and Configure Data Warehouse Access Browsing and Query Servicing

Data Warehouse querying and browsing services are typically part of the standard Data Warehouse server. The only difference here is that you would need to configure them for Data Warehouse kind of design. For example, making query services aggregate aware. This means that if an aggregate/summary level schema/cube exists, query service will not be using the detailed level schema for running a summary level query.

Design and Configure Staging Servers and Loaded Areas;

This is the standard step of configuring the servers with all the attendant components, just like you will do it for any OLTP application.

 

   Data Warehouse Modeling and Analyze Phase OLAP + Data Warehouse Design Phase  
 
 

Was this page helpful?
 
 
More on Data Warehouse Project Plan- WBS
Data Warehouse Project Definition
Data Warehouse Project Initiation Phase
DW Business Requirements Gathering Phase
Data Warehouse Modeling and Analyze Phase
OLAP + Data Warehouse Design Phase
Physical Database Design and Implementation
BUY BI & Data Management Vendors & Tools Evaluation Kit
Read more...
BUY largest on-line Data-Quality Management Kit
Read more...
Additional Channels
Principles & Rules
Free Templates
Glossary
Key Performance Indicators



Most Popular Zones with list of pages crossing 25000 hits  →→→ 
Maximising Sales Performance
Sales Leads Management System
Sales Channel Retention, Support and Engagement
Sales force Training and Development
Sales Leads Classification and prioritization
Sales geographic expansion
Read more...
  Customer Relationship Management
Customer Value and Profitability Tips and Actions
Customer Knowledge and Organizational Knowledge
Drivers for Customer Satisfaction & Retention
Customer Segmentation Actions
Customer Value and Profitability- BI
Read more...
  Human Resources & Leadership
Act with Decisiveness
Give feedback closer to the observation
Lead diverse and collaborative teams
Customer Focus
Business and Financial Acumen
Read more...
 
 
Business Performance & Planning
Strategy Map Objectives Measures and Initiatives
SWOT Assessment Report
For important KPIs- Install first & Fix later
Individual goal Sheet
Internal Info Assessment Report
Read more...
  Business Intelligence & Data Quality
Customer Satisfaction and Retention- BI
Business Ownership of Data Quality
DMA Data flow Analysis
System Quality Assessment Tool
A smart manager does not follow-up
Read more...
  IT Vendors & Tools Management
Vendor Evaluation Matrix
Load, Log and Cache Management for Reports
Metadata Tool Architecture Features
Vendor Credentials and Track-Record Evaluation
Metadata Tool administration Security
Read more...