|
Data Warehouse Source Systems – Concepts
NOTE- While this page is part of Data Warehouse section, the concept and process can be equally applied in the areas of Master Data Management and Operational Data Store. Both MDM and ODS involve the use of Extraction, Transformation and Loading. The extent and intensity of transformation varies depending upon the purpose.
Sources systems are all the data feeding pipes to the staging area. TYPICALLY any Transformation on the data is done after the data in its raw OR unchanged form is picked from the source systems.
Core Systems
These systems include the core systems mostly having well-organized database, set schedules, the data being updated on online basis. Typical systems are core product manufacturing systems, accounting systems, Money Management systems, commission/sales compensation systems and tightly coupled job/workflow systems.
Typical traits of these systems are:
- There is an emphasis on the quality of Data related to external customers and stakeholder. (Try sending a wrong card statement OR bank account statement).
- Any data apart from the above typically does not get that level of emphasis as its inadequate quality does not raise that many whistles OR risk to the bonuses.
- Large applications running for a long time do not have enough documentation OR skills to enable fully understand the data layout, across thousands of tables, masters, flags, dates and codes.
- The well managed job scheduling etc. allows a reliable extract files ones the agreement has been reached with the technology owner.
- The production Batch-schedules for these systems take maximum priority and sometimes the Data Warehouse has to struggle to find right place under the sun.
- Due to constant system evolution and companies preferring to build additional layers on their legacy/older systems, it leads to multiple and varying platforms. (The Policy issuance legacy system and a state of the art commission/compensation calculation system).
Field OR Front-end Systems
These are the systems, which are primarily used by the customer acquisition and retention staff. These include customer service systems, sales automation systems, lead management systems and campaign management systems. Features of these systems are:
- These systems 'typically' do not enforce the data quality as rigorously as the core systems.
- Many a times, these systems are developed as functional level initiatives (it is possible to have different functions having different customer service OR campaign management systems). Therefore their data standards, data models etc. may not be in synch.
Modeling and Analysis systems This family of systems includes budgeting, planning, forecasting, pricing and valuation type of systems. Features of these systems are: - These are more standalone and farther from the mainland core systems compared to field systems.
- There is no rule OR even a remote generalization possible for these system, as the may range from excel based models to highly sophisticated systems like Data Mining.
External Data - Demographic market segment data.
- Target Customer Lists
- Customer Business segments
- Industry benchmarks
- Credit bureau information
- Industry standards like mortality rate tables for life insurance companies.
Industry benchmark and standards data is available in a standard format. The marketing and customer related information is available in varying forms and they are ridden with data quality issues.
Non-Data Base/Desk top Sources
Wealth of information and critical operational data resides in the Excel sheets and MS access tables in the desktops OR local servers of organization. If you want to surprise yourself, just make a study of the number of excel based applications, which have become critical part of operational delivery and reporting. The numbers could go in hundreds, if not in thousands.
Some of these carry the derived data from the main systems, but many of them carry the original data, which does not exist in the production databases. Typical examples are the excel based trackers, calculation tables, MIS packs. Though some technical die-hards will keep on raising the risks of these kinds of systems, this evolution is as inevitable as the month-end skew in Sales community.
Apart from Excels, MS Access, there is large amount of data available in interface and flat files produced by the systems. Typically core systems and more often legacy systems produce the output files through extensive batch runs, and these files serve the purpose of reducing the Transformation load. Secondly these output files some time contain the historical information, which is either achieved OR over written in the core system.
Typically you will use the existing services, infrastructure and data to create the first phase of your Data-warehouse.
Data Warehouse Source Systems Mapping Design- Process
This is the first step of the technical architecture development: Source System Mapping:
This is a very important step. This will contain the listing of 'all' sources of data in the business environment, which can supply data to the given scope of DW. The output is the Source system mapping tables. It will contain:
- The source system name
- The source system 'Business owner' and 'technology owner'
- Brief functionality and purpose
- Referring to the dimension and Facts tables, list out the facts and dimensions, which are available. Against each dimension OR fact list out:
- The frequency of change. - Is it 'Master Reference'? - Is it equal to 'master reference' ? - The tables in which the data is residing. - Is the historical information overwritten OR appended?.
PLEASE REFER Execution-MiHPractice Tool Source system mapping matrix for how to implement the source system mapping. |