This is a relatively interesting and thoughtful tip. The typical
use of this ETL and DW combo is to provide the information needs (enterprise
reporting, Query and analysis and Performance reporting) for a given set
of dimensions, data elements. It also provides you information to maximum granularity. However mostly the information on the 'source' or 'source system' of the information
is lost once the data is taken through the ETL. That is fine as long as you
have the above conventional needs. However, it becomes a problem, when reporting
and querying is linked to a particular source system and not only for a particular
information theme. For example:
- Number of sales calls made to active customers v/s how many sales calls
were made to active customer through the "Jupiter" (say a name of customer
contact system)?
- How much revenue was booked through the 'Saber' system?
- Mismatch between the financial information reported through the data warehouse
and the transaction systems. If this needs to be investigated, one needs to
have an audit trail on how the information from source system got transformed
to the loading stage.
Therefore it is advisable to have the key master fields (source system customer
ID, location ID) to be retained as attributed in the dimension tables.
The other key benefit for the source system trail to be maintained is to change manage the dimensional model. There can be two types of changes in the DW environment from design perspective:
- Source system undergoing a change and you want to do an impact analysis on the downstream dimensional model.
- You have additional information requirement. Due to this, you are adding new attributes and facts to the existing dimensional model. In this case you will need to change your ETL programs.
In both the above scenarios, you will have to do an impact analysis and drive those changes. If you have a source system trail, you will be much better equipped on the impact analysis and doing the change management. For example, let us say that you are extracting travel booking information from three different travel systems. If one travel system undergoes a change in the 'travel_agent_ID' field code (which means that agent code is now 8 characters instead of 6 characters) and you have been tracking it in your dimensional model in the data-warehouse, you can make the change much easily. |