Given the volume of data and complexities of scenarios, the Data Warehouse can be tested with gradual scaling of the same.
Limited Data Warehouse test Data
This involves feeding limited transactions in the source systems (typically less than few thousands for each data-mart'). This should ideally take care of key scenarios in terms of different Transformation logics. Say, if Transformation is doing some de-duping, place couple of duplicate cases. For customer dimension (say) you can have customers of different ages, income groups etc.
After these transactions have been processed by the source systems, the entire processing is conducted and results are checked at each interim stage and also in the end user tools. The expertise here lies in making it happen with minimum transactions with maximum scenarios. However, never try to include all scenarios. The guidelines here are to include scenarios, which are complex OR have complex programming logic (like 'de-dup', 'standardize')
Limited Production Data for Data Warehouse Testing
The next step is to further expand the scenarios and expand data. This is achieved by fine-tuning your extract scripts so that they pick-up limited amount of production data from the source system. The filtering is typically kept simple.
Full Production Data
This is the final and must do test for a data warehouse. Users will generally not accept a system till they reconcile it with their reports from the source system. (For example Data Warehouse should show as many telecom customers, as shown in the core production systems.) |