The Data is extracted from the source system, by various methods (typically called Extraction) and is placed in the normalized form into the ‘Staging Area’. Once in the Staging Area, data is cleansed, standardized and re-formatted to make to ready for Loading into the Data-Warehouse Loaded area. We are going to cover the broad details here. The details of staging can be referred to in Data Extraction and Transformation Design in Data Warehouse.
Staging Area is important not only for Data Warehousing, bit for host of other applications as well. Therefore, it has to seen from a wider perspective. Staging is an area where a sanitized, integrated & detailed data in normalized form exists. The concept of staging is as old as the Stone Age. It is commonsensical to have an offline database to take care of reporting. Therefore, staging as a concept has been used in one-way OR the other by IS managers. It has become branded since the advent & popularity of Data Warehouse. However, there is much more than mere change of labels.
With the advent of Data Warehouse, the concept of Transformation has gained ground, which provides a high degree of quality & uniformity to the data. The conventional (pre-data warehouse) Staging Areas used to be plain dumps of the production data. Therefore a Staging Area with Extraction & Transformation is best of both the worlds for generating quality transaction level information.
A staging area is sometimes used for scheduled/Production Reports: As staging is mostly normalized, the queries run on it have to be predictable in terms of volume and timing (unlike a data warehouse). One may ask a question on why can’t we generate the production reports from the data warehouse. There are two answers to it. One is that by using the Staging Area, one distributes the load of querying to two separate areas. Second is that it allows the reports to be produced earlier (as one does not have to wait for the Loading process to be over). While, we say this, it is not a desirable option to use staging area. We recommend all information to be taken out of data warehouse platform. ZAB
|