This page presents a high level listing of the components linked to Data Warehouse. PLEASE REFER TO Business Intelligence Architecture for a complete big picture on the relevance and positioning of Data Warehouse.
Source systems and Databases
Source Systems are all those 'transaction/Production' raw data providers, from where the details are pulled out for making it suitable for Data Warehousing. The sources can be quite diverse:
· Production Databases like Oracle, Sybase, SQL.
· Excel Sheets.
· Database of small time applications like in MS Access.
· ASCII/Data flat files.
Data Staging 'Area'
The data staging area is the place where all 'grooming' is done on data after it is pulled from the Source Systems. The end point of grooming is for the Data to be loaded into the 'Analysis OR Presentation Server'. Data staging covers most of the 'back-bone' activities of a Data-Warehouse, which typically are also the biggest analytical and technical challenge of a project. These activities are 'Extraction' and 'Transformation'
Data Extraction is an activity, which pulls the data from various data sources. Most of these sources are production systems OR are used for transaction level work.
If Data Extraction is mining the iron ore, Transformation is to create the steel billets. The Transformation makes sure that the transaction level raw data is transformed into a form (while still being detailed) so that it can be loaded into the 'presentation/Loaded' area.
This is the repository where the data is finally loaded after going through all the works of Extraction and Transformation. This becomes the ultimate source for information for various reasons ranging from queries to advanced data modeling.
The presentation area has data model, which is different from that of production system. This is called Dimensional Model. It is the way data is organized in data-warehouse. This concept has been dealt with fair degree of detail as this is the engine of Data Warehouse.
Meta Data subject is covered in a separate section. It contains all the business and technical designs, rules and locations etc. of all the data starting from the Extraction to final data usage.
End User Tools and Applications.
Data is cooked for consumption. There is a long list of applications to which the data can be put to and the tools, which can make it happen. This includes the reporting, publishing, analysis, modeling and mining tools.
Data-Warehouse Administration and Tools
Data warehouse is a large platform, which has large number of users, data sources and data targets. Just like production systems, it has to be administered in terms of performance, timelines and availability. This also includes activity logging, data security, backing-up and archiving.
The entire section of Data Warehouse is equally applicable to a Data-Mart. A Data-Mart is a Data repository with a more restricted and short-term perspective. Please refer to De-Normalized Data Warehouse/Data Mart for similarities and differences between a Data Warehouse and a Data Mart.
OLAP Servers & Data Marts
While Data Warehouse can be accessed for any end-user tools application, it also feeds to the downstream OLAP Layer. For example, HR wants to have its own data mart in their separate servers due to confidential reasons. Similarly people who are traveling may need to have their own offline data Mart.