This page is an extract from BIDS Metadata Management Solution authored by Kamlesh Mhashilkar- Head, Execution-MiH Services of Tata Consultancy Services
Metadata allows for enterprise-wide data integrity, enabling smooth data flow across the enterprise. A well-defined metadata management strategy enables organizations to integrate various BI system components and deploy best-of-breed architecture. Metadata is usually hidden in various data structures, ETL processes and presentation layers and can be integrated to share across applications. Metadata interoperability allows the use of most appropriate technology for sharing metadata. Metadata centric reporting / presentation layer allows users to know what is stored in the mysterious systems and facilitates the entire business chain, creating a competitive business advantage for the organization. Also metadata finds its usefulness in the area of impact analysis. How the changes will affect the system, its components, and the other related applications, can be easily tracked with the help of metadata based impact analysis.
Metadata integration is the base of enterprise-wide data integrity. Metadata management ensures that the components that inter-operate with the enterprise metadata repositories will all be using the same data, and will ascribe consistent meanings to it. It also enables an organization to establish consistency in its enterprise-wide data.
BI Metadata Management Program – Goals and Objectives
A well-defined metadata management program guarantees a high quality of the BI information and provides sufficient flexibility to extend the scope of the BI system to new information requirements and sources. Metadata management primarily aims at standardized and centralized metadata yielding flexible and robust metadata architecture.
Goals
The overall goals for enterprise-wide BI metadata management program can be defined as the following:
- Standardization in metadata as well as data handling
- Centralization of metadata management
- Elimination of duplication of metadata information
- Transition adaptive metadata architecture
Objectives
The specific objectives can be as the following.
- To develop metadata and data standards.
- To centralize the administration as well as usage of BI system.
- To improve data integrity and accuracy through non-redundant / non-duplicated metadata information.
- To reduce the effort in development, enhancement, implementation and maintenance of the BI system components.
- To establish a flexible metadata architecture to incorporate alterations in the BI architecture.
Business Intelligence Metadata Management Requirements
The high level requirements for generation and management of metadata can be perceived as:
Metadata Standardization
- Unique terminology and standardized communication within the enterprise: The availability of a metadata as a unique source for users should bring various benefits. It should ensure a consistent vocabulary for users to communicate, understand, and interpret business information. It should eliminate ambiguity and guarantee consistency of information within the enterprise, and enable sharing of knowledge and experience.
- Seamless system integration: ETL processes, especially integration, rely on metadata of the various data sources and BI system. The standardized metadata should aid in integration of data from various systems and give unique meaning to the data elements loaded in the BI system. Furthermore, the integration of different applications as well as tools is only possible if their metadata is shared through a standard mechanism.
- Data quality improvement: Standard quality assurance rules should be defined. This forms an integral part of ETL metadata. Data quality ( you can also refer what is data quality) includes aspects called 'LUCAS' i.e.
- Latency: Whether the data is received as per the schedule from all the given sources,
- Uniqueness: Whether the data is without duplication,
- Consistency: Whether the representation of data is uniform and no data with ambiguous / confusing definitions over a period of time,
- Accuracy: Whether the stored and the source value reconcile (from precision and confidence of the data perspective),
- Sufficiency: Whether data or data elements are sufficient / missing.
Metadata Centralization
- Improvement of analytics as well as user interaction with the BI system: Analytics covers a wide range of techniques starting with a simple query based reporting, continuing through OLAP analysis and up to complex data mining. The user interaction with these techniques is highly guided by the metadata layer. All the different kind of analytics should be metadata driven. Metadata Model should provide the user with the centralized information about the meaning of the data, the terminology and business concepts used within the enterprise and their relationship to the data. Hence metadata should allow posing precise, well-directed queries and reduces the costs for users accessing, evaluating and using appropriate information.
- Data integrity and accuracy: Centralized metadata should assure non-redundant / non-duplicated metadata information. In addition, high Business Intelligence data quality requires data traceability and reconciliation. ETL procedures should manage the metadata traceability by capturing the data heritage (e.g. source, schedule information, receipt timestamps) and the reconciliation through methods like checksum. Centralizing all this information aids in speedy resolution of data integrity issues and the well management of accuracy of captured information.
Reduced efforts in BI system management
- Support for the development of new applications: Metadata provides the information related to the meaning of the data, its structure and origin. This aids in the requirements gathering and design phase yielding control and reliability of the application development process. Furthermore, metadata regarding design decisions adopted for existing applications may be reused.
- Automated administration processes: Metadata should drive the execution of the diverse DW processes (like ETL, batch reporting). Information about the process execution (logs, DW data load status etc.) should also be stored in the repository for easy access by the administrator. These metadata driven processes should automate the complete BI administration, reducing the manual intervention and hence the effort in maintaining the BI system.
- Sophisticated security mechanisms: ACLs and user profiles should be well managed in metadata layer in order to provide a sophisticated security mechanism. The different granularity of information with department wise / geography wise restrictions needs to be appropriately maintained with the user roles. Security breaches need to be detected through a robust audit trail process.
Flexible metadata architecture
- Extendibility and adaptability of metadata: Metadata should be extendable and adaptable to changes. E.g. semantic aspects likely to change frequently can be explicitly stored as metadata outside the application programs yielding flexibility in extending the system and adaptability for including new metadata objects without difficulty. Also generic metadata models can enable reusability of various code fragments.
Note- BIDS Solutions encompass the proprietary solutions from TCS covering Business Intelligence and Data Warehousing landscape.
|