Overall Usage Guide
Purpose of this work-tool-
This work-tool provides the structure, guidelines and text examples for creating the DQ program initiation completion report. In the whole DQ program WBS, this is the most critical deliverable. It provides the overall results and analysis of the current state of data quality and DQ assurance of the systems and processes, within the scope of DQ program. It provides the solutions, recommendation and proposed game-plan, resources and funding needs. This deliverable is the final input the stakeholders need to take a decision on how much resources they want to deploy for the DQ program.
Who uses this work-tool?
Ideally a data steward should be the owner of this document. The document can be filled-up in combination by business system analyst and IT specialist, Business Process specialist and other members of DQ initiation phase core team.
When this work-tool is used?
This work-tool is used after all the analysis work in the initiation phase is complete, and one is now prepared to document the findings and recommendations to the stakeholders. This work-tool can be filled in parallel, as the final touches to your analysis is being done.
Linked Work-Tools:
- Data quality Program WBS- This tool provides the break-down of activities and mile-stones of a DQ program. You can use its content to create your project plan for DQ program.
- Data Quality program Agreement- After the DQ initiation phase completion report is accepted, the formal DQ program proposal is submitted including all components for a typical program proposal (project plans, timelines, communication framework, risk management plan...)
-------------------------------
Help Guide
EXECUTIVE SUMMARY
- Back-Ground: Provide the back-ground on what triggered the Data Quality Program Initiation Phase.. The trigger could involve data warehouse initiatives, a major data quality program, Data Migration initiative, Business Process Re-engineering initiative or a need to audit comment or a regulatory compliance issue
- The Scope of the Data Quality Program Initiation Phase: Provide high level scope in terms of coverage- Systems, Processes, data entities etc.. AND the activities- Data Mapping and Assessment, Data-Group Quality Assessment, Data Quality Mechanisms Assessment, Data Governance and Management Assessment
- Key Gaps Identified: State on the key gaps identified in terms of data quality, data model issues, data flow chain issues, Lack Data Quality Mechanics and Controls, Data Management and Governance Controls.
- Root Cause of High Priority Gaps: This mainly lists out those root-causes, which have a broad-based impact.
- Proposed Data Quality Program Scope: This part takes the inputs from Data Quality Gap-Approach-Planning and Tracking. It provides a high level scope of the data quality program and the funding required.
- Proposed Business Benefits and Funding: This takes the inputs from Data Quality Gap-Approach-Planning and Tracking
- Proposed Program Resources and Timelines: This will provide the high level figures for the resource requirements and key milestones
NOTE- DQ program is not a single initiative. It is a combination of many medium to small initiatives. The funding, scope and timelines for the DQ program will be an aggregate of all these initiatives. We will generally not include change-controls (unless they are large sized), as part of the initiatives.
Assumptions: Key assumptions made behind the scoping, timelines and funding
SCOPE OF Data Quality Program Initiation Phase
This part provides all what was included in the Data Quality Program Initiation Phase. This is same as in the Data Quality Program Initiation Proposal. However, there may be some changes in the way
- Functions: The functions and sub-functions, which have a stake-holding in the DMA, and the functions which will be deploying their resources to conduct the DMA
- Business Processes: The business processes run across the functions, manual world and automated world. You may not have an end-to-end business process under your DMA scope. Therefore, you would need to mention which sub-processes, within a business process which will be involved.
- Data-Groups Targeted: Typically the DMA, if triggered by a data quality issue, is generally targeted to resolve data quality issues with certain data groups. For example, Customer Data Quality Issue, or Reconciliation issues with Travel business accounting transactions... As you list down the Data-Groups, do mention the specific Data-Groups which are targeted to be fixed via this DMA. As a side not, DMA will not fix the data issues. It will primarily identify the current state of the data and contribute to the Root-Cause Analysis.
- Geographies/Locations Involved: Within a system and process, you may like to focus on select geographies. This can be because you may want to go in phases, or the issues or initiatives which triggered DMA, could be planned for a limited set of geographies. Therefore, you may like to focus on those geographies first.
- Systems and Data-Bases Involved: This is fairly straight forward. Depending upon the trigger of the DMA, you can identify the systems and databases which you will include in DMA. For example, if the DMA is being done for designing and scoping the ETL of a Data Mart, you will include all the ETL source systems as part of your DMA. However, with in these source systems, you may like to focus on the customer and related databases, as the Data Mart is for customer value and profitability analysis
DQ Program Initiation Phase Activities in Scope
This May include the following:
- Data Mapping and Assessment
- Data Governance and Management Assessment
- State of Implementation of Data Quality Assurance Mechanisms
- Data-Group based Data Quality Assessment
- Data Gaps Impact Assessment
- Data Quality Gaps prioritization and solution alternative assessment
- Data Quality
Methods Deployed
This is a free form, where the process of conducting the DMA is listed out. This includes some initial information gathering methods, like
- Interviews with managers, processors and technology staff
- Segregating the data as per logical groupings.
- Production issues review
- Customer complaints
After developing broad level understanding, here are the methods which you would undertake to do the detailed DMA
- Running ad-hoc queries on the database.
- Application of Data Profiling tools
- Reviewing the database design in the system
- Reviewing the programming logic
- Reviewing the front-end screens
- Review Systems Documentation
- Reviewing business process documents
- Reviewing Enterprise reports. Etc...
The idea here is to share on ‘what was the entity’ (System, Business Process, Data-Groups etc.) and ‘what method(s)’ did you apply to come out with ‘what information’. For example, you can say that you reviewed system documentation and applied data profiling tools to come out with the value distribution of the customer data.
Observations and GAPS statement
The in-feed to this part will come from:
This is a free form with no specific format. Purely depends on the data and type of assessment undergone. ONE important addition is the proposed priority and criticality rating of the gaps.
This section will not have the details of your findings, as they may be running into scores of pages. The detailed observations are placed in an Appendix. This part will provide the list of all key observations linked to the objectives, under which you started the Data Quality Program Initiation
This section will also not be detailing on the areas, where your findings are aligned with your expectations, and/or standards.
One needs to define the gaps in terms of:
- The Data Quality Gaps at the level of Data-Group and Data Entities and sometimes at the table level data in a an IT system.
- Data Management and Governance Gaps
As a principle, one may like define the top 20-30 observations and Gaps. The detailed observations, Gaps and Root-Cause Analysis can be kept for appendix.
One may ask a question, that why we are not stating root-cause along with the Gaps and why we have a separate section for root-cause analysis? - The reason is that you do not have all the observations and Gaps in this section. The details are in the appendix. Secondly there is not one-to-one mapping between Gaps and Root-Cause.
example of Gaps:
Customer Data Group Quality:
- 30% of customer records do not have complete Address.
- 20% of the customer records do not have the correct PIN Code.
- 10% of the customer records are not in synch across the customer master of core ERP system and the field servicing system.
NOTE- Most of the feed for this part will be coming from Data Mapping Assessment Report and Data Mapping and Assessment Management.
Data Governance and Management
- No universal domain definition for locations.
- No universal data standard defined for customer code.
NOTE- many of the gaps listed out here are the root causes behind the bad data in the system. For example the best customer data, could be the result of in-adequate input controls.
Key Root Causes behind high priority Gaps
This section provides the list of root-causes, which are either fundamental in nature or are contributing to high-priority gaps for select high-priority gaps. For the detailed root causes one needs to look into the DQ Gaps RCA, Approach, planning and Tracking Tool.For example-
A wrongly designed Account opening form is fundamental root cause, and
A system downtime 2 months back, resulting in bad 100 records is transaction specific root-cause which may not be included in this section. However, if there have been many such incidents, it may move into ‘fundamental’ category. Appendix will be carrying the details of all the root-causes you have identified.
Fault in Data Migration, resulted in corruption of 40000 invoice history records.
Note- A gap can be the root cause of another gap. One should not be too hard and fast on what is a gap vs. what is the root-cause, and apply one’s own judgment. Generally a gap is more to do with the final manifestation of data quality issue.
The reason for placing only high level fundamental root-causes, is that business owners are its main audience. As we have to be selective in terms of information overload, we need to be tactful in terms of the areas which need their biggest support.
The root causes need not be linked to any specific Gap or Observation (refer previous section). You may like to refer to the key gaps it is linked to. However, the key objective is to give the list of top 20-30 issues, which are contributing to the bad data quality.
For example
- Badly designed account opening form and lack of input controls is leading to more than one-third of all customer master information.
- Lack of universal data models around key entities, is leading to every systems applying their own cardinality rules, which is leading to data inconsistent with our business rules and processes.
- Due to lack of a universal interface between the feeding system and general ledge system, every feeding system is following its own mapping between business accounts and GL accounts, which is leading to huge reconciliation issues.
Recommended Initiatives
One needs to refer DQ Gaps RCA, approach, planning and Tracking sheet to understand the complete flow of how DQ gaps are taken through a flow of RCA, Prioritization, Alternative Solutions review and assignment to specific initiatives.
The outcome of such flow, is the initiatives, which you propose to be part of this data quality program.
NOTE- Data Quality program is unique in that the initiatives under it may not be executed separately. Many of the initiatives could end up being part of larger initiatives (which is a smart way to fund the DQ efforts)
TIP- You should ideally not have a ‘large’ data quality initiative, unless things are really in a mess. ExecutionMiH.com recommends that DQ is achieved more by small and less noisy and less glitzy projects.
In this section, you need to pick-up the top 10-15 top initiatives, and highlight their road-map, benefits, Funding Required and effort. Simply pick-up the relevant fields out of the 'Initiative Tracking Sheet' with-in DQ Gaps RCA, approach, planning and Tracking
Timelines, Deliverables and Funding
The Next Step in the recommended initiatives is to package data quality initiatives into multiple data quality programs. These programs will have some initiatives being part of larger initiatives and some will be executed stand-alone.
The data quality program will be containing the earliest set of initiatives as ‘Phase 1’ Data Quality Program. For each data quality program- One needs to highlight the program, the timelines, initiatives and funding required..
Next Steps
This section provides the list of next steps and help items. Following are the examples of the next steps:
- Formation of Data Quality Council.
- Implementing the Data Quality Organization – Data Steward, Data Custodians and Data Quality Council. (Refer Data Management Stake-holding and Responsibility matrix)
- Launching the ‘Phase1’ Data Quality program.
APPENDIX
There is a long list of appendix you can have in the DQ program initiation completion report, depending on its scope
List of Data Groups analyzed
List of the people Interviewed
List of the systems analyzed
List of Data Entities analyzed
List of Business processes, Sub-processes and process points analyzed.
Observations, Gaps and Root Cause- (System Wise)
Single Column
Multiple Column Analysis
Data Model Analysis
DMA Management
System Stability and DQ Assessment
DQ Gaps Approach, Planning and Tracking sheet.
|