USAGE GUIDE
Purpose and Objectives
This work-tool enables you to capture the system DQ assessment results, gap analysis, gap prioritization, solution listing and assignment of the Gap to the centralized DQ gap management tool. This work tool is to manage a given scope of systems, which you want to assess for data quality.
When to use this Tool?
As you do the Data Quality Assessment, you do it on the following counts:
- The current state of data quality.
- The current state of data quality assurance.
The System DQ Assessment Management tool enables you to capture the results and analysis of your assessment on both set of parameters. The level of intensity at which you will do the assessment may vary upon your level of urgency and the funding available. This tool will be used through out your assessment exercise.
Who uses this tool?
System DQ assessment Management tool is used by the business analysts and IT specialist, conducting the assessment, to fill-up the results and analysis. It is used by the business owner and the functional Data Steward to review the results and analysis. It used by the functional or enterprise data steward to prepare the DQ program initiation phase completion report.
Linked Work-tools
- Data Quality program Initiation Phase completion report- the output of this tool is fed into the DQ program initiation phase completion report.
- DMA Report: The results of this report are fed into the DMA assessment Management tool. Therefore DMA feeds into System DQ Assessment, which feeds into DQ Program initiation phase completion report.
------------------------
Help Guide
CONTEXT SETTING- The System DQ Assessment Management Tool has two parts:
Part I- Current State of Data Results and Analysis
This part takes the inputs from a DMA Management tool or from DMA report. It however will not be carrying the results to that level of depth as you do in DMA. The idea is to record the results in a way that they can assigned to Gaps.
Part II- State of DQ Assurance methods
System DQ assessment is not complete, till you understand the state of robustness of DQ controls. This understanding till provide inputs to the root cause of the current state of Data (part I of this work-tool). The list of DQ controls and style of results capture and analysis, will ensure that you are able to do insightful root cause.
For example, if the customer data-group has issues with the customer-location data (as captured in Part I), you may be able to link it with your observations in lack of input form controls on customer data capture screen or lack of discrete value set in location domain (as captured in part II). Now for each issue over the lack the DQ assurance controls, you can also mention on why the DQ controls are missing (like lack of standard input form guidelines..). Therefore, you are going to the second level ('why' of 'why')
Help-Guide- Data Group Quality
This sheet captures the current state of quality for given data groups.
- Data-Group+ Data Entity List:
You may need to refer Data Group Master to understand the concept of Data-Group. Data-group in brief is logically grouped Data, which represent business entity. Data-Group does not represent a model, but the aggregation of all the data belonging to that business entity irrespective of the system and its location.
As you do the assessment of the current state, doing it in context of the data-group is the best way to manage your results. Business can also relate with it. A data-group will be essentially be a combination of Data-Entities. For example Customer Address, Customer Contact Details, Customer professional Details, Customer Demographic details etc. You should ideally capture the results in terms of each entity with in that Data-Group.
- Brief Description: Provide the brief description for each of the data-group and data-entities.
- Physical Location: This provides the name of the systems and tables, which have the data related to this data group and data-entity.
- Code & Reference: Provide the reference, where the details) for the given data-group resides. For example, you can give the Data-Group code as it exists in the Data-Group Master and you can provide the Data-Model Documentation, where a data-entity belongs
- Data Quality Rating: This will provide the overall 'judgmental' rating to the state of data-quality. You will be providing the rating at Data-Entity level as well as at the level of Data-Group.
- Comments: This field will be used to share:
- Level of confidence we have on our assessment.
- Any specific comments on exceptions to the results. For example, customer Professional details are garbled for 100 records, because of one time production issue.
- Standard: This defines on what are the business expectations to the level of quality. Refer Data quality is relative to understand the subject matter. If you have already done the DMA, you would have those details.
- Gap Definition: State the Gap- by taking into account the current state and business expectations. This gap definition should be done for each entity.
- Data-Group Gap definition: This will be aggregating the entity level Gap into the Data-Group Level Gap.
- Gap Rating: This will be capturing the Gap Rating for the data quality Gap for each entity.
- Root causes: This column will be providing the root-cause or a combination of root-cause contributing to the Gap. At this stage (unlike in DMA), one will do in-depth analysis on the root-cause.
- Possible Solution Set: At this stage, one will be identifying possible solutions only. This is because the final solution set will be identified, when you look at all the gaps, prioritize them and group-them into initiatives and change controls (refer DQ Gap Management Tool). A Data Quality Gap and Solution has many-to-many relationship. This means that a gap can be addressed with a combination of many solutions, and a solution can address may DQ gaps.
- Gap Assignment: This will provide the DQ gap code, to which this Gap belongs. The DQ gap code is the identifier of a DQ gap in the Gap Management tool
-------------------
Help Guide- STATE OF DQ ASSURANCE METHODS
This is part II of you work-tool. It has multiple-sheets enabling you to capture each and every aspect of DQ assurance. All the sheets have the following common fields
- Object+Controls list: List of the objects and the controls within that object, which need to assessed. There is one sheet for each of the following objects"
- Data Exchange Interfaces
- Input forms
- Key Data Elements (for domain value and data standards)
- Data Entities
- Business Rules
- Batch-Jobs
- Business Partner Interfaces
- Description: A brief description of the object
- Code & Reference: The code and reference to the object within the systems documentation or within the object level inventory.
- Change & Issue history: A high level history of the changes done to the object and past data quality issues.
- DQ Comments: The comments on the robustness of the DQ controls for that object. It will also cover 'why' aspect of the state of DQ.
- Rating: One can provide the 'judgmental' rating on each of control+object combination or an overall rating for the object.
- Impact Statement: Any impact due to low rating of the object. This part will feed into the root-cause for current state of data.
--------------------------------------------------
examples for each of the sheets within DQ Assurance Assessment
Interface Quality:
- Interface 1: Customer Master interface between field servicing and Order fulfillment System.
- Description: This interface is used to synchronize the customer master data across the Field Servicing system (used to support the field maintenance and support post sales) and the order fulfillment system. This a one way interface to create new customer, whereby the order fulfillment system is the 'system of record' for the customer master.
- Code & Reference: Interface code= INTCS0023, Document Ref- DDFS0324, Page#212
- Change & Issue history: This interface underwent a change (CC#- FDS2345) in its structure as order fulfillment changed its customer master structure. There has been 2 issues with this interface in the past. It has been missing an upload one some occasions (production issue- POD 2345TG
- DQ Comments: The interface has two controls missing- Duplicate file-Check, Duplicate Records check
- Rating on individual controls: Apart from Duplicate File Check and Duplicate records check, the other controls will be rated high.
- Impact Statement: Though there is a lack of duplicate file check and duplicate record check is missing, its not a high impact because:
- The field servicing system, is not changing any data of the uploaded customer file.
- There is DB to DB synch check done post the file upload
-----------------------------------------------------------------
example Domain Value and Data Standard
You will typically not include thousands of data-elements here. You may like to have top 200-300 data elements per system, which will make biggest impact on your data quality.
- Data Element 1: Invoice shipping rate
- Table Name and Field Name: Shipping_rate in the shipping_rate_table in the order fulfillment system.
- Description: This data element is the shipping rate applied based upon the location, product, weight and given a campaign.
- Change & Issue history: NIL
- DQ Comments: This data element has the following controls missing and not applicable:
- This data element can be optional: This means that a record may have a Null value. As per the domain standard, this value should be mandatory, with default as Zero%.
- Impact Statement: Shipping rate being optional has risk of it not being entered, which resulting in NIL shipping will cost for the applicable invoices, leading to a financial loss.
----------------------------------------
example -Input Form
- Input form 1: Shipment Details Entry Screen
- Description: This input form is used to enter the shipping details the shipment, before the goods are shipped.
- Change & Issue history: NIL
- DQ Comments: The field validation the shipping location. There is not domain value check for the PIN-Code vs. City, while the format check exists. Even if the PIN-CODE is valid, it may not be the valid PIN Code for that city
- Impact Statement: We have a risk of wrong pin code to be entered, which could mean shipping issues and delayed delivery.
---------------------------------------------------------------------------------
example of Data-Model Controls
- Data Entity 1: Customer Professional Details
- Description: This is a sub-entity to the customer-entity. it contains the details of the professional experience of the customer. It has many to one relationship with the customer master.
- Change & Issue history: The entity was expanded to include the Industry experience of the customer.
- DQ Comments: The Insert rules do not check the Industry with the Industry master and function with the function master.
-
- Rating: 6
- Impact Statement: The lack of check is leading to lot of uneven industries and functions.
--------------------------------------------------------------------------------
example of Business Rules for Data Entities
- Data Entity 1: Invoice-header
- Description: This Data Entity contains the header information for an invoice.
- Change & Issue history: The invoice header was expanded to PO date, over and above the PO number. This was used to help the customer to retrieve the relevant purchase order.
- DQ Comments: There is not consistency rule for shipping date vs. PO date vs. Invoice date. For example, shipping date can be before the purchase date and Invoice date can be before the shipping date. There was a production issue related to Invoice header, whereby the wrong Invoice number was being generated. The problem was fixed by building auto-increment logic.
- Impact Statement: A wrong set of dates, can spoil out reputation, impact our order fulfillment and customer service processes, and place us at a legal exposure.
------------------------------------------------------------------
example of Batch-Process Controls
- Batch-Process 1: Bank Statement File Generation Process.
- Description: This batch process generates the file for account statements, for bank statement printing system.
- Change & Issue history: This batch process had issues related to mentioning wrong Text in converting the numbers into 'the balance in words'. It was a program logic issue. We also had issues, where the some transactions were missed out in the back statement, whereas the balance was correct.
- DQ Comments: The batch does not have a post completion checks, in terms of having covered all the customers and all transactions. It does not add-up the figures in the bank transaction table and in the statement files generated by this system. The batch should be doing a 3-way check between the customer account master, customer account transaction and account statement file.
- Impact Statement: While there are extensive checks on the correctness of the transaction and master table for the bank accounts, but the lack of control on the batch generating the bank statement files, can significantly impact our reputation.
|