Sales Management Customer Relationship Human Resources Business Performance BI & Data Quality IT Tools & Vendors

Sign-in   Register
Establishing 'Making it Happen' as a 'Formal & Predictable' Discipline
  Data Mapping and Column Analysis  

ENCYCLOPEDIA→   Enterprise Intelligence  →   -  Data Quality  →   -  Data Mapping & Assessment  → 

Data Mapping and Assessment

Lets take a more detailed look at what all is included in the DMA. The key step needed to assure good data or to fix bad data, is to understand what is the current state of data.

Data Mapping and Assessment is the foundation and the pre-cursor to most of the data quality initiatives. This is equivalent to the 'current state assessment' part of the 'analyze' phase of any quality initiative. Data Mapping and assessment not only support the data quality, but also host of other initiatives related to data management like data integration, master data management, metadata management, data warehousing etc..DMA answers the following questions:

Data Mapping and Assessment for data structure/Data model

Not all the systems have their documented data-model and Database designs. Even if they exist, they may not have been updated over time. The first and foremost target is to see the actual Data model followed by the data. This actual is then compared with the documented OR expected model. This comparison throws-up the gaps. Some of the gaps (both valid and invalid), which come out are:

  • New Tables depicting a new entity (Valid!! Company started providing services over and above products, and added a services master and service transaction table..).
  • New temporary tables (Valid!! It got created to produce a report OR to provide a quick fix functionality..).
  • Existing tables having new fields. (Valid!! Added a mobile number field to the customer table..)
    New Normalized tables (Valid!! We started taking multiple cheques against a policy and therefore made a separate table carrying the cheque details..).
  • Fields allowing null values. (Invalid!! Against the business rules..).
  • Primary key is customer ID and Date of birth (Invalid!! We need to have a unique customer ID..).
  • Optional foreign key relationships (Valid!! A patient might be in a ‘registered patient’ table , but may not have a record in ‘treatment’ table, as company has launched a campaign for ‘register now for saving your health bills later’)

Essentially data model changes over time, and this exercise helps build the real map. Only after fully analyzing this and understanding on what is wanted and what is not, one can actually set-up the quality benchmark.

What is the information flow chain (aka. Data Flow Diagram)

This is a comprehensive (and often missed) exercise in a data quality program. Most of the Data Quality Programs stop at ‘data quality assessment’ OR by max ‘data profiling’. The true data mapping is also to assess the information flow chain.

As the name suggests, information flow chain tells us how data flows through an enterprise. It tells on :

  • How data enters into an enterprise,
  • How data gets processed,
  • What different shapes and forms it takes,
  • How and where it is stored
  • How it is sent out of the enterprise.
  • The people, roles and functions, which handle the flow.

Just like all other business components of Business Meta-Data, Data Flow Diagram is not necessarily for an automated system, but it can represent both automated and non-automated domain. This is key for success to understand the flow of data across an enterprise as lots of data gets processes outside the automated systems and get stored in hard-copies, excel sheets and MS access databases. As you may realize, many of the data quality problems reside in the data sources different from our dear RDBMSs.

Mapping and Assessing Data for data value distributions

This is a typical output of a data profiling exercise (which is sub-set of data mapping). The purpose of this exercise is to give a view on the boundaries of data and also on the variations with in the data model. For example:

  • The credit limit field has 95% values below USD 10000. (Data quality issue!! Our gold card portfolio is over 25%...).
  • 50% records in mobile number field are null. (Sales channel does not seem to be doing good enough job to get these details..).
  • 35% of the mobile numbers are of length less than 10 digits (the mobile number data is not accurate).
  • 70% of the middle names are null. (that’s fine..).
  • 60% of client age is above 50 years (Data quality issue!! our whole sales apparatus and products are geared towards Young OR middle age customers).
  • 99% of the foreign key matches exist between ‘registered patient’ table and ‘treatment’ table. (Good!! It means that most of the registered customers are using our services.

There are endless examples like above. Essentially this gives a great deal of information, which is of use to business as well as IT domain.

Data Mapping and Assessment for inaccuracies and inconsistencies

Apart from finding data quality issues through data model and value distribution check, an overall check of data quality is must.

  • 80% of the records ‘city’ field is not matching with the standard set of cities. (Data collection and input form checks problem).
  • 15% of inactive agents have ‘active’ proposals under process. (Against the business rule- need to fix the system..).
  • 25% of the addresses are not correct (on the basis of return mail, not having right states and cities etc..).
  • Equation of 'Opening balance of WIP Packets+ Delivered + New Packets + New suspense items= Closing Balance' is not matching.
  • Date format is not correct.
  • ‘Status’ field in 50% records is null and in 25% cases, it has a numeric OR date notation.
  • 0.56% records in the billing table have zero billing amount.

What has been the rate of data build-up OR updation?

This provides the information on how fast and how much of data is getting added OR updated. This information drives the level of importance as well as the urgency in implementing quality. For example:

  • Customer Data table is getting on an average 20 new records (for job manufacturing company), 200 new records (for banking company) and 10000 new records (for a global mobile company) every day.
  • A bank has 100,000 transactions added to its transaction table, 30% of them come from the web, 20% through IVR (interactive voice response) and 50% through over the counter.
  • The customer table gets updated 1000 times a day on ‘Product scheme code’ (Mobile telephony customers keep on changing the schemes).
  • Insurance policy issuance table does not undergo a change beyond one month after the policy has been issued.

There is no straight rule on how to undertake DMA exercise.

Typically:

  • It is not done for an entire enterprise
  • It is done where the problem has manifested itself into a crisis
  • It is done, when we want to start a conversion project
  • It is done pro-actively for an area where the need for data accuracy is very high.

Even within a select domain within an enterprise, you may like to optimize on the level of assessment one want s to do. No two assessments scopes OR the methods applied are same. One can apply a combination of many a methods to build a customized solution. The data quality analysis can be done relatively quickly and intuitively. It is a combination of heuristics, problem assessments, domain understanding and people who are experts in database and excel querying. If one has good manual process, analysis can be versioned and reused, making it possible to see how effective the actions taken to correct the errors are.

TIP- DMA needs funding. Unless driven by a data quality crisis or by a major initiatives, people face a challenge to build its business case. Arguments for pro-active DMA, are lost in front many business initiative fighting for funding. We would recommend, a separate funding for pro-active data management initiatives, which should not be part of the competing initiative. The business-case can be that the business initiatives will increase their productivity, using the efforts done by the DMA.

 

  Data Mapping and Column Analysis  
 
 
Relevant Links to this page
TOPIC - Data Quality Program DMA → Practice Tools → Data Mapping and Assessment Report → 

Was this page helpful?
 
 
More on Data Mapping & Assessment
Data Mapping and Column Analysis
Data Model Entity Relationship Analysis
DMA Data flow Analysis
BUY BI & Data Management Vendors & Tools Evaluation Kit
Read more...
BUY largest on-line Data-Quality Management Kit
Read more...
Additional Channels
Principles & Rules
Free Templates
Glossary
Key Performance Indicators



Most Popular Zones with list of pages crossing 25000 hits  →→→ 
Maximising Sales Performance
Enhancing Sales Channel productivity
Sales Behavior
Sales Objectives Clarity
Sales Campaign Business Intelligence
Sales product Mix Profitability
Read more...
  Customer Relationship Management
Supply Chain for Customer Service and Support
Drivers for Customer Satisfaction & Retention
Customer Value and Profitability Data Management
Customer Service and Support - Strategic Role
Customer Segmentation approach
Read more...
  Human Resources & Leadership
Develop Self and Others
Lead diverse and collaborative teams
Competencies Definitions
People become the way you treat them
Act with Decisiveness
Read more...
 
 
Business Performance & Planning
A KPI should be simple -but it depends
Strategy Blueprint Information Gathering
Creating Strategy Blueprint
Shifting the mind-set to leading Indicators- KPIs
3-4 hours in reviewing a scorecard.
Read more...
  Business Intelligence & Data Quality
Synergies & Shared Capabilities-Overview
Big-Bang Data Warehouse is a pipe-dream
Facts and Derived Facts Table
Sponsor for a Data Quality Program
Knowledge Discovery in Databases Program
Read more...
  IT Vendors & Tools Management
Data explosion OLAP Server
OLAP Server administration
Data Cleansing and Augmentation
Multi Cube OLAP Architecture
Collaboration and Administration Support
Read more...