Building Making It Happen
Establishing Making-it-Happen as ‘Formal & Measurable’ Business Discipline
  Sign-in         Register
    
   Data Mapping and Assessment Data Model Entity Relationship Analysis  

Execution-MiH Encyclopedia  →   Enterprise Intelligence  →  SECTION -  Data Quality  →  CHAPTER -  Data Mapping & Assessment  → 

Data Mapping and Column Analysis

Column analysis is typically the first step in DMA. Within column analysis you first do individual column analysis followed by multi-column analysis.

Data Mapping Individual Values - Single Column analysis

The most basic, yet powerful, data quality analysis consists of querying each and every individual column in a database. Each column is examined in order to find and sort all the individual values, together with their frequency.

Single column analysis, primarily detects data format errors, valid set errors, valid range errors, null/non null errors and so on and above all primary key validation. (refer domain value controls)

For example- PIN codes (many a times used for location based analysis and also used for mailing lists). ZIP codes are something that usually has a specific format for a given country, for example- five digits. In that case, they should not contain four digits and certainly not letters. A comprehensive single column analysis can validate the ZIP+4+2 format

ZIP

+4

+2

%age values

Valid

Valid

Valid

5%

Valid

Valid

Not Null, Invalid

10%

Valid

Valid

Null

5%

Valid

Not Null, Invalid

NA

10%

Valid

Null

NA

25%

Not Null, Invalid

NA

NA

40%

Null

NA

NA

5%

Table shows an example of PIN codes, that have been sorted, and their frequency. Non-existing, suspicious OR the outright faulty values can be found.

The verification of address fields in a database is an example- of a data format analysis. Customer code verification is the format as well as the coding logic validation. Transaction value field verification can be valid range validation (one cannot have home water purifier selling for USD 1 million) etc. Null/Non null validation is generally common across all verifications. All master table primary keys go through ‘being unique’ verification.

Multiple column Analysis/Coupled Values analysis

Once the basic single column analysis has been done, it is time to analyze relations between different columns containing related information. This analysis, also called the multiple/double column analysis, verifies the data model linkages like foreign key match, multiple column primary key verification. Please also refer and data model controls. Specific examples of multiple column analysis are:

  • The foreign key referential integrity check (A cheque code in the collection table should have an equivalent record in the ‘Financial instrument detail table’. Wondering why we need to normalize?? This will be in a case, when a single cheque is collected against a multiple sales).
  • Multiple column primary key uniqueness validation- (‘Flight code+ sequence number+date’ needs to be unique in the ‘flight details’ master of an airline operations management system)
    Business rules validation (If the manufactured item status is ‘WIP’, no record should be found in ‘billing’ table).
  • Completeness verification (at least one contact detail telephone 1, telephone 2, mobile no.. is available).
  • Transaction Consistency verification (If a card status is ‘billed’ in core card processing system, it should also be ‘billed’ in card statement system).
  • Master table to master table consistency verification (Customer description in the CRM system is congruent to the customer description in core system customer master AND the all ‘Customer master’ tables should be in synch for the common information fields).
  • Master table to transaction table consistency verification (similar to foreign key check. The sales should not be selling the products which are not existing in product master. Billing should not be sent to the customers, which are not existing OR are inactive in customer master)

TIP- This assessment can be done in two stages- One can first use the data definition language and the program specs to understand the business rules associated with the data. In the second stage, one can then run the discovery process on the data itself, using these business rules as a reference. This is different from a single stage pure discovery (reverse engineering) process.

PLEASE REFER Execution-MiHPractice Tool Data Mapping & Assessment Report

 

   Data Mapping and Assessment Data Model Entity Relationship Analysis  
 
All Topics in: "Data Mapping & Assessment" Chapter
 Data Mapping and Assessment →  Data Mapping and Column Analysis →  Data Model Entity Relationship Analysis →  DMA Data flow Analysis → 
 
Relevant Links to this page
TOPIC - Data Quality Monitoring → Data Quality Program DMA → Practice Tools → Data Mapping and Assessment Report → 

Was this page helpful?
If you like it ? share it !
Digg
Digg
Reddit
Reddit
Del.icio.us
Delicious
Google
Google
Live
Live
Facebook
Facebook
Slashdot
Slashdot
Netscape
Netscape
Technorati
Technorati
Stumbleupon
Stumbleupon
Spurl
Spurl
Furl
Furl
Blogmarks
Blogmarks
Yahoo
Yahoo
Plugim
Plugim
Squidoo
Squidoo
BlinkBits
BlinkBits
 
Content Additions
 
CONTENT ZONE
Data Quality

Featured Pages
BI Organizational Impact success assessment
Data Quality Approach Finalization
Analyze well, but be decisive
OLAP and Data Warehouse Dimensional Hierarchy

Make 'Executable' Strategy
Maximize Results
Maximize People
Manage Execution

Featured Pages
Dimensional Attributes+ Facts + Source Matrix
Data Monitoring Checklist
Avoid Pure MOLAP
OLAP in Business Intelligence- What is OLAP?