As you implement MDM-CDI solution, one need to have a method so that:
- The Customer records stored in MDM-CDI hub are unique and are linked to the 'existing' customer records in the source system. This is enabled by assigning unique identifier keys. This enables the updation of the MDM-CDI hub as soon as a customer record is updated in one of the source systems.
- The customer records stored in MDM-Hub have unique identifier fields, so that one can search for the existing customers in MDM- Hub, as you add 'new'customer records in the source systems. In case of new customer records, the unique ID concept does not work. One has to search the MDM-CDI hub using identification attributes
Concept of Unique Identifier within MDM-CDI HUB to link 'Existing Customer Records'
In any of the architecture scenario, an MDM should be able to connect with the source systems on the on-going basis. This requires a mapping between the Customer Master Data lying in the MDM-CDI Hub and the Customer Master tables in the source systems. This is applicable for 'existing' customer records in the source systems, which have gone through processing and have been placed in the MDM-Hub.
As a process, in this kind of architecture, a unique MDM-HUB identifier is allocated to all the master-customer records within the HUB. This MDM-CDI Customer_ID will be different from the customer IDs used in the source systems. The reason is that the MDM-CDI HUB will be having customer data from multiple source systems. Each source system would have assigned customer_ID within its own systems as per some logic of (say location+ type+ running serial number). As these customer records get processed and stored in the MDM-HUB, they have to be given a new identifier, because the identification has a different context. The customer data picked from source systems undergoes following contextual change:
- They loose their one-on-one relationship with source system. For example multiple customer records in source system could be linked to a single 'merged' record in the MDM-CDI Hub.
- They are in large universe of customer records. For example MDM-Hub can contain million customer records, picked from ten different source systems, with each having on an average one hundred thousand records. Therefore, the running serial number concept will work with much larger population
- The identifier has to deal with a larger set of identifier attributes: A source system could be just having the location+ running serial number as customer ID. This could be because as per the data model of that source system, a customer cannot belong to more than one location. However, in MDM-CDI hub data model, a customer could belong to more than one location.
An example of how the MDM platform is maintaining an on-going link to Source Systems using unique identifier
Persistent Style
Persistent style is that when HUB is the single point reference and has all the Master Data stored physically in its database. This data has gone through data correction** and transformation**. This authoritative data can have following scenarios:
Single record MDM-CDI hub is related to Single record in a Single Source system
This is simplest of all. For example, a customer_ID ACD12345 in sales order system is linked to customer_ID AH2345789 in MDM_CDI Hub. Every time ACD12345 changes (say marital status undergoes a change), by sheer ID mapping, it will update the change in the MDM-CDI hub.
One-to-one mapping like this, is possible when there is not merging or splitting of the source system. The source system customer master table will have an additional field called 'Hub_Customer_ID' which will carry the customer_ID as assigned in the Hub for that record. This will be the foreign or reference key to connect to the customer record in the MDM-CDI hub.
Single record in MDM-CDI hub is related to Multiple records in a Single Source system:
This happens when the source system records have been merged (or de-duped) from a single source system. In that scenario, one you can have following sub-scenarios:
There is only one true 'genuine' customer record
It is possible that one record among the multiple customer records related to the same customer in the source system, is assigned as 'true' customer record. The other records could be in-active, dead or redundant records. At the time of original creation of the merged customer record in the MDM-CDI hub, this decision would have been taken and any useful data on these dead records would have been picked up. It might be possible that all the records might be functioning initially, but as the MDM analysis happened the true record is identified in the source system for any future processing.
However, the dead or redundant records are still maintained for the sake of reporting, historical analysis, and regulatory compliance. In this case, one uses a combination of MDM-CDI Hub Customer_ID and a flag on which is the true reference record in the source system.
For example
MDM-CDI Customer Table |
|
|
|
|
Customer_ID |
Customer_FNAME |
Customer_Lname |
|
|
AJH12345678 |
Joseph |
Holland |
|
|
Source System |
|
|
|
|
MDM_Customer_ID |
MDM_Ref_tag |
Customer_ID |
Customer_Fname |
Customer_Lname |
AJH12345678 |
Y |
AG12345 |
Joseph |
Holland |
AJH12345678 |
N |
AK45367 |
Joe |
Holland |
AJH12345678 |
N |
PK21456 |
J |
Holland |
AJH12345678 |
N |
SD34189 |
Joseph F. |
Holland |
In the above example, the customer_ID AG12345 is the true reference record. One could ask a question on why to bother mapping the dead records (the other three) to the MDM-CDI hub. The reasons are:
- It provides an audit trail on the input records for creating the master record in the Hub.
- It also allows a check to see if supposedly dead records have any activity on them. If there is an activity, it can raise alerts for the source system owners.
- There is more than one 'genuine' record in the same source system, related to a single record in the MDM-CDI hub
In this case, there will be certain business rules, which will be used to merge the information from multiple source records into one record in MDM hub. There can be dozens of different merging rules. The merge rules can have one among the following approaches:
Table driven rule, where you can specify different rules for different records
Let us say that we have four customer records in the source system, which are merged into one record in the MDM-CDI hub. You can assign the fields which will be considered as reference fields for non-financial information. Extending on previous example, you can say that
- Name and address field will be referred from customer_id AG12345
- The customer activity status will be taken as 'active', if any one of the customer records in the source systems are showing active.
- The relationship value will be the sum of relationship value across all the four records.
These rules can be configured by having rule tables like
MDM_Customer_ID |
Customer_ID |
Customer_Fname |
Customer_Lname |
Relationship Value |
Status |
AJH12345678 |
AG12345 |
Refer |
Refer |
SUM |
Active |
AJH12345678 |
AK45367 |
N |
N |
SUM |
Active |
AJH12345678 |
PK21456 |
N |
N |
SUM |
Active |
AJH12345678 |
SD34189 |
N |
N |
SUM |
Active |
MDM program driven rules
These kinds of rules will be applicable across multiple records in the MDM-CDI hub (unlike table driven rules, which can be made specific to each customer record in MDM-CDI hub). Using program driven rules, one can assign rules for more complex conditions.
Single record in MDM-CDI hub is related to multiple records across different source system.
This is an extension of the previous point. The only difference will be that in your business rules, you will also have to define the source system from which the records are being picked up. examples are:
MDM_Customer_ID |
Customer_ID |
Source_ID |
Cust_Fname |
Cust_Lname |
Relationship_Value |
Status |
AJH12345678 |
AG12345 |
CRM01 |
Refer |
Refer |
SUM |
Active |
AJH12345678 |
AKI453 |
SFM01 |
N |
N |
SUM |
Active |
AJH12345678 |
PK21456 |
CRM01 |
N |
N |
SUM |
Active |
AJH12345678 |
SDH8B9 |
SBF02 |
N |
N |
SUM |
Active |
Multiple records MDM-CDI hub are related to single record in a single system.
This happens when you have split a customer record in the source system to multiple records in the MDM-Hub. In this scenario, the best way is to split the record in the source system as well. The split rules for on-going updation of MDM-CDI are complex and confusing to follow.
Registry or Virtual Style
Registry style architecture, one will have only the unique identifier fields in the MDM-CDI hub and the descriptive fields in the source systems only. The registry style is followed when you don't need to do complex merging of the records or any transformation apart from the identifier fields. For example
Let us say that you have customer first name, last name, PIN-Code and telephone number as unique identifier. You may undergo the transformation of the PIN-Code, and Telephone number as per the standard format. However, you will not transform the descriptive fields (like address and Customer Description...) fields, which are lying in the source system.
The 'true' ongoing merge does not happen in registry style. At the time of Customer Data Integration, you would do one-time merge or de-dupe the multiple records across the systems or within a single system. However once you have done the merging, or de-duping, you will assign a single record in the source system(s) as true reference record and assign the other records as redundant. This single 'true reference' record will be linked to the MDM-CDI hub customer record on the ongoing basis. |