|
WIZARDS
This is a fairly universal feature for competitive tools. A tool will significantly improve the user friendliness and productivity by providing wizards for as many activities as possible. For most of the initiatives, wizards may not be the one-stop solution for all your needs. However, they will let you cover a distance, before you need to get into the programming or building more complex components.
Connecting and Accessing the source system
- Identifying the sources once you give the names of the server or database environment
- Automatically detect or let you provide the details on the kind of platforms of your source systems and the type of data structures
- Providing the list of data tables, files, out of which you can select
- Should be able to get the source system details from the metadata repository** of the source systems or by scanning the actual tables and files
- Provide you the data structure of the source system (like the field names of the source table in a source database)
Creating the destination structures
This wizard helps you to create the data structures of the destination. For example, if you are migrating the data from one form to another, it will help you to create the data definition for the destination tables. Typically it will start with replication of the source system structures, and then allowing you to make changes in that structure. If you are converting multiple tables to multiple tables, it will give you a list of all fields and tables and you should be able to pick and choose out of them to create your destination structure
Creating source access structures
You may not be picking up all the tables in a database or all the fields in a table from the source system. The wizard should help you to define the structure of the data which you will extract from the source systems.
Creating data access routines
This wizard should help you to do the following:
- Define the filters which you would apply to extract the data.
- Define the data quality validation for extraction
- Define the scheduling (manual, automated, fixed time, conditional...)
- Sequence of access routines
- Validations to be done pre-and post a specific access routine
Importing data
Import is a simpler version of the access routine. You should be able to import data from a source system, by defining the source and the destination. You should be able to define the import destination data type as well.
Building Transformation process and flows
Transformation can be a complex activity. You may refer ETL transformation design** to get more understanding around this topic. This part of wizard is most important aspect one should validate. The features of the wizard will include:
- Allowing you to pick-up and select the pre-defined transformation libraries and functions
- Defining the work-flow of transformation. For example, a customer data may go through four stages of transformation in terms of standardizing the standardizing names, standardizing pin-codes, standardizing telephone numbers and de-duplication. A wizard should be able to define the flow of these transformations.
- Defining the scheduling of transformation in terms of sequence, pre and post checks
- Define the filters and business rules linked to the transformations.
Building loading process and flows
You may refer ETL loading design** in Data-Warehouse section** to get some more understanding on the. The loading wizard should be able to define the creation of data sets for loading, the sequence for creation of loading data sets, the sequence of loading the datasets into the destination area and finally the validations to be performed post the loading.
Identifying and repairing the issues related to extracted data and transformed data
As you do the validation of the extracted data as well as transformed data, some issues may be thrown-up. A wizard should allows you to interactively fix these issue or/and take decisions around them. For example, if there are customer records, which have garbled data around the age of the customers, you may decide to apply the extrapolation method, or place a default value or place a median value etc...
Creating re-usable business processes
We have not seen this feature in too many tools. By using this wizard, you should be able to define integration processes like- standard business process to handle financial values, pin-codes, customer names (placing first name first followed by last name followed by middle name)..
Modeling and Design Support
- Ability to provide impact analysis in case of changes in the source systems: This applies to any kind of DI task. If your source system data structure changes, it may have an impact on your data-federation, ETL and synchronization designs.
- Ability to track the data lineage (reverse impact analysis): This is reverse of the previous point. In case you need to have different data in your loaded area, you may specify it and the DI tool should be able to identify the changes needed in the extraction and transformation routines. For example, if you want to add the customer age in your operational data store**, BI tool should be able to identify the data source from where this information can be gathered and also the extraction and transformation routines which need to be changes.
- Ability to provide the impact analysis of changes to the data integration tasks: This will provide inputs on if you want to stop or change an ETL, synchronization or data federation tasks.
|