Building Making It Happen
Establishing Making-it-Happen as ‘Formal & Measurable’ Business Discipline
  Sign-in         Register
    
   What is KDD- Data Mining? Knowledge Discovery in Databases Process  

Execution-MiH Encyclopedia  →   Enterprise Intelligence  →  SECTION -  KDD-Data Mining  →  CHAPTER -  KDD- Data Mining Overview  → 

Knowledge Discovery in Databases Program

KDD - Data Mining program has two streams of requirements. Business/Functional requirements are centered around growth in revenue & profitability and business process optimization. Non-functional requirements like high response time, accuracy, visualization, metadata management and data quality ensure a continued sponsorship of a KDD program.

This page is an extract from BIDS KDD Methodology authored by Kamlesh Mhashilkar-Head, Execution-MiH Services of Tata Consultancy Services

The prime objective of KDD is to find patterns and rules automatically, based on the data. Data Mining brings together techniques from machine learning (Artificial Intelligence), pattern recognition, statistics, databases and visualization. The emergence of new algorithms especially suited for large volumes of data and facilities to integrate these algorithms into the database engine have added to the attractiveness of Data Mining. The Data Mining technique is used in unearthing associations, segmentation or classification of large data sets, rule discovery, modeling and identifying functional dependencies.

Goals

Typical goals set for the KDD Program relate to the business augmentation (Bottom as well as top line) and process optimization. The overall goals for data mining program can be defined as following.

  • Improve Sales Revenue using the KDD results e.g. by improving Customer Acquisition, Retention and Share of wallet, Customized product packaging increasing sales.
  • Improve Profit using KDD analysis e.g. by controlling operational costs on campaigns, identifying and reducing risks / frauds, configuring profitable products / channels.
  • Optimize / improve business process with minimal human intervention using data mining techniques e.g. workflow / network bottleneck detection, drug formulation.

Objectives

In an organization, the objectives of the KDD program are specific to the business domain can and can be very specific to cases in line of businesses. But the generic objectives can be as following.

  • Unearthing the valuable information hidden in the data. This can be achieved in terms of prediction of unknown or future values of selected metrics.
  • Improve the business opportunities based on the patterns emerged from the data mining results
  • Reduce the cost by identifying the business process gaps
  • Optimize and enhance the business processes

Functional Requirements

The high level business requirements of the organization can be perceived as given in the following table.

Growth in Revenue

  • Divide and Rule: Understand Customer Behavior through demographical (e.g. geographical, socio-economical, professional, educational, personal) and psycho graphical (e.g. hobbies, preferences, privacy norms, event based response) segmentation / classification and improve loyal customer base through Customer Acquisition and Retention for higher Share of wallet.
  • Lateral Growth: Understand market need to provide better products and services including channels. Find cross domain (different Line of Business) opportunities to grow laterally.
  • Cross Sell / Up Sell: Understand Customer / Market and enable cross and up selling through customized product packaging. Use of Channels / Partners for additional revenue sources.

Growth in Profit

  • Control Operational Costs: Controlling operational cost can be achieved through various mechanisms based on the business, but a few generic examples are as follows.
    • Targeted Campaign
    • Optimized Communication
    • Identifying and treating high cost components
  • Profitable Product and Services configurations: Configuring the products and services to reduce cost and gain profit.
  • Managing Risk
    • Credit Risk: Identifying and scoring credit risk for customers in order to maximize profit. Fraud identification (e.g. Credit card frauds like skimming and burst out)
    • Market Risk: Analyzing market risk (Interest Rate based, Foreign Exchange Based, Equity based) and utilizing the knowledge for low / no impact and more profit.
    • Operational Risk: Minimizing operational risk through better and in-time prediction of faults / bottlenecks. Prediction of “just enough” capital allocation for such risks resulting into better asset management.

Business Process Optimization / Enhancement

  • Bottleneck Detection: Identification of bottlenecks / flaws in business processes to augment productivity as well as stakeholder satisfaction. This plays a critical role in managing corporate performance.
  • Process Automation: Identification of processes to be automated finally resulting into lowered operational expenditure.

Non-Functional Requirements

Apart from the business requirements there can be a few non-functional requirements related to the data mining system. Following table gives and indicative list of these non-functional requirements

  • Data Integration and Cleansing: Data mining has a critical dependency on integrated, clean and well-maintained data. Data mining performed on data from a data warehouse is always an ideal situation. But in the absence of a data warehouse, some amount of pre-processing of data would be needed before deploying data mining.
  • Metadata Management: In order to ensure clean and appropriate data is provided to Data Mining process, it is necessary to manage and integrate the enterprise-wide metadata. This aids is improving quality of data through integrated metadata definitions (measures, business rules), which need to be used for data integration and cleansing.
  • Response Time: While performing data mining on large volumes of data and / or complex algorithms, response time becomes a concern. In such scenarios, capable hardware and software techniques need to be positioned to ensure in-time responses, which are critical for business decisions.
  • Visualization: Sometimes, the output of data mining is extremely complex to understand due to the number of entities and steps involved in the computation e.g. output of credit card skimming detection. It is necessary to have the results in interpretable patterns for which based on the business case visualization techniques may need to be developed.
  • Accuracy / Hit Ratio: In order to use the results from data mining, it needs to be guaranteed with positive hit ratio otherwise the program results into irrecoverable operational cost. The positive and high hit ratio or accuracy depends on various factors e.g. sample data, algorithm, model parameterization, test data and time (/ situation) difference between model generation and application.

Note- BIDS Solutions encompass the proprietary solutions from TCS covering Business Intelligence and Data Warehousing landscape.

 

   What is KDD- Data Mining? Knowledge Discovery in Databases Process  
 
All Topics in: "KDD- Data Mining Overview" Chapter
 What is KDD- Data Mining? →  Knowledge Discovery in Databases Program →  Knowledge Discovery in Databases Process →  Data Mining Technology →  KDD- Data Mining Issues & Challenges →  Knowledge Discovery in Databases Methodology →  Data Mining Techniques- Propensity Modeling →  Data Mining Techniques- Predictive Modeling → 
 

Was this page helpful?
If you like it ? share it !
Digg
Digg
Reddit
Reddit
Del.icio.us
Delicious
Google
Google
Live
Live
Facebook
Facebook
Slashdot
Slashdot
Netscape
Netscape
Technorati
Technorati
Stumbleupon
Stumbleupon
Spurl
Spurl
Furl
Furl
Blogmarks
Blogmarks
Yahoo
Yahoo
Plugim
Plugim
Squidoo
Squidoo
BlinkBits
BlinkBits
 
CONTENT ZONE
KDD-Data Mining
Featured Pages
Data Warehouse Information Systems Assessment
Object Level Data Quality Tracking- BAU
Exception Analysis
Dimensional Model Schemas- Star, Snow-Flake

Make 'Executable' Strategy
Maximize Results
Maximize People
Manage Execution

Featured Pages
Domain and Data Related Services
Data Quality Assurance Track
What is Data Warehouse?
Data Warehouse Challenges and Issues