Sales Management Customer Relationship Human Resources Business Performance BI & Data Quality IT Tools & Vendors

Sign-in   Register
Establishing 'Making it Happen' as a 'Formal & Predictable' Discipline
  Knowledge Discovery in Databases Program  

ENCYCLOPEDIA→   Enterprise Intelligence  →   -  KDD-Data Mining  →   -  KDD- Data Mining Overview  → 

What is KDD- Data Mining?

Data Mining term is used interchangeably with KDD. In reality, it is one of the steps in the whole process of knowledge discovery in databases. Data Mining needs a well defined business case and a diligent data preparation and has to be followed with a detailed evaluation of the discovery results.

Overview

It was recognized that information is at the heart of business operations and that decision-makers could make use of the data stored to gain valuable insight into the business. Database Management systems gave access to the data stored but this was only a small part of what could be gained from the data. Traditional OLTP systems are good at putting data into databases quickly, safely and efficiently but are not good at delivering meaningful analysis in return. Analyzing data can provide further knowledge about a business by going beyond the data explicitly stored to derive knowledge about the business. This is where Knowledge Discovery in Database (KDD) has obvious benefits for any enterprise. It involves processes like Business Case Definition, Data Preparation, Data Mining and Evaluation.

The term data mining has been stretched beyond its limits to apply to any form of data analysis and is used interchangeably with KDD. But in true sense data mining is just a step in KDD process focusing on data analysis with minimum user intervention. Some of the numerous definitions of Data Mining are:

  •  “Data mining is the search for relationships and global patterns that exist in large databases but are `hidden' among the vast amount of data, such as a relationship between patient data and their medical diagnosis. These relationships represent valuable knowledge about the database and the objects in the database and, if the database is a faithful mirror, of the real world registered by the database.” Marcel Holshemier and Arno Siebes (1994).
  • The analogy with the mining process is described as “Data mining refers to ‘using a variety of techniques to identify nuggets of information or decision-making knowledge in bodies of data, and extracting these in such a way that they can be put to use in the areas such as decision support, prediction, forecasting and estimation. The data is often voluminous, but as it stands of low value as no direct use can be made of it; it is the hidden information in the data that is useful’." Clementine User Guide, a data mining toolkit from SPSS.
  • “Data Mining is the nontrivial extraction of implicit, previously unknown, and potentially useful information from data. This encompasses a number of different technical approaches, such as clustering, data summarization, learning classification rules, finding dependency net works, analyzing changes, and detecting anomalies.” William J Frawley, Gregory Piatetsky-Shapiro and Christopher J Matheus.

Basically Data Mining is concerned with the analysis of data and the use of tools and techniques for finding patterns and regularities in sets of data. It is the computer system, which is responsible for finding the patterns by identifying the underlying rules and features in the data. The idea is that it is possible to strike gold in unexpected places as the system mines deep into the data to extract patterns not previously discernable or so obvious that no one has noticed them before. It is not simple queries for validating facts. The objective is to find patterns and rules automatically with minimal user input.

In the evolution from business data to business information to business knowledge, each new step has built upon the previous one. For example, dynamic data access is critical for drill-through in data navigation applications, and the ability to store large databases is critical to data mining. From the user’s point of view, the four steps, listed in the table below, were revolutionary because they allowed new business questions to be answered accurately and quickly.

Evolutionary Step

Business Question

Enabling Technologies

Product Providers

Characteristics

Data Collection
(1960s)

"What was my total revenue in the last five years?"

Computers, tapes, disks

IBM, CDC

Retrospective, static data delivery

Data Access
(1980s)

"What were unit sales in New England last March?"

RDBMS, SQL, ODBC

Oracle, IBM, Microsoft

Retrospective, dynamic data delivery at record level

Data Warehousing
(1990s)

"What were unit sales in New England last March? Drill down to Boston."

Relational Data Warehouse, OLAP, MDDB

NCR, Business Objects, COGNOS, Hyperion

Retrospective, dynamic data delivery at multiple levels

Data Mining
(2000s)

"What’s likely to happen to Boston unit sales next month? Why?"

Advanced algorithms, Very Large Databases

SAS, SPSS, IBM, Oracle, NCR

Prospective, proactive information as well as knowledge delivery

 

  Knowledge Discovery in Databases Program  
 
 

Was this page helpful?
 
 
More on KDD- Data Mining Overview
Knowledge Discovery in Databases Program
Knowledge Discovery in Databases Process
Data Mining Technology
KDD- Data Mining Issues & Challenges
Knowledge Discovery in Databases Methodology
Data Mining Techniques- Propensity Modeling
Data Mining Techniques- Predictive Modeling
BUY BI & Data Management Vendors & Tools Evaluation Kit
Read more...
BUY largest on-line Data-Quality Management Kit
Read more...
Additional Channels
Principles & Rules
Free Templates
Glossary
Key Performance Indicators

Most Popular Zones with list of pages crossing 25000 hits  →→→ 
Maximising Sales Performance
Sales Leads Generation through Point of Sale
Sales strike rate
Sales Channel Partner Acquisition
Sales Leads Management Concept
Sales force Training and Development
Read more...
  Customer Relationship Management
Customer Service and Support - Strategic Role
Customer Value and Profitability- BI
Customer-Centric product-service management
Customer Segmentation Data Management
Customer Segmentation Actions
Read more...
  Human Resources & Leadership
Maximize the output first and then the potential
Act with Decisiveness
Deliver Results
Develop Self and Others
Roles and Level based Competency Segregation
Read more...
 
 
Business Performance & Planning
For important KPIs- Install first & Fix later
Dashboard Health Checklist
Review Session should stay focused
External Info Assessment Report
Individual goal Sheet
Read more...
  Business Intelligence & Data Quality
BI platform and system quality
Object Level Data Quality Tracking- BAU
Don't worry for NULL as facts
Data Warehouse Business Requirements
Business Intelligence organization roles
Read more...
  IT Vendors & Tools Management
OLAP Dimensional Model Change Management
Data Quality through Data Integration Tools
Data Quality Tools Wizards
Design & Analysis support and Wizards
OLAP Architecture Cache Management
Read more...