Download presentation
Presentation is loading. Please wait.
Published byPhilip Perkins Modified over 9 years ago
1
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge Discovery : An Overview Course: CIS 864 Class: 6 Presenter: MANMOHAN K. UTTARWAR On: Monday, January 29, 2001
2
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Contents Reasoning Advances KDD:Definition Data Mining & KDD The KDD Process Stages Of KDD Primary Tasks of Data Mining Components Of Data Mining Algorithm Popular Data Mining methods Application Issues Of Data Mining Guidelines for Selecting Potential KDD Applications Research & Application Challenges for KDD
3
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence An Overview Explosive growth of Business,Govt. & Scientific databases. Ability to Interpret & Digest Data Inappropriate Tools & Techniques for Database Analysis
4
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Advances in Storage Technology Wal-Mart : 20 Million Trans/Day Health care : Multi Gigabyte Mobil Oil : 100 TB of Oil Exploration Data NASA: EOS –generates 50GB /hr Remotely Sensed Image Data
5
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence KDD: Definition KDD: Definition The non-trivial process of identifying valid, novel, potentially useful, and ultimately understandable patterns in data non-trivial process Multiple process valid Justified patterns/models novel Previously unknown useful Can be used understandable by human and machine
6
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Data Mining & KDD Data Mining -Step in KDD process -Consists of particular Data Mining algorithms Under specified computational efficiency limitations produces specific enumeration of patterns KDD process is the process of using Data Mining methods (Algorithms) to extract deemed knowledge according to the specifications of measures and threshold using the database along with any required preprocessing, sub sampling & transformations of that database.
7
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence The KDD Processes Preliminaries Developing an understanding of the application Domain Creating a Target Data Set Choosing the Data Mining Tasks Preprocessing Data cleaning and Pre-processing Data Reduction and Projection Choosing the Data Mining Algorithms Data Mining Application Interpretation of Mining Patterns Consolidating discovered Knowledge
8
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Stages of KDD
9
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Tasks in KDD Process
10
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Classification Deviation and change detection Summarization Regression finding the description of several predefined classes and classify a data item into one of them. maps a data item to a real-valued prediction variable. identifying a finite set of categories or clusters to describe the data. finding a compact description for a subset of data finding a model which describes significant dependencies between variables. discovering the most significant changes in the data Primary Tasks of Data Mining Primary Tasks of Data Mining Clustering Dependency Modeling
11
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Primary Tasks of Data Mining Classification Regression Clustering Dependency modeling Change & Deviation Detection Summarization
12
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Primary Tasks of Data Mining
13
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Components of Data Mining Algorithms Model Representation - L describing discoverable patterns Model Evaluation - meets criteria of KDD process Search Methods - Parameter Search - Model Search
14
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Popular Data Mining Methods Decision Trees & Rules Non-Linear Regression & Classification Methods Example-Based Methods Probabilistic Graphical Dependency Model Relational Learning Model
15
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Non-Linear & Nearest Neighbor Classifier
16
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence SEMMA Process (Simple DM Algo.)
17
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Application Issues of KDD Database Marketing Analysis and Selection of Stocks Scientific applications such as -- Astronomy -- Molecular Biology -- Global Climate Change Modeling Fraud Detection & Prevention
18
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Guidelines for Selecting a Potential KDD Application Practical Criteria -- Potential for significant impact of an application -- No Good alternatives exists -- Organizational support -- Potential for privacy / legal issues Technical Criteria -- Availability of Significant Data -- Relevance of attributes -- Low noise levels -- Confidence intervals -- Prior Knowledge
19
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Challenges for KDD Larger Database High Dimensionality Over-fitting Assessing Statistical Significance Changing Data & Knowledge Missing & Noisy Data Complex Relationships between Fields Understandability of Patterns User Interaction & Prior Knowledge Integration with other Systems
20
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Conclusions KDD – a desirable end product Many approaches exists - have advantages & problems Barrier in obtaining Quality Data
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.