Download presentation
Presentation is loading. Please wait.
Published byArthur Doyle Modified over 9 years ago
1
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Chapter 12 Discovering New Knowledge – Data Mining
2
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Chapter Objectives Introduce the student to the concept of Data Mining. How it is different from knowledge elicitation from experts How it is different from extracting existing knowledge from databases. The objectives of data mining Explanation of past events (descriptive DM) Prediction of future events (predictive DM) (continued)
3
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Chapter Objectives (cont.) Introduce the student to the different classes of methods available for DM Symbolic (induction) Connectionist (neural networks) Statistical Introduce the student to the details of some of the methods described in the chapter.
4
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Section 12.1 - Objectives Introduction of chapter contents
5
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Section 12.2 - Objectives Defines the concept of data mining and the reasons for performing DM studies Defines the objectives of data mining Descriptive DM Predictive DM Introduces the three basic approaches to DM Symbolic (induction) Connectionist (neural networks) Statistical (curve-fitting, others)
6
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Section 12.3 - Objectives Present a detailed description of the symbolic approach to data mining - rule induction Present the main algorithm for rule induction - C5.0 and its ancestors, ID3 and CLS Present several example applications of rule induction Present other alternate algorithms for rule induction CART CHAID
7
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Section 12.4 - Objectives Provide a detailed description of the connectionist approach to data mining - neural networks Present the basic neural network architecture - the multi-layer feed forward neural network Present the main supervised learning algorithm - backpropagation Present the main unsupervised neural network architecture - the Kohonen network
8
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Section 12.5 - Objectives Provide a detailed description of the most important statistical methods for data mining Curve fitting with least squares method Multi-variate correlation K-Means clustering Market Basket analysis Discriminant analysis Logistic regression
9
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Section 12.6 - Objectives Provide useful guidelines for determining what technique to use for specific problems
10
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Section 12.7 - Objectives Discuss the importance of errors in data mining studies Define the types of errors possible in data mining studies
11
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Section 12.8 - Objectives Summarize the chapter Provide Key terms Provide Review Questions Provide Review Exercises
12
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Figure 12.1 Is the stock’s price/earning s ratio > 5? Has the company’s quarterly profit increased over the last year by 10% or more? Don’ t buy No Yes Root node Is the company’s management stable? Leaf node Yes Don’ t buy No Buy Don’ t Buy Yes No
13
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Figure 12.2 DS3 - Not Enjoyable DS2 - Not Enjoyable = Sunny = Cloudy = Rainy Rain Outlook {DS1, DS2, DS3, DS4} DS1 – Enjoyable DS4 - Not Enjoyable
14
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Figure 12.3 None DS3 - Not Enjoyable DS2 - Not Enjoyable = Sunny = Cloudy= Rain Rain Outlook Temperature DS1 – Enjoyable DS4 – Not Enjoyable = Cold = Hot = Mild DS1 - Enjoyable DS4 – Not Enjoyable {DS1, DS2, DS3, DS4}
15
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Figure 12.4 MilliExpert ThoughtGen OffSite Language Genie XS SilverWorks XS Lis p C++ Java
16
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Figure 12.5 Language MilliExpert ThoughtGen OffSite Backward Forward Backward C++ Java MilliExpert ThoughtGen OffSite Genie XS SilverWorks XS Lis p OffSiteXSGinie XS SilverWorks Forward Backwards
17
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Figure 12.6 MilliExpert ThoughtGen OffSite Backward Forward Backward C++ Java MilliExpert ThoughtGen OffSite Language Genie XS SilverWorks XS Lis p OffSite XSGenie XSSilverWorks Forward Backward MilliExpert ThoughtGen OffSite SpreadsheetXL ASCII Devices dBase
18
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Figure 12.7 = Dry= Humid DS2 – Not Enjoyable DS3 – Not Enjoyable DS4 – Not Enjoyable {DS1, DS2, DS3, DS4} Humidity DS1 - Enjoyable
19
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Figure 12.8 x2x2 x1x1 xkxk Activation function f() Inputs y W1W1 W2W2 WnWn
20
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Figure 12.9 Threshold function Piece-wise Linear function Sigmoid function 1.0
21
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Figure 12.10 Inputs Outputs
22
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Figure 12.11
23
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Figure 12.12 Variable A Variable B Cluster #1 Cluster #2
24
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Figure 12.13 Inputs WiWi
25
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Figure 12.14 x y
26
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Figure 12.15 x y Best fitting equation x
27
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Table 12.1
28
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Table 12.2
29
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Table 12.3
30
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Table 12.3 (cont.)
31
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Table 12.4
32
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Table 12.5
33
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Table 12.5 (cont.)
34
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Table 12.6
35
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Table 12.7
36
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Conclusions The student should be able to use: The C5.0 algorithm to capture rules from examples. Basic feedforward neural networks with supervised learning. Unsupervised learning, clustering techniques and the Kohonen networks. Curve-fitting algorithms. Statistical methods for clustering. Other statistical techniques.
37
Becerra-Fernandez, et al. -- Knowledge Management 1/e -- © 2004 Prentice Hall Chapter 12 Discovering New Knowledge – Data Mining
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.