Figure 1.1 Rules for the contact lens data.. Figure 1.2 Decision tree for the contact lens data.

Slides:



Advertisements
Similar presentations
Florida International University COP 4770 Introduction of Weka.
Advertisements

Decision Tree Approach in Data Mining
Weka & Rapid Miner Tutorial By Chibuike Muoh. WEKA:: Introduction A collection of open source ML algorithms – pre-processing – classifiers – clustering.
Data Mining Classification: Alternative Techniques
Classification Algorithms – Continued. 2 Outline  Rules  Linear Models (Regression)  Instance-based (Nearest-neighbor)
Classification Algorithms – Continued. 2 Outline  Rules  Linear Models (Regression)  Instance-based (Nearest-neighbor)
WEKA (sumber: Machine Learning with WEKA). What is WEKA? Weka is a collection of machine learning algorithms for data mining tasks. Weka contains.
Knowledge Representation. 2 Outline: Output - Knowledge representation  Decision tables  Decision trees  Decision rules  Rules involving relations.
Neural Nets How the brain achieves intelligence 10^11 1mhz cpu’s.
Ensemble Learning: An Introduction
Department of Computer Science, University of Waikato, New Zealand Eibe Frank WEKA: A Machine Learning Toolkit The Explorer Classification and Regression.
Basic Data Mining Techniques
Knowledge Representation. 2 Outline: Output - Knowledge representation  Decision tables  Decision trees  Decision rules  Rules involving relations.
Introduction to Boosting Aristotelis Tsirigos SCLT seminar - NYU Computer Science.
Machine Learning: Ensemble Methods
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
Ensembles of Classifiers Evgueni Smirnov
An Exercise in Machine Learning
 The Weka The Weka is an well known bird of New Zealand..  W(aikato) E(nvironment) for K(nowlegde) A(nalysis)  Developed by the University of Waikato.
Issues with Data Mining
Data Mining – Output: Knowledge Representation
Machine Learning Chapter 3. Decision Tree Learning
Learning what questions to ask. 8/29/03Decision Trees2  Job is to build a tree that represents a series of questions that the classifier will ask of.
Slides for “Data Mining” by I. H. Witten and E. Frank.
Appendix: The WEKA Data Mining Software
Data Mining Practical Machine Learning Tools and Techniques Chapter 3: Output: Knowledge Representation Rodney Nielsen Many of these slides were adapted.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
University of Texas at Austin Machine Learning Group Department of Computer Sciences University of Texas at Austin WEKA Tutorial Sugato Basu and Prem Melville.
Scaling up Decision Trees. Decision tree learning.
Combining multiple learners Usman Roshan. Bagging Randomly sample training data Determine classifier C i on sampled data Goto step 1 and repeat m times.
1 1 Slide Using Weka. 2 2 Slide Data Mining Using Weka n What’s Data Mining? We are overwhelmed with data We are overwhelmed with data Data mining is.
CS690L Data Mining: Classification
MACHINE LEARNING 10 Decision Trees. Motivation  Parametric Estimation  Assume model for class probability or regression  Estimate parameters from all.
Decision Trees Binary output – easily extendible to multiple output classes. Takes a set of attributes for a given situation or object and outputs a yes/no.
Weka Just do it Free and Open Source ML Suite Ian Witten & Eibe Frank University of Waikato New Zealand.
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 6.2: Classification Rules Rodney Nielsen Many.
An Exercise in Machine Learning
Random Forests Ujjwol Subedi. Introduction What is Random Tree? ◦ Is a tree constructed randomly from a set of possible trees having K random features.
***Classification Model*** Hosam Al-Samarraie, PhD. CITM-USM.
Machine Learning (ML) with Weka Weka can classify data or approximate functions: choice of many algorithms.
CSCI 347, Data Mining Chapter 4 – Functions, Rules, Trees, and Instance Based Learning.
Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 3 of Data Mining by I. H. Witten, E. Frank and M. A. Hall.
Combining multiple learners Usman Roshan. Decision tree From Alpaydin, 2010.
Chapter 4: Algorithms CS 795. Inferring Rudimentary Rules 1R – Single rule – one level decision tree –Pick each attribute and form a single level tree.
Machine Learning in Practice Lecture 8 Carolyn Penstein Rosé Language Technologies Institute/ Human-Computer Interaction Institute.
Ensembles of Classifiers Evgueni Smirnov. Outline 1 Methods for Independently Constructing Ensembles 1.1 Bagging 1.2 Randomness Injection 1.3 Feature-Selection.
In part from: Yizhou Sun 2008 An Introduction to WEKA Explorer.
Data Mining CH6 Implementation: Real machine learning schemes(2) Reporter: H.C. Tsai.
Linear Models & Clustering Presented by Kwak, Nam-ju 1.
1 Machine Learning: Ensemble Methods. 2 Learning Ensembles Learn multiple alternative definitions of a concept using different training data or different.
@relation age sex { female, chest_pain_type { typ_angina, asympt, non_anginal,
Rodney Nielsen Many of these slides were adapted from: I. H. Witten, E. Frank and M. A. Hall Data Science Output: Knowledge Representation WFH: Data Mining,
Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 3 of Data Mining by I. H. Witten and E. Frank.
Machine Learning: Ensemble Methods
Data Mining Practical Machine Learning Tools and Techniques
Data Mining Practical Machine Learning Tools and Techniques
Ch9: Decision Trees 9.1 Introduction A decision tree:
CSE 711: DATA MINING Sargur N. Srihari Phone: , ext. 113.
Decision Tree Saed Sayad 9/21/2018.
Figure 1.1 Rules for the contact lens data.
Data Mining Practical Machine Learning Tools and Techniques
Weka Free and Open Source ML Suite Ian Witten & Eibe Frank
DataMining, Morgan Kaufmann, p Mining Lab. 김완섭 2004년 10월 27일
Statistical Learning Dong Liu Dept. EEIS, USTC.
Ensemble learning.
Data Mining CSCI 307, Spring 2019 Lecture 7
Data Mining CSCI 307, Spring 2019 Lecture 11
Data Mining CSCI 307, Spring 2019 Lecture 6
Presentation transcript:

Figure 1.1 Rules for the contact lens data.

Figure 1.2 Decision tree for the contact lens data.

Figure 1.3 Decision trees for the labor negotiations data. (a)(b)

Figure 2.1 A family tree and two ways of expressing the sister-of relation.

Figure 2.2 ARFF file for the weather data.

Figure 3.1 Decision tree for a simple disjunction.

Figure 3.2 The exclusive-or problem. If x=1 and y=0 then class = a If x=0 and y=1 then class = a If x=0 and y=0 then class = b If x=1 and y=1 then class = b

Figure 3.3 Decision tree with a replicated subtree. If x=1 and y=1 then class = a If z=1 and w=1 then class = a Otherwise class = b

Figure 3.4 Rules for the Iris data. Default: Iris-setosa 1 except if petal-length  2.45 and petal-length < and petal-width < then Iris-versicolor 4 except if petal-length  4.95 and petal-width < then Iris-virginica 6 else if sepal-length < 4.95 and sepal-width  then Iris-virginica 8 else if petal-length  then Iris-virginica10 except if petal-length < 4.85 and sepal-length < then Iris-versicolor12

Figure 3.5 The shapes problem. Shaded: standing Unshaded: lying

Figure 3.6(a) Models for the CPU performance data: linear regression. PRP = MYCT MMIN MMAX CACH CHMIN CHMAX

Figure 3.6(b) Models for the CPU performance data: regression tree.

Figure 3.6(c) Models for the CPU performance data: model tree.

Figure 3.7 Different ways of partitioning the instance space. (a)(b)(c)(d)

Figure 3.8 Different ways of representing clusters a b c d e f g h … (a)(b) (c)(d)

Figure 4.1 Pseudo-code for 1R.

Figure 4.2 Tree stumps for the weather data. (a)(b) (c)(d)

Figure 4.3 Expanded tree stumps for the weather data. (a)(b) (c)

Figure 4.4 Decision tree for the weather data.

Figure 4.5 Tree stump for the ID code attribute.

Figure 4.6 (a) Operation of a covering algorithm; (b) decision tree for the same problem. (a) (b)

Figure 4.7 The instance space during operation of a covering algorithm.

Figure 4.8 Pseudo-code for a basic rule learner.

Figure 5.1 A hypothetical lift chart.

Figure 5.2 A sample ROC curve.

Figure 5.3 ROC curves for two learning schemes.

Figure 6.1 Example of subtree raising, where node C is “raised” to subsume node B. (a)(b)

Figure 6.2 Pruning the labor negotiations decision tree.

Figure 6.3 Generating rules using a probability measure.

Figure 6.4 Definitions for deriving the probability measure. p number of instances of that class that the rule selects; t total number of instances that the rule selects; p total number of instances of that class in the dataset; t total number of instances in the dataset.

Figure 6.5 Algorithm for forming rules by incremental reduced error pruning.

Figure 6.6 Algorithm for expanding examples into a partial tree.

Figure 6.7 Example of building a partial tree. (a) (b) (c)

Figure 6.7 (continued) Example of building a partial tree. (d)(e)

Figure 6.8 Rules with exceptions, for the Iris data. Exceptions are represented as Dotted paths, alternatives as solid ones.

Figure 6.9 A maximum margin hyperplane.

Figure 6.10 A boundary between two rectangular classes.

Figure 6.11 Pseudo-code for model tree induction.

Figure 6.12 Model tree for a dataset with nominal attributes.

Figure 6.13 Clustering the weather data. (a) (c) (b)

Figure 6.13 (continued) Clustering the weather data. (d) (e)

Figure 6.13 (continued) Clustering the weather data. (f)

Figure 6.14 Hierarchical clusterings of the Iris data. (a)

Figure 6.14 (continued) Hierarchical clusterings of the Iris data. (b)

Figure 6.15 A two-calss mixture model. A 51 A 43 B 62 B 64 A 45 A 42 A 46 A 45 A 45 B 62 A 47 A 52 B 64 A 51 B 65 A 48 A 49 A 46 B 64 A 51 A 52 B 62 A 49 A 48 B 62 A 43 A 40 A 48 B 64 A 51 B 63 A 43 B 65 B 66 B 65 A 46 A 39 B 62 B 64 A 52 B 63 B 64 A 48 B 64 A 48 A 51 A 48 B 64 A 42 A 48 A 41 data model  A =50,  A =5, p A =0.6  B =65,  B =2, p B =0.4

Figure 7.1 Attribute space for the weather dataset.

Figure 7.2 Discretizing temperature using the entropy method.

Figure 7.3 The result of discretizing temperature no yes yes no yes yes yes no no yes yes no yes yes F E D C B A

Figure 7.4 Class distribution for a two-class, two-attribute problem.

Figure 7.5 Number of international phone calls from Belgium, 1950–1973.

Figure 7.6 Algorithm for bagging. model generation Let n be the number of instances in the training data. For each of t iterations: Sample n instances with replacement from training data. Apply the learning algorithm to the sample. Store the resulting model. classification For each of the t models: Predict class of instance using model. Return class that has been predicted most often.

Figure 7.7 Algorithm for boosting. model generation Assign equal weight to each training instance. For each of t iterations: Apply learning algorithm to weighted dataset and store resulting model. Compute error e of model on weighted dataset and store error. If e equal to zero, or e greater or equal to 0.5: Terminate model generation. For each instance in dataset: If instance classified correctly by model: Multiply weight of instance by e / (1 - e). Normalize weight of all instances. classification Assign weight of zero to all classes. For each of the t (or less) models: Add -log(e / (1 - e)) to weight of class predicted by model. Return class with highest weight.

Figure 8.1 Weather data: (a) in spreadsheet; (b) comma-separated. (a)(b)

Figure 8.1 Weather data: (c) in ARFF format. (c)

Figure 8.2 Output from the J4.8 decision tree learner.

Figure 8.3 Using Javadoc: (a) the front page; (b) the weka.core package. (a) (b)

Figure 8.4 A class of the weka.classifiers package.

Figure 8.5 Output from the M5´ program for numeric prediction.

Figure 8.6 Output from J4.8 with cost-sensitive classification.

Figure 8.7 Effect of AttributeFilter on the weather dataset.

Figure 8.8 Output from the APRIORI association rule learner.

Figure 8.9 Output from the EM clustering scheme.

Figure 8.10 Source code for the message classifier.

Figure 8.10 (continued)

Figure 8.11 Source code for the ID3 decision tree learner.

Figure 8.11 (continued)

Figure 8.11

Figure 8.12 Source code for a filter that replaces the missing values in a dataset.

Figure 8.12

Figure 9.1 Representation of Iris data: (a) one dimension.

Figure 9.1 Representation of Iris data: (b) two dimensions.

Figure 9.2 Visualization of classification tree for grasses data.