CSCI 347, Data Mining Chapter 4 – Functions, Rules, Trees, and Instance Based Learning.

Slides:



Advertisements
Similar presentations
Data Mining Classification: Alternative Techniques
Advertisements

Lazy vs. Eager Learning Lazy vs. eager learning
Classification Algorithms – Continued. 2 Outline  Rules  Linear Models (Regression)  Instance-based (Nearest-neighbor)
Classification Algorithms – Continued. 2 Outline  Rules  Linear Models (Regression)  Instance-based (Nearest-neighbor)
INTRODUCTION TO Machine Learning ETHEM ALPAYDIN © The MIT Press, Lecture Slides for.
LEARNING DECISION TREES
Covering Algorithms. Trees vs. rules From trees to rules. Easy: converting a tree into a set of rules –One rule for each leaf: –Antecedent contains a.
Oracle Data Mining Ying Zhang. Agenda Data Mining Data Mining Algorithms Oracle DM Demo.
Chapter 5 Data mining : A Closer Look.
Machine Learning Márk Horváth Morgan Stanley FID Institutional Securities.
DATA MINING AND MACHINE LEARNING Addison Euhus and Dan Weinberg.
Introduction to Directed Data Mining: Decision Trees
Chapter 4: Algorithms CS 795.
Module 04: Algorithms Topic 07: Instance-Based Learning
Data Mining – Output: Knowledge Representation
Slides for “Data Mining” by I. H. Witten and E. Frank.
Chapter 6: Implementations. Why are simple methods not good enough? Robustness: Numeric attributes, missing values, and noisy data.
Copyright © Cengage Learning. All rights reserved. CHAPTER 5 SEQUENCES, MATHEMATICAL INDUCTION, AND RECURSION SEQUENCES, MATHEMATICAL INDUCTION, AND RECURSION.
Advanced Counting Techniques CSC-2259 Discrete Structures Konstantin Busch - LSU1.
Algorithms: The Basic Methods Witten – Chapter 4 Charles Tappert Professor of Computer Science School of CSIS, Pace University.
Chapter 7: Transformations. Attribute Selection Adding irrelevant attributes confuses learning algorithms---so avoid such attributes Both divide-and-conquer.
LEARNING DECISION TREES Yılmaz KILIÇASLAN. Definition - I Decision tree induction is one of the simplest, and yet most successful forms of learning algorithm.
Basic Data Mining Technique
Figure 1.1 Rules for the contact lens data.. Figure 1.2 Decision tree for the contact lens data.
Benk Erika Kelemen Zsolt
Data Mining Practical Machine Learning Tools and Techniques Chapter 3: Output: Knowledge Representation Rodney Nielsen Many of these slides were adapted.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.3: Decision Trees Rodney Nielsen Many of.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.4: Covering Algorithms Rodney Nielsen Many.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.7: Instance-Based Learning Rodney Nielsen.
CS690L Data Mining: Classification
Advanced Counting Techniques CSC-2259 Discrete Structures Konstantin Busch - LSU1.
COMMON CORE STANDARDS for MATHEMATICS FUNCTIONS: INTERPRETING FUNCTIONS (F-IF) F-IF3. Recognize that sequences are functions, sometimes defined recursively.
Chapter 11 Statistical Techniques. Data Warehouse and Data Mining Chapter 11 2 Chapter Objectives  Understand when linear regression is an appropriate.
DATA MINING WITH CLUSTERING AND CLASSIFICATION Spring 2007, SJSU Benjamin Lam.
Data Mining Practical Machine Learning Tools and Techniques By I. H. Witten, E. Frank and M. A. Hall Chapter 6.2: Classification Rules Rodney Nielsen Many.
DECISION TREE Ge Song. Introduction ■ Decision Tree: is a supervised learning algorithm used for classification or regression. ■ Decision Tree Graph:
Random Forests Ujjwol Subedi. Introduction What is Random Tree? ◦ Is a tree constructed randomly from a set of possible trees having K random features.
Outline K-Nearest Neighbor algorithm Fuzzy Set theory Classifier Accuracy Measures.
Data Mining By Farzana Forhad CS 157B. Agenda Decision Tree and ID3 Rough Set Theory Clustering.
Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 3 of Data Mining by I. H. Witten, E. Frank and M. A. Hall.
Chapter 4: Algorithms CS 795. Inferring Rudimentary Rules 1R – Single rule – one level decision tree –Pick each attribute and form a single level tree.
DECISION TREES Asher Moody, CS 157B. Overview  Definition  Motivation  Algorithms  ID3  Example  Entropy  Information Gain  Applications  Conclusion.
Classification Algorithms Covering, Nearest-Neighbour.
1 Learning Bias & Clustering Louis Oliphant CS based on slides by Burr H. Settles.
Chapter 6 – Trees. Notice that in a tree, there is exactly one path from the root to each node.
Data Mining Practical Machine Learning Tools and Techniques Chapter 6.5: Instance-based Learning Rodney Nielsen Many / most of these slides were adapted.
Linear Models & Clustering Presented by Kwak, Nam-ju 1.
Data Mining Chapter 4 Algorithms: The Basic Methods - Constructing decision trees Reporter: Yuen-Kuei Hsueh Date: 2008/7/24.
Data Mining Practical Machine Learning Tools and Techniques Slides for Chapter 3 of Data Mining by I. H. Witten and E. Frank.
Data Mining Practical Machine Learning Tools and Techniques
Data Transformation: Normalization
k-Nearest neighbors and decision tree
Data Science Algorithms: The Basic Methods
Ch9: Decision Trees 9.1 Introduction A decision tree:
Data Science Algorithms: The Basic Methods
Data Science Algorithms: The Basic Methods
CSE 711: DATA MINING Sargur N. Srihari Phone: , ext. 113.
Figure 1.1 Rules for the contact lens data.
Functional Programming
Classification Algorithms
Chapter 7: Transformations
INTRODUCTION TO Machine Learning 2nd Edition
Chapter 3 :Recursion © 2011 Pearson Addison-Wesley. All rights reserved.
Data Mining CSCI 307, Spring 2019 Lecture 24
Data Mining CSCI 307, Spring 2019 Lecture 21
Data Mining CSCI 307, Spring 2019 Lecture 7
Data Mining CSCI 307, Spring 2019 Lecture 23
Data Mining CSCI 307, Spring 2019 Lecture 11
Data Mining CSCI 307, Spring 2019 Lecture 6
Presentation transcript:

CSCI 347, Data Mining Chapter 4 – Functions, Rules, Trees, and Instance Based Learning

Audio With PowerPoints Audio can be included in PowerPoint slides, when the slides are viewed as a slideshow. While watching a slides show, click the icon below to hear a description of the slide:

Functions/Linear Models

Example Let x be the value of a house in Butte, MT x = c + b * num_bedrooms + d * num_bathrooms + p * price_houses_in_neigh Where c, b, d, and p are coefficients “learned” from data mining algorithm 4

Simple Linear Regression Equation for the CPU Performance Data 5 PRP = CACH

More Precise Linear Regression Equation for the CPU Data 6 PRP = MYCT MMIN MMAX CACH CHMIN CHMAX

Linear Models  Work most naturally with numeric attributes  The outcome is a linear combination of attributes, a 1, a 2, …, a k, and weights w 0, w 1, …, w k : x = w 0 + w 1 *a 1 + w 2 *a 2 + … + w n *a n

Rules as Covering Algorithms

Covering Algorithms  Rather than looking at what attribute to split on, start with a particular class  Class by class, develop rules that “cover” the class

Example: Generating a Rule 10 If x > 1.2 then class = a If x > 1.2 and y > 2.6 then class = a If ??? then class = a

Example: Generating a Rule Possible rule set for class “b”: Could add more rules, get “perfect” rule set 11 If x > 1.2 and y > 2.6 then class = a If x  1.2 then class = b If x > 1.2 and y  2.6 then class = b

Decision Trees

 Divide and conquer strategy  Can be expressed recursively

Decision Tree Algorithm Constructing a decision tree can be expressed recursively:  Select an attribute to place as the root node  Make one branch for each possible value, splitting the example set into subsets, one for every value of the attribute.  Repeat the process for each branch (recursion)  Base case - stop if all instances have the same class, there are no more attributes to split on, or a pre-defined depth of the tree has been split.

Recursion Recursion – Recursion is the process of repeating items in a similar way. Example: Definition of a person’s ancestors:  One’s parents are one’s ancestors (base case)  The ancestor of one’s ancestors are also one’s ancestors (recursion step) Example: Definition of the Fibonacci sequence  Fib(0)=0 and Fib(1)=1 (base cases)  For all integers n>1, Fib(n) = Fin(n-1) + Fib(n-2) (recursion step)

Recursion

Which Attribute to Select? 17

Rules vs. Trees Corresponding decision tree: (produces exactly the same predictions) Covering algorithm concentrates on one class value at a time whereas decision tree learner takes all class values into account 18

Instance-Based Learning

 No structure is learned  Given an instance to predict, simply predict the class of its nearest neighbor  Alternatively, predict the class which appears most frequently for the nearest k neighbors

Example Predict the class value of the following: OutlookTempHumidityWindyPlay rainyhotnormalfalse?

Example Predict the class value of the following: OutlookTempHumidityWindyPlay rainy false?

Manhattan Distance In two dimensions: if p = (p 1, p 2 ) and q = (q 1, q 2 )

Euclidean Distance Ordinary distance which one would measure with a ruler In two dimensions: if p = (p 1, p 2 ) and q = (q 1, q 2 ) Uses the Pythagorean Theorem

Euclidean Distance In n dimensions: