L6. Learning Systems in Java. Necessity of Learning No Prior Knowledge about all of the situations. Being able to adapt to changes in the environment.

Slides:



Advertisements
Similar presentations
Learning from Observations Chapter 18 Section 1 – 3.
Advertisements

DECISION TREES. Decision trees  One possible representation for hypotheses.
Ch. Eick: More on Machine Learning & Neural Networks Different Forms of Learning: –Learning agent receives feedback with respect to its actions (e.g. using.
Decision Tree Approach in Data Mining
Data Mining Classification: Basic Concepts, Decision Trees, and Model Evaluation Lecture Notes for Chapter 4 Part I Introduction to Data Mining by Tan,
Classification Techniques: Decision Tree Learning
Learning Department of Computer Science & Engineering Indian Institute of Technology Kharagpur.
Machine Learning CPSC 315 – Programming Studio Spring 2009 Project 2, Lecture 5.
1 Chapter 10 Introduction to Machine Learning. 2 Chapter 10 Contents (1) l Training l Rote Learning l Concept Learning l Hypotheses l General to Specific.
Learning From Observations
Decision Tree Algorithm
Induction of Decision Trees
Lecture 5 (Classification with Decision Trees)
Three kinds of learning
LEARNING DECISION TREES
Learning decision trees derived from Hwee Tou Ng, slides for Russell & Norvig, AI a Modern Approachslides Tom Carter, “An introduction to information theory.
Learning decision trees
Learning decision trees derived from Hwee Tou Ng, slides for Russell & Norvig, AI a Modern Approachslides Tom Carter, “An introduction to information theory.
Learning….in a rather broad sense: improvement of performance on the basis of experience Machine learning…… improve for task T with respect to performance.
ICS 273A Intro Machine Learning
Learning: Introduction and Overview
© Prentice Hall1 DATA MINING Introductory and Advanced Topics Part II Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist.
MACHINE LEARNING. What is learning? A computer program learns if it improves its performance at some task through experience (T. Mitchell, 1997) A computer.
Gini Index (IBM IntelligentMiner)
Chapter 5 Data mining : A Closer Look.
Induction of Decision Trees (IDT) CSE 335/435 Resources: – –
Machine learning Image source:
Machine learning Image source:
Machine Learning. Learning agent Any other agent.
Machine Learning CPS4801. Research Day Keynote Speaker o Tuesday 9:30-11:00 STEM Lecture Hall (2 nd floor) o Meet-and-Greet 11:30 STEM 512 Faculty Presentation.
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
Mehdi Ghayoumi Kent State University Computer Science Department Summer 2015 Exposition on Cyber Infrastructure and Big Data.
INTRODUCTION TO MACHINE LEARNING. $1,000,000 Machine Learning  Learn models from data  Three main types of learning :  Supervised learning  Unsupervised.
Inductive learning Simplest form: learn a function from examples
Using Neural Networks in Database Mining Tino Jimenez CS157B MW 9-10:15 February 19, 2009.
LEARNING DECISION TREES Yılmaz KILIÇASLAN. Definition - I Decision tree induction is one of the simplest, and yet most successful forms of learning algorithm.
Learning from observations
Learning from Observations Chapter 18 Through
CHAPTER 18 SECTION 1 – 3 Learning from Observations.
Learning from Observations Chapter 18 Section 1 – 3, 5-8 (presentation TBC)
Learning from Observations Chapter 18 Section 1 – 3.
1 Machine Learning 1.Where does machine learning fit in computer science? 2.What is machine learning? 3.Where can machine learning be applied? 4.Should.
CS690L Data Mining: Classification
Decision Trees. What is a decision tree? Input = assignment of values for given attributes –Discrete (often Boolean) or continuous Output = predicated.
An Investigation of Commercial Data Mining Presented by Emily Davis Supervisor: John Ebden.
1 Chapter 10 Introduction to Machine Learning. 2 Chapter 10 Contents (1) l Training l Rote Learning l Concept Learning l Hypotheses l General to Specific.
Decision Trees Binary output – easily extendible to multiple output classes. Takes a set of attributes for a given situation or object and outputs a yes/no.
An Introduction Student Name: Riaz Ahmad Program: MSIT( ) Subject: Data warehouse & Data Mining.
Lecture Notes for Chapter 4 Introduction to Data Mining
COMP 2208 Dr. Long Tran-Thanh University of Southampton Decision Trees.
CSC 8520 Spring Paula Matuszek DecisionTreeFirstDraft Paula Matuszek Spring,
1 Classification: predicts categorical class labels (discrete or nominal) classifies data (constructs a model) based on the training set and the values.
DECISION TREES Asher Moody, CS 157B. Overview  Definition  Motivation  Algorithms  ID3  Example  Entropy  Information Gain  Applications  Conclusion.
Chapter 18 Section 1 – 3 Learning from Observations.
Learning From Observations Inductive Learning Decision Trees Ensembles.
Decision Tree Learning CMPT 463. Reminders Homework 7 is due on Tuesday, May 10 Projects are due on Tuesday, May 10 o Moodle submission: readme.doc and.
Learning from Observations
Learning from Observations
Machine Learning Inductive Learning and Decision Trees
Data Transformation: Normalization
Introduce to machine learning
DATA MINING © Prentice Hall.
Presented By S.Yamuna AP/CSE
Ch9: Decision Trees 9.1 Introduction A decision tree:
Decision making in episodic environments
Data Mining – Chapter 3 Classification
Learning from Observations
Learning from Observations
Decision trees One possible representation for hypotheses
Machine Learning: Decision Tree Learning
Presentation transcript:

L6. Learning Systems in Java

Necessity of Learning No Prior Knowledge about all of the situations. Being able to adapt to changes in the environment. Getting better at task through experience.

Some Forms of Learning Rote learning –Copy examples and exactly reproduce the behavior Parameter or weight adjustment –Adjust weight factors over time Induction –A process of learning by example: to extract the important characteristics of the problem  to generalize to novel situations or inputs. –The key is that the examples are processed and automatically transformed into a knowledge representation. –Using for classification or regression (prediction) problems. Clustering –Grouping examples and generalizing to new situations. –Using for data mining.

Learning Paradigms Supervised learning – programming by example –The learning agent is trained by showing it examples of the problem state or attributes along with the desired output or action. –The learning agent makes a prediction based on the inputs and if the output differs from the desired output, then the agent is adjusted or adapted to product the correct output. –E.g.: the back propagation neural network, a decision tree Unsupervised learning –The learning agent needs to recognize similarities between inputs or to identify features in the input data. –It partitions the data into group. –E.g.: a Kohonen map Reinforcement learning –A type of supervised learning but the error information is less specific. –Having exact prior information about the desired output is not possible.

Classifier Systems Classifier systems –They were introduced by John Holland as a way to introduce learning to rule-based systems. –Mechanism is based on a techniques known as genetic algorithms. –The rule base is modified by applying genetic algorithms. Genetic Algorithms –Representing the rules as binary strings. –The rules are modified by genetic operators. –Evaluation function is the key of a genetic algorithm. –The whole process is based on Darwin’s evolutionary principle.

Decision Trees example data sets classifiers and prediction models apply information theory By Shanon and Weaver (1949) The unit of information is a bit, and the amount of information in a single binary answer is log 2 P(v), where P(v) is the probability of event v occurring. Information needed for a correct answer, I(p/(p+n), n/(p+n)) = - (p/(p+n) log 2 p/(p+n) ) - n/(p+n)log 2 n/(p+n) ) Information contained in the remained sub-trees, Remainder(A) =  (p i + n i ) /(p+n) I(p i /(p i + n i ), n i /(p i + n i )) Gain(A) = I(p/(p+n), n/(p+n)) - Remainder(A)

Information Gain (an example) Suppose that there are the total of 1000 customers, men renew 90 percent of the time, women renew 70 percent, and the customer set is made up half of men and half of women. Information gain by testing whether a customer is a male or female? Gain(Sex) = 1- [(500/1000)I(450/500, 50/500)+(500/1000)I(350/500, 140/500)] = 1-(0.5)I(0.9, 0.1) - (0.5)I(0.7, 0.3) = 1-0.5x x = Information gain by testing on the attribute, usage? Gain(Usage) = 1- [(1/3)I(1/2, 1/2)+(1/3)I(9/10, 1/10)+(1/3)I(1, 0)] = x x x1.0 = Suppose that we had grouped the customers’ usage habits into 3 groups: under 4 hours a month, from 4 to 10 hours, and over 10. The customers are evenly split among three groups. The first group renews at 50 percent, the second at 90 percent, and the third at 100 percent. Conclusion: In building a decision tree, it is better to first split the data based on whether the customer was male or female, and then on how much connect-time they used.

Implementation of a Decision Tree DecisionTree.txt DecisionTree.txt // compute information content, // given # of pos and neg examples double computeInfo(int p, int n) { double total = p + n ; double pos = p / total ; double neg = n / total; double temp; if ((p ==0) || (n == 0)) { temp = 0.0 ; } else { temp = (-1.0 * (pos * Math.log(pos)/Math.log(2))) - (neg * Math.log(neg)/Math.log(2)) ; } return temp ; } double computeRemainder(Variable variable, Vector examples) { int positive[] = new int[variable.labels.size()]; int negative[] = new int[variable.labels.size()]; int index = variable.column; int classIndex = classVar.column; double sum = 0 ; double numValues = variable.labels.size(); double numRecs = examples.size() ; for( int i=0 ; i < numValues ; i++) { String value = variable.getLabel(i); Enumeration enum = examples.elements(); while (enum.hasMoreElements()) { String record[] = (String[])enum.nextElement(); // get next record if (record[index].equals(value)) { if (record[classIndex].equals("yes")) { positive[i]++; } else { negative[i]++; } } /* endwhile */ double weight = (positive[i]+negative[i]) / numRecs; double myrem = weight * computeInfo(positive[i], negative[i]); sum = sum + myrem ; } /* endfor */ return sum ; }

Implementation of a Decision Tree // return the variable with most gain Variable chooseVariable(Hashtable variables, Vector examples) { Enumeration enum = variables.elements() ; double gain = 0.0, bestGain = 0.0 ; Variable best = null ; int counts[] ; counts = getCounts(examples) ; int pos = counts[0] ; int neg = counts[1] ; double info = computeInfo(pos, neg); while(enum.hasMoreElements()) { Variable tempVar = (Variable)enum.nextElement() ; gain = info - computeRemainder(tempVar, examples); if (gain > bestGain) { bestGain = gain ; best = tempVar; } return best; // }

Demo A decision tree. C:\huang\DAI\L5_2004\learning\learn\appletTest.jpr Example data resttree.dat.txt alternate bar FriSat hungry patrons price raining reservatio rtype waitEstimate ClassField variables

Starting DecisionTree Info = 1.0 waitEstimate gain = raining gain = 0.0 hungry gain = price gain = FriSat gain = bar gain = 0.0 patrons gain = alternate gain = 0.0 rtype gain = E-16 reservation gain = Choosing best variable: patrons Subset - there are 4 records with patrons = some Subset - there are 6 records with patrons = full Info = waitEstimate gain = raining gain = hungry gain = price gain = FriSat gain = bar gain = 0.0 patrons gain = 0.0 alternate gain = rtype gain = reservation gain = Choosing best variable: waitEstimate Subset - there are 0 records with waitEstimate = 0-10 Subset - there are 2 records with waitEstimate = Info = 1.0 waitEstimate gain = 0.0 raining gain = 0.0 hungry gain = 0.0 price gain = 0.0 FriSat gain = 1.0 bar gain = 1.0 patrons gain = 0.0 alternate gain = 0.0 rtype gain = 1.0 reservation gain = 0.0 Choosing best variable: FriSat Subset - there are 1 records with FriSat = no Subset - there are 1 records with FriSat = yes Subset - there are 2 records with waitEstimate = 10-30

Info = 1.0 waitEstimate gain = 0.0 raining gain = 0.0 hungry gain = 0.0 price gain = 1.0 FriSat gain = 0.0 bar gain = 1.0 patrons gain = 0.0 alternate gain = 0.0 rtype gain = 1.0 reservation gain = 1.0 Choosing best variable: price Subset - there are 1 records with price = $$$ Subset - there are 1 records with price = $ Subset - there are 0 records with price = $$ Subset - there are 2 records with waitEstimate = >60 Subset - there are 2 records with patrons = none DecisionTree -- classVar = ClassField Interior node - patrons Link - patrons=some Leaf node - yes Link - patrons=full Interior node - waitEstimate Link - waitEstimate=0-10 Leaf node - yes Link - waitEstimate=30-60 Interior node - FriSat Link - FriSat=no Leaf node - no Link - FriSat=yes Leaf node - yes Link - waitEstimate=10-30 Interior node - price Link - price=$$$ Leaf node - no Link - price=$ Leaf node - yes Link - price=$$ Leaf node - yes Link - waitEstimate=>60 Leaf node - no Link - patrons=none Leaf node - no Stopping DecisionTree - success! Draw a decision tree!

Another Demo C:\huang\DAI\L5_2004\learning\DecisionT reeApplet_3.20\source\DecisionTreeApple t.html load: basketball Algorithm-> set splitting function: gain

References miniconference/papers/swere.pdfhttp:// miniconference/papers/swere.pdf Suggestion: Make a presentation – Decision tree and a rule base (Optional) Apply the decision tree learning to your rule base system