Presentation is loading. Please wait.

Presentation is loading. Please wait.

L6. Learning Systems in Java. Necessity of Learning No Prior Knowledge about all of the situations. Being able to adapt to changes in the environment.

Similar presentations


Presentation on theme: "L6. Learning Systems in Java. Necessity of Learning No Prior Knowledge about all of the situations. Being able to adapt to changes in the environment."— Presentation transcript:

1 L6. Learning Systems in Java

2 Necessity of Learning No Prior Knowledge about all of the situations. Being able to adapt to changes in the environment. Getting better at task through experience.

3 Some Forms of Learning Rote learning –Copy examples and exactly reproduce the behavior Parameter or weight adjustment –Adjust weight factors over time Induction –A process of learning by example: to extract the important characteristics of the problem  to generalize to novel situations or inputs. –The key is that the examples are processed and automatically transformed into a knowledge representation. –Using for classification or regression (prediction) problems. Clustering –Grouping examples and generalizing to new situations. –Using for data mining.

4 Learning Paradigms Supervised learning – programming by example –The learning agent is trained by showing it examples of the problem state or attributes along with the desired output or action. –The learning agent makes a prediction based on the inputs and if the output differs from the desired output, then the agent is adjusted or adapted to product the correct output. –E.g.: the back propagation neural network, a decision tree Unsupervised learning –The learning agent needs to recognize similarities between inputs or to identify features in the input data. –It partitions the data into group. –E.g.: a Kohonen map Reinforcement learning –A type of supervised learning but the error information is less specific. –Having exact prior information about the desired output is not possible.

5 Classifier Systems Classifier systems –They were introduced by John Holland as a way to introduce learning to rule-based systems. –Mechanism is based on a techniques known as genetic algorithms. –The rule base is modified by applying genetic algorithms. Genetic Algorithms –Representing the rules as binary strings. –The rules are modified by genetic operators. –Evaluation function is the key of a genetic algorithm. –The whole process is based on Darwin’s evolutionary principle.

6 Decision Trees example data sets classifiers and prediction models apply information theory By Shanon and Weaver (1949) The unit of information is a bit, and the amount of information in a single binary answer is log 2 P(v), where P(v) is the probability of event v occurring. Information needed for a correct answer, I(p/(p+n), n/(p+n)) = - (p/(p+n) log 2 p/(p+n) ) - n/(p+n)log 2 n/(p+n) ) Information contained in the remained sub-trees, Remainder(A) =  (p i + n i ) /(p+n) I(p i /(p i + n i ), n i /(p i + n i )) Gain(A) = I(p/(p+n), n/(p+n)) - Remainder(A)

7 Information Gain (an example) Suppose that there are the total of 1000 customers, men renew 90 percent of the time, women renew 70 percent, and the customer set is made up half of men and half of women. Information gain by testing whether a customer is a male or female? Gain(Sex) = 1- [(500/1000)I(450/500, 50/500)+(500/1000)I(350/500, 140/500)] = 1-(0.5)I(0.9, 0.1) - (0.5)I(0.7, 0.3) = 1-0.5x0.468996-0.5x0.881291 = 0.324857 Information gain by testing on the attribute, usage? Gain(Usage) = 1- [(1/3)I(1/2, 1/2)+(1/3)I(9/10, 1/10)+(1/3)I(1, 0)] = 1-0.333x1.0-0.333x0.466133-0.333x1.0 = 0.178778 Suppose that we had grouped the customers’ usage habits into 3 groups: under 4 hours a month, from 4 to 10 hours, and over 10. The customers are evenly split among three groups. The first group renews at 50 percent, the second at 90 percent, and the third at 100 percent. Conclusion: In building a decision tree, it is better to first split the data based on whether the customer was male or female, and then on how much connect-time they used.

8 Implementation of a Decision Tree DecisionTree.txt DecisionTree.txt // compute information content, // given # of pos and neg examples double computeInfo(int p, int n) { double total = p + n ; double pos = p / total ; double neg = n / total; double temp; if ((p ==0) || (n == 0)) { temp = 0.0 ; } else { temp = (-1.0 * (pos * Math.log(pos)/Math.log(2))) - (neg * Math.log(neg)/Math.log(2)) ; } return temp ; } double computeRemainder(Variable variable, Vector examples) { int positive[] = new int[variable.labels.size()]; int negative[] = new int[variable.labels.size()]; int index = variable.column; int classIndex = classVar.column; double sum = 0 ; double numValues = variable.labels.size(); double numRecs = examples.size() ; for( int i=0 ; i < numValues ; i++) { String value = variable.getLabel(i); Enumeration enum = examples.elements(); while (enum.hasMoreElements()) { String record[] = (String[])enum.nextElement(); // get next record if (record[index].equals(value)) { if (record[classIndex].equals("yes")) { positive[i]++; } else { negative[i]++; } } /* endwhile */ double weight = (positive[i]+negative[i]) / numRecs; double myrem = weight * computeInfo(positive[i], negative[i]); sum = sum + myrem ; } /* endfor */ return sum ; }

9 Implementation of a Decision Tree // return the variable with most gain Variable chooseVariable(Hashtable variables, Vector examples) { Enumeration enum = variables.elements() ; double gain = 0.0, bestGain = 0.0 ; Variable best = null ; int counts[] ; counts = getCounts(examples) ; int pos = counts[0] ; int neg = counts[1] ; double info = computeInfo(pos, neg); while(enum.hasMoreElements()) { Variable tempVar = (Variable)enum.nextElement() ; gain = info - computeRemainder(tempVar, examples); if (gain > bestGain) { bestGain = gain ; best = tempVar; } return best; // }

10 Demo A decision tree. C:\huang\DAI\L5_2004\learning\learn\appletTest.jpr Example data resttree.dat.txt alternate bar FriSat hungry patrons price raining reservatio rtype waitEstimate ClassField variables

11 Starting DecisionTree Info = 1.0 waitEstimate gain = 0.20751874963942196 raining gain = 0.0 hungry gain = 0.19570962879973086 price gain = 0.19570962879973075 FriSat gain = 0.020720839623907805 bar gain = 0.0 patrons gain = 0.5408520829727552 alternate gain = 0.0 rtype gain = 1.1102230246251565E-16 reservation gain = 0.020720839623907805 Choosing best variable: patrons Subset - there are 4 records with patrons = some Subset - there are 6 records with patrons = full Info = 0.9182958340544896 waitEstimate gain = 0.2516291673878229 raining gain = 0.10917033867559889 hungry gain = 0.2516291673878229 price gain = 0.2516291673878229 FriSat gain = 0.10917033867559889 bar gain = 0.0 patrons gain = 0.0 alternate gain = 0.10917033867559889 rtype gain = 0.2516291673878229 reservation gain = 0.2516291673878229 Choosing best variable: waitEstimate Subset - there are 0 records with waitEstimate = 0-10 Subset - there are 2 records with waitEstimate = 30-60 Info = 1.0 waitEstimate gain = 0.0 raining gain = 0.0 hungry gain = 0.0 price gain = 0.0 FriSat gain = 1.0 bar gain = 1.0 patrons gain = 0.0 alternate gain = 0.0 rtype gain = 1.0 reservation gain = 0.0 Choosing best variable: FriSat Subset - there are 1 records with FriSat = no Subset - there are 1 records with FriSat = yes Subset - there are 2 records with waitEstimate = 10-30

12 Info = 1.0 waitEstimate gain = 0.0 raining gain = 0.0 hungry gain = 0.0 price gain = 1.0 FriSat gain = 0.0 bar gain = 1.0 patrons gain = 0.0 alternate gain = 0.0 rtype gain = 1.0 reservation gain = 1.0 Choosing best variable: price Subset - there are 1 records with price = $$$ Subset - there are 1 records with price = $ Subset - there are 0 records with price = $$ Subset - there are 2 records with waitEstimate = >60 Subset - there are 2 records with patrons = none DecisionTree -- classVar = ClassField Interior node - patrons Link - patrons=some Leaf node - yes Link - patrons=full Interior node - waitEstimate Link - waitEstimate=0-10 Leaf node - yes Link - waitEstimate=30-60 Interior node - FriSat Link - FriSat=no Leaf node - no Link - FriSat=yes Leaf node - yes Link - waitEstimate=10-30 Interior node - price Link - price=$$$ Leaf node - no Link - price=$ Leaf node - yes Link - price=$$ Leaf node - yes Link - waitEstimate=>60 Leaf node - no Link - patrons=none Leaf node - no Stopping DecisionTree - success! Draw a decision tree!

13 Another Demo C:\huang\DAI\L5_2004\learning\DecisionT reeApplet_3.20\source\DecisionTreeApple t.html load: basketball Algorithm-> set splitting function: gain

14 References 1.http://www.cs.ualberta.ca/~aixplore/learning/DecisionTrees/http://www.cs.ualberta.ca/~aixplore/learning/DecisionTrees/ 2.http://www.cse.unsw.edu.au/~billw/cs9414/notes/ml/06prop/id3/id3.htmlhttp://www.cse.unsw.edu.au/~billw/cs9414/notes/ml/06prop/id3/id3.html 3.http://nand.net/~paras/genetic_decision_trees/http://nand.net/~paras/genetic_decision_trees/ 4.http://www.mindtools.com/dectree.htmlhttp://www.mindtools.com/dectree.html 5.http://www.cs.ubc.ca/labs/lci/CIspace/download.htmlhttp://www.cs.ubc.ca/labs/lci/CIspace/download.html 6.http://www.cis.temple.edu/~ingargio/cis587/readings/id3-c45.htmlhttp://www.cis.temple.edu/~ingargio/cis587/readings/id3-c45.html 7.http://www.lboro.ac.uk/departments/el/research/esc- miniconference/papers/swere.pdfhttp://www.lboro.ac.uk/departments/el/research/esc- miniconference/papers/swere.pdf 8.http://www.pitt.edu/~jduffy/econ1200/LectNotesWk8.pdfhttp://www.pitt.edu/~jduffy/econ1200/LectNotesWk8.pdf 9.http://www.aaai.org/AITopics/html/trees.htmlhttp://www.aaai.org/AITopics/html/trees.html 10.http://www2.cs.uregina.ca/~hamilton/courses/831/notes/ml/dtrees/4_dtrees3.htmlhttp://www2.cs.uregina.ca/~hamilton/courses/831/notes/ml/dtrees/4_dtrees3.html 11.http://www2.cs.uregina.ca/~dbd/cs831/notes/ml/dtrees/c4.5/tutorial.htmlhttp://www2.cs.uregina.ca/~dbd/cs831/notes/ml/dtrees/c4.5/tutorial.html 12.http://www.netnam.vn/unescocourse/knowlegde/3-3.htmhttp://www.netnam.vn/unescocourse/knowlegde/3-3.htm Suggestion: Make a presentation – Decision tree and a rule base (Optional) Apply the decision tree learning to your rule base system


Download ppt "L6. Learning Systems in Java. Necessity of Learning No Prior Knowledge about all of the situations. Being able to adapt to changes in the environment."

Similar presentations


Ads by Google