Decision Tree Learning Brought to you by Chris Creswell.

Slides:

Advertisements

Similar presentations

Learning from Observations Chapter 18 Section 1 – 3.

Advertisements

Conceptual Clustering

DECISION TREES. Decision trees  One possible representation for hypotheses.

1 Machine Learning: Lecture 3 Decision Tree Learning (Based on Chapter 3 of Mitchell T.., Machine Learning, 1997)

Decision Tree Learning - ID3

Machine Learning A Quick look Sources:

CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.

Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.

Decision Tree Approach in Data Mining

David Luebke 1 4/22/2015 CS 332: Algorithms Quicksort.

Classification Techniques: Decision Tree Learning

Chapter 7 – Classification and Regression Trees

Chapter 7 – Classification and Regression Trees

Decision Trees.

Decision Trees Jeff Storey. Overview What is a Decision Tree Sample Decision Trees How to Construct a Decision Tree Problems with Decision Trees Decision.

Black & White To analyze and critique the AI used Amanda Yaklin.

ID3 Algorithm Abbas Rizvi CS157 B Spring What is the ID3 algorithm? ID3 stands for Iterative Dichotomiser 3 Algorithm used to generate a decision.

Decision Tree Algorithm

Induction of Decision Trees

Decision Trees Chapter 18 From Data to Knowledge.

LEARNING DECISION TREES

Learning decision trees derived from Hwee Tou Ng, slides for Russell & Norvig, AI a Modern Approachslides Tom Carter, “An introduction to information theory.

Learning decision trees

Learning decision trees derived from Hwee Tou Ng, slides for Russell & Norvig, AI a Modern Approachslides Tom Carter, “An introduction to information theory.

Learning….in a rather broad sense: improvement of performance on the basis of experience Machine learning…… improve for task T with respect to performance.

Learning Chapter 18 and Parts of Chapter 20

Fall 2004 TDIDT Learning CS478 - Machine Learning.

Machine Learning Chapter 3. Decision Tree Learning

Learning what questions to ask. 8/29/03Decision Trees2  Job is to build a tree that represents a series of questions that the classifier will ask of.

Learning CPSC 386 Artificial Intelligence Ellen Walker Hiram College.

CS-424 Gregory Dudek Today’s outline Administrative issues –Assignment deadlines: 1 day = 24 hrs (holidays are special) –The project –Assignment 3 –Midterm.

Chapter 9 – Classification and Regression Trees

1 CO Games Development 2 Week 19 Probability Trees + Decision Trees (Learning Trees) Gareth Bellaby.

LEARNING DECISION TREES Yılmaz KILIÇASLAN. Definition - I Decision tree induction is one of the simplest, and yet most successful forms of learning algorithm.

CHAPTER 18 SECTION 1 – 3 Learning from Observations.

Decision Trees DefinitionDefinition MechanismMechanism Splitting FunctionSplitting Function Issues in Decision-Tree LearningIssues in Decision-Tree Learning.

Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.3: Decision Trees Rodney Nielsen Many of.

Artificial Intelligence in Game Design N-Grams and Decision Tree Learning.

For Wednesday No reading Homework: –Chapter 18, exercise 6.

For Monday No new reading Homework: –Chapter 18, exercises 3 and 4.

CS 8751 ML & KDDDecision Trees1 Decision tree representation ID3 learning algorithm Entropy, Information gain Overfitting.

CS 5751 Machine Learning Chapter 3 Decision Tree Learning1 Decision Trees Decision tree representation ID3 learning algorithm Entropy, Information gain.

Decision Trees Binary output – easily extendible to multiple output classes. Takes a set of attributes for a given situation or object and outputs a yes/no.

1 Decision Tree Learning Original slides by Raymond J. Mooney University of Texas at Austin.

Decision Trees, Part 1 Reading: Textbook, Chapter 6.

Decision Tree Learning

COMP 2208 Dr. Long Tran-Thanh University of Southampton Decision Trees.

CSC 8520 Spring Paula Matuszek DecisionTreeFirstDraft Paula Matuszek Spring,

Big Data Analysis and Mining Qinpei Zhao 赵钦佩 2015 Fall Decision Tree.

1 Classification: predicts categorical class labels (discrete or nominal) classifies data (constructs a model) based on the training set and the values.

Decision Trees.

DECISION TREES Asher Moody, CS 157B. Overview  Definition  Motivation  Algorithms  ID3  Example  Entropy  Information Gain  Applications  Conclusion.

Chapter 18 Section 1 – 3 Learning from Observations.

Decision Trees MSE 2400 EaLiCaRA Dr. Tom Way.

Review of Decision Tree Learning Bamshad Mobasher DePaul University Bamshad Mobasher DePaul University.

CSE573 Autumn /09/98 Machine Learning Administrative –Last topic: Decision Tree Learning Reading: 5.1, 5.4 Last time –finished NLP sample system’s.

Learning from Observations

DECISION TREES An internal node represents a test on an attribute.

Decision Trees an introduction.

Introduce to machine learning

Classification Algorithms

Artificial Intelligence

Presented By S.Yamuna AP/CSE

Data Science Algorithms: The Basic Methods

Issues in Decision-Tree Learning Avoiding overfitting through pruning

Decision Tree Saed Sayad 9/21/2018.

Machine Learning: Lecture 3

Learning from Observations

Decision trees One possible representation for hypotheses

Decision Trees Jeff Storey.

Presentation transcript:

Decision Tree Learning Brought to you by Chris Creswell

Why learn about decision trees? A practical way to get AI to adapt to player – a simple form of user modeling A practical way to get AI to adapt to player – a simple form of user modeling –Enhances replayability –Player’s bot allies can be more effective –Opponent bots can learn player’s tactics, player can’t repeat the same strategy over and over

What we’ll learn What is a decision tree What is a decision tree How do we build a decision tree How do we build a decision tree What has been done with decision trees in games What has been done with decision trees in games –What else can we do with them

What is a decision tree Decision Tree Learning (DTL) is a form of inductive learning task, meaning it has the following objective: use a training set of examples to create a hypothesis that makes general conclusions Decision Tree Learning (DTL) is a form of inductive learning task, meaning it has the following objective: use a training set of examples to create a hypothesis that makes general conclusions

What is a decision tree – terms/concepts Attribute: a variable that we take into account in making a decision Attribute: a variable that we take into account in making a decision Target attribute: the attribute that we want to take on a certain value, we’ll decide based on it Target attribute: the attribute that we want to take on a certain value, we’ll decide based on it

What is a decision tree – an example ExampleHourWeatherAccidentStall Target -- Commute D18AMSunnyNoNoLong D28AMCloudyNoYesLong D310AMSunnyNoNoShort D49AMRainyYesNoLong D59AMSunnyYesYesLong D610AMSunnyNoNoShort D710AMCloudyNoNoShort D89AMRainyNoNoMedium D99AMSunnyYesNoLong D1010AMCloudyYesYesLong D1110AMRainyNoNoShort D128AMCloudyYesNoLong D139AMSunnyNoNoMedium

What is a decision tree – an example Hour 10AM Stall No Short Yes Long 8AM Long 9AM Accident No Medium Yes Long

What is a decision tree – how to use it Given a set of circumstances (values of attributes), use it to traverse the tree from root to leaf Given a set of circumstances (values of attributes), use it to traverse the tree from root to leaf The leaf node is a decision The leaf node is a decision

Why is this useful The hypothesis formed from the training set can be used to draw conclusions about sets of circumstances not present in the training set – it will generalize The hypothesis formed from the training set can be used to draw conclusions about sets of circumstances not present in the training set – it will generalize

How do we construct a decision tree? Guiding principle of inductive learning: Guiding principle of inductive learning: –Occam’s razor – choose the simplest possible hypothesis that is consistent with the provided examples General idea: recursively classify the examples based on one of the attributes until all examples have been used General idea: recursively classify the examples based on one of the attributes until all examples have been used Here’s the algorithm: Here’s the algorithm:

node LearnTree(examples, targetAttribute, attributes) examples is the training set targetAttribute is what to learn attributes is the set of available attributes returns a tree node begin if all the examples have the same targetAttribute value, return a leaf with that value else if the set of attributes is empty return a leaf with the most common targetAttribute value among examples else begin A = the “best” attribute among attributes having a range of values v1, v2, …, vk Partition examples according to their value for A into sets S1, S2, …, Sk Create a decision node N with attribute A for i = 1 to k begin Attach a branch B to node N with test V i if S i has elements (is non-empty) Attach B to LearnTree(S i, targetAttribute, attributes – {A}); else Attach B to a leaf node with most common targetAttribute end return decision node N endend

This is how we construct a decision tree This very simple pseudo-code basically implements the construction of a decision tree, except for one key thing that is abstracted away, this is … This very simple pseudo-code basically implements the construction of a decision tree, except for one key thing that is abstracted away, this is … Key step in the algorithm: choosing the “best” attribute to classify on Key step in the algorithm: choosing the “best” attribute to classify on One algorithm for doing this is ID3 (used in Black and White) One algorithm for doing this is ID3 (used in Black and White) –We’ll get to the algorithm in a bit

This is how we construct a decision tree – pseudo-code walkthrough First, LearnTree is called with all examples, the targetAttribute, and all attributes to classify on First, LearnTree is called with all examples, the targetAttribute, and all attributes to classify on It chooses the “best” (we’ll get to that) attribute to split on, creates a decision node for it, then recursively calls LearnTree for each partition of the examples It chooses the “best” (we’ll get to that) attribute to split on, creates a decision node for it, then recursively calls LearnTree for each partition of the examples

This is how we construct a decision tree – pseudo-code walkthrough Recursion stops when: Recursion stops when: –All examples have the same value –There are no more attributes –There are no more examples The first two need some explanation, the third one is trivial – all examples have been classified The first two need some explanation, the third one is trivial – all examples have been classified

This is how we construct a decision tree – pseudo-code walkthrough Recursion stops when all examples have the same value, when does this happen? Recursion stops when all examples have the same value, when does this happen? –When ancestor attributes and corresponding branch values, as well as the target attribute and value, are the same across examples

This is how we construct a decision tree – pseudo-code walkthrough Recursion stops when there are no more attributes Recursion stops when there are no more attributes –This happens when training set is inconsistent, e. g. there are 2 or more examples having the same values for all but the target attribute –The way our pseudo-code is written, it guesses when this happens, it picks the most popular target attribute value –This is a decision left up to the implementer –This is a weakness of the algorithm It doesn’t handle “noise” in its training set well It doesn’t handle “noise” in its training set well

This is how we construct a decision tree – pseudo-code walkthrough Let’s watch the algorithm in action … Let’s watch the algorithm in action … ng/DecisionTrees/InterArticle/2- DecisionTree.html ng/DecisionTrees/InterArticle/2- DecisionTree.html ng/DecisionTrees/InterArticle/2- DecisionTree.html ng/DecisionTrees/InterArticle/2- DecisionTree.html

ID3 algorithm Picks the best attribute to classify on in a call of LearnTree, does so by quantifying how useful an attribute will be w/respect to the remaining examples Picks the best attribute to classify on in a call of LearnTree, does so by quantifying how useful an attribute will be w/respect to the remaining examples How? Using Shannon’s Information theory, pick the attribute that favors the best reduction in entropy How? Using Shannon’s Information theory, pick the attribute that favors the best reduction in entropy

ID3 algorithm – Shannon’s Information Theory Choose an attribute that favors the best reduction in entropy Choose an attribute that favors the best reduction in entropy Entropy quantifies the variation in a set of examples with respect to the target attribute values Entropy quantifies the variation in a set of examples with respect to the target attribute values A set of ex’s with mostly the same targetAttr value has very low entropy (that’s good) A set of ex’s with mostly the same targetAttr value has very low entropy (that’s good) A set of ex’s with many varying targetAttr values will have high entropy (bad) A set of ex’s with many varying targetAttr values will have high entropy (bad) Ready? Here come some equations … Ready? Here come some equations …

ID3: Shannon’s Information Theory In the following, S is the set of examples, Si is a subset of S with value Vi under the target Attribute In the following, S is the set of examples, Si is a subset of S with value Vi under the target Attribute

ID3: Shannon’s Information Theory Expected entropy of candidate attribute A is weighted sum of subset Expected entropy of candidate attribute A is weighted sum of subset In the following, k is the size of range of attribute A: In the following, k is the size of range of attribute A:

ID3: Shannon’s Information Theory What we really want is to maximize information gain, defined: What we really want is to maximize information gain, defined:

ID3: Shannon’s Information Theory Entropy of the commute time example: Entropy of the commute time example: The thirteens are because there are thirteen examples. The fours, twos, and sevens come from how many short, medium, and long commutes there are, respectively.

ID3: Shannon’s Information Theory Attribute Expected Entropy Info Gain Hour Weather Accident Stall

ID3: Drawbacks Does not guarantee the smallest possible decision tree Does not guarantee the smallest possible decision tree –Selects classifying attribute based on best expected information gain, not always right Not very good with continuous values, best with symbolic data Not very good with continuous values, best with symbolic data –When given lots of distinct continuous values, ID3 will create very “bushy” trees – 1 or 2 levels deep, lots and lots of leaves –We can make this less serious, but it’s still a drawback

Decision Trees in games First successful use of a decision tree was in “Black and White” (Lionhead studios, 2001) First successful use of a decision tree was in “Black and White” (Lionhead studios, 2001) “In Black & White you can be the god you want to be. Will you rule with a fair hand, making life better for your people? Or will you be evil and scare them into prayer and submission? No one can tell you which way to be. You, as a god, can play the game any way you choose.” “In Black & White you can be the god you want to be. Will you rule with a fair hand, making life better for your people? Or will you be evil and scare them into prayer and submission? No one can tell you which way to be. You, as a god, can play the game any way you choose.”

Decision Trees in games “And as a god, you get to own a Creature. Chosen by you from magical, special animals, your Creature will copy you, you will teach him and he will learn by himself. He will grow, ultimately to 30 metres, and can do anything you can do in the game. Your Creature can help the people or can kill and eat them. He can cast Miracles to bring rain to their crops or he can drown them in thesea. Your Creature is your physical manifestation in the world of Eden, He is “And as a god, you get to own a Creature. Chosen by you from magical, special animals, your Creature will copy you, you will teach him and he will learn by himself. He will grow, ultimately to 30 metres, and can do anything you can do in the game. Your Creature can help the people or can kill and eat them. He can cast Miracles to bring rain to their crops or he can drown them in thesea. Your Creature is your physical manifestation in the world of Eden, He is whatever you want him to be.... And the game also boasts a new level of artificial intelligence. Your Creature is almost a living, breathing thing. He learns, remembers and makes connections. His huge range of abilities and decisions is born of a ground- breakingly powerful and complex AI system.”

Decision Trees in games So you teach your creature by giving it feedback – it learns to perform actions that get it the highest feedback So you teach your creature by giving it feedback – it learns to perform actions that get it the highest feedback Problem: feedback is a continuous variable Problem: feedback is a continuous variable We have to make it discrete We have to make it discrete We do so using K-means clustering We do so using K-means clustering

Decision Trees in games In K-means clustering, we find out how many clusters we want to create, then use an algorithm to successively associate or dissociate instances with clusters until associations stabilize around k clusters In K-means clustering, we find out how many clusters we want to create, then use an algorithm to successively associate or dissociate instances with clusters until associations stabilize around k clusters The author’s reference for this is from a computer vision textbook The author’s reference for this is from a computer vision textbook –I wasn’t about to go buy it Not important to know clustering algorithm Not important to know clustering algorithm

Decision Trees in games Example from B&W: should your creature attack a town Example from B&W: should your creature attack a town Examples : Examples : ExampleAllegianceDefenseTribeFeedback D1FriendlyWeakCeltic D2EnemyWeakCeltic0.4 D3FriendlyStrongNorse D4EnemyStrongNorse-0.2 D5FriendlyWeakGreek D6EnemyMediumGreek0.2 D7EnemyStrongGreek-0.4 D8EnemyMediumAztec0.0 D9FriendlyWeakAztec

Decision Trees in games If we ask for 4 clusters, K-means clustering will create clusters around -1, 0.4, 0.1, The memberships in these clusters will be {D1, D3, D5, D9}, {D2}, {D6, D8}, {D4, D7} respectively. If we ask for 4 clusters, K-means clustering will create clusters around -1, 0.4, 0.1, The memberships in these clusters will be {D1, D3, D5, D9}, {D2}, {D6, D8}, {D4, D7} respectively. The tree ID3 will create using these examples and clusters: The tree ID3 will create using these examples and clusters:

Decision Trees in games Allegiance Friendly Enemy Defense Weak 0.4 Medium 0.1 Strong -0.3

Decision Trees in games So in this case, the tree the creature learned can be reduced to a nice compact logical expression: So in this case, the tree the creature learned can be reduced to a nice compact logical expression: ((Allegiance = Enemy) AND (Defense = weak)) OR ((Allegiance = Enemy) AND (Defense = Medium)) ((Allegiance = Enemy) AND (Defense = weak)) OR ((Allegiance = Enemy) AND (Defense = Medium)) This happens sometimes This happens sometimes Makes it easier and more efficient to apply Makes it easier and more efficient to apply

An Extension to ID3 to better handle continuous values Seems simple, use an inequality, right? Seems simple, use an inequality, right? Not that simple – need to pick cut points Not that simple – need to pick cut points Cut points are the boundaries we create for our inequalities, where do they go? Cut points are the boundaries we create for our inequalities, where do they go? Key insight: optimal cut points must always reside at boundary points Key insight: optimal cut points must always reside at boundary points Okay, so what are boundary points? Okay, so what are boundary points?

An Extension to ID3 to better handle continuous values If we sort the list of examples according to their values of the candidate attribute, a boundary point is a value in this list between 2 adjacent instances having different attribute values of the target attribute. If we sort the list of examples according to their values of the candidate attribute, a boundary point is a value in this list between 2 adjacent instances having different attribute values of the target attribute. In the worst case, the number of boundary points is about equal to the number of instances In the worst case, the number of boundary points is about equal to the number of instances –This happens if the target attribute oscillates back and forth between good and bad

Example software on CD Show an example made using the software on the CD Show an example made using the software on the CD

Conclusions Decision Trees are an elegant way of learning – it is easy to expose their logic and understand what it has learned Decision Trees are an elegant way of learning – it is easy to expose their logic and understand what it has learned Decision Trees are not always the best way to learn – they have some weaknesses Decision Trees are not always the best way to learn – they have some weaknesses But it also has its own set of strengths But it also has its own set of strengths

Conclusions Decision Trees work best for symbolic, discrete values Decision Trees work best for symbolic, discrete values Can be extended to work with continuous values Can be extended to work with continuous values B&W had to do some clustering of feedback values to use decision trees B&W had to do some clustering of feedback values to use decision trees

Conclusions Up to now, the only use of Decision Trees in games has been in B&W Up to now, the only use of Decision Trees in games has been in B&W What are they good for? What are they good for? –User modeling -- teaching the computer how to react to the player, enhances replayability –Can be used to make bots that are the player’s allies more effective as in B&W –Could also make enemies more intelligent – the player would be forced to come up with new strategies How else can they be used? How else can they be used? –This is relatively unexplored territory people – if you think you have a great idea, go for it