International Graduate School of Dynamic Intelligent Systems Machine Learning RG Knowledge Based Systems Hans Kleine Büning 15 July 2015.

Slides:



Advertisements
Similar presentations
University Paderborn 07 January 2009 RG Knowledge Based Systems Prof. Dr. Hans Kleine Büning Reinforcement Learning.
Advertisements

Reinforcement Learning
Learning from Observations Chapter 18 Section 1 – 3.
DECISION TREES. Decision trees  One possible representation for hypotheses.
CPSC 502, Lecture 15Slide 1 Introduction to Artificial Intelligence (AI) Computer Science cpsc502, Lecture 15 Nov, 1, 2011 Slide credit: C. Conati, S.
C4.5 algorithm Let the classes be denoted {C1, C2,…, Ck}. There are three possibilities for the content of the set of training samples T in the given node.
Huffman code and ID3 Prof. Sin-Min Lee Department of Computer Science.
Data Mining Techniques: Classification. Classification What is Classification? –Classifying tuples in a database –In training set E each tuple consists.
Decision Tree Approach in Data Mining
1 Data Mining Classification Techniques: Decision Trees (BUSINESS INTELLIGENCE) Slides prepared by Elizabeth Anglo, DISCS ADMU.
Decision Tree.
Classification Techniques: Decision Tree Learning
Minimax and Alpha-Beta Reduction Borrows from Spring 2006 CS 440 Lecture Slides.
Chapter 7 – Classification and Regression Trees
Chapter 7 – Classification and Regression Trees
Decision Trees IDHairHeightWeightLotionResult SarahBlondeAverageLightNoSunburn DanaBlondeTallAverageYesnone AlexBrownTallAverageYesNone AnnieBlondeShortAverageNoSunburn.
SLIQ: A Fast Scalable Classifier for Data Mining Manish Mehta, Rakesh Agrawal, Jorma Rissanen Presentation by: Vladan Radosavljevic.
Decision Tree Algorithm
Induction of Decision Trees
Three kinds of learning
1 MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING By Kaan Tariman M.S. in Computer Science CSCI 8810 Course Project.
Learning decision trees derived from Hwee Tou Ng, slides for Russell & Norvig, AI a Modern Approachslides Tom Carter, “An introduction to information theory.
Learning….in a rather broad sense: improvement of performance on the basis of experience Machine learning…… improve for task T with respect to performance.
© Prentice Hall1 DATA MINING Introductory and Advanced Topics Part II Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist.
MACHINE LEARNING. What is learning? A computer program learns if it improves its performance at some task through experience (T. Mitchell, 1997) A computer.
CS Reinforcement Learning1 Reinforcement Learning Variation on Supervised Learning Exact target outputs are not given Some variation of reward is.
Fall 2004 TDIDT Learning CS478 - Machine Learning.
Machine Learning Chapter 3. Decision Tree Learning
Learning what questions to ask. 8/29/03Decision Trees2  Job is to build a tree that represents a series of questions that the classifier will ask of.
Learning CPSC 386 Artificial Intelligence Ellen Walker Hiram College.
CS-424 Gregory Dudek Today’s outline Administrative issues –Assignment deadlines: 1 day = 24 hrs (holidays are special) –The project –Assignment 3 –Midterm.
Reinforcement Learning
Chapter 9 – Classification and Regression Trees
Machine Learning Queens College Lecture 2: Decision Trees.
Learning from Observations Chapter 18 Through
CHAPTER 18 SECTION 1 – 3 Learning from Observations.
1 Learning Chapter 18 and Parts of Chapter 20 AI systems are complex and may have many parameters. It is impractical and often impossible to encode all.
For Monday No new reading Homework: –Chapter 18, exercises 3 and 4.
CS 8751 ML & KDDDecision Trees1 Decision tree representation ID3 learning algorithm Entropy, Information gain Overfitting.
MACHINE LEARNING 10 Decision Trees. Motivation  Parametric Estimation  Assume model for class probability or regression  Estimate parameters from all.
CS 5751 Machine Learning Chapter 3 Decision Tree Learning1 Decision Trees Decision tree representation ID3 learning algorithm Entropy, Information gain.
Decision Trees Binary output – easily extendible to multiple output classes. Takes a set of attributes for a given situation or object and outputs a yes/no.
1 Decision Tree Learning Original slides by Raymond J. Mooney University of Texas at Austin.
Machine Learning Decision Trees. E. Keogh, UC Riverside Decision Tree Classifier Ross Quinlan Antenna Length Abdomen Length.
Reinforcement Learning
COMP 2208 Dr. Long Tran-Thanh University of Southampton Decision Trees.
CSC 8520 Spring Paula Matuszek DecisionTreeFirstDraft Paula Matuszek Spring,
An Interval Classifier for Database Mining Applications Rakes Agrawal, Sakti Ghosh, Tomasz Imielinski, Bala Iyer, Arun Swami Proceedings of the 18 th VLDB.
1 Decision Trees Greg Grudic (Notes borrowed from Thomas G. Dietterich and Tom Mitchell) [Edited by J. Wiebe] Decision Trees.
Machine Learning Recitation 8 Oct 21, 2009 Oznur Tastan.
Chapter 18 Section 1 – 3 Learning from Observations.
Learning From Observations Inductive Learning Decision Trees Ensembles.
Reinforcement Learning Guest Lecturer: Chengxiang Zhai Machine Learning December 6, 2001.
REINFORCEMENT LEARNING Unsupervised learning 1. 2 So far ….  Supervised machine learning: given a set of annotated istances and a set of categories,
Learning from Observations
Learning from Observations
DECISION TREES An internal node represents a test on an attribute.
Introduce to machine learning
C4.5 algorithm Let the classes be denoted {C1, C2,…, Ck}. There are three possibilities for the content of the set of training samples T in the given node.
C4.5 algorithm Let the classes be denoted {C1, C2,…, Ck}. There are three possibilities for the content of the set of training samples T in the given node.
Decision Trees (suggested time: 30 min)
"Playing Atari with deep reinforcement learning."
Machine Learning Chapter 3. Decision Tree Learning
Machine Learning Chapter 3. Decision Tree Learning
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
Learning from Observations
MACHINE LEARNING TECHNIQUES IN IMAGE PROCESSING
Learning Chapter 18 and Parts of Chapter 20
Learning from Observations
Decision trees One possible representation for hypotheses
Presentation transcript:

International Graduate School of Dynamic Intelligent Systems Machine Learning RG Knowledge Based Systems Hans Kleine Büning 15 July 2015

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Outline  Learning by Example  Motivation  Decision Trees  ID3  Overfitting  Pruning  Exercise  Reinforcement Learning  Motivation  Markov Decision Processes  Q-Learning  Exercise

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Outline  Learning by Example  Motivation  Decision Trees  ID3  Overfitting  Pruning  Exercise  Reinforcement Learning  Motivation  Markov Decision Processes  Q-Learning  Exercise

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Motivation  Partly inspired by human learning  Objectives:  Classify entities according to some given examples  Find structures in big databases  Gain new knowledge from the samples  Input:  Learning examples with  Assigned attributes  Assigned classes  Output:  General Classifier for the given task

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Classifying Training Examples  Training Example for EnjoySport  General Training Examples

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Attributes & Classes  Attribute: A i  Number of different values for A i : |A i |  Class: C i  Number of different classes: |C|  Premises:  n > 2  Consistent examples (no two objects with the same attributes and different classes)

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Possible Solutions  Decision Trees  ID3  C4.5  CART  Rule Based Systems  Clustering  Neural Networks  Backpropagation  Neuroevolution

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Decision Trees  Idea: Classify entities using if- then-rules  Example: Classifing Mushrooms  Attributes: Color, Size, Points  Classes: eatable, poisonous  Resulting rules:  if (Colour = red) and (Size = small) then poisonous  if (Colour = green) then eatable  … ColorSizePointsClass red brown green red small big small big yes no yes no poisonous eatable

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Decision Trees  There exist different decision trees for the same task.  In the mean the left tree decides earlier.

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems How to measure tree quality?  Number of leafs?  Number of generated rules  Tree height?  Maximum rule length  External path length?  = Sum of the length of all paths from root to leaf  Amount of memory needed for all rules  Weighted external path length  Like external path length  Paths are weighted by the number of objects they represent

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Back to the Example CriterionLeft TreeRight Tree number of leafs45 height22 external path length65 weighted external path length78

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Weighted External Path Length  Idea from information theory:  Given:  Text which should be compressed  Probabilities for character occurrence  Result:  Coding tree  Example: eeab  p(e) = 0.5  p(a) = 0.25  p(b) = 0.25  Encoding:  Build tree according to the information content.

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Entropy  Entropy = Measurement for mean information content  In general:  Mean number of bits to encode each element by optimal encoding. (= mean height of the theoretically optimal encoding tree)

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Information Gain  Information gain = expected reduction of entropy due to sorting  Conditional Entropy:  Information Gain:

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems  Use conditional entropy and information gain for selecting split attributes.  Chosen split attribute A k :  Possible values for A k :  x i – Number of objects with value a i for A k  x i,j – Number of objects with value a i for A k and class C j Probability that one of the objects has attribute a i Probability that an object with attribute a i has class C j Probability that one of the objects has attribute a i Entropy & Decision Trees Probability that one of the objects has attribute a i

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Decision Tree Construction  Choose split attribute A k which gives the highest information gain or the smallest  Example: colour ColorSizePointsClass red brown green red small big small big yes no yes no poisonous eatable

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Decision Tree Construction (2)  Analogously:  H(C|A colour ) = 0.4  H(C|A size ) ≈  H(C|A points ) = 0.4  Choose colour or points as first split criterion  Recursively repeat this procedure

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Decision Tree Construction (3)  Right side is trivial:  Left side: both attributes have the same information gain

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Generalisation  The classifier should also be able to handle unknown data.  Classifing model is often called hypothesis.  Testing Generality:  Divide samples into  Training set  Validation or test set  Learn according to training set  Test generality according to validation set  Error computation:  Test set X  Hypothesis h  error(X,h) – Function which is monotonously increasing in the number of wrongly classified examples in X by h

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Overfitting  Learnt hypothesis performs good on training set but bad on validation set  Formally: h is overfitted if there exists a hypothesis h’ with error(D,h) error(X,h’) X validation set D training set

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Avoiding Overfitting  Stopping  Don‘t split further if some criteria is true  Examples:  Size of node n : Don‘t split if n contains less then ¯ examples.  Purity of node n : Don‘t split of purity gain is not big enough.  Pruning  Reduce decision tree after training.  Examples:  Reduced Error Pruning  Minimal Cost-Complexity Pruning  Rule-Post Pruning

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Pruning  Pruning Syntax:  If T 0 was produced by (repeated) pruning on T we write

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Maximum Tree Creation  Before pruning we need a maximum tree T max  What is a maximum tree?  All leaf nodes are smaller then some threshold or  All leaf nodes represent only one class or  All leaf nodes have only objects with the same attribute values  T max is then pruned starting from the leaves.

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Reduced Error Pruning 1.Consider branch T n of T 2.Replace T n by leaf with the class that is mostly associated with T n 3.If error(X, h(T)) < error(X, h(T/T n )) take back the decision 4.Back to 1. until all non-leaf nodes were considered

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Exercise Fred wants to buy a VW Beetle and classifies all offering in the classes interesting and uninteresting. Help Fred by creating a decision tree using the ID3 algorithm. ColourYear of ConstructionMileageClass red blue green red green blue yellow > km > km km km < km interesting uninteresting interesting interesting uninteresting uninteresting interesting

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Outline  Learning by Example  Motivation  Decision Trees  ID3  Overfitting  Pruning  Exercise  Reinforcement Learning  Motivation  Markov Decision Processes  Q-Learning  Exercise

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Reinforcement Learning: The Idea  A way of programming agents by reward and punishment without specifying how the task is to be achieved

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Learning to Balance on a Bicycle  States:  Angle of handle bars  Angular velocity of handle bars  Angle of bicycle to vertical  Angular velocity of bicycle to vertical  Acceleration of angle of bicycle to vertical

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Learning to Balance on a Bicycle  Actions:  Torque to be applied to the handle bars  Displacement of the center of mass from the bicycle’s plan (in cm)

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Angle of bicycle to vertical is greater than 12° Reward = 0 Reward = -1 no yes

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Reinforcement Learning: Applications  Board Games  TD-Gammon program, based on reinforcement learning, has become a world-class backgammon player  Control a Mobile Robot  Learning to Drive a Bicycle  Navigation  Pole-balancing  Acrobot  Robot Soccer  Learning to Control Sequential Processes  Elevator Dispatching

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Deterministic Markov Decision Process

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Value of Policy and Agent’s Task

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Nondeterministic Markov Decision Process P = 0.8 P = 0.1

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Methods Dynamic Programming Value Function Approximation + Dynamic Programming Reinforcement Learning Valuation Function Approximation + Reinforcement Learning continuous states discrete states continuous states Model (reward function and transition probabilities) is known Model (reward function or transition probabilities) is unknown

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Q-learning Algorithm

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Q-learning Algorithm

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Example

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Example: Q-table Initialization

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Example: Episode 1

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Example: Episode 1

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Example: Episode 1

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Example: Episode 1

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Example: Episode 1

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Example: Q-table

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Example: Episode 1

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Episode 1

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Example: Q-table

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Example: Episode 2

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Example: Episode 2

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Example: Episode 2

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Example: Q-table after Convergence

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Example: Value Function after Convergence

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Example: Optimal Policy

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Example: Optimal Policy

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Q-learning

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Convergence of Q-learning

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Blackjack  Standard rules of blackjack hold  State space:  element[0] - current value of player's hand (4-21)  element[1] - value of dealer's face­-up card (2-11)  element[2] - player does not have usable ace (0/1)  Starting states:  player has any 2 cards (uniformly distributed), dealer has any 1 card (uniformly distributed)  Actions:  HIT  STICK  Rewards:  ­1 for a loss  0 for a draw  1 for a win

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Blackjack: Optimal Policy

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Exercise:

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Exercise:

Hans Kleine Büning 9 January RG Knowledge Based Systems University of Paderborn International Graduate School of Dynamic Intelligent Systems Problems  Multiagent Systems  Cooperative Agents  Competitive Agents  Continuous Domains  Partially observable MDP (POMDP)