PAPER BY JONATHAN MUGAN & BENJAMIN KUIPERS PRESENTED BY DANIEL HOUGH Learning Distinctions and Rules in a Continuous World through Active Exploration.

Slides:



Advertisements
Similar presentations
Chapter 3 Introduction to Quantitative Research
Advertisements

Chapter 3 Introduction to Quantitative Research
CSCTR Session 11 Dana Retová.  Start bottom-up  Create cognition based on sensori-motor interaction ◦ Cohen et al. (1996) – Building a baby ◦ Cohen.
1 ECE 776 Project Information-theoretic Approaches for Sensor Selection and Placement in Sensor Networks for Target Localization and Tracking Renita Machado.
What is a CAT?. Introduction COMPUTER ADAPTIVE TEST + performance task.
11 Planning and Learning Week #9. 22 Introduction... 1 Two types of methods in RL ◦Planning methods: Those that require an environment model  Dynamic.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. Chapter 13 Experiments and Observational Studies.
The Art and Science of Teaching (2007)
A Simplifying Framework for an Introductory Statistics Class By Dr. Mark Eakin University of Texas at Arlington.
Week 5.  A psychologist at the local university agrees to carry out a study to investigate the claim that eating a healthy breakfast improves reading.
Date:2011/06/08 吳昕澧 BOA: The Bayesian Optimization Algorithm.
Who are the participants? Creating a Quality Sample 47:269: Research Methods I Dr. Leonard March 22, 2010.
Sampling Distributions
. PGM: Tirgul 8 Markov Chains. Stochastic Sampling  In previous class, we examined methods that use independent samples to estimate P(X = x |e ) Problem:
Chapter 10 The Analysis of Frequencies. The expression “cross partition” refers to an abstract process of set theory. When the cross partition idea is.
Information from Samples Alliance Class January 17, 2012 Math Alliance Project.
Stevenson and Ozgur First Edition Introduction to Management Science with Spreadsheets McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies,
Sampling Distributions  A statistic is random in value … it changes from sample to sample.  The probability distribution of a statistic is called a sampling.
PIAGET’S THEORY May Eun Mi Lee(Anna Moore)
IE 594 : Research Methodology – Discrete Event Simulation David S. Kim Spring 2009.
Extracting Places and Activities from GPS Traces Using Hierarchical Conditional Random Fields Yong-Joong Kim Dept. of Computer Science Yonsei.
Chapter 4 Gathering data
Chapter 1: Introduction to Statistics
Experiments and Observational Studies. Observational Studies In an observational study, researchers don’t assign choices; they simply observe them. look.
Jean Piaget ( ) Started out as a biologist but specialized in psychology. He was interested in the nature of knowledge and how the child acquires.
© Copyright McGraw-Hill CHAPTER 1 The Nature of Probability and Statistics.
 1  Outline  stages and topics in simulation  generation of random variates.
Verification & Validation
MATHEMATICAL FOUNDATIONS OF QUALITATIVE REASONING Louise-Travé-Massuyès, Liliana Ironi, Philippe Dague Presented by Nuri Taşdemir.
Bayesian Learning By Porchelvi Vijayakumar. Cognitive Science Current Problem: How do children learn and how do they get it right?
Section Copyright © 2014, 2012, 2010 Pearson Education, Inc. Lecture Slides Elementary Statistics Twelfth Edition and the Triola Statistics Series.
IU 19 Math Network Jim Bohan  Introduction  Common Core Standards  College and Career Readiness Standards  K-12 Grade Level Standards.
Stochastic Linear Programming by Series of Monte-Carlo Estimators Leonidas SAKALAUSKAS Institute of Mathematics&Informatics Vilnius, Lithuania
Part III Gathering Data.
Patterns of Event Causality Suggest More Effective Corrective Actions Abstract: The Occurrence Reporting and Processing System (ORPS) has used a consistent.
Probability level 8 NZC AS91585 Apply probability concepts in solving problems.
Brian Macpherson Ph.D, Professor of Statistics, University of Manitoba Tom Bingham Statistician, The Boeing Company.
Understanding Randomness
2-Day Introduction to Agent-Based Modelling Day 2: Session 6 Mutual adaption.
Chapter 5 Parameter estimation. What is sample inference? Distinguish between managerial & financial accounting. Understand how managers can use accounting.
Chapter 9: Inferences Based on Two Samples: Confidence Intervals and Tests of Hypotheses Statistics.
Is Sampling Useful in Data Mining? A Case in the Maintenance of Discovered Association Rules S.D. Lee, David W. Cheung, Ben Kao The University of Hong.
Motor Control. Beyond babbling Three problems with motor babbling: –Random exploration is slow –Error-based learning algorithms are faster but error signals.
Maximum a posteriori sequence estimation using Monte Carlo particle filters S. J. Godsill, A. Doucet, and M. West Annals of the Institute of Statistical.
CS532 TERM PAPER MEASUREMENT IN SOFTWARE ENGINEERING NAVEEN KUMAR SOMA.
Michael A. Hitt C. Chet Miller Adrienne Colella Slides by R. Dennis Middlemist Michael A. Hitt C. Chet Miller Adrienne Colella Chapter 4 Learning and Perception.
Generic Tasks by Ihab M. Amer Graduate Student Computer Science Dept. AUC, Cairo, Egypt.
Lecture 10: Correlation and Regression Model.
Chapter 3 System Performance and Models Introduction A system is the part of the real world under study. Composed of a set of entities interacting.
POSC 202A: Lecture 4 Probability. We begin with the basics of probability and then move on to expected value. Understanding probability is important because.
OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.
Aim: What factors must we consider to make an experimental design?
Simulation Chapter 16 of Quantitative Methods for Business, by Anderson, Sweeney and Williams Read sections 16.1, 16.2, 16.3, 16.4, and Appendix 16.1.
Simulation. Simulation is a way to model random events, such that simulated outcomes closely match real-world outcomes. By observing simulated outcomes,
Application of the GA-PSO with the Fuzzy controller to the robot soccer Department of Electrical Engineering, Southern Taiwan University, Tainan, R.O.C.
Statistical NLP: Lecture 4 Mathematical Foundations I: Probability Theory (Ch2)
Chapter 9: Introduction to the t statistic. The t Statistic The t statistic allows researchers to use sample data to test hypotheses about an unknown.
1 Chapter 11 Understanding Randomness. 2 Why Random? What is it about chance outcomes being random that makes random selection seem fair? Two things:
Genetic Algorithm. Outline Motivation Genetic algorithms An illustrative example Hypothesis space search.
1 Tempo Induction and Beat Tracking for Audio Signals MUMT 611, February 2005 Assignment 3 Paul Kolesnik.
Rule Induction for Classification Using
Information from Samples
Navigation In Dynamic Environment
CS222/CS122C: Principles of Data Management Lecture #4 Catalogs, File Organizations Instructor: Chen Li.
Making Sense of Experience Jonathan Harlow
CS222p: Principles of Data Management Lecture #4 Catalogs, File Organizations Instructor: Chen Li.
Interpreting Epidemiologic Results.
Samples and Populations
Presentation transcript:

PAPER BY JONATHAN MUGAN & BENJAMIN KUIPERS PRESENTED BY DANIEL HOUGH Learning Distinctions and Rules in a Continuous World through Active Exploration

The Challenge To build a robot which learns its environment like children do. Piaget [1952] theorised that children constructed this knowledge in stages Cohen [2002] proposed that children have a domain-general information processing system for bootstrapping knowledge.

Foundations The Focus of the work: How a developing agent can learn temporal contingencies in the form of predictive rules over events. Watson [2001] proposed a model of contingencies based on his observations of infant behaviour:  Prospective temporal contingency: Event B tends to follow Event A with a likelihood greater than chance  Retrospective temporal contingency: Event A tends to come before Event B more often than chance. Distinctions must be found to determine when an event has occurred.

Foundations Drescher [1991] proposed a model inspired by Piaget where contingencies (here schemas) are found using marginal attribution. Results are found which follow actions in a method similar to Watson’s. For each schema (in the form of an action + result), the algorithm searches for context (situation) that makes the result more likely to follow that action.

The Method Introduction Here, prospective contingencies as well as contingencies in which events occur simultaneously are represented using predictive rules These rules are learned using a method inspired by marginal attribution The difference with Drescher is continuous variables. This brings up the issue of determining when events occur, so distinctions must be found.

The Method Introduction Motor babbling method from last week to learn distinctions and contingencies. This was undirected, does not allow learning for larger problems – too much effort is wasted on uninteresting portions of state space.

The Method Introduction In this algorithm, the agent receives as input the values of time-varying continuous variables but can only represent, reason about and construct knowledge using discrete values. Continuous values are discretised using distinctions in the form of landmarks:  A discrete value v(t) for each continuous variable v’(t);  If for landmarks v 1 and v 2, v 1 < v’(t) < v 2 then v(t) has the open interval between v 1 and v 2 as its value, v = (v 1,v 2 ).  The association means agent can focus on changes of v = events The agent greedily learns rules that use one event to predict another.

The Method How it’s evaluated The method is evaluated using a simulated robot based on the situation of a baby sitting in a high chair. Fig. 1: Adorable Fig. 2: Less adorable

The Method Knowledge Representation & Learning The goal is for the agent to learn to identify landmark values from its own experience. The importance of a qualitative distinction is estimated from the reliability of the rules that can be learned, given that distinction. The qualitative representation is based on QSIM [Kupiers, 1994]

The Method Knowledge Representation & Learning A continuous variable x’(t) is represented by discrete variable x(t) for magnitude and x’’(t) for the direction of change of x’(t), and ranges over some subset of the real number line (-∞, +∞). In QSIM, magnitude is abstracted to a discrete variable x(t) that ranges over a quantity space Q(x) of qualitative values. Q(x) = L(x) U I(x) whereL(x) = {x 1,...,x n } landmark values I(x) = {(-∞,x 1 ),(x 1,x 2 ),...,(x n, +∞)} mutually disjoint open intervals

The Method Knowledge Representation & Learning A quantity space with two landmarks might be described as (x1,x2), which implies five distinct qualtitative values, Q(x) = {(-∞,x 1 ),x 1,(x 1,x 2 ),x 2,(x 2, +∞)} A discrete variable x’’(t) for direction of change of x’(t) has a single intrinsic landmark at 0, so its initial quantity space is Q(x’’) = {(-∞,0),(0,+∞)}

The Method Knowledge Representation & Learning: Events If a is the qualitative value of a discrete variable A, meaning a ∈ Q(A), then the event A t → a is defined by A(t – 1) =/= a and A(t) = a That is, an event takes place when a discrete variable A changes to value a at time t, from some other value.

The Method Knowledge Representation & Learning: Predictive Rules This is how temporal contingencies are described There are two types of predictive rules:  Causal: one event occurs after another later in time  Functional: linked by a function so happen at the same time

The Method Learning a predictive rule The agent wants to learn rule which predicts a certain event h It will look at other events and find that if one, u, leads to h more likely than others, then it will create a rule with that event as the antecedent  It does so by starting with an initial rule with no context

The Method Landmarks When a new landmark is inserted into Q(x) we replace one interval with two intervals and the dividing landmark, e.g. a new landmark x* we have (x i,x*),x*,(x*,x i +1) Whenever a new landmark is inserted, statistics about the previous state space are thrown out and new ones are built up. This means checking that the reliability of the rule must be checked.

The Method The Learning Process 1. Do 7 times a) Actively explore world with set of candidate goals coming from discrete variables in M for 1000 timesteps b) Learn new causal and functional rules c) Learn new landmarks by examining statistics stored in rules and events 2. Gather 3000 more timesteps of experience to solidify the learned rules 3. Update the strata 4. Goto 1

Evaluation Experimental Setup The robot has two motor variables, one for each of its degrees of freedom A perpetual system creates variables for each of the two tracked objects in the environment: the hand and the block. Too many variables to reasonably explain here, each has various constraints During learning of the block is knocked off the tray or if it is not moved for 300 timesteps, it’s put back on the tray in a random position within reach of the agent

Evaluation Experimental Results The algorithm was evaluated using the simple task of moving the block in a specified direction. It was ran five times using passive learning and five using active learning and each run lasted 120,000 timesteps. Each active run of the algorithm resulted in an average of 62 predictive rules. The agent gains proficiency as it learns until reaching threshold at approximately 70,000 timesteps for both.

Evaluation Experimental Results Clearly, active exploration appears to do better since at 40,000 timesteps, active learning achieves the level passive has at 60,000 timesteps.

The Complexity of Space and Time The storage required to learn new rules is O(e 2 ), as is the number of rules – but only a small number are learned by the agent. Using marginal attribution each rule requires storage O(e), although all pairs of events are stored for simplicity.

Conclusion First the agent could only determine the direction of movement of an object Active exploration of environment and using rules to learn distinctions then using distinctions to learn more rules, the agent progressed from having a very simple representation towards a representation that is aligned with the natural “joints” of its environment.