Leveraging Human Knowledge for Machine Learning Curriculum Design Matthew E. Taylor teamcore.usc.edu/taylorm.

Slides:



Advertisements
Similar presentations
Information Processing Technology Office Learning Workshop April 12, 2004 Seedling Overview Learning Hierarchical Reactive Skills from Reasoning and Experience.
Advertisements

Thinking Skills and Personal Capabilities Unit 1
Parking passes lost while under construction. Excavation & demolition of previous parking lot. Construction of new parking lot.
Academic Writing Writing an Abstract.
Differentiation: What It Is/What It Isn’t
What is a CAT?. Introduction COMPUTER ADAPTIVE TEST + performance task.
B1: Technical Merit List short term research directions and areas –Mobility assistant, builds model of environment, peer-to-peer interaction with human.
An evaluation of scaffolding for virtual interactive tutorials 指導教授 : 陳 明 溥 研 究 生 : 許 良 村 Pahl, C.(2002).An evaluation of scaffolding for virtual interactive.
1. Algorithms for Inverse Reinforcement Learning 2
© 2012 Aptima, Inc. The Science of Game-based Training Effectiveness 29 March 2012 Krista Langkamer Ratwani Kara L. Orvis.
Towards Equilibrium Transfer in Markov Games 胡裕靖
WebQuests Presented by Frank H. Osborne, Ph. D. © 2005 Bio 2900 Computer Applications in Biology.
Game play in Football consists of a series of downs, individual plays of short duration, outside of which the ball is dead or not in play. These.
Collaborative Learning in Finance and Investment using Wiki Osama S M Khan 5 th Annual European Real Estate Society Education Seminar Vienna University.
REC 1 / 26 Enhancement of Prosthetics and Orthotics Learning and Teaching through State-of-the-art Teaching Technology & Appropriate Methodology  Dr.
Independent Research End User Design Cortney Germain Matthew Hung Mark Lewis Prazen.
An Introduction to Machine Learning In the area of AI (earlier) machine learning took a back seat to Expert Systems Expert system development usually consists.
The Future of Learning Designs Making them useful and useable for teachers and learners Sue Bennett University of Wollongong,Australia Sue Bennett University.
Cognitive Science Overview Design Activity Cognitive Apprenticeship Theory Cognitive Flexibility Theory.
Behaviorist Psychology R+R- P+P- B. F. Skinner’s operant conditioning.
Teaching Styles. Where the coach instructs the group and is in full control—the coach makes the decisions Advantages In dangerous situations With cognitive.
Lisa Torrey University of Wisconsin – Madison CS 540.
PA SAS Training Series Han Liu Department of Teacher Education Shippensburg University Adapted from SAS
© 2014 Texas Education Agency / The University of Texas System Explicit Instruction for Diverse Learners Foundations Adapted with permission from Anita.
Léon van Berlo / Jos van Leeuwen The Neighbourhood Wizard Cause and effect of changes in urban neighbourhoods.
Curriculum Learning Yoshua Bengio, U. Montreal Jérôme Louradour, A2iA
Cognitive Apprenticeship “Mastering knowledge” CLICK TO START.
Kaihua Zhang Lei Zhang (PolyU, Hong Kong) Ming-Hsuan Yang (UC Merced, California, U.S.A. ) Real-Time Compressive Tracking.
A year 1 sports personA year 2 sports personA year 3 sports person Games I can throw underarm. I can hit a ball with a bat. I can move and stop safely.
Using Advice to Transfer Knowledge Acquired in One Reinforcement Learning Task to Another Lisa Torrey, Trevor Walker, Jude Shavlik University of Wisconsin-Madison,
Skill Acquisition via Transfer Learning and Advice Taking Lisa Torrey, Jude Shavlik, Trevor Walker University of Wisconsin-Madison, USA Richard Maclin.
1 Multimedia-Supported Metaphors for Meaning Making in Mathematics Moreno & Mayer (1999)
Bayesian Reinforcement Learning Machine Learning RCC 16 th June 2011.
Relational Macros for Transfer in Reinforcement Learning Lisa Torrey, Jude Shavlik, Trevor Walker University of Wisconsin-Madison, USA Richard Maclin University.
Advice Taking and Transfer Learning: Naturally-Inspired Extensions to Reinforcement Learning Lisa Torrey, Trevor Walker, Richard Maclin*, Jude Shavlik.
1 KIMAS 2003Dr. K. Kleinmann An Infrastructure for Adaptive Control of Multi-Agent Systems Dr. Karl Kleinmann, Richard Lazarus, Ray Tomlinson KIMAS, October.
POMDPs: 5 Reward Shaping: 4 Intrinsic RL: 4 Function Approximation: 3.
Learning to Navigate Through Crowded Environments Peter Henry 1, Christian Vollmer 2, Brian Ferris 1, Dieter Fox 1 Tuesday, May 4, University of.
HAITHAM BOU AMMAR MAASTRICHT UNIVERSITY Transfer for Supervised Learning Tasks.
Evaluation and Designing
Instructor Training Los Angeles County Sheriff CERT Level 1.
CS4042 / CS4032 – Directed Study 28/01/2009 Digital Media Design Music and Performance Technology Jim Buckley Directed Study (CS4042.
T OP 10 B ELIEFS A BOUT C OACHING Lauren Eudene. N UMBER 10 Be responsive to teachers’ needs Coaches should target their support to best help teachers.
Competency based learning & performance Ola Badersten.
SBAC Overview. SBAC Data SBAC Data Using ATLAS Protocol Step 1- GETTING STARTED 0 The educator providing the student work gives a very brief statement.
1 Learning through Interactive Behavior Specifications Tolga Konik CSLI, Stanford University Douglas Pearson Three Penny Software John Laird University.
Transfer Learning in Sequential Decision Problems: A Hierarchical Bayesian Approach Aaron Wilson, Alan Fern, Prasad Tadepalli School of EECS Oregon State.
The ideals reality of science The pursuit of verifiable answers highly cited papers for your c.v. The validation of our results by reproduction convincing.
CISC Machine Learning for Solving Systems Problems Presented by: Eunjung Park Dept of Computer & Information Sciences University of Delaware Solutions.
Web-Mining Agents: Transfer Learning TrAdaBoost R. Möller Institute of Information Systems University of Lübeck.
Matthew E. Taylor 1 Autonomous Inter-Task Transfer in Reinforcement Learning Domains Matthew E. Taylor Learning Agents Research Group Department of Computer.
PASS Criteria F Construction of Knowledge F Disciplined Inquiry F Value Beyond School.
Subject of research “Information management system for supporting educational process” Student: Scherbinin T. A. Supervisor: Grankov M. V.
Autonomous Skill Acquisition on a Mobile Manipulator Hauptseminar: Topics in Robotics Jonah Vincke George Konidaris MIT CSAIL Scott Kuindersma.
Transfer Learning and Intelligence: an Argument and Approach Matthew E. Taylor Joint work with: Gregory Kuhlmann and Peter Stone Learning Agents Research.
BILC Conference Athens, Greece 22 – 26 June 2008 Ray T. Clifford
Transferring Instances for Model-Based Reinforcement Learning
What are your individual reactions to this sequence of tasks?
Swiping Basics.
THE TEACHING AND LEARN NG ROAD
Whole-Part-Whole Learning Process
Designing Neural Network Architectures Using Reinforcement Learning
Using the 7 Step Lesson Plan to Enhance Student Learning
Emir Zeylan Stylianos Filippou
MATH 2311 Section 6.3.
Reinforcement Learning in a Multi-Robot Domain
Unsupervised Perceptual Rewards For Imitation Learning
Name __________________ Class ___________________
Continuous Curriculum Learning for RL
Presentation transcript:

Leveraging Human Knowledge for Machine Learning Curriculum Design Matthew E. Taylor teamcore.usc.edu/taylorm

Overview Want agents to learn difficult problems – Lots of data needed (time) – Picking a correct bias (NFL) Taxi driving example Use human to design sequence of tasks 1.Basic car control 2.Parking lot navigation 3.Small Town 4.Los Angeles Why not have agents select tasks?

Problem Statement Humans can selecting a training sequence Results in faster training / better performance

Task Transfer 1.Reduce total training time by picking source task(s) 2.Learn sequence of source tasks, then learn (previously unknown) task Source S, A Target S’, A’

Problem Statement Humans can selecting a training sequence Results in faster training / better performance Meta-planning problem for agent learning MDP ?

Type of Shaping Assume agents could learn on their own Think of Skinner (1953) Not “RL Shaping” [Colombetti and Dorigo (1993) or Ng (1999)] DANGER: Negative Transfer

Not On-line or Interactive Help Advice / Demonstration / Imitation – Human unable or unwilling Picking sequence of tasks – How to best learn important skills / ideas

Types of Useful Information Common Sense – Soccer balls roll after being kicked – Friction reduces an object’s speed Domain Knowledge – It is easier to complete short passes than long passes Algorithmic Knowledge – State space size can impact learning speed

Useful? Training time critical Agent needs robust understanding of domain – (rare affordances) Consumer Level – Low bar for background knowledge – Save consumer time

Possible Domains? Nero RoboCup Coach

Path of Study Determine what makes a good sequence – Increasing Difficulty – Basic skills (options) – Basic concepts / learn useful abstractions – Retrospective analysis Education literature? On-line sequence adaptation? (social scaffolding)

Conclusion Leveraging human knowledge Both experts and non-experts Where is constructing a task sequence superior? – Easy – Effective How can we construct such sequences well? – Transfer Learning / Lifelong Learning Analysis – Empirical studies

Possible Domains? Nero ESP, Peekaboom RoboCup Coach