Trey Smith, Robotics Institute, Carnegie Mellon University

Slides:



Advertisements
Similar presentations
G5BAIM Artificial Intelligence Methods
Advertisements

Probabilistic Planning (goal-oriented) Action Probabilistic Outcome Time 1 Time 2 Goal State 1 Action State Maximize Goal Achievement Dead End A1A2 I A1.
Partially Observable Markov Decision Process (POMDP)
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Solving POMDPs Using Quadratically Constrained Linear Programs Christopher Amato.
SA-1 Probabilistic Robotics Planning and Control: Partially Observable Markov Decision Processes.
SARSOP Successive Approximations of the Reachable Space under Optimal Policies Devin Grady 4 April 2013.
Mobile Robotics Laboratory Institute of Systems and Robotics ISR – Coimbra Aim –Take advantage from intelligent cooperation between mobile robots, so as.
Fast approximate POMDP planning: Overcoming the curse of history! Joelle Pineau, Geoff Gordon and Sebastian Thrun, CMU Point-based value iteration: an.
1 Black Box and Generalized Algorithms for Planning in Uncertain Domains Thesis Proposal, Dept. of Computer Science, Carnegie Mellon University H. Brendan.
Pradeep Varakantham Singapore Management University Joint work with J.Y.Kwak, M.Taylor, J. Marecki, P. Scerri, M.Tambe.
A Hybridized Planner for Stochastic Domains Mausam and Daniel S. Weld University of Washington, Seattle Piergiorgio Bertoli ITC-IRST, Trento.
DESIGN OF A GENERIC PATH PATH PLANNING SYSTEM AILAB Path Planning Workgroup.
Partially Observable Markov Decision Process By Nezih Ergin Özkucur.
1 Sensor Relocation in Mobile Sensor Networks Guiling Wang, Guohong Cao, Tom La Porta, and Wensheng Zhang Department of Computer Science & Engineering.
Planning under Uncertainty
Susanne Biundo, Karen Myers, Kanna Rajan How is Planning & Scheduling Changing the World?
Probabilistic Similarity Search for Uncertain Time Series Presented by CAO Chen 21 st Feb, 2011.
1998/5/21by Chang I-Ning1 ImageRover: A Content-Based Image Browser for the World Wide Web Introduction Approach Image Collection Subsystem Image Query.
Concurrent Probabilistic Temporal Planning (CPTP) Mausam Joint work with Daniel S. Weld University of Washington Seattle.
Integrating POMDP and RL for a Two Layer Simulated Robot Architecture Presented by Alp Sardağ.
1RADAR – Scheduling Task © 2003 Carnegie Mellon University RADAR – Scheduling Task May 20, 2003 Manuela Veloso, Stephen Smith, Jaime Carbonell, Brett Browning,
Probabilistic Temporal Planning with Uncertain Durations Mausam Joint work with Daniel S. Weld University of Washington Seattle.
Reinforcement Learning Yishay Mansour Tel-Aviv University.
Exploration in Reinforcement Learning Jeremy Wyatt Intelligent Robotics Lab School of Computer Science University of Birmingham, UK
Decentralised Coordination of Mobile Sensors School of Electronics and Computer Science University of Southampton Ruben Stranders,
Carnegie Mellon Life in the Atacama, Design Review, December 19, 2003 Science Planner Science Observer Life in the Atacama Design Review December 19, 2003.
Belief space planning assuming maximum likelihood observations Robert Platt Russ Tedrake, Leslie Kaelbling, Tomas Lozano-Perez Computer Science and Artificial.
Towards Cognitive Robotics Biointelligence Laboratory School of Computer Science and Engineering Seoul National University Christian.
Science Objectives & Investigation Methodology Life in the Atacama 2005 Science & Technology Workshop January 6-7, 2005 Nathalie A. Cabrol NASA Ames.
Life in the Atacama, Design Review, December 19, 2003 Carnegie Mellon SCIENCE OPS [contributions from Peter, Trey, Dom, Kristen, Kristina and Mike] Life.
Science Investigation Life in the Atacama 2004 Science & Technology Workshop Nathalie A. Cabrol NASA Ames.
Probabilistic Reasoning for Robust Plan Execution Steve Schaffer, Brad Clement, Steve Chien Artificial Intelligence.
MURI: Integrated Fusion, Performance Prediction, and Sensor Management for Automatic Target Exploitation 1 Dynamic Sensor Resource Management for ATE MURI.
OPERETTA: An Optimal Energy Efficient Bandwidth Aggregation System Karim Habak†, Khaled A. Harras‡, and Moustafa Youssef† †Egypt-Japan University of Sc.
1 Distributed and Optimal Motion Planning for Multiple Mobile Robots Yi Guo and Lynne Parker Center for Engineering Science Advanced Research Computer.
University of Windsor School of Computer Science Topics in Artificial Intelligence Fall 2008 Sept 11, 2008.
Reinforcement Learning Yishay Mansour Tel-Aviv University.
A Tutorial on the Partially Observable Markov Decision Process and Its Applications Lawrence Carin June 7,2006.
Conformant Probabilistic Planning via CSPs ICAPS-2003 Nathanael Hyafil & Fahiem Bacchus University of Toronto.
Life in the Atacama Life in the Atacama 2004 Science & Technology Workshop July 14-16, 2004 Carnegie Mellon University.
Towards Proactive Replanning for Multi-Robot Teams Brennan Sellner and Reid Simmons 5th International Workshop on Planning and Scheduling for Space October.
Learning to Navigate Through Crowded Environments Peter Henry 1, Christian Vollmer 2, Brian Ferris 1, Dieter Fox 1 Tuesday, May 4, University of.
Introduction to Artificial Intelligence CS 438 Spring 2008 Today –AIMA, Ch. 25 –Robotics Thursday –Robotics continued Home Work due next Tuesday –Ch. 13:
Ground-Truth Strategy Life in the Atacama 2004 Science & Technology Workshop Edmond A. Grin NASA Ames.
Thrust IIB: Dynamic Task Allocation in Remote Multi-robot HRI Jon How (lead) Nick Roy MURI 8 Kickoff Meeting 2007.
Rover and Instrument Capabilities Life in the Atacama 2004 Science & Technology Workshop Michael Wagner, James Teza, Stuart Heys Robotics Institute, Carnegie.
Generalized Point Based Value Iteration for Interactive POMDPs Prashant Doshi Dept. of Computer Science and AI Institute University of Georgia
Science Methods & Approach Life in the Atacama 2004 Science & Technology Workshop Nathalie A. Cabrol NASA Ames.
Vision-Guided Humanoid Footstep Planning for Dynamic Environments P. Michel, J. Chestnutt, J. Kuffner, T. Kanade Carnegie Mellon University – Robotics.
C.V. Education – Ben-Gurion University (Israel). –B.Sc. – –M.Sc. – –PhD. – Work: –After B.Sc. – Software Engineer at Microsoft,
Autonomy: Executive and Instruments Life in the Atacama 2004 Science & Technology Workshop Nicola Muscettola NASA Ames Reid Simmons Carnegie Mellon.
Mission Planning Life in the Atacama 2004 Science & Technology Workshop Paul Tompkins Carnegie Mellon.
Introduction to Machine Learning, its potential usage in network area,
Vision-Guided Humanoid Footstep Planning for Dynamic Environments
SOAR Observatory Strategic Planning Initial Concept Presentation
NASA Ames Research Center
POMDPs Logistics Outline No class Wed
Project and Workshop Objectives
WHO.
Update on Advancing Development of the ROLO Lunar Calibration System
Multi-Agent Exploration
Robust Belief-based Execution of Manipulation Programs
Autonomous Operations in Space
Deep Neural Networks for Onboard Intelligence
Approximate POMDP planning: Overcoming the curse of history!
Market-based Dynamic Task Allocation in Mobile Surveillance Systems
Heuristic Search Value Iteration
CS 416 Artificial Intelligence
Reinforcement Learning Dealing with Partial Observability
Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7
Presentation transcript:

Trey Smith, Robotics Institute, Carnegie Mellon University Rover Science Autonomy Probabilistic Planning for Science-Aware Exploration Thesis Proposal Trey Smith, Robotics Institute, Carnegie Mellon University 14 June 2004

Robotic Planetary Science

Mars Rover Operations Today

Mars Rover Operations Today

Mars Rover Operations Today

Over-the-Horizon Science 60 m

Over-the-Horizon Science 60 m 6 km

Science Autonomy The opportunity: Long-distance mobility gives the rover access to new sites The challenge: Do useful science in over-the- horizon situations The method: Enable the rover to reason about science goals and the science data it collects

New Operational Modes !

Sensing Uncertainty type 4 type 7 novel

Probabilistic Sensor Planning contact sensing spectrometer keep moving keep moving

Probabilistic Sensor Planning contact sensing spectrometer keep moving keep moving

POMDP Model Principled method for reasoning about information gain and future beliefs

POMDP Model Principled method for reasoning about information gain and future beliefs Horribly, horribly intractable

POMDP Model Principled method for reasoning about information gain and future beliefs Horribly, horribly intractable Develop approximation techniques that take advantage of structure in the SA domain to speed up planning

Thesis Statement Enabling a robot to reason about science goals and the science data it collects will significantly improve its exploration efficiency,

Thesis Statement Enabling a robot to reason about science goals and the science data it collects will significantly improve its exploration efficiency, and POMDP methods can significantly improve the quality of science autonomy plans.

Outline Motivation Background and related work Technical approach Preliminary work Proposed research

Science Autonomy Related Work Image and spectral analysis [Gulick et al., 2001] [Gazis and Roush, 2001] Balancing exploration priorities [Moorehead, 2001] Sensor planning to detect meteorites [Wagner et al., 2001] Orbiter reacting to science events [Chien et al., 2003] OASIS project: closed-loop SA system [Estlin et al., 2003]

POMDP Introduction Recall: Great for sensor planning Tend to be massively intractable

Example Problem: RockSample

Example Problem: RockSample

Example Problem: RockSample exit

Example Problem: RockSample exit check

Example Problem: RockSample exit check good bad (+20 or +0) (+10) = 0.9

History and Belief initially 50/50 check good check good 0.5 0.7 0.9 bad good Pr(good) = 0.5 Pr(good) = 0.7 Pr(good) = 0.9 check good check good

Policy exit check sample sample check bad good good 0.38 0.64 0.5 bad 0.7 bad

Value Function: Horizon 1 exit sample 20 10 check bad good

Value Function: Horizon 1 exit sample 20 10 bad good

Value Function: Horizon 2 exit sample 20 check bad good 10 exit sample bad good

Computing the Policy exit check sample 20 10 exit sample check bad good 10 exit sample exit check sample

optimal value function V* Value Iteration Horizon 2 optimal value function V* Horizon 1 Horizon 0

POMDP Related Work Policy iteration [Hansen, 1998] [Poupart and Boutilier, 2003] Gradient ascent policy search [Baxter and Bartlett, 2000] Value iteration [Sondik, 1971]

Value Iteration Variants Belief Space

Value Iteration Variants Exact value iteration [Littman, 1994] [Cassandra et al., 1997]

Value Iteration Variants Exact value iteration [Littman, 1994] [Cassandra et al., 1997]

Value Iteration Variants

Value Iteration Variants Grid-based [Brafman, 1997] [Hauskrecht, 2000]

Value Iteration Variants What next?

Value Iteration Variants Focused value iteration

Value Iteration Variants Initial Belief

Value Iteration Variants Reachable Beliefs

Value Iteration Variants Relevant Beliefs

Value Iteration Variants

Value Iteration Variants Point-based (PBVI) [Pineau et al., 2003]

Value Iteration Variants

Value Iteration Variants Heuristic Search (HSVI) [Smith and Simmons, 2004]

Outline Motivation Background and related work Technical approach Preliminary work Proposed research

Intelligent Site Survey

Intelligent Site Survey

Intelligent Site Survey

Intelligent Site Survey !

Intelligent Site Survey

System Architecture planner executive functional layer environment

System Architecture planner executive functional layer science observer environment

System Architecture planner domain planners executive functional layer science observer environment

System Architecture off-board onboard planner domain planners science interface executive off-board planner functional layer science observer environment

Pushing POMDP Technology Heuristic search Combine heuristic search with efficient value function representations Factored state Do better when the state is mostly observable Data structures that leverage independence Continuous planning Fast replanning

Outline Motivation Background and related work Technical approach Preliminary work Proposed research

Heuristic Search Value Iteration Solves POMDPs Fast anytime algorithm with provable bounds on plan quality On some benchmark problems, >100x speedup V * V

Heuristic Search Value Iteration Focus on reachable beliefs Avoid considering foolish actions and unlikely observations

Heuristic Search Value Iteration Focus on reachable beliefs Avoid considering foolish actions and unlikely observations

Heuristic Search Value Iteration Focus on reachable beliefs Avoid considering foolish actions and unlikely observations

Heuristic Search Value Iteration Focus on reachable beliefs Avoid considering foolish actions and unlikely observations

Heuristic Search Value Iteration Focus on reachable beliefs Avoid considering foolish actions and unlikely observations

Heuristic Search Value Iteration Focus on reachable beliefs Avoid considering foolish actions and unlikely observations

Heuristic Search Value Iteration Focus on reachable beliefs Avoid considering foolish actions and unlikely observations

Heuristic Search Value Iteration Focus on reachable beliefs Avoid considering foolish actions and unlikely observations

Heuristic Search Value Iteration Focus on reachable beliefs Avoid considering foolish actions and unlikely observations

Heuristic Search Value Iteration Focus on reachable beliefs Avoid considering foolish actions and unlikely observations

Heuristic Search Value Iteration Focus on reachable beliefs Avoid considering foolish actions and unlikely observations

Heuristic Search Value Iteration Focus on reachable beliefs Avoid considering foolish actions and unlikely observations

Heuristic Search Value Iteration Focus on reachable beliefs Avoid considering foolish actions and unlikely observations

Heuristic Search Value Iteration Focus on reachable beliefs Avoid considering foolish actions and unlikely observations

Heuristic Search Value Iteration Focus on reachable beliefs Avoid considering foolish actions and unlikely observations

Heuristic Search Value Iteration Focus on reachable beliefs Avoid considering foolish actions and unlikely observations

Bounds Update at b b

HSVI Performance Tag (870s 5a 30o)

Multi-Algorithm Comparison

Comparison with PBVI [Pineau et al., 2003] optimal value HSVI PBVI solution quality wallclock time

Comparison with PBVI [Pineau et al., 2003] optimal value vc HSVI PBVI reported times HSVI PBVI

Comparison with PBVI [Pineau et al., 2003]

HSVI Related Work

Outline Motivation Background and related work Technical approach Preliminary work Proposed research

Research Goals Develop and field test an SA system, includes developing Problem definition and performance criteria Overall architecture Planning module Extend POMDP state-of-art to meet the needs of the SA problem

Performance Criteria: SA Overall performance: did the science team characterize the site accurately? Subgoals with quantitative metrics Coverage, key signatures, anomalies, representative sampling Compare to baseline operational modes

Baseline Operational Modes

Baseline Operational Modes directed sampling

Baseline Operational Modes

Baseline Operational Modes periodic sampling

Performance Criteria: POMDP Well-established metrics for POMDP planning performance Primarily test in simulation Compare to baseline planning techniques Ignoring informational value (MDP) Greedy or heuristic techniques (e.g. Q- MDP)

Life in the Atacama Robotic astrobiology: work under Mars-relevant operational constraints and do meaningful science on Earth.

Science Mission The biodiversity and distribution of habitats in the Atacama is not yet measured. Where does life survive and where does it not? Coastal Range Interior Desert

Planner Implementations Atacama 2004 Make it work: deterministic world model, heuristic search, quick development Technology research A series of quickly developed planners that focus on specific POMDP techniques, tested in simulation Atacama 2005 Integrate lessons from earlier versions: optimize for stability, extensibility POMDP technology may not be ready

Pushing POMDP Technology Heuristic search Combine heuristic search with efficient value function representations Factored state Do better when the state is mostly observable Data structures that leverage independence Continuous planning Fast replanning

Contributions Lessons learned from developing and field testing one of the first science autonomy systems Distinguished by “keep moving” operational strategy and novel intelligent site survey operational mode Extensions to the POMDP state-of-art that apply both to SA and other domains

Schedule POMDP technology research (ongoing) Initial SA development (Summer 04) LITA Expedition 04 (Fall 04) POMDP SA model development (Spring 05) LITA Expedition 05 (Fall 05) Thesis writing (Spring-Summer 06)

Questions?

NASA Mars Agenda 2005 Mars Reconnaissance Orbiter 2007 Mars Scout 1: Phoenix Lander 2009 Mobile Lab 2011 Mars Scout 2 2015 Mars Sample Return ... ~2025 Human Landing

OASIS Project Broad agreement on goals and approach For instance, ideas about preferences (key signature, anomaly, etc.) Distinctions of the proposed work Operational strategy: keep moving Kilometer-scale traverses, multiple sites per day Novel intelligent site survey concept

LITA Science Autonomy 2004 September 2004 Expedition Early versions of all modules present Test science observer/planner closed loop in very simple situations Teleoperate rover to simulate SA system Gain experience with architecture and domain modeling Get feedback from science team

LITA Science Autonomy 2005 September 2005 Expedition Test full-up SA operational modes Learn about integration into overall operations strategy Characterize performance

Sensing Uncertainty type 4 type 7 novel

Sensing Uncertainty type 4 type 7 novel type 4 type 7 novel

Sensing Uncertainty type 4 type 7 novel type 4 type 7 novel type 4

Sensing Uncertainty type 4 type 7 novel type 4 type 7 novel type 4

Science on the Fly

Science on the Fly

Science on the Fly

Science on the Fly

Science on the Fly !

Science on the Fly

Science on the Fly carbonate!

Science on the Fly

Science on the Fly

Science on the Fly

Science on the Fly

Science on the Fly