Course Logistics CS533: Intelligent Agents and Decision Making

Slides:

Advertisements

Similar presentations

Markov Decision Process

Advertisements

SA-1 Probabilistic Robotics Planning and Control: Partially Observable Markov Decision Processes.

CSE-573 Artificial Intelligence Partially-Observable MDPS (POMDPs)

MDP Presentation CS594 Automated Optimal Decision Making Sohail M Yousof Advanced Artificial Intelligence.

1 Markov Decision Processes Basics Concepts Alan Fern.

1 Classical STRIPS Planning Alan Fern * * Based in part on slides by Daniel Weld.

What Are Partially Observable Markov Decision Processes and Why Might You Care? Bob Wall CS 536.

Partially Observable Markov Decision Process By Nezih Ergin Özkucur.

COSC 878 Seminar on Large Scale Statistical Machine Learning 1.

Reinforcement Learning & Apprenticeship Learning Chenyi Chen.

Planning under Uncertainty

CPSC 322, Lecture 37Slide 1 Finish Markov Decision Processes Last Class Computer Science cpsc322, Lecture 37 (Textbook Chpt 9.5) April, 8, 2009.

SA-1 1 Probabilistic Robotics Planning and Control: Markov Decision Processes.

Nov 14 th  Homework 4 due  Project 4 due 11/26.

Department of Computer Science Undergraduate Events More

Reinforcement Learning Yishay Mansour Tel-Aviv University.

CS Reinforcement Learning1 Reinforcement Learning Variation on Supervised Learning Exact target outputs are not given Some variation of reward is.

MDP Reinforcement Learning. Markov Decision Process “Should you give money to charity?” “Would you contribute?” “Should you give money to charity?” $

1 Endgame Logistics  Final Project Presentations  Tuesday, March 19, 3-5, KEC2057  Powerpoint suggested ( to me before class)  Can use your own.

CSE-473 Artificial Intelligence Partially-Observable MDPS (POMDPs)

How R&N define AI Systems that think like humans Systems that think rationally Systems that act like humans Systems that act rationally humanly vs. rationally.

1 ECE-517 Reinforcement Learning in Artificial Intelligence Lecture 7: Finite Horizon MDPs, Dynamic Programming Dr. Itamar Arel College of Engineering.

TKK | Automation Technology Laboratory Partially Observable Markov Decision Process (Chapter 15 & 16) José Luis Peralta.

1 Markov Decision Processes Basics Concepts Alan Fern.

Department of Computer Science Undergraduate Events More

Reinforcement Learning Yishay Mansour Tel-Aviv University.

Theory of Computations III CS-6800 |SPRING

Lecture 1 – Operations Research

INTRODUCTION TO Machine Learning

Algorithmic, Game-theoretic and Logical Foundations

MDPs (cont) & Reinforcement Learning

Reinforcement Learning AI – Week 22 Sub-symbolic AI Two: An Introduction to Reinforcement Learning Lee McCluskey, room 3/10

Department of Computer Science Undergraduate Events More

1 Chapter 17 2 nd Part Making Complex Decisions --- Decision-theoretic Agent Design Xin Lu 11/04/2002.

1 (Chapter 3 of) Planning and Control in Stochastic Domains with Imperfect Information by Milos Hauskrecht CS594 Automated Decision Making Course Presentation.

Reinforcement Learning Guest Lecturer: Chengxiang Zhai Machine Learning December 6, 2001.

Chapter 5 Adversarial Search. 5.1 Games Why Study Game Playing? Games allow us to experiment with easier versions of real-world situations Hostile agents.

1 Markov Decision Processes Finite Horizon Problems Alan Fern * * Based in part on slides by Craig Boutilier and Daniel Weld.

CS 5751 Machine Learning Chapter 13 Reinforcement Learning1 Reinforcement Learning Control learning Control polices that choose optimal actions Q learning.

Intelligent Agents (Ch. 2)

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 3

A Crash Course in Reinforcement Learning

Analytics and OR DP- summary.

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 3

Introduction Contents Sungwook Yoon, Postdoctoral Research Associate

Reinforcement Learning

Timothy Boger and Mike Korostelev

Biomedical Data & Markov Decision Process

CSCI 5582 Artificial Intelligence

Markov Decision Processes

Planning to Maximize Reward: Markov Decision Processes

Markov Decision Processes

Chapter 3: The Reinforcement Learning Problem

Planning CSE 573 A handful of GENERAL SEARCH TECHNIQUES lie at the heart of practically all work in AI We will encounter the SAME PRINCIPLES again and.

Dr. Unnikrishnan P.C. Professor, EEE

CS 188: Artificial Intelligence Fall 2007

CS 188: Artificial Intelligence Fall 2008

Chapter 3: The Reinforcement Learning Problem

Chapter 3: The Reinforcement Learning Problem

September 22, 2011 Dr. Itamar Arel College of Engineering

October 6, 2011 Dr. Itamar Arel College of Engineering

CS 416 Artificial Intelligence

Reinforcement Learning Dealing with Partial Observability

CS 416 Artificial Intelligence

Markov Decision Processes

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7

Markov Decision Processes

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 3

Intelligent Systems (AI-2) Computer Science cpsc422, Lecture 7

Presentation transcript:

Course Logistics CS533: Intelligent Agents and Decision Making M, W, F: 1:00—1:50 (KEC1003) Instructor: Alan Fern (KEC2071) Office hours: by appointment Course Web Site: Sign Up: http://classes.engr.oregonstate.edu/eecs/spring2016/cs533/ Will post lecture-schedule, notes, reading, and assignments Grading 75% Instructor Assigned Projects (mostly implementation and evaluation) 25% Student Selected Final Project (work in teams of 2-3) Assigned Projects (work alone) Generally will be implementing and evaluating one or more algorithms Final Project (teams allowed) Last month of class You select a project related to course content

Automated Planning Under Uncertainty Optimizing Fire & Rescue Response Policies

Automated Planning Under Uncertainty Conservation Planning: Recovery of Red-cockaded Woodpecker From http://www.fws.gov/ rcwrecovery/rcw.html

Conservation Planning: Recovery of Red-cockaded Woodpecker Automated Planning Under Uncertainty Conservation Planning: Recovery of Red-cockaded Woodpecker From http://www.fws.gov/ rcwrecovery/rcw.html

Automated Planning Under Uncertainty Klondike Solitaire Real-Time Strategy Games

AI for General Atari 2600 Games

Robotics Control Legged Robot Control Helicopter Control Knot Tying Laundry

Intelligent Simulator Agents Immersive real-time training

Smart Grids

Some AI Planning Problems Health Care Personalized treatment planning Hospital Logistics/Scheduling Transportation Autonomous Vehicles Supply Chain Logistics Air traffic control Assistive Technologies Dialog Management Automated assistants for elderly/disabled Household robots Personal planner

Common Elements We have a controllable system that can change state over time (in some predictable way) The state describes essential information about system (the visible card information in Solitaire) We have an objective that specifies which states, or state sequences, are more/less preferred Can (partially) control the system state transitions by taking actions Problem: At each moment must select an action to optimize the overall objective Produce most preferred state sequences

Some Dimensions of AI Planning Observations Actions World sole source of change vs. other sources fully observable vs. partially observable ???? Goal deterministic vs. stochastic instantaneous vs. durative

Classical Planning Assumptions (primary focus of AI planning until early 90’s) Observations Actions World sole source of change ???? fully observable deterministic instantaneous Goal achieve goal condition

Classical Planning Assumptions (primary focus of AI planning until early 90’s) Observations Actions World sole source of change ???? fully observable deterministic instantaneous Goal achieve goal condition Greatly limits applicability

Stochastic/Probabilistic Planning: Markov Decision Process (MDP) Model Observations Actions World sole source of change fully observable ???? stochastic instantaneous Goal maximize expected reward over lifetime We will primarily focus on MDPs

Stochastic/Probabilistic Planning: Markov Decision Process (MDP) Model Probabilistic state transition (depends on action) Action from finite set World State ???? Goal maximize expected reward over lifetime

Goal win the game or Example MDP State describes all visible info about cards Action are the different legal card movements ???? Goal win the game or play max # of cards

Course Outline Course is structured around algorithms for solving MDPs Different assumptions about knowledge of MDP model Different assumptions about prior knowledge of solution Different assumptions about how MDP is represented 1) Markov Decision Processes (MDPs) Basics Basic definitions and solution techniques Assume an exact MDP model is known Exact solutions for small/moderate size MDPs 2) Monte-Carlo Planning Assumes an MDP simulator is available Approximate solutions for large MDPs

Course Outline 3) Reinforcement learning 4a or 4b) as time allows MDP model is not known to agent Exact solutions for small/moderate MDPs Approximate solutions for large MDPs 4a or 4b) as time allows a) Planning w/ Symbolic Representations of Huge MDPs Symbolic Dynamic Programming Classical planning for deterministic problems b) Imitation Learning