Download presentation
Presentation is loading. Please wait.
1
CS 188: Artificial Intelligence Fall 2009 Advanced Applications: Robotics / Vision / Language Dan Klein – UC Berkeley Many slides from Sebastian Thrun, Pieter Abbeel, Jitendra Malik 1
2
Announcements Grades in glookup: W1-2, P1-3, Midterm (and all regrades) Let us know if there are any issues Contest: qualifiers! Congrats to current qualifiers Qualification closes on 11/30 2
3
So Far: Foundational Methods 3
4
Now: Advanced Applications 4
5
Autonomous Vehicles Autonomous vehicle slides adapted from Sebastian Thrun
6
150 mile off-road robot race across the Mojave desert Natural and manmade hazards No driver, no remote control No dynamic passing 150 mile off-road robot race across the Mojave desert Natural and manmade hazards No driver, no remote control No dynamic passing Grand Challenge: Barstow, CA, to Primm, NV [DEMO: GC Bad, Good]
7
An Autonomous Car 5 Lasers Camera Radar E-stop GPS GPS compass 6 Computers IMU Steering motor Control Screen
8
Autonomous Architecture Central database Continuously updated In-Car Control Loop Autonomy Monitor Detects failure modes, Invokes exceptions Interact w/ human driver
9
Actions: Steering Control Reference Trajectory Error Velocity Steering Angle (with respect to trajectory)
10
Sensors: Laser Readings [DEMO: ]
11
1 2 3 Readings: No Obstacles
12
ZZ Readings: Obstacles
13
Raw Measurements: 12.6% false positives Obstacle Detection Trigger if |Z i Z j | > 15cm for nearby z i, z j
14
x t+2 xtxt x t+1 z t+2 ztzt z t+1 Probabilistic Error Model GPS IMU GPS IMU GPS IMU
15
HMM Inference: 0.02% false positivesRaw Measurements: 12.6% false positives HMMs for Detection
17
Environmental Tracking [DEMO: PEOPLE]
18
Sensors: Camera
19
Object Recognition Query Template Vision slides adapted from Jitendra Malik
20
Shape Context Count the number of points inside each bin, e.g.: Count = 4 Count = 10... Compact representation of distribution of points relative to each point 20
21
Shape Context 21
22
Similar Regions Color indicates similarity using local descriptors 22
23
Match for Image Similarity 23
24
Vision for a Car [DEMO: LIDAR 1]
25
Self-Supervised Vision [DEMO: LIDAR 2]
26
Complex Robot Control [demo – quad initial]
27
Robotic Control Tasks Perception / Tracking Where exactly am I? What’s around me? Low-Level Control How to move from position A to position B Safety vs efficiency High-Level Control What are my goals? What are the optimal high-level actions?
28
Low-Level Planning Low-level: move from configuration A to configuration B
29
A Simple Robot Arm Configuration Space What are the natural coordinates for specifying the robot’s configuration? These are the configuration space coordinates Can’t necessarily control all degrees of freedom directly Work Space What are the natural coordinates for specifying the effector tip’s position? These are the work space coordinates
30
Coordinate Systems Workspace: The world’s (x, y) system Obstacles specified here Configuration space The robot’s state Planning happens here Obstacles can be projected to here
31
Obstacles in C-Space What / where are the obstacles? Remaining space is free space
32
Example: A Less Simple Arm [DEMO]
33
Motion as Search Motion planning as path-finding problem Problem: configuration space is continuous Problem: under-constrained motion Problem: configuration space can be complex Why are there two paths from 1 to 2?
34
Decomposition Methods Break c-space into discrete regions Solve as a discrete problem
35
Probabilistic Roadmaps Idea: sample random points as nodes in a visibility graph This gives probabilistic roadmaps Very successful in practice Lets you add points where you need them If insufficient points, incomplete or weird paths
36
Demonstrate path across the “training terrain” Run apprenticeship learning to find a set of weights w Receive “testing terrain” (a height map) Find a policy for crossing the testing terrain. High-Level Control
37
High DOF Robots [DEMOS] Videos from Pieter Abbeel, Jean-Claude Latombe
38
38
39
39
40
Demonstrate path across the “training terrain” Run our apprenticeship learning algorithm to find a set of reward weights w. Receive “testing terrain”---height map. Find a policy for crossing the testing terrain. Hierarchical Control Learned reward weights
41
Motivating example How do we specify a task like this?
42
Pacman Apprenticeship! Examples are states s Candidates are pairs (s,a) “Correct” actions: those taken by expert Features defined over (s,a) pairs: f(s,a) Score of a q-state (s,a) given by: How is this VERY different from reinforcement learning? “correct” action a*
43
Helicopter Control inputs: i lon : Main rotor longitudinal cyclic pitch control (affects pitch rate) i lat : Main rotor latitudinal cyclic pitch control (affects roll rate) i coll : Main rotor collective pitch (affects main rotor thrust) i rud : Tail rotor collective pitch (affects tail rotor thrust) TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAA
44
Autonomous helicopter setup On-Board Inertial Measurements Unit (IMU) data Send out controls to helicopter 1.Kalman filter 2.Control policy Position data
45
Helicopter dynamics State: Control inputs: i lon : Main rotor longitudinal cyclic pitch control (affects pitch rate) i lat : Main rotor latitudinal cyclic pitch control (affects roll rate) i coll : Main rotor collective pitch (affects main rotor thrust) i rud : Tail rotor collective pitch (affects tail rotor thrust) Dynamics: s t+1 = f (s t, a t ) + w t [f encodes helicopter dynamics]
46
Graphical model Intended trajectory satisfies dynamics. Expert trajectory is a noisy observation of one of the hidden states. But we don’t know exactly which one. Intended trajectory Expert demonstrations Time indices
47
Learning algorithm Similar models appear in speech processing, genetic sequence alignment. See, e.g., Listgarten et. al., 2005 Maximize likelihood of the demonstration data over: Intended trajectory states Time index values Variance parameters for noise terms Time index distribution parameters
48
Learning algorithm Make an initial guess for ¿. Alternate between: Fix ¿. Run EM on resulting HMM. Choose new ¿ using dynamic programming. If ¿ is unknown, inference is hard. If ¿ is known, we have a standard HMM.
49
49
50
50
51
51
52
Probabilistic Deformation 52
53
53
54
KNN + Vision (8 slides + ??? demo) 54
55
Clustering + News/Vision (10 slides) 55
56
Syntax + MT + Search (8 slides) 56
57
Historical change (examples) 57
58
Video Games (Pomdps, warcraft demos, scape demo?, bugman) 58
59
59
60
60
61
What is NLP? Fundamental goal: analyze and process human language, broadly, robustly, accurately… End systems that we want to build: Ambitious: speech recognition, machine translation, information extraction, dialog interfaces, question answering… Modest: spelling correction, text categorization… 61
62
Automatic Speech Recognition (ASR) Audio in, text out SOTA: 0.3% error for digit strings, 5% dictation, 50%+ TV Text to Speech (TTS) Text in, audio out SOTA: totally intelligible (if sometimes unnatural) Speech Systems “Speech Lab” 62
63
Information Retrieval General problem: Given information needs, produce information Includes, e.g. web search, question answering, and classic IR Common case: web search 63 q = “Apple Computers”
64
Feature-Based Ranking q = “Apple Computers”
65
Learning to Rank Setup Optimize, e.g.: 65 … lots of variants are possible!
66
Information Extraction Unstructured text to database entries SOTA: perhaps 70% accuracy for multi-sentence temples, 90%+ for single easy fields New York Times Co. named Russell T. Lewis, 45, president and general manager of its flagship New York Times newspaper, responsible for all business-side activities. He was executive vice president and deputy general manager. He succeeds Lance R. Primis, who in September was named president and chief operating officer of the parent. startpresident and CEONew York Times Co.Lance R. Primis endexecutive vice president New York Times newspaper Russell T. Lewis startpresident and general manager New York Times newspaper Russell T. Lewis StatePostCompanyPerson 66
67
HMMs for Information Extraction 67
68
Document Understanding? Question Answering: More than search Ask general comprehension questions of a document collection Can be really easy: “What’s the capital of Wyoming?” Can be harder: “How many US states’ capitals are also their largest cities?” Can be open ended: “What are the main issues in the global warming debate?” SOTA: Can do factoids, even when text isn’t a perfect match 68
69
Problem: Ambiguities Headlines: Teacher Strikes Idle Kids Hospitals Are Sued by 7 Foot Doctors Ban on Nude Dancing on Governor’s Desk Iraqi Head Seeks Arms Local HS Dropouts Cut in Half Juvenile Court to Try Shooting Defendant Stolen Painting Found by Tree Kids Make Nutritious Snacks Why are these funny?
70
Syntactic Analysis Hurricane Emily howled toward Mexico 's Caribbean coast on Sunday packing 135 mph winds and torrential rain and causing panic in Cancun, where frightened tourists squeezed into musty shelters. 70 [demo]
71
PCFGs Natural language grammars are very ambiguous! PCFGs are a formal probabilistic model of trees Each “rule” has a conditional probability (like an HMM) Tree’s probability is the product of all rules used Parsing: Given a sentence, find the best tree – a search problem! ROOT S375/420 S NP VP. 320/392 NP PRP127/539 VP VBD ADJP 32/401 ….. 71
72
Reference Resolution The Weir Group, whose headquarters is in the U.S, is a large specialized corporation. This power plant,which, will be situated in Jiangsu, has a large generation capacity. 72
73
Summarization Condensing documents Single or multiple Extractive or synthetic Aggregative or representative Even just shortening sentences Very context- dependent! An example of analysis with generation
74
Machine Translation SOTA: much better than nothing, but more an understanding aid than a replacement for human translators New, better methods Original Text Translated Text [demo] 74
75
Corpus-Based MT Modeling correspondences between languages Sentence-aligned parallel corpus: Yo lo haré mañana I will do it tomorrow Hasta pronto See you soon Hasta pronto See you around Yo lo haré pronto Novel Sentence I will do it soonI will do it around See you tomorrow Machine translation system: Model of translation
76
Levels of Transfer
77
MT Overview 77
78
A Phrase-Based Model SegmentationTranslationDistortion 78
79
A Phrase-Based Decoder Probabilities at each step include LM and TM 79
80
Search for MT
81
Etc: Historical Change Change in form over time, reconstruct ancient forms, phylogenies … just an example of the many other kinds of models we can build
82
Want to Know More? Check out the Berkeley NLP Group: nlp.cs.berkeley.edu 82
83
Learning MT Models Phrase Level Model Syntax Level Model VP PPPP VP VP 83
84
MT from Monotext Source Text Target Text Translation without parallel text? 84
85
Output 85
86
Translation: Codebreaking? “Also knowing nothing official about, but having guessed and inferred considerable about, the powerful new mechanized methods in cryptography—methods which I believe succeed even when one does not know what language has been coded—one naturally wonders if the problem of translation could conceivably be treated as a problem in cryptography. When I look at an article in Russian, I say: ‘This is really written in English, but it has been coded in some strange symbols. I will now proceed to decode.’ ” Warren Weaver (1955:18, quoting a letter he wrote in 1947)
87
Machine Translation
88
Computational Linguistics
89
Recap: Classification Classification systems: Supervised learning Make a rational prediction given evidence We’ve seen several methods for this Useful when you have labeled data (or can get it) 89
90
Clustering Clustering systems: Unsupervised learning Detect patterns in unlabeled data E.g. group emails or search results E.g. find categories of customers E.g. detect anomalous program executions Useful when don’t know what you’re looking for Requires data, but no labels Often get gibberish 90
91
Clustering Basic idea: group together similar instances Example: 2D point patterns What could “similar” mean? One option: small (squared) Euclidean distance 91
92
K-Means An iterative clustering algorithm Pick K random points as cluster centers (means) Alternate: Assign data instances to closest mean Assign each mean to the average of its assigned points Stop when no points’ assignments change 92
93
K-Means Example 93
94
Example: K-Means [web demos] http://www.cs.tu- bs.de/rob/lehre/bv/Kmeans/Kmeans.html http://www.cs.tu- bs.de/rob/lehre/bv/Kmeans/Kmeans.html http://www.cs.washington.edu/research/image database/demo/kmcluster/ http://www.cs.washington.edu/research/image database/demo/kmcluster/ 94
95
K-Means as Optimization Consider the total distance to the means: Each iteration reduces phi Two stages each iteration: Update assignments: fix means c, change assignments a Update means: fix assignments a, change means c points assignments means 95
96
Phase I: Update Assignments For each point, re-assign to closest mean: Can only decrease total distance phi! 96
97
Phase II: Update Means Move each mean to the average of its assigned points: Also can only decrease total distance… (Why?) Fun fact: the point y with minimum squared Euclidean distance to a set of points {x} is their mean 97
98
Initialization K-means is non- deterministic Requires initial means It does matter what you pick! What can go wrong? Various schemes for preventing this kind of thing: variance-based split / merge, initialization heuristics 98
99
K-Means Getting Stuck A local optimum: Why doesn’t this work out like the earlier example, with the purple taking over half the blue? 99
100
K-Means Questions Will K-means converge? To a global optimum? Will it always find the true patterns in the data? If the patterns are very very clear? Will it find something interesting? Do people ever use it? How many clusters to pick? 100
101
Clustering for Segmentation Quick taste of a simple vision algorithm Idea: break images into manageable regions for visual processing (object recognition, activity detection, etc.) http://www.cs.washington.edu/research/imagedatabase/demo/kmcluster/ 101
102
Representing Pixels Basic representation of pixels: 3 dimensional color vector Ranges: r, g, b in [0, 1] What will happen if we cluster the pixels in an image using this representation? Improved representation for segmentation: 5 dimensional vector Ranges: x in [0, M], y in [0, N] Bigger M, N makes position more important How does this change the similarities? Note: real vision systems use more sophisticated encodings which can capture intensity, texture, shape, and so on. 102
103
K-Means Segmentation Results depend on initialization! Why? Note: best systems use graph segmentation algorithms 103
104
Clustering Application 104 Top-level categories: supervised classification Story groupings: unsupervised clustering
105
Motion as Search Motion planning as path-finding problem Problem: configuration space is continuous Problem: under-constrained motion Problem: configuration space can be complex Why are there two paths from 1 to 2? 105 [demo]
106
Probabilistic Roadmaps Idea: just pick random points as nodes in a visibility graph This gives probabilistic roadmaps Very successful in practice Lets you add points where you need them If insufficient points, incomplete, or weird paths 106
107
Policy Search 107
108
Policy Search Problem: often the feature-based policies that work well aren’t the ones that approximate V / Q best E.g. your value functions from project 2 were probably horrible estimates of future rewards, but they still produced good decisions We’ll see this distinction between modeling and prediction again later in the course Solution: learn the policy that maximizes rewards rather than the value that predicts rewards This is the idea behind policy search, such as what controlled the upside-down helicopter 108 [demo]
109
Policy Search* Advanced policy search: Write a stochastic (soft) policy: Turns out you can efficiently approximate the derivative of the returns with respect to the parameters w (details in the book, but you don’t have to know them) Take uphill steps, recalculate derivatives, etc. 109
110
Object Recognition Query Template 110
111
Comparing Local Regions 111
112
Shape Context Count the number of points inside each bin, e.g.: Count = 4 Count = 10... FCompact representation of distribution of points relative to each point 112
113
Shape Context 113
114
Similar Regions Not Quite... Color indicates similarity using Geometric Blur Descriptor 114
115
Match for Image Similarity 115
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.