ARTIFICIAL INTELLIGENCE: THE MAIN IDEAS Nils J. Nilsson OLLI COURSE SCI 102 Tuesdays, 11:00 a.m. – 12:30 p.m. Winter Quarter, 2013 Higher Education Center, Medford Room Course Web Page: For Information about parking near the HEC, go to: There are links on that page to parking rules and maps
AI in the News ?
Perception Action Selection Memory PART ONE (Continued) REACTIVE AGENTS
Summary: Neural Networks Have Many Applications
But Some Are Not Very User- Friendly Fair Isaac Experience
Models of the Cortex Using Deep, Hierarchical Neural Networks All connections are bi-directional
The Neocortex
Two Pioneers in Using Networks to Model the Cortex Geoffrey Hinton Jeff Hawkins Hierarchical Temporal Memory
More About Jeff Hawkinss Ideas overview/education/HTM_CorticalLearningAlgorithms.pdf
Dileep Georges Hierarchical Temporal Memory (HTM) Model A Convolutional Network George is a founder of startup, Vicarious
A Mini-Column of the Neo-Cortex From: HIERARCHICAL TEMPORAL MEMORY
Figure 10. Columnar organization of the microcircuit. George, Dileep and Hawkins, Jeff: (2009) Towards a Mathematical Theory of Cortical Micro-circuits. PLoS Comput Biol 5(10): e doi: /journal.pcbi
Figure 9. A laminar biological instantiation of the Bayesian belief propagation equations used in the HTM nodes. George D, Hawkins J (2009) Towards a Mathematical Theory of Cortical Micro-circuits. PLoS Comput Biol 5(10): e doi: /journal.pcbi
Ray Kurzweils New Book
Unsupervised Learning
Letting Networks Adapt to Their Inputs All connections are bi-directional Massive number of inputs Weight Values Become Those For Extracting Features of Inputs Honglak Lee,et al., Convolutional Deep Belief Networks for Scalable Unsupervised Learning of Hierarchical Representations, Proceedings of the 26th Annual International Conference on Machine Learning, 2009
Hubel & Wiesels Detector Neurons Short bar of light projected onto a cats retina Response of a single neuron in the cats visual cortex (as detected by a micro-electrode in the anaesthetized cat) David Hubel, Torsten Wiesel
Use of Deep Networks With Unsupervised Learning First Layer Learns Building-Block Features Common to Many Images All connections are bi-directional
Second Layer Learns Features Common Just to Cars, Faces, Motorbikes and Airplanes cars, faces, motorbikes, airplanes
Third Layer Learns How to Combine the Features of the Second Layer Into a Representation of the Input cars, faces, motorbikes, airplanes
Output Layer Can be Used to Make a Decision CAR
The Net Can Make Predictions About Unseen Parts of the Input
Building High-level Features Using Large Scale Unsupervised Learning Quoc V. Lee,et al. (Google and Stanford) 1,000 Google Computers, 1,000,000,000 Connections
10 million 200x200 pixel images downloaded from the Internet (stills from YouTube) Unsupervised learning for three days Large Scale Unsupervised Learning (Continued) a face neuron Recognizes 22,000 object categories a cat neuron
81.7% accuracy in detecting faces out of 13,026 faces in a test set One Result For more information about these experiments at Google/Stanford, see:
Using Models (i.e., Memory) Can Make Agents Even More Intelligent Perception Action Selection Model of World (e.g., a map)
Types of Models Maps Memory of Previous States List of State-Action Pairs
Models can be pre-installed or learned
where am I? where is everything else? Learning and Using Maps Neato Robot Vacuum
Neato Robotics Mapping System
NEATO ROBOTICS XV11
S-R Rules Using State of the Agent Perception Action Selection determines thestateof the world Library of States and Actions (Memory) IF state 1, THEN action a IF state 2, THEN action b...
Ways to Represent States Lists of numbers, such as (1,7,3,4,6) Arrays, such as Statements, such as Color(Walls, LightBlue) Shape(Rectangular)...
Library of States & Actions (1,7,3,4,6) a (1,6,2,8,7) b (4,5,1,8,5) c... (7,4,8,9,2) k (1,5,2,8,6) Input (present state) Closest Match
Example: Face Recognition Using a large database containing many, many images of faces, a small set of building-block faces is computed: The average of all faces:
Familiar Uses of Building Blocks A Musical Tone Consists of Harmonics
Library of Known Faces (Represented as composites of the building-block faces) (0,0,1,0,0,-2,-2,0,-1,-2,-2,-1,2,-1,0) (2,2,-2,0,0,1,2,2,-1,2,2,-1,,0,2,0) (-3,2,1,1,-2,1,-2,3,0,0,0,-4,-3,2,-2)(4,1,3,-1,4,0,4,4,1,4,4,-4,4,-4,-4) Plus Thousands More SamJoe Sue Mike
Library of Known Faces Query Face Represented as a composite of the building-block faces (present state) Sue is the Closest Match Face Recognition Sam Joe Sue Mike (-2,2,1,1,-2,1,-2,3,1,0,0,-4,-3,2,-2) (0,0,1,0,0,-2,-2,0,-1,-2,-2,-1,2,-1,0) (2,2,-2,0,0,1,2,2,-1,,2,2,-1,,0,2,0) (-3,2,1,1,-2,1,-2,3,0,0,0,-4,-3,2,-2) (4,1,3,-1,4,0,4,4,1,4,4,-4,4,-4,-4)
Another Kind of Model A table of states and actions and values State and ActionValue State 1, Action b State 1, Action c 13 7 State 2, Action g State 2, Action h State 2, Action j State 3, Action m State 3, Action n 2626
Why have values for multiple actions instead of just noting the best action? Because the values in the table can be changed (learned) depending on experience! REINFORCEMENT LEARNING (Another Point of Contact with Brains)
Pioneers in the Use of Reinforcement Learning in AI Andy BartoRich SuttonChris Watkins
An Example: Learning a Maze
But the Mouse Doesnt Have a Map of the Maze (Like We Do) Instead it remembers the states it visits and assigns their actions random initial values State and ActionValue State 1, up3 State 2, left State 2, down State 2, right <add more when encountered)
It Can Change the Values in the Table The First Step (state 1, up) gets initial random value 3
state 2, has 3 actions, each with initial random values There is only one action possible (up), and the mouse ends up in state 2
Now the mouse updates the value of (state 1, up) in its table 5 value propagates backward (possibly with some loss)
Sooner or later, the mouse stumbles into the goal and gets a reward
The reward value is propagated backward value propagates backward (with some loss) 99
And So On... With a Lot of Exploration, the Mouse Learns the Maze
Reward Centers Alter Dopamine Concentrations Reinforcement Learning in Animals
The Brains Reward Centers Associate Values With States A Neural Substrate of Prediction and Reward, Wolfram Schultz, Peter Dayan, P. Read Montague, Science, 275 : , 14 March When The Actual Value of a State is Better Than The Predicted Value, Dopamine is Released; When Worse Than Expected Dopamine is Inhibited.
Learning to Flip Pancakes By Sylvain Calinon, Advanced Robotics Department Italian Institute of Technology
Learning to Fly a Model Helicopter
TD-Gammon Temporal Difference Learning and TD-Gammon, By Gerald Tesauro backgammon boards resulting from a move from current board predicted probabilities of winning (the values of the input boards) train neural network so that value of current board is closer to value of the best next board
Summary of TD-Gammon Results
Bottom Line: Reactive Agents Can Be Quite Effective! See: Daniel Kahneman, Thinking, Fast and Slow Perception Action Selection Model of World
An Interesting Novel About Neural Networks Galatea 2.2, Richard Powers