Autonomous Motion Learning for Near Optimal Control By Alan Jennings School of Engineering, University of Dayton Dayton, OH, August 2012 Dissertation defense.

Autonomous Motion Learning for Near Optimal Control By Alan Jennings School of Engineering, University of Dayton Dayton, OH, August 2012 Dissertation defense in partial fulfillment of the requirements for the degree of doctor of philosophy in electrical engineering

Motivation Consider human learning: Intelligent system: able to solve new problems and become an expert Consider computer accomplishments: Beat chess & Jeopardy! grandmasters But one cannot be repurposed for the other. Consider general purpose learning: People can grow up to be presidents, design fashion, play croquet, identify liars, train animals, predict weather… Not been accomplished The foundation for general purpose learning is a developmental framework: Shaped by environment & experiences Complex value systems guide learning Infant stage restrict exploration until basic skills are established IBM’s Deep blue beat Kasparov on their second match in 1997 IBM’s Watson beat Jennings and Rutter in 2011 In 2011, Google gained Nevada licenses for self-driving cars 2Alan Jennings, Dissertation Defense, July 2012 Google Prius image, Flckr user Steve Jurvetson

Context in Developmental Learning Developmental learning seeks to mimic the progressive learning process – Infant -> Toddler -> Child -> Young adult -> … – The solution/knowledge should be unguided by the programmer Learning basic tasks supports learning high-level tasks – Proverbial walking before running The robot then learns general tasks of increasing complexity at increasing proficiency – Does not require reasoning/understanding/consciousness 3Alan Jennings, Dissertation Defense, July 2012

My Contributions Autonomous motion learning: General purpose rigid body motion optimization – Provides novel high-level interface at the robot geometry level – Allows for novice roboticists or computers to design motions – However, has high computation requirements Optimal inverse functions from a global search – Organizes motions in continuous, optimal inverse functions – Provides a set of reflexive responses for use online – Efficiently searches high dimension space using agents & local gradient Improving motions by unbounded resolution – Nodes are added to an interpolation approaching optimal continuous function in the limit – Efficiently collects and “understands” experiences – Motions are not limited by initial programming resolution or initial training time limitation 4Alan Jennings, Dissertation Defense, July 2012

Motivating Example Use of general purpose programs to solve control problems – Use CAD package to draw robot – Use kinematic program for equations of motion – Use optimal control program to solve Optimal control problem is introduced – Finding the input with the lowest cost among inputs satisfying constraints. 5Alan Jennings, Dissertation Defense, July 2012

Motivating Example Use of general purpose programs for solving control problems The optimal control problem Finding the control input with the lowest cost among inputs satisfying constraints. Optimal Control Dynamics Mass & joints Set up DIDO Draft project Set up Simulink What does it look like What are the controls What is trying to be done Human creativity comes in at the design level, not the optimization. 6Alan Jennings, Dissertation Defense, July 2012

Motivating Example Typically solved by discretizing over time – Optimize a set of variables, not the continuous function – Local search method – Applies to isolated problem Change final value and needs new optimization Use of general purpose programs for solving control problems The optimal control problem Finding the control input with the lowest cost among inputs satisfying constraints. x(t), u(t) → g(t) ψoψo ϕ J ψfψf xoxo xfxf XfXf XoXo General Optimal Control Problem 7Alan Jennings, Dissertation Defense, July 2012

Motivating Example Motion Primitive Example Problem: System: Pendulum actuated at base Cost: (Torque) 2, J=∫ u(τ) 2 dτ Output: Initial Disturbance, y = θ(t 0 ) Constraints: Reach final value: θ (t f ) =0 Saturation: -u max ≤ u(t) ≤ u max The way forward If system dynamics and initial state are repeatable, Then problem is really only to find a control signal. Continuous signals can be approximated by parameterization, So motion primitives can be composed solely by a vector function of an output. 8Alan Jennings, Dissertation Defense, July 2012

Diversity and Progression in Motion Primitives Continuous, optimal inverse function – Motion primitives should be continuous so that changes in the system behavior are not abrupt Global search required for discovery – Global search offers possibility of finding alternative motion primitives – Finding isolated optima requires testing candidates which local conditions indicate would give worse performance Progression via increasing resolution – After optimizing at a given resolution, the signal is then limited by the optimal signal not lying in the space of the parameterization. So the resolution must be increased to improve performance. 10Alan Jennings, Dissertation Defense, July 2012

Optimal Inverse Functions High level concept Population covers broad area and uses local gradients to improve. Converging agents are removed so number of agents quickly drops. Settled agents create a motion primitive and use the local gradient to expand to new outputs. The operator has a choice of inverse functions to select from. – Can use softer criteria for preference. Inverse function is continuous and easily calculated making them suited for real-time use. Optimization Initialize Population Move Agents: Lower J(x), Maintain f(x) Check for removal or settling conditions Form Cluster Set of h k (y d )’s Execution Get y d, Evaluate h k Select inverse function, h k (y d ) Move to new x* Operator 11Alan Jennings, Dissertation Defense, July 2012

Optimal Inverse Functions Mechanics of the method Improving a given agent 1.Restrict motion to null space of Output gradient 2.Move opposite Cost gradient Saturation If gradients are large -> Limits effect If Cost gradient is small -> small step If Output gradient is small -> ease null space restriction Boundary constraint reduces step length Minimum step for settling Remove particles too close Quickly reduces population size Output Cost Step 12Alan Jennings, Dissertation Defense, July 2012

Optimal Inverse Functions Mechanics of the method Form a cluster of optimal points 1.Change output by moving along the Output gradient 2.Repeat optimizing steps Test for continuity/optimality Output changes in expected direction Not too far (discontinuity) Not too close (ill conditioned surface) Settled (optimality satisfied) Decreasing y d Increasing y d 13Alan Jennings, Dissertation Defense, July 2012

Optimal Inverse Functions Testing of the method Quadratic Cost Linear/Quadratic Cost Periodic Cost Quadratic Cost Combination of functions Multiple extremum Saddle points 2-dim for verification Expected result Clusters between output extremum 14Alan Jennings, Dissertation Defense, July 2012

Optimal Inverse Functions Testing of the method Quadratic Cost Periodic-Linear output 15Alan Jennings, Dissertation Defense, July 2012

Optimal Inverse Functions Practical example Robot control Problem – Precision is dependent on the pose – Radial precision is optimized via joint angles for varying radial distance Planar Robot, Motoman HP-3: Complex Robot, Motoman IA-20: 16Alan Jennings, Dissertation Defense, July 2012

Optimal Inverse Functions Practical example Each link has a different radius to the tip and therefore a different sensitivity In addition, the direction of sensitivity is different The problem effectively finds the joint locations that reduce sensitivity in the radial distance Links are shown by solid arrows. The effective length to the tip is shown by a dashed arrow. The arc showing the sensitivity for a joint is matched by color. 17Alan Jennings, Dissertation Defense, July 2012

Optimal Inverse Functions Practical example Output is adjusted as desired (additional task of finding angle of plane and the in-plane angle) Operator selects an inverse function

Optimal Inverse Functions Method searches a large space efficiently by: – Having agents congregate to locally optimal solutions (increasing the effective search area of each), and – Eliminating neighboring points (once locations of optima are sketched out, less agents are needed). Sets of continuous, optimal inverse functions – Can be used in real time, and – Reduces the burden on operator without reducing optimality 19Alan Jennings, Dissertation Defense, July 2012

Unbounded Resolution High level concept To have continuous learning, must have unbounded resolution. Unbounded resolution leads to exponential growth in complexity Must make efficient use of experience Optimization Reflex Function Memory Model Cubic Interpolation System Reflex Function Operator or Higher Level Planner Cubic Interpolation System Memory Model 21Alan Jennings, Dissertation Defense, July 2012

Unbounded Resolution Mechanics of the method Cubic Interpolation System Reflex Function Operator or Higher Level Planner Optimization Reflex Function Memory Model Cubic Interpolation System Memory Model System Assumptions t and a are bounded y(a) and J(a) are in C 2 and constant 22Alan Jennings, Dissertation Defense, July 2012

Unbounded Resolution Why cubic interpolation Adding node to cubic interpolation allows for all experiences to be transferred. Power series parameters are ill conditioned as the effective area of the basis approaches extremes Fourier series parameters typically create a less smooth optimization surface Radial basis function scaling parameter is either too small at low resolutions or large at high resolutions, and automatically changing it means data cannot be mapped exactly Sigmoid neural network parameters are large with respect to the input magnitude, resulting in poor optimization scaling. 23Alan Jennings, Dissertation Defense, July 2012

Unbounded Resolution Why Locally weighted regression Locally Weighted Regression performs a least- squared-error regression where the error is scaled by the distance to the test point. – local weighting allows global nonlinear behavior Quadratic regression to accurately model optima Provides gradient for optimization (and hessian) Directions with insufficient data are identified from eigenvalues – Allows for autonomously determining which samples must be tested 24Alan Jennings, Dissertation Defense, July 2012

Unbounded Resolution Testing of the method Problem Design: Cost: (Distance to sine wave) 2 J 2 =∫ (u(τ)-(sin(2π τ)+2)/4) 2 dτ Output: Average value y=∫ u(τ) dτ Saturation applied to u(t) Results Sinusoidal shape & Saturate at closer side Motivation Possibly internal resonance, Distance traveled, material processed, … Internal Limitations Flattens peaks in the absolute distance -> Minimize RMS 25Alan Jennings, Dissertation Defense, July 2012

Unbounded Resolution Testing of the method Near optimal compared to direct optimization Exponential Learning Rate Waveform results The results exploits saturation. Going from 4 to 9 nodes, the cost decreases but the shape appears identical by sight. 26Alan Jennings, Dissertation Defense, July 2012

Unbounded Resolution Practical example Objective: – Control the motor voltage to spin the motor to a given speed at a set time with the minimum peak current. Only modifications – Adjusted parameters for range of u, y & J – Increase measure of data required to deal with process variation – Ideal cost based on steady state AmplifierMotor Voltage out Tachometer Current Peak Detector Sampled after the run, does not need to be sampled continuously Unknown to Method 27Alan Jennings, Dissertation Defense, July 2012

Unbounded Resolution Practical example Completely automated Progressive improvement Sizable variation – Direct optimization on an average of 10 trials still did not converge – However, LWR provided an sufficiently accurate estimate of the gradients to converge Thirteen sets of data – Multiple runs gave similar results 28Alan Jennings, Dissertation Defense, July 2012

Unbounded Resolution Practical example 7 dim in 17 hours – About 40,000 samples – Method parameters were not optimized Results make sense – Final voltage determines output – Initial voltage very similar – Initial slope flattens 29Alan Jennings, Dissertation Defense, July 2012

My Contributions Related publications and presentations: Journal submissions – “Unbounded Motion Optimization by Developmental Learning ” Revision submitted to IEEE Systems, Man and Cybernetics Part B – “Optimal Inverse Functions Created via Population Based Optimization” Submitted to IEEE Systems, Man and Cybernetics Part B Conference Presentations – “Memory-Based Motion Optimization for Unbounded Resolution” Computational Intelligence and Bioinformatics, IASTED, 753-31, Nov 2011 – “Population Based Optimization for Variable Operating Points” Congress on Evolutionary Computation, IEEE, Jun 2011 – “Constrained Near-Optimal Control Using a Numerical Kinetic Solver” Robotics and Applications, IASTED, 706-21, Nov 2010 – “Biomimetic Learning, Not Learning Biomimetics: A survey of developmental learning” National Aerospace and Electronics Conference (NAECON), IEEE, July 2010 Posters – “Memory Based Optimization for Unbounded Learning” 2011 Great Midwest Regional Space Grant Consortia Meeting, also NASA Futures Form, Feb 2012. – “Constrained Near-Optimal Control Using a Numerical Kinetic Solver” 2009 Great Midwest Regional Space Grant Consortia, 3 rd place 30Alan Jennings, Dissertation Defense, July 2012

Commencement Future applications: Implementation for novel locomotion – Implement on an inch worm – Challenge is automating the tests, such as defining distance traveled – Would be very interesting to reduce variation Learn control law for regulation – Develop control law for pendulum – Question of what disturbance to use and metric for cost or output (possibly response time, the operator sets the urgency) Address multidimensional outputs – Robots are used to provide multiple outputs – A manifold of the output may not be represented in the output space (Think of a screw thread, despite moving continuously, there are multiple surfaces with the same horizontal coordinates). 32Alan Jennings, Dissertation Defense, July 2012

Autonomous Motion Learning for Near Optimal Control By Alan Jennings School of Engineering, University of Dayton Dayton, OH, August 2012 Dissertation defense.

Similar presentations

Presentation on theme: "Autonomous Motion Learning for Near Optimal Control By Alan Jennings School of Engineering, University of Dayton Dayton, OH, August 2012 Dissertation defense."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Autonomous Motion Learning for Near Optimal Control By Alan Jennings School of Engineering, University of Dayton Dayton, OH, August 2012 Dissertation defense.

Similar presentations

Presentation on theme: "Autonomous Motion Learning for Near Optimal Control By Alan Jennings School of Engineering, University of Dayton Dayton, OH, August 2012 Dissertation defense."— Presentation transcript:

Similar presentations

About project

Feedback