Machine Science Distilling Free-Form Natural Laws from Experimental Data Hod Lipson, Cornell University
Lipson & Pollack, Nature 406, 2000
Camera View
Adapting in simulation Simulator Evolve Controller In Simulation Try it in reality! Download Crossing The Reality Gap
Adapting in reality Evolve Controller In Reality Try it ! Too many Physical Trials
Simulation & Reality Evolve Controller “Simulator” Try it in reality! Build Collect Sensor DataEvolve Simulator Evolve Simulators Evolve Robots
Servo Actuators Tilt Sensors
Emergent Self-Model With Josh Bongard and Victor Zykov, Science 2006
Damage Recovery With Josh Bongard and Victor Zykov, Science 2006
Random Predicted Physical
System Identification ?
Perturbations
Photo: Floris van Breugel
Structural Damage Diagnosis With Wilkins Aquino
. com With Jeff Clune, Jason Yosinski
Symbolic Regression f(x)=e x sin(|x|) What function describes this data? John Koza, 1992
Encoding Equations * sin – x1x1 3 f(x)f(x) + x2x2 -7 Building Blocks: + - * / sin cos exp log … etc sin(x 2 ) x 1 *sin(x 2 ) (x 1 – 3)*sin(x 2 ) (x 1 – 3)*sin(-7 + x 2 ) John Koza, 1992
Michael D. Schmidt, Hod Lipson (2006) Models: Expression trees Subject to mutation and selection Experiments: Data-points Subject to mutation and selection {const,+,-,*,/,sin,cos,exp,log,abs}
Solution Accuracy Entire Dataset Coevolved Dataset
Solution Complexity Entire Dataset Coevolved Dataset
Semi-empirical mass formula Modeling the binding energy of an atomic nucleus R 2 = Weizsäcker’s Formula: R 2 = Inferred Formula:
Systems of Differential Equations Regress on derivative timex 1 x 2 … … … … … ………… State Variables dx 1 /dtx 2 /dt … … … … … ……… Derivatives
Original Equations Inferred Equations Inferring Biological Networks With Michael Schmidt, John Wikswo (Vanderbilt), Jerry Jenkins (CFDRC)
Charles Richter
Ingmar Zanger, John Amend
Wet Data, Unknown System With Michael Schmidt (Cornell) and Gurol Suel (UT Southwestern)
Wet Data, Unknown System With Michael Schmidt (Cornell) and Gurol Suel (UT Southwestern)
Cell #1 Cell #2 Cell #3-60 …
Blue Dots = data points, Green Line = regressed fit = =
Biologist’s Inferred Model: Gurol Suel, et. al., Science 2007 Symbolic Regression Inferred Time-Delay Model:
Withheld Test Set #1 Fit
Withheld Test Set #2 Fit
Withheld Test Set #3 Fit
Looking For Invariants
Data Mining
42 42+x-x 42+1/(1000+x 2 ) ?
From Data: x y… ……… δxδx δyδy δyδy δxδx,, … Calculate partial derivatives Numerically: δfδf δxδx δfδf δyδy From Equation: δx’δx’ δy’δy’ δy’δy’ δx’δx’,, … Calculate predicted partial derivatives Symbolically:
x y x 3 + x – y 2 – 1.5 = 0x 2 + y 2 – 16 = 0x 2 + y 2 + z 2 – 1 = 0 x y x z Homework CircleElliptic CurveSphere
Linear Oscillator Coefficients may have different scales and offsets each run
Pendulum H L
Double Linear Oscillator would be plus for Lagrangian
xyz z=f(x,y) ? dy/dt=f(x,y) ? f(x,y,z)=0 ?
Run…
xyz z=f(x,y) ? dy/dt=f(x,y) ? f(x,y,z)=0 ? I feel lucky
Von Thomas Hermanowski, Dr. Andreas Rick, Dr. Jochen Weber
Particle Accelerators angle Distortion Correction (GE X-Ray Detectors) 5x Resolution Improvement Jay Schuren
Concluding Remarks Correlation is enough. Faced with massive data, [the Scientific Method] is becoming obsolete. We can stop looking for models. “ ” Chris Anderson Wired The data deluge accelerates our ability to hypothesize, model, and test.
Theoretical physicists are not yet obsolete, but scientists have taken steps toward replacing themselves
I am worried that we have enjoyed a brief window in human history where we could actually understand things, but that period may be coming to an end. -- Steve Strogatz The end of insight