Pat Langley Adam Arvay Department of Computer Science University of Auckland Auckland, NZ Heuristic Induction of Rate-Based Process Models Thanks to W.

Slides:

Advertisements

Similar presentations

Pat Langley Dileep George Stephen Bay Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford,

Advertisements

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California

Pat Langley Institute for the Study of Learning and Expertise Javier Sanchez CSLI / Stanford University Ljupco Todorovski Saso Dzeroski Jozef Stefan Institute.

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California Elena Messina.

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA Modeling Social Cognition in a Unified Cognitive Architecture.

Pat Langley Center for the Study of Language and Information Stanford University, Stanford, California

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona A Cognitive Architecture for Integrated.

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Computational Discovery.

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Computational Discovery of Explanatory Process Models Thanks to.

1 Some Comments on Sebastiani et al Nature Genetics 37(4)2005.

Fast Algorithms For Hierarchical Range Histogram Constructions

INTRODUCTION TO MODELING

Constraint Optimization Presentation by Nathan Stender Chapter 13 of Constraint Processing by Rina Dechter 3/25/20131Constraint Optimization.

CSCI 347 / CS 4206: Data Mining Module 07: Implementations Topic 03: Linear Models.

FTP Biostatistics II Model parameter estimations: Confronting models with measurements.

The General Linear Model Or, What the Hell’s Going on During Estimation?

Budapest May 27, 2008 Unifying mixed linear models and the MASH algorithm for breakpoint detection and correction Anders Grimvall, Sackmone Sirisack, Agne.

458 Multispecies Models (Introduction to predator-prey dynamics) Fish 458, Lecture 26.

Åbo Akademi University & TUCS, Turku, Finland Ion PETRE Andrzej MIZERA COPASI Complex Pathway Simulator.

P. Venkataraman Mechanical Engineering P. Venkataraman Rochester Institute of Technology DETC2011 –47658 Determining ODE from Noisy Data 31 th CIE, Washington.

Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Institute for the Study of Learning and Expertise Palo Alto, California.

Reviewing and Critiquing Research

Visual Recognition Tutorial

Integrating Bayesian Networks and Simpson’s Paradox in Data Mining Alex Freitas University of Kent Ken McGarry University of Sunderland.

SSP Re-hosting System Development: CLBM Overview and Module Recognition SSP Team Department of ECE Stevens Institute of Technology Presented by Hongbing.

Knowledge Acquisitioning. Definition The transfer and transformation of potential problem solving expertise from some knowledge source to a program.

1 Unsupervised Learning With Non-ignorable Missing Data Machine Learning Group Talk University of Toronto Monday Oct 4, 2004 Ben Marlin Sam Roweis Rich.

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California

Machine Learning: Symbol-Based

Semi-Supervised Clustering Jieping Ye Department of Computer Science and Engineering Arizona State University

Developing Ideas for Research and Evaluating Theories of Behavior

1 A MONTE CARLO EXPERIMENT In the previous slideshow, we saw that the error term is responsible for the variations of b 2 around its fixed component 

Radial Basis Function Networks

Objectives of Multiple Regression

Attention Deficit Hyperactivity Disorder (ADHD) Student Classification Using Genetic Algorithm and Artificial Neural Network S. Yenaeng 1, S. Saelee 2.

Unit 2: Engineering Design Process

Machine Learning CUNY Graduate Center Lecture 3: Linear Regression.

Understanding Social Interactions Using Incremental Abductive Inference Benjamin Meadows Pat Langley Miranda Emery Department of Computer Science The University.

1 Hybrid methods for solving large-scale parameter estimation problems Carlos A. Quintero 1 Miguel Argáez 1 Hector Klie 2 Leticia Velázquez 1 Mary Wheeler.

Using Bayesian Networks to Analyze Expression Data N. Friedman, M. Linial, I. Nachman, D. Hebrew University.

Sampletalk Technology Presentation Andrew Gleibman

Discovering Dynamic Models Lecture 21. Dynamic Models: Introduction Dynamic models can describe how variables change over time or explain variation by.

2 2  Background  Vision in Human Brain  Efficient Coding Theory  Motivation  Natural Pictures  Methodology  Statistical Characteristics  Models.

ENM 503 Lesson 1 – Methods and Models The why’s, how’s, and what’s of mathematical modeling A model is a representation in mathematical terms of some real.

FOR 500 PRINCIPLES OF RESEARCH: PROPOSAL WRITING PROCESS

Various topics Petter Mostad Overview Epidemiology Study types / data types Econometrics Time series data More about sampling –Estimation.

Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Developing and Evaluating Theories of Behavior.

Discovering Descriptive Knowledge Lecture 18. Descriptive Knowledge in Science In an earlier lecture, we introduced the representation and use of taxonomies.

Decision Making Under Uncertainty Lec #8: Reinforcement Learning UIUC CS 598: Section EA Professor: Eyal Amir Spring Semester 2006 Most slides by Jeremy.

The end of geographic theory ? Prospects for model discovery in the geographic domain Mark Gahegan Centre for eResearch & Dept. Computer Science University.

Chapter 8: Simple Linear Regression Yang Zhenlin.

Data Mining and Decision Support

04/21/2005 CS673 1 Being Bayesian About Network Structure A Bayesian Approach to Structure Discovery in Bayesian Networks Nir Friedman and Daphne Koller.

MACHINE LEARNING 3. Supervised Learning. Learning a Class from Examples Based on E Alpaydın 2004 Introduction to Machine Learning © The MIT Press (V1.1)

ECE 576 – Power System Dynamics and Stability Prof. Tom Overbye Dept. of Electrical and Computer Engineering University of Illinois at Urbana-Champaign.

Generalized Point Based Value Iteration for Interactive POMDPs Prashant Doshi Dept. of Computer Science and AI Institute University of Georgia

1 Representation and Evolution of Lego-based Assemblies Maxim Peysakhov William C. Regli ( Drexel University) Authors: {umpeysak,

MBF1413 | Quantitative Methods Prepared by Dr Khairul Anuar 8: Time Series Analysis & Forecasting – Part 1

Introduction to Modeling Technology Enhanced Inquiry Based Science Education.

Borrett et al Computational Discovery of Process Models for Aquatic Ecosystems August 2006 Ecological Society of America, Memphis, TN Natasa Atanasova.

Dynamical Systems Modeling

Semi-Supervised Clustering

What to Look for Mathematics Grade 7

Representation and Evolution of Lego-based Assemblies

Pat Langley Department of Computer Science University of Auckland

CSc4730/6730 Scientific Visualization

Where did we stop? The Bayes decision rule guarantees an optimal classification… … But it requires the knowledge of P(ci|x) (or p(x|ci) and P(ci)) We.

Case-Based Reasoning BY: Jessica Jones CSCI 446.

Presentation transcript:

Pat Langley Adam Arvay Department of Computer Science University of Auckland Auckland, NZ Heuristic Induction of Rate-Based Process Models Thanks to W. Bridewell, R. Morin, S. To, L. Todorovski, and others for contributions to this project, which is funded by ONR Grant No. N

Inductive Process Modeling Inductive process modeling constructs explanations of time series from background knowledge (Langley et al., 2002). Models are stated as sets of differential equations organized into higher-level processes.

The SC-IPM System 1. Uses background knowledge to generate process instances; 2. Combines them to produce possible model structures, rejecting ones that violate known constraints; 3. For each candidate model structure: a. Carries out gradient descent search through parameter space to find good coefficients; b. Invokes random restarts to decrease chances of local optima; 4. Returns the parameterized model with lowest squared error or a ranked list of models. Previously, we reported SC-IPM (Bridewell & Langley, 2010), a system for inductive process modeling that: We have reported encouraging results with SC-IPM on a variety of scientific data sets.

Some SC-IPM Successes aquatic ecosystems protist dynamics hydrology biochemical kinetics

Critiques of SC-IPM Despite these successes, the SC-IPM system suffers from four key drawbacks, in that it: Evaluates full model structures, so disallows heuristic search; Requires repeated simulation to estimate model parameters; Invokes random restarts to reduce chances of local optima; Despite these steps, it can still find poorly-fitting models. As a result, SC-IPM does not scale well to complex modeling tasks and it is not reliable. In recent research, we have developed a new framework that avoids these problems percent of CPU time

A New Process Formalism SC-IPM allowed processes with only algebraic equations, only differential equations, and mixtures of them. In our new modeling formalism, each process P must include: A rate that denotes P’s speed / activation on a given time step; An algebraic equation that describes P’s rate as a parameter- free function of known variables; One or more derivatives that are proportional to P’s rate. This notation has important mathematical properties that assist model induction. The revised formalism is also closer to Forbus’ (1984) original Qualitative Process theory.

A Sample Process Model Consider a process model for a simple predator-prey ecosystem: exponential_growth[aurelia] rate r = aurelia parameters A = 0.75 equations d[aurelia] = A * r exponential_loss[nasutum] rate r = nasutum parameters B = equations d[nasutum] = B * r holling_predation[nasutum, aurelia] rate r = nasutum * aurelia parameters C = D = equations d[nasutum] = C * r d[aurelia] = D * r Each derivative is proportional to the algebraic rate expression.

A Sample Process Model Consider a process model for a simple predator-prey ecosystem: exponential_growth[aurelia] rate r = aurelia parameters A = 0.75 equations d[aurelia] = A * r exponential_loss[nasutum] rate r = nasutum parameters B = equations d[nasutum] = B * r holling_predation[nasutum, aurelia] rate r = nasutum * aurelia parameters C = D = equations d[nasutum] = C * r d[aurelia] = D * r d[aurelia] = 0.75 * aurelia – * nasutum * aurelia d[nasutum] = * nasutum * aurelia – 0.57 * nasutum This model compiles into a set of differential equations

Some Generic Processes Generic processes have a very similar but more abstract format: exponential_growth(X [prey]) [growth] rate r = X parameters A = (> A 0.0) equations d[prey] = A * r exponential_loss(X [predator]) [loss] rate r = predator parameters B = (< B 0.0) equations d[prey] = B * r holling_predation(X [predator], Y [prey]) [predation] rate r = X * Y parameters C = (> C 0.0) D = (< D 0.0) equations d[predator] = C * r d[prey] = D * r These form the building blocks from which to compose models..

RPM: Regression-Guided Process Modeling This suggests a new approach to inducing process models that our RPM system implements: Generate all process instances consistent with type constraints For each process P, calculate the rate for P on each time step For each dependent variable X, Estimate dX/dt on each time step with center differencing, For each subset of processes with up to k elements, Find a regression equation for dX/dt in terms of process rates If the equation’s r 2 is high enough, retain for consideration Add the equation with the highest r 2 to the process model This approach factors the model construction task into a number of tractable components. Assumes all variables observed Rate expression is parameter free

Two-Level Heuristic Search in RPM

Heuristics for Model Induction RPM uses four heuristics to guide its search through the space of process models: A model may include only one process instance of each type; Parameters must obey numeric constraints in generic processes; If an equation for one variable includes a process P, then P must appear in equations for other variables that P mentions; Incorporate variables that participate in more processes earlier than less constrained ones. These heuristics reduce substantially the amount of search that RPM carries our during model induction.

Behavior on Natural Data RPM matches the main trends for a simple predator-prey system. d[aurelia] = 0.75 * aurelia − 0.11 * nasutum * aurelia [r 2 = 0.84] d[naustum] = * nasutum * aurelia − 0.57 * nasutum [r 2 = 0.71]

Behavior on Complex Synthetic Data RPM also finds an accurate model for a 20-organism food chain. This suggests the system scales well to difficult modeling tasks.

With smoothing, RPM can handle 10% noise on synthetic data. The system also scales well to increasing numbers of generic processes and variables in the target model. Handling Noise and Complexity

We compared RPM to SC-IPM, its predecessor, on synthetic data for a three-variable predator-prey ecosystem. SC-IPM finds more accurate models with more restarts, but also takes longer to find them. RPM and SC-IPM SC-IPM

We compared RPM to SC-IPM, its predecessor, on synthetic data for a three-variable predator-prey ecosystem. RPM found accurate models far more reliably than SC-IPM and, at worst, ran 800,000 faster than the earlier system. RPM and SC-IPM RPM SC-IPM

Related and Future Research Our approach builds on ideas from earlier research, including: Qualitative representations of scientific models (Forbus, 1984) Inducing differential equations ( Todorovki, 1995; Bradley, 2001)  Heuristic search and multiple linear regression Our plans for extending the RPM system include: Replacing greedy search for models with beam search  Adding heuristic search through the equation space  Handling parametric rate expressions (e.g., using LMS)  Dealing with unobserved variables (e.g., iterative optimization) Together, these should extend RPM’s coverage and usefulness.

Summary Remarks In this talk, I presented a novel approach to inductive process modeling that: Incorporates a rate-based representation for processes Carries out heuristic search through the space of models  Avoids the need for repeated simulation and random restarts  Scales well to irrelevant variables and complex models  Is more reliable and much more rapid than its predecessor However, we can improve the framework’s scalability further and reduce its reliance on simplifying assumptions.

Publications on Inductive Process Modeling Todorovski, L., Bridewell, W., & Langley, P. (2012). Discovering constraints for inductive process modeling. Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence. Toronto: AAAI Press. Park, C., Bridewell, W., & Langley, P. (2010). Integrated systems for inducing spatio-temporal process models. Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (pp ). Atlanta: AAAI Press. Bridewell, W., & Todorovski, L. (2010). The induction and transfer of declarative bias. Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence (pp ). Atlanta: AAAI Press. Bridewell, W., & Langley, P. (2010). Two kinds of knowledge in scientific discovery. Topics in Cognitive Science, 2, Bridewell, W., Borrett, S. R., & Langley, P. (2009). Supporting innovative construction of explanatory scientific models. In A. B. Markman & K. L. Wood (Eds.), Tools for Innovation. Oxford: Oxford University Press. Bridewell, W., Langley, P., Todorovski, L., & Dzeroski, S. (2008). Inductive process modeling. Machine Learning, 71, Bridewell, W., Borrett, S., & Todorovski, L. (2007). Extracting constraints for process modeling. Proceedings of the Fourth International Conference on Knowledge Capture (pp ). Whistler, BC. Bridewell, W., & Todorovski, L. (2007). Learning declarative bias. Proceedings of the Seventeenth International Conference on Inductive Logic Programming. Corvallis, OR. Borrett, S. R., Bridewell, W., Langley, P., & Arrigo, K. R. (2007). A method for representing and developing process models. Ecological Complexity, 4, Bridewell, W., Sanchez, J. N., Langley, P., & Billman, D. (2006). An interactive environment for the modeling and discovery of scientific knowledge. International Journal of Human-Computer Studies, 64, Bridewell, W., Langley P., Racunas, S., & Borrett, S. R. (2006). Learning process models with missing data. Proceedings of the Seventeenth European Conference on Machine Learning (pp ). Berlin: Springer. Langley, P., Shiran, O., Shrager, J., Todorovski, L., & Pohorille, A. (2006). Constructing explanatory process models from biological data and knowledge. AI in Medicine, 37, Asgharbeygi, N., Bay, S., Langley, P., & Arrigo, K. (2006). Inductive revision of quantitative process models. Ecological Modelling, 194, Bridewell, W., Bani Asadi, N., Langley, P., & Todorovski, L. (2005). Reducing overfitting in process model induction. Proceedings of the Twenty-Second International Conference on Machine Learning (pp ). Bonn, Germany. Todorovski, L., Bridewell, W., Shiran, O., & Langley, P. (2005). Inducing hierarchical process models in dynamic domains. Proceedings of the Twentieth National Conference on Artificial Intelligence (pp ). Pittsburgh, PA: AAAI Press.

Computational Scientific Discovery Research on computational scientific discovery aims to find laws and models in established scientific formalisms like: Qualitative laws (Jones & Langley, 1986; Colton, 1999) Numeric equations (Langley, 1981; Zytkow et al., 1990) Structural models (Zytkow & Simon, 1986; Valdes-Perez, 1992) Process models (Valdes-Perez, 1993; Kocabas & Langley, 2000) In this talk, I focus on the task of inductive process modeling, which combines equation and process discovery.

Theoretical Predictions We can four predictions about our system’s behavior on the task of inductive process modeling: RPM should replicate earlier systems’ ability to identify model structure and explain observed trajectories; Because RPM carries out heuristic, not exhaustive, search and avoids repeated simulation, it should be far more efficient; Because it uses multiple linear regression, RPM should avoid local optima and be far more robust; Heuristic search should let RPM scale well to complex models. In summary, the system should be much more effective than its predecssors at process model induction.

Two-Level Heuristic Search in RPM exponential_growth(X [prey]) [growth] rate R = X derivatives d[X,t] = a * R parameters a > 0 holling(X [predator], Y [prey]) [predation] rate R = X * Y derivatives d[X,t] = b * R, d[Y, t] = c * R parameters b > 0, c < 0 ··· Time-series data Generic processes Search for equations Search for models exponential_growth(Organism1) rate R = Organism1 derivatives d[Organism1,t] = a * R parameters a = 0.75 holling(Organism2, Organism1) rate R = Organism2 * Organism1 derivatives d[Organism2,t] = b * R, d[Organism1,t] = c * R parameters b = , c = –0.011 Process models Organism1 [predator, prey] Organism2 [predator, prey] Target variables ··· Inductive Process Modeling