P ROCESSES AND C ONSTRAINTS IN S CIENTIFIC M ODEL C ONSTRUCTION Will Bridewell † and Pat Langley †‡ † Cognitive Systems Laboratory, CSLI, Stanford University.

Slides:



Advertisements
Similar presentations
Computational Revision of Ecological Process Models
Advertisements

Pat Langley Dileep George Stephen Bay Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford,
Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California
Pat Langley Institute for the Study of Learning and Expertise Javier Sanchez CSLI / Stanford University Ljupco Todorovski Saso Dzeroski Jozef Stefan Institute.
Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California
Pat Langley Center for the Study of Language and Information Stanford University, Stanford, California
Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California USA
Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Computational Discovery.
Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Computational Discovery of Explanatory Process Models Thanks to.
Lect.3 Modeling in The Time Domain Basil Hamed
Supervisory Control of Hybrid Systems Written by X. D. Koutsoukos et al. Presented by Wu, Jian 04/16/2002.
Learning Process-Based Models of Dynamic Systems Nikola Simidjievski Jozef Stefan Institute, Slovenia HIPEAC 2014 LJUBLJANA.
Presented by: Thabet Kacem Spring Outline Contributions Introduction Proposed Approach Related Work Reconception of ADLs XTEAM Tool Chain Discussion.
Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Institute for the Study of Learning and Expertise Palo Alto, California.
Relational Data Mining in Finance Haonan Zhang CFWin /04/2003.
Knowledge Acquisitioning. Definition The transfer and transformation of potential problem solving expertise from some knowledge source to a program.
1 Learning Entity Specific Models Stefan Niculescu Carnegie Mellon University November, 2003.
Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California
Simulation Models as a Research Method Professor Alexander Settles.
Models of Human Performance Dr. Chris Baber. 2 Objectives Introduce theory-based models for predicting human performance Introduce competence-based models.
Chapter 5 Formulating the research design
Automated Planning and HTNs Planning – A brief intro Planning – A brief intro Classical Planning – The STRIPS Language Classical Planning – The STRIPS.
Dynamic Models Lecture 13. Dynamic Models: Introduction Dynamic models can describe how variables change over time or explain variation by appealing to.
SIMULATION. Simulation Definition of Simulation Simulation Methodology Proposing a New Experiment Considerations When Using Computer Models Types of Simulations.
Unit A2.1 Causality Kenneth D. Forbus Qualitative Reasoning Group Northwestern University.
Applying Multi-Criteria Optimisation to Develop Cognitive Models Peter Lane University of Hertfordshire Fernand Gobet Brunel University.
CS Bayesian Learning1 Bayesian Learning. CS Bayesian Learning2 States, causes, hypotheses. Observations, effect, data. We need to reconcile.
Overview of the Database Development Process
Unit 2: Engineering Design Process
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
Beyond Intelligent Interfaces: Exploring, Analyzing, and Creating Success Models of Cooperative Problem Solving Gerhard Fischer & Brent Reeves.
QUALITATIVE MODELING IN EDUCATION Bert Bredweg and Ken Forbus Yeşim İmamoğlu.
Chapter 1 Introduction to Simulation
ITEC224 Database Programming
1 Abduction and Induction in Scientific Knowledge Development Peter Flach, Antonis Kakas & Oliver Ray AIAI Workshop 2006 ECAI August, 2006.
Chapter 6 Supplement Knowledge Engineering and Acquisition Chapter 6 Supplement.
Lecture 9: Chapter 9 Architectural Design
Taxonomies and Laws Lecture 10. Taxonomies and Laws Taxonomies enumerate scientifically relevant classes and organize them into a hierarchical structure,
Sampletalk Technology Presentation Andrew Gleibman
Discovering Dynamic Models Lecture 21. Dynamic Models: Introduction Dynamic models can describe how variables change over time or explain variation by.
Pat Langley Adam Arvay Department of Computer Science University of Auckland Auckland, NZ Heuristic Induction of Rate-Based Process Models Thanks to W.
Experimentation in Computer Science (Part 1). Outline  Empirical Strategies  Measurement  Experiment Process.
Speeding Up Relational Data Mining by Learning to Estimate Candidate Hypothesis Scores Frank DiMaio and Jude Shavlik UW-Madison Computer Sciences ICDM.
Knowledge Representation of Statistic Domain For CBR Application Supervisor : Dr. Aslina Saad Dr. Mashitoh Hashim PM Dr. Nor Hasbiah Ubaidullah.
Discovering Descriptive Knowledge Lecture 18. Descriptive Knowledge in Science In an earlier lecture, we introduced the representation and use of taxonomies.
1 William P. Cunningham University of Minnesota Mary Ann Cunningham Vassar College Chapter 02 Lecture Outline Copyright © McGraw-Hill Education. All rights.
The end of geographic theory ? Prospects for model discovery in the geographic domain Mark Gahegan Centre for eResearch & Dept. Computer Science University.
Theme 2: Data & Models One of the central processes of science is the interplay between models and data Data informs model generation and selection Models.
SD modeling process One drawback of using a computer to simulate systems is that the computer will always do exactly what you tell it to do. (Garbage in.
Boolean Networks and Biology Peter Lee Shaun Lippow BE.400 Final Project December 10, 2002.
Data Mining and Decision Support
Data mining with DataShop Ken Koedinger CMU Director of PSLC Professor of Human-Computer Interaction & Psychology Carnegie Mellon University.
JigCell Nicholas A. Allen*, Kathy C. Chen**, Emery D. Conrad**, Ranjit Randhawa*, Clifford A. Shaffer*, John J. Tyson**, Layne T. Watson* and Jason W.
RULES Patty Nordstrom Hien Nguyen. "Cognitive Skills are Realized by Production Rules"
1 Learning through Interactive Behavior Specifications Tolga Konik CSLI, Stanford University Douglas Pearson Three Penny Software John Laird University.
MA354 Math Modeling Introduction. Outline A. Three Course Objectives 1. Model literacy: understanding a typical model description 2. Model Analysis 3.
Examining issues with advanced authoring Chris Roast Andrew Dearden Babak Khazaei Sheffield Hallam University.
IB Business & Management
Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, CA
Science and Engineering Practices K–2 Condensed Practices3–5 Condensed Practices6–8 Condensed Practices9–12 Condensed Practices Developing and Using Models.
Statistical process model Workshop in Ukraine October 2015 Karin Blix Quality coordinator
© 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
1 Software Requirements Descriptions and specifications of a system.
Borrett et al Computational Discovery of Process Models for Aquatic Ecosystems August 2006 Ecological Society of America, Memphis, TN Natasa Atanasova.
Predictive Customer Engagement
School of Computer Science & Engineering
Ben Saylor1, Anagha Kulkarni1, Neo Martinez2, Ilmi Yoon1
10 Stages Of the Engineering Design Process
Learning Probabilistic Graphical Models Overview Learning Problems.
Presented By: Darlene Banta
Presentation transcript:

P ROCESSES AND C ONSTRAINTS IN S CIENTIFIC M ODEL C ONSTRUCTION Will Bridewell † and Pat Langley †‡ † Cognitive Systems Laboratory, CSLI, Stanford University ‡ CIRCAS, Arizona State University

Where Are We Going? Introduction to inductive process modeling Constraints in inductive process modeling Learning constraints

Inductive Process Modeling ObservationsPredictions Model Model Objectives: Explanation and Prediction Langley et al. 2002, ICML; Bridewell et al. 2008, ML

Ordinary Differential Equations Processes Quantitative Process Models process exponential_growth equations d[hare.density, t, 1] = 2.5 * hare.density process exponential_loss equations d[wolf.density, t, 1] = −1.2 * wolf.density process predation_holling_type_1 equations d[hare.density, t, 1] = −0.1 * hare.density * wolf.density d[wolf.density, t, 1] = 0.3 * 0.1 * hare.density * wolf.density dhare.density/dt = 2.5 * hare.density + −0.1 * hare.density * wolf.density dwolf.density/dt = −1.2 * wolf.density * 0.1 * hare.density * wolf.density

Advantages of Quantitative Process Models Process models offer scientists a promising framework because:  they embed quantitative relations within qualitative structure;  that refer to notations and mechanisms familiar to experts;  they provide dynamical predictions of changes over time;  they offer causal and explanatory accounts of phenomena;  while retaining the modularity needed for induction/abduction. Quantitative process models provide an important alternative to formalisms used currently in computational discovery.

Ordinary Differential Equations Processes Modularity in Quantitative Process Models process exponential_growth equations d[hare.density, t, 1] = 2.5 * hare.density process exponential_loss equations d[wolf.density, t, 1] = −1.2 * wolf.density process predation_holling_type_1 equations d[hare.density, t, 1] = −0.1 * hare.density * wolf.density d[wolf.density, t, 1] = 0.3 * 0.1 * hare.density * wolf.density dhare.density/dt = 2.5 * hare.density + −0.1 * hare.density * wolf.density dwolf.density/dt = −1.2 * wolf.density * 0.1 * hare.density * wolf.density

Ordinary Differential Equations Processes process exponential_growth equations d[hare.density, t, 1] = 2.5 * hare.density process exponential_loss equations d[wolf.density, t, 1] = −1.2 * wolf.density dhare.density/dt = 2.5 * hare.density dwolf.density/dt = −1.2 * wolf.density Modularity in Quantitative Process Models

Ordinary Differential Equations Processes process exponential_growth equations d[hare.density, t, 1] = 2.5 * hare.density process exponential_loss equations d[wolf.density, t, 1] = −1.2 * wolf.density process predation_holling_type_2 equations d[hare.density, t, 1] = −0.1 * hare.density * wolf.density / ( * –0.1 * hare.density) d[wolf.density, t, 1] = 0.3 * 0.1 * hare.density * wolf.density / ( * –0.1 * hare.density) dhare.density/dt = 2.5 * hare.density + −0.1 * hare.density * wolf.density / ( * –0.1 * hare.density) dwolf.density/dt = −1.2 * wolf.density * 0.1 * hare.density * wolf.density / ( * –0.1 * hare.density) Modularity in Quantitative Process Models

Generic Processes generic process predation_Holling_1 entities P1{prey}, P2{predator} parameters r[0, infinity], e[0, infinity] equations d[P1.density, t, 1] = −1 * r * P1.density * P2.density d[P2.density, t, 1] = e * r * P1.density * P2.density

Generic Processes generic process predation_Holling_1 entities P1{prey}, P2{predator} parameters r[0, infinity], e[0, infinity] equations d[P1.density, t, 1] = −1 * r * P1.density * P2.density d[P2.density, t, 1] = e * r * P1.density * P2.density Instantiation P1: hareP2: wolf r: 0.1e: 0.3

Generic Processes generic process predation_Holling_1 entities P1{prey}, P2{predator} parameters r[0, infinity], e[0, infinity] equations d[P1.density, t, 1] = −1 * r * P1.density * P2.density d[P2.density, t, 1] = e * r * P1.density * P2.density process wolves_eat_hares equations d[hare.density, t, 1] = −1 * 0.1 * hare.density * wolf.density d[wolf.density, t, 1] = 0.3 * 0.1 * hare.density * wolf.density Instantiation P1: hareP2: wolf r: 0.1e: 0.3

The IPM System Given: - A library of generic entities and processes - Instantiated entities - Data Ground the generic processes with instantiated entities Generate all combinations of the ground processes Fit the numeric parameters of each structure Output: The best models based on fit to the data (a naive approach)

Applications Aquatic EcosystemsFjord Dynamics also, biochemical kinetics, protist interactions, photosynthesis See Bridewell et al. 2008, Machine Learning, 71, 1–32

Life After IPM help scientists formalize their modeling knowledge; let scientists consider several alternative models; reduce some of the drudgery of model construction; speed exploration and evaluation. Early versions of inductive process modeling systems: However, IPM produces several structurally implausible models, some of which account quite well for the data.

Model Constraints eliminate implausible models; reduce the size of the search space; make complex domains tractable; improve model accuracy during incomplete search. HIPM, Todorovski et al. AAAI-05 Constraints on the structure of models: Structural constraints differ from constraints on model behavior most importantly because they do not require simulation.

SC-IPM Constraints: Necessary Name: Nutrient-Replenishment Type: necessary Processes: nutrient_mixing(N), remineralization(N,_ ) Specifies Required Processes P = primary producer G = grazer N = nutrient

Name: Growth-Limitation Type: always-together Processes: limited(P), nutrient_limitation(P, N) All or None P = primary producer G = grazer N = nutrient SC-IPM Constraints: Always-Together

Name: Growth-Alternatives Type: exactly-one Processes: exponential(P), logistic(P), limited(P) Mutual Exclusion P = primary producer G = grazer N = nutrient SC-IPM Constraints: Exactly-One

Name: Optional-Grazing Type: at-most-one Processes: holling_1(P,G), holling_2(P,G), holling_3(P,G) Enables Optional Processes P = primary producer G = grazer N = nutrient SC-IPM Constraints: At-Most-One

The SC-IPM System 1. Ground the generic processes with instantiated entities. 2. Treat ground processes as Boolean literals. 3. Conjoin the individual constraints. 4. Rewrite the constraints in conjunctive normal form. 5. Apply a SAT solver (e.g., DPLL,WalkSAT). 6. Instant model structure! 7. Fit parameters, etc.

Advantages of SC-IPM constraints that limit the consideration of implausible models; constraint modularity that eases control of the search space. SC-IPM adds several powerful features to IPM, such as: The constraints used by SC-IPM typically come from a scientist’s implicit knowledge, and we can both elicit them through examples and learn them computationally.

Goal: Identify implicit or unknown constraints to use in future modeling tasks Plan: Analyze the space of model structures Use machine learning techniques to help Key Idea: Don’t throw away any models Even the bad ones contain valuable information Learning Constraints Bridewell & Todorovski 2007, ILP and KCAP

Learning Constraints 1. Build and parameterize process models 2. Store the models for analysis 3. Formally describe the structure of the models 4. Identify good and bad models 5. Use ILP to generate descriptions of accurate and inaccurate model structures 6. Convert the descriptions into SC-IPM constraints We chose Aleph by Ashwin Srinivasan due to its ready availability and capabilities.

Good and Bad Models 1996–1997 Ross Sea GoodBad

Extracted Constraints A model that includes a second-order exponential mortality process for phytoplankton will be inaccurate. (positive:560, negative: 0) A model that includes the Lotka–Volterra grazing process will be inaccurate. (positive: 80, negative: 0) A model that lacks both the first and second order Monod growth limitation process between iron and phytoplankton will be inaccurate. (positive: 448, negative: 0)

Apply Constraints to Other Problems Ross Sea Across Years Search Spaces: 9x–16x smallerModel Distribution: more accurate

Apply Constraints to Other Domains Ross Sea to Bled Lake Bridewell & Todorovski AAAI-08 (Transfer Learning Workshop)

Related Work  Other quantitative modelers  L AGRAMGE (Todorovski & Dzeroski)  PRET (Bradley & Stolle)  Metalearning and others  Learning Constraint Networks via Version Spaces (Bessiere et al.)  Relational Clichés (Silverstein & Pazzani; Morin & Matwin)  Mode Declarations in ILP (McCreath & Sharma)  Rule Reliability from Prior Performance (Mark Reid)

continuing the analysis of constraint transfer; closing the automated modeling + constraint learning loop; basing new analyses and methodologies on model ensembles; adapting the general strategies to other tasks; supporting other modeling paradigms. Future Directions We are currently working in several directions which include: Inductive process modeling is a fruitful paradigm for exploring knowledge representation, modeling, discovery, and creativity in scientific practice.

Modular Constraints Name: Type: {always-together, exactly-one, at-most-one, necessary} Processes:

Descriptive Rules Run 1: Generate a theory for accurate models. accurate_model(A) :-does_not_include_process(A,death_exp2), includes_process_entity(A,monod_lim,iron). Run 2: Generate a theory for inaccurate models. inaccurate_model(A) :- does_not_include_process_entity(A,monod_2nd,iron), does_not_include_process_entity(A,monod_lim,iron). inaccurate_model(A) :- includes_process(A,death_exp2), includes_process_entity(A,deangelis_beddington,phytoplankton). We chose Aleph by Ashwin Srinivasan due to its ready availability and capabilities.

Cross Problem Transfer TrainingTest 1Test 2 The search spaces are 9x–16x smaller Distribution of Models in the Search Space Two Protist Ecosystem

Ross-96 to Ross-97

Bled-02 to Ross-96

Convert Descriptions into Structural Constraints Rules for accurate models become sufficient conditions for retaining model structures The negation of rules for inaccurate models become necessary conditions for retaining the model structures.