Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California

Slides:



Advertisements
Similar presentations
Pat Langley Dileep George Stephen Bay Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford,
Advertisements

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California
Pat Langley Institute for the Study of Learning and Expertise Javier Sanchez CSLI / Stanford University Ljupco Todorovski Saso Dzeroski Jozef Stefan Institute.
Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California Elena Messina.
Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California
Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona USA Modeling Social Cognition in a Unified Cognitive Architecture.
Pat Langley Center for the Study of Language and Information Stanford University, Stanford, California
Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California USA
Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California USA
Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona A Cognitive Architecture for Integrated.
Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Computational Discovery.
Pat Langley Institute for the Study of Learning and Expertise 2164 Staunton Court, Palo Alto, California and School of Computing and Informatics Arizona.
Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Computational Discovery of Explanatory Process Models Thanks to.
Pat Langley Institute for the Study of Learning and Expertise Palo Alto, California A Cognitive Architecture for Complex Learning.
Active Appearance Models
Present by Oz Shapira.  User modeling ”is a sub-area of human–computer interaction, in which the researcher / designer develops cognitive models of human.
INTRODUCTION TO MODELING
Chapter 6: The physical symbol system hypothesis
Quantitative vs. Qualitative Research Method Issues Marian Ford Erin Gonzales November 2, 2010.
Pat Langley School of Computing and Informatics Arizona State University Tempe, Arizona Institute for the Study of Learning and Expertise Palo Alto, California.
APRIL, Application of Probabilistic Inductive Logic Programming, IST Albert-Ludwigs-University, Freiburg, Germany & Imperial College of Science,
Integrating Bayesian Networks and Simpson’s Paradox in Data Mining Alex Freitas University of Kent Ken McGarry University of Sunderland.
Relational Data Mining in Finance Haonan Zhang CFWin /04/2003.
Knowledge Acquisitioning. Definition The transfer and transformation of potential problem solving expertise from some knowledge source to a program.
Statistical Relational Learning for Link Prediction Alexandrin Popescul and Lyle H. Unger Presented by Ron Bjarnason 11 November 2003.
P ROCESSES AND C ONSTRAINTS IN S CIENTIFIC M ODEL C ONSTRUCTION Will Bridewell † and Pat Langley †‡ † Cognitive Systems Laboratory, CSLI, Stanford University.
Marakas: Decision Support Systems, 2nd Edition © 2003, Prentice-Hall Chapter Chapter 7: Expert Systems and Artificial Intelligence Decision Support.
Models of Human Performance Dr. Chris Baber. 2 Objectives Introduce theory-based models for predicting human performance Introduce competence-based models.
Scientific method - 1 Scientific method is a body of techniques for investigating phenomena and acquiring new knowledge, as well as for correcting and.
Latent Semantic Analysis (LSA). Introduction to LSA Learning Model Uses Singular Value Decomposition (SVD) to simulate human learning of word and passage.
Fig Theory construction. A good theory will generate a host of testable hypotheses. In a typical study, only one or a few of these hypotheses can.
Thanks to K. Arrigo, G. Bradshaw, S. Borrett, W. Bridewell, S. Dzeroski, H. Simon, L. Todorovski, and J. Zytkow for their contributions to this research,
An analysis and evaluation of the effects of lease terms on effective commercial rents in Frankfurt, Germany Gero Grunenberg, John Moohan and Paul Royston.
Chapter 14: Artificial Intelligence Invitation to Computer Science, C++ Version, Third Edition.
“Enhancing Reuse with Information Hiding” ITT Proceedings of the Workshop on Reusability in Programming, 1983 Reprinted in Software Reusability, Volume.
Single-Factor Experimental Designs
Discovering Dynamic Models Lecture 21. Dynamic Models: Introduction Dynamic models can describe how variables change over time or explain variation by.
1 Issues in Assessment in Higher Education: Science Higher Education Forum on Scientific Competencies Medellin-Colombia Nov 2-4, 2005 Dr Hans Wagemaker.
GATree: Genetically Evolved Decision Trees 전자전기컴퓨터공학과 데이터베이스 연구실 G 김태종.
Pat Langley Adam Arvay Department of Computer Science University of Auckland Auckland, NZ Heuristic Induction of Rate-Based Process Models Thanks to W.
Introduction to Research
1 A Conceptual Framework of Data Mining Y.Y. Yao Department of Computer Science, University of Regina Regina, Sask., Canada S4S 0A2
Thanks to G. Bradshaw, W. Bridewell, S. Dzeroski, H. A. Simon, L. Todorovski, R. Valdes-Perez, and J. Zytkow for discussions that led to many of these.
1 CS 391L: Machine Learning: Experimental Evaluation Raymond J. Mooney University of Texas at Austin.
Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin Developing and Evaluating Theories of Behavior.
Discovering Descriptive Knowledge Lecture 18. Descriptive Knowledge in Science In an earlier lecture, we introduced the representation and use of taxonomies.
Estimating Component Availability by Dempster-Shafer Belief Networks Estimating Component Availability by Dempster-Shafer Belief Networks Lan Guo Lane.
Introduction to Earth Science Section 2 Section 2: Science as a Process Preview Key Ideas Behavior of Natural Systems Scientific Methods Scientific Measurements.
Artificial intelligence
Scientific Methods and Terminology. Scientific methods are The most reliable means to ensure that experiments produce reliable information in response.
Dendral: A Case Study Lecture 25.
Generic Tasks by Ihab M. Amer Graduate Student Computer Science Dept. AUC, Cairo, Egypt.
1 Modeling in MS Science. 2 ANNOUNCEMENTS Q3 Assessments, scantrons due back Apr 13 th end of day (drop off at security if needed) Q4 Assessments: May.
Week 2 The lecture for this week is designed to provide students with a general overview of 1) quantitative/qualitative research strategies and 2) 21st.
Research for Nurses: Methods and Interpretation Chapter 1 What is research? What is nursing research? What are the goals of Nursing research?
Data Mining and Decision Support
NTU & MSRA Ming-Feng Tsai
Artificial Intelligence: Research and Collaborative Possibilities a presentation by: Dr. Ernest L. McDuffie, Assistant Professor Department of Computer.
ABRA Week 3 research design, methods… SS. Research Design and Method.
Helpful hints for planning your Wednesday investigation.
Introduction to Earth Science Section 1 SECTION 1: WHAT IS EARTH SCIENCE? Preview  Key Ideas Key Ideas  The Scientific Study of Earth The Scientific.
Borrett et al Computational Discovery of Process Models for Aquatic Ecosystems August 2006 Ecological Society of America, Memphis, TN Natasa Atanasova.
Methods of multivariate analysis Ing. Jozef Palkovič, PhD.
Model Discovery through Metalearning
Chapter 7. Classification and Prediction
Rule Induction for Classification Using
Qualitative Research Quantitative Research.
Pat Langley Department of Computer Science University of Auckland
CSc4730/6730 Scientific Visualization
CSc4730/6730 Scientific Visualization
Presentation transcript:

Pat Langley Computational Learning Laboratory Center for the Study of Language and Information Stanford University, Stanford, California Knowledge, Data, and Search in Computational Discovery Thanks to Kevin Arrigo, Stuart Borrett, Will Bridewell, and Ljupco Todorovski for their contributions to this work, and to the National Science Foundation for funding.

Qualitative Laws of Intelligence  the ability to store, retrieve, and manipulate list structures  since computers are general symbol manipulators  the ability to solve novel problems by heuristic search  with problem spaces defined by states and operators In their 1975 Turing Award speech, Newell and Simon claimed that intelligence depends on two factors: Moreover, one can constrain search with knowledge that is cast as symbolic list structures. These insights underlie the fields of artificial intelligence and cognitive science.

Two Basic Claims  knowledge structures are important results of machine learning and discovery  knowledge structures are important inputs to machine learning and discovery Newell and Simon’s insights suggest the two claims of this talk: In other words, knowledge plays as crucial a role as data in the automation of discovery. I will illustrate these ideas using recent work on induction of scientific process models.

The Mainstream View Learning/Discovery Process Training Data Predictive Model Nearly all current research in machine learning and data mining takes this perspective.

An Alternative View Learning/Discovery Process Existing Knowledge Acquired Knowledge Training Data This perspective is now uncommon, but the ideas themselves are not new to machine learning and discovery.

Historical Landmarks in Machine Learning  1980 – Machine learning launched as an outgrowth of symbolic AI  1983 – Early emphasis on knowledge-guided approaches to learning  1986 – First issue of the journal Machine Learning published  1989 – Advent of UCI repository and routine experimental evaluation  1989 – Introduction of statistical methods from pattern recognition  1993 – Workshop on fielded applications of machine learning  1995 – First conference on knowledge discovery and data mining  1997 – Explosion of the Web and associated research on text mining  2001 – Strong focus on predictive accuracy over understandability  2004 – Prevalence of statistical methods over symbolic approaches

Knowledge as Output of Discovery Systems  have been stated in some declarative format  that can be communicated clearly and precisely  which helps people understand observations  in terms that they find plausible and familiar Discovery systems produce models that are useful for prediction, but they should also produce models that: We typically refer to the content of such models as knowledge.

What is Knowledge?  criteria tables (M of N rules) in diagnostic medicine  molecular structures and reaction pathways in chemistry  qualitative causal models in biology and geology  structural equations in economics and sociology  differential equations in physics and ecology Knowledge can be cast in many different formalisms, such as: Discovery systems should generate knowledge in a format that is familiar to domain users. Fortunately, computers can encode all such forms of knowledge.

Successes of Scientific Knowledge Discovery Over the past decade, computational discovery systems have helped uncover new knowledge in many scientific fields:  qualitative chemical factors in mutagenesis (King et al., 1996)  quantitative laws of metallic behavior (Sleeman et al., 1997)  qualitative conjectures in number theory (Colton et al., 2000)  temporal laws of ecological behavior (Todorovski et al., 2000)  reaction pathways in catalytic chemistry (Valdes-Perez, 1994) Each has led to publications in the refereed scientific literature, the key measure of academic success. For a review of these scientific results, see Langley (IJHCS, 2000).

Description vs. Explanation  move beyond superficial descriptive summaries  to account for observations at a deeper theoretical level  in terms of unobserved concepts and mechanisms  that are familiar and plausible to domain experts Traditional discovery systems have focused on descriptive models that summarize data and make accurate predictions. But many sciences are concerned with explanatory models that: Explanations may or may not have quantitative aspects, but they invariably have qualitative structure not captured by statistics.

Two Accounts of the Ross Sea Ecosystem d[phyto,t,1] =   phyto   zoo  phyto d[zoo,t,1] =   zoo   zoo d[detritus,t,1] =  phyto  zoo   zoo   detritus d[nitro,t,1] =    phyto  detritus As phytoplankton uptakes nitrogen, its concentration increases and nitrogen decreases. This continues until the nitrogen supply is exhausted, which leads to a phytoplankton die off. This produces detritus, which gradually remineralizes to replenish the nitrogen. Zooplankton grazes on phytoplankton, which slows the latter’s increase and also produces detritus.

Relating Equation Terms to Processes d[phyto,t,1] =   phyto   zoo  phyto d[zoo,t,1] =   zoo   zoo d[detritus,t,1] =  phyto  zoo   zoo   detritus d[nitro,t,1] =    phyto  detritus As phytoplankton uptakes nitrogen, its concentration increases and nitrogen decreases. This continues until the nitrogen supply is exhausted, which leads to a phytoplankton die off. This produces detritus, which gradually remineralizes to replenish the nitrogen. Zooplankton grazes on phytoplankton, which slows the latter’s increase and also produces detritus.

A Process Model for the Ross Sea model Ross_Sea_Ecosystem variables: phyto, zoo, nitro, detritus observables: phyto, nitro process phyto_loss equations:d[phyto,t,1] =   phyto equations:d[phyto,t,1] =   phyto d[detritus,t,1] =  phyto process zoo_loss equations:d[zoo,t,1] =   zoo equations:d[zoo,t,1] =   zoo d[detritus,t,1] =  zoo process zoo_phyto_grazing equations:d[zoo,t,1] =   zoo equations:d[zoo,t,1] =   zoo d[detritus,t,1] =   zoo d[phyto,t,1] =   zoo process nitro_uptake equations:d[phyto,t,1] =  phyto equations:d[phyto,t,1] =  phyto d[nitro,t,1] =    phyto process nitro_remineralization; equations:d[nitro,t,1] =  detritus equations:d[nitro,t,1] =  detritus d[detritus,t,1 ] =   detritus This model is equivalent to a standard differential equation model, but it makes explicit assumptions about which processes are involved. For completeness, we must also make assumptions about how to combine influences from multiple processes.

Advantages of Process Models  they embed quantitative relations within qualitative structure;  that refer to notations and mechanisms familiar to experts;  they provide dynamical predictions of changes over time;  they offer causal and explanatory accounts of phenomena;  while retaining the modularity that is needed for induction. Process models are a promising representational scheme because: Quantitative process models provide an important alternative to formalisms typically used in modeling and discovery.

The Task of Inductive Process Modeling We can use these ideas to reformulate the modeling problem:  Given: A set of variables of interest to the scientist;  Given: Observations of how these variables change over time;  Given: Background knowledge about plausible processes;  Find: A process model that explains these variations and that generalizes well to future observations. The resulting model encodes new knowledge about the domain.

Challenges of Inductive Process Modeling  process models characterize behavior of dynamical systems;  variables are continuous but can have discontinuous behavior;  observations are not independently and identically distributed;  models may contain unobservable processes and variables;  multiple processes can interact to produce complex behavior. We can use ideas from machine learning to induce process models, but this differs from typical learning tasks in that: Compensating factors include a focus on deterministic systems and ways to constrain the search for models.

Machine Learning as Heuristic Search Heuristic search depends on ways to guide exploration of the space.

Knowledge as Input to Discovery Systems  by providing constraints on the space searched  as in work on declarative bias for induction  by providing operators used during search  as in ILP research on relational cliches  by providing a starting point for heuristic search  as in work on theory revision and refinement One can also use knowledge to guide discovery mechanisms: Using knowledge to influence discovery can reduce prediction error but also improve model understandability.

Background Knowledge as Constraints  Horn clause programs (e.g., King et al., 1996)  context-free grammars (e.g., Dzeroski & Todorovski, 1997)  prior probability distributions (e.g., Friedman et al., 2000) We can use background knowledge about the domain to constrain search for candidate models. Previous work has encoded background knowledge in terms of: However, none of these notations are familiar to domain scientists, which suggests the need for another approach.

Generic Processes as Background Knowledge  the variables involved in a process and their types;  the parameters appearing in a process and their ranges;  the forms of conditions on the process; and  the forms of associated equations and their parameters. We cast background knowledge as generic processes that specify: Generic processes are building blocks from which one can compose specific process models.

Generic Processes for Aquatic Ecosystems generic process exponential_lossgeneric process remineralization variables: S{species}, D{detritus} variables: N{nutrient}, D{detritus} variables: S{species}, D{detritus} variables: N{nutrient}, D{detritus} parameters:  [0, 1] parameters:  [0, 1] parameters:  [0, 1] parameters:  [0, 1] equations:d[S,t,1] =  1    S equations:d[N, t,1] =   D equations:d[S,t,1] =  1    S equations:d[N, t,1] =   D d[D,t,1] =   Sd[D, t,1] =  1    D generic process grazinggeneric process constant_inflow variables: S1{species}, S2{species}, D{detritus} variables: N{nutrient} variables: S1{species}, S2{species}, D{detritus} variables: N{nutrient} parameters:  [0, 1],  [0, 1] parameters: [0, 1] parameters:  [0, 1],  [0, 1] parameters: [0, 1] equations:d[S1,t,1] =     S1 equations:d[N,t,1] = equations:d[S1,t,1] =     S1 equations:d[N,t,1] = d[D,t,1] = (1   )    S1 d[S2,t,1] =  1    S1 generic process nutrient_uptake variables: S{species}, N{nutrient} variables: S{species}, N{nutrient} parameters:  [0,  ],  [0, 1],  [0, 1] parameters:  [0,  ],  [0, 1],  [0, 1] conditions:N >  conditions:N >  equations:d[S,t,1] =   S equations:d[S,t,1] =   S d[N,t,1] =  1      S Our current library contains about 20 generic processes, including ones with alternative functional forms for loss and grazing processes.

A Method for Process Model Construction 1. Find all ways to instantiate known generic processes with specific variables, subject to type constraints; 2. Combine instantiated processes into candidate generic models subject to additional constraints (e.g., number of processes); 3. For each generic model, carry out search through parameter space to find good coefficients; 4. Return the parameterized model with the best overall score. We have developed IPM, a system that constructs explanatory process models from generic components in four stages: Our typical evaluation metric is squared error, but we have also explored other measures of explanatory adequacy.

Estimating Parameters in Process Models 1. Selects random initial values that fall within ranges specified in the generic processes; 2. Improves these parameters using the Levenberg-Marquardt method until it reaches a local optimum; 3. Generates new candidate values through random jumps along dimensions of the parameter vector and continue search; 4. If no improvement occurs after N jumps, it restarts the search from a new random initial point. To estimate the parameters for each generic model structure, the IPM algorithm: This multi-level method gives reasonable fits to time-series data from a number of domains, but it is computationally intensive.

Results on Training Data from Ross Sea We provided IPM with 188 samples of phytoplankton, nitrogen, light, and ice measures for the Ross Sea. From 2035 distinct model structures, it found accurate models that limited phyto growth by the nitrate and the light available. Some high-ranking models incorporated zooplankton, whereas others did not.

Results on a Protist Ecosystem We also ran the system on protist data from Villeaux (1979), using 54 samples of two variables (P. aurelia and P. nasutum). In this run, IPM considered a space of 470 distinct model structures and reproduced basic trends.

Results on Rinkobing Fjord Data from a Danish fjord included measurements on fjord height, sea level, water inflow, and wind direction and speed. We used 1100 samples for training and 551 samples for testing over a space of 32 model structures.

Results on Battery Data from Space Station Data from the Space Station batteries included current, voltage, and temperature, with resistance and state of charge unobserved. We used 6000 samples for training and 2640 samples for testing over a space of 162 model structures.

Results on Biochemical Kinetics We also ran IPM on 14 samples of six chemicals involved in glycolysis from a pulse response study. Here the system considered some 172 model structures. The best model fit the data but reproduced only part of the known pathway.

Hierarchical Induction of Process Models  organizes background knowledge into a hierarchy of processes;  specifies required vs. optional components and mutual exclusion;  associates variables with entities that occur in processes;  carries out beam search through the resulting AND/OR space. Despite its success, we have observed IPM produce models that lack required components or include mutually exclusive ones. In response, we have developed an extended system, HIPM, that: We hypothesized this additional knowledge would reduce search effort and variance, thus improving generalization error. For more details about HIPM, see Todorovski et al. (AAAI-2005).

HIPM Results on Ross Sea Data BeamSystem # models Test SSE Test r 2 4IPM HIPM IPM HIPM IPM HIPM HIPM examines fewer models and has better predictive accuracy.

Research on Theory Revision  Horn clause programs (e.g., Ourston & Mooney, 1990)  diagnostic fault hierarchies (e.g., Langley et al., 1994)  qualitative causal models (e.g., Bay et al., 2003)  sets of quantitative equations (e.g., Todorovski et al., 2003) We can also use background knowledge to specify initial models from which to start search. Research on theory revision has applied this idea to models cast as: This approach typically produces models that are more accurate and easier to comprehend than ones induced from scratch.

Inductive Revision of Process Models Revision initial model observations revised model model RossSeaEcosystem variables: phyto, zoo, nitro, residue observables: phyto, nitro d[phyto,t,1] =   phyto   zoo  phyto  phyto d[zoo,t,1] =   zoo   zoo d[residue,t,1] =  phyto  zoo   zoo   residue   zoo   residue d[nitro,t,1] =    phyto  residue model RossSeaEcosystem variables: phyto, zoo, nitro, residue observables: phyto, nitro d[phyto,t,1] =   phyto   zoo  phyto  phyto d[zoo,t,1] =   zoo   zoo d[residue,t,1] =  phyto  zoo   zoo   residue   zoo   residue d[nitro,t,1] =    phyto  residue process exponential_growth variables: P {population} variables: P {population} equations: d[P,t] = [0, 1,  ]  P equations: d[P,t] = [0, 1,  ]  P process logistic_growth variables: P {population} variables: P {population} equations: d[P,t] = [0, 1,  ]  P  (1  P / [0, 1,  ]) equations: d[P,t] = [0, 1,  ]  P  (1  P / [0, 1,  ]) process constant_inflow variables: I {inorganic_nutrient} variables: I {inorganic_nutrient} equations: d[I,t] = [0, 1,  ] equations: d[I,t] = [0, 1,  ] process consumption variables: P1 {population}, P2 {population}, nutrient_P2 variables: P1 {population}, P2 {population}, nutrient_P2 equations: d[P1,t] = [0, 1,  ]  P1  nutrient_P2, equations: d[P1,t] = [0, 1,  ]  P1  nutrient_P2, d[P2,t] =  [0, 1,  ]  P1  nutrient_P2 d[P2,t] =  [0, 1,  ]  P1  nutrient_P2 process no_saturation variables: P {number}, nutrient_P {number} variables: P {number}, nutrient_P {number} equations: nutrient_P = P equations: nutrient_P = P process saturation variables: P {number}, nutrient_P {number} variables: P {number}, nutrient_P {number} equations: nutrient_P = P / (P + [0, 1,  ]) equations: nutrient_P = P / (P + [0, 1,  ]) generic processes

Comprehensible Bagging of Process Models  creates multiple training sets by sampling the original data;  uses HIPM to induce one process model from each training set;  creates a new model structure that includes common processes;  estimates parameters for this structure from the original data. We have seen HIPM produce models that fit the training data but generalize poorly, so we created another system – FUSE – that: We hypothesized this method would reduce generalization error while keeping models understandable, unlike bagging. This shows one can combine ideas about knowledge and statistics. For more details about FUSE, see Bridewell et al. (ICML-2005).

r2r2r2r2 SSE FUSE Results on Ross Sea Data Five-fold cross validation on 188 measurements of two variables Cross-validation fold

Process Modeling and Missing Data 1.replaces missing values with interpolated estimates; 2.uses HIPM to find the model that minimizes squared error; 3.replaces the estimated values with ones the model predicts; 4.if some values have changed, then return to Step 2. Our initial algorithms assumed that the variables have no missing samples, so we have developed another extension to HIPM that: Experiments suggest that this expectation-maximization variant reduces error substantially on unseen data. This shows another way to combine knowledge with statistics. For more details, see Bridewell et al. (submitted to ICML-2006).

Contributions of the Research  a formalism that states scientific knowledge as process models  an encoding for background knowledge as generic processes  a computational method for inducing process models  a related technique for revising initial process models  extended methods that combine knowledge with statistics In summary, our work on computational discovery has produced: Inductive process modeling has great potential to help scientists construct explanatory models of dynamical systems.

Future Research on Process Modeling  produce additional results on other scientific data sets  develop more efficient methods for fitting model parameters  extend framework to handle partial differential equations  explore evaluation metrics like match to trajectory shape  introduce subsystems to support large-scale modeling Despite our progress to date, we need further work in order to: Taken together, these will make inductive process modeling a more robust approach to scientific knowledge discovery.

Relevance for Feature Selection  placing constraints on acceptable combinations of features  provide an initial set of features from which to start search  biasing selection to produce understandable models Knowledge can also assist in the search for useful features by: We can apply these ideas to any representation of discovered knowledge, since they must include features as components.

Feature Selection in Process Modeling  construct initial models that include only a few variables  use generic processes, type constraints, and available terms to expand the best-scoring models  by adding new terms between ones in the current model  by adding new terms to the fringe of the current model  continue this forward selection scheme to construct ever more inclusive process models We hope to extend our methods for inducing process models to: This strategy mirrors the incremental way that scientists improve their models over time.

Concluding Remarks  serve as understandable results of discovery systems  provide useful inputs to discovery systems that guide search In summary, ideas from symbolic AI remain highly relevant to machine learning and discovery. These ideas revolve around using structural knowledge that can: One can combine knowledge-based approaches with statistical techniques to gain the benefits of both paradigms. Taken together, they offer a balanced and productive approach to computational induction.

End of Presentation