Causal Models Lecture 12.

Slides:



Advertisements
Similar presentations
Slide 1 of 18 Uncertainty Representation and Reasoning with MEBN/PR-OWL Kathryn Blackmond Laskey Paulo C. G. da Costa The Volgenau School of Information.
Advertisements

Lesson Overview 1.1 What Is Science?.
Bayesian Network and Influence Diagram A Guide to Construction And Analysis.
Chapter 1 Introduction to Modeling DECISION MODELING WITH MICROSOFT EXCEL Copyright 2001 Prentice Hall.
Display of Information for Time-Critical Decision Making Eric Horvitz Decision Theory Group Microsoft Research Redmond, Washington 98025
Lesson Overview 1.1 What Is Science?.
Introduction of Probabilistic Reasoning and Bayesian Networks
Regulatory Network (Part II) 11/05/07. Methods Linear –PCA (Raychaudhuri et al. 2000) –NIR (Gardner et al. 2003) Nonlinear –Bayesian network (Friedman.
Learning with Bayesian Networks David Heckerman Presented by Colin Rickert.
Goal: Reconstruct Cellular Networks Biocarta. Conditions Genes.
Statistics for Managers Using Microsoft Excel, 5e © 2008 Prentice-Hall, Inc.Chap 13-1 Statistics for Managers Using Microsoft® Excel 5th Edition Chapter.
1 Department of Computer Science and Engineering, University of South Carolina Issues for Discussion and Work Jan 2007  Choose meeting time.
CBioC: Massive Collaborative Curation of Biomedical Literature Future Directions.
Dynamic Models Lecture 13. Dynamic Models: Introduction Dynamic models can describe how variables change over time or explain variation by appealing to.
Causal Models, Learning Algorithms and their Application to Performance Modeling Jan Lemeire Parallel Systems lab November 15 th 2006.
Introduction to Science: The Scientific Method
 Catalogue No: BS-338  Credit Hours: 3  Text Book: Advanced Engineering Mathematics by E.Kreyszig  Reference Books  Probability and Statistics by.
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 11 Regression.
Made by: Maor Levy, Temple University  Probability expresses uncertainty.  Pervasive in all of Artificial Intelligence  Machine learning 
Framework for K-12 Science Education
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
Introduction to Science: The Scientific Method
Discovering Dynamic Models Lecture 21. Dynamic Models: Introduction Dynamic models can describe how variables change over time or explain variation by.
EQT 373 Chapter 3 Simple Linear Regression. EQT 373 Learning Objectives In this chapter, you learn: How to use regression analysis to predict the value.
Slides to accompany Weathington, Cunningham & Pittenger (2010), Chapter 3: The Foundations of Research 1.
Bayesian Networks for Data Mining David Heckerman Microsoft Research (Data Mining and Knowledge Discovery 1, (1997))
Conceptual Modelling and Hypothesis Formation Research Methods CPE 401 / 6002 / 6003 Professor Will Zimmerman.
Discovering Descriptive Knowledge Lecture 18. Descriptive Knowledge in Science In an earlier lecture, we introduced the representation and use of taxonomies.
Intro to Scientific Research Methods in Geography Chapter 2: Fundamental Research Concepts.
Lesson Overview Lesson Overview What Is Science? Lesson Overview 1.1 What Is Science?
Problem Limited number of experimental replications. Postgenomic data intrinsically noisy. Poor network reconstruction.
The Scientific Method Objectives: List the steps of the scientific method Explain the relationship between hypothesizing, predicting, and experimenting.
DeepDive Model Dongfang Xu Ph.D student, School of Information, University of Arizona Dec 13, 2015.
Using Bayesian Networks to Predict Plankton Production from Satellite Data By: Rob Curtis, Richard Fenn, Damon Oberholster Supervisors: Anet Potgieter,
04/21/2005 CS673 1 Being Bayesian About Network Structure A Bayesian Approach to Structure Discovery in Bayesian Networks Nir Friedman and Daphne Koller.
Computational methods for inferring cellular networks II Stat 877 Apr 17 th, 2014 Sushmita Roy.
Modeling of Core Protection Calculator System Software February 28, 2005 Kim, Sung Ho Kim, Sung Ho.
Bayesian Biosurveillance of Disease Outbreaks RODS Laboratory Center for Biomedical Informatics University of Pittsburgh Gregory F. Cooper, Denver H.
Introduction to Science: The Scientific Method Courtesy of: Omega Science.
Linking Threats to Assets in Complex Ecological and Socio-Economic Systems: Qualitative Modelling for Tourism Development in North Western Australia Jeffrey.
Stochasticity and Probability. A new approach to insight Pose question and think of the answer needed to answer it. Ask: How do the data arise? What is.
Inferring Regulatory Networks from Gene Expression Data BMI/CS 776 Mark Craven April 2002.
Introduction to Science: The Scientific Method
Introduction to Science: The Scientific Method
Chapter 13 Simple Linear Regression
OPERATING SYSTEMS CS 3502 Fall 2017
Introduction to Science: The Scientific Method
Introduction to Science: The Scientific Method
Multi-Axis Tabular Loads in ANSYS Workbench
Chapter 11: Simple Linear Regression
Cranfield Universityb (UK)
Introduction to Science: The Scientific Method
Elementary Statistics
CHAPTER 29: Multiple Regression*
Introduction to Science: The Scientific Method
What Is Science? Read the lesson title aloud to students.
What Is Science? Read the lesson title aloud to students.
Introduction to Science: The Scientific Method
Propagation Algorithm in Bayesian Networks
What Is Science? Read the lesson title aloud to students.
Bayesian Statistics and Belief Networks
The Scientific Method Section 2.1.
POSC 202A: Lecture 1 Introductions Syllabus R
What Is Science? Read the lesson title aloud to students.
Chapter 1 The Science of Biology
Graph Info for Labs: In your lab book: For Lab Report:
Lesson Overview 1.1 What Is Science?.
Scientific Workflows Lecture 15
Relating Models to Data
Presentation transcript:

Causal Models Lecture 12

Causal Models: Introduction Causal models highlight interactions among variables, often without specifying mechanisms for those relationships. Although causal models may be deterministic, they often use probability theory to address uncertainty. This lecture discusses three approaches: structural equation models, Bayesian networks, and qualitative approaches based on model-checking. The first two formalisms have been used for decades and are supported by multiple informatics tools. Support for the scientific use of qualitative approaches is in its early stages.

Causal Models: Historical Use

Structural Equation Models Structural equation models express linear, causal relationships among variables and include observed, or manifest, variables that may be measured and that are generally associated with data; unobserved, or latent, variables that are typically not measurable in principle; causal links among the variables; and error terms that account for intrinsic randomness or unknown causal factors. As a methodology, structural equation models let scientists connect causal models to observational data.

Structural Equation Models A structural equation models is a system of linear equations with error terms. The equations are often shown as graphs to highlight the causal relationships among the variables. x1 = 0.56x4 + 0.90x2 + N(0, 1.40) x2 = N(0, 1.11) x3 = 1.39x1 + N(0, 1.22) x4 = -0.52x2 + N(0, 1.07) screenshots from an interactive TETRAD session.

TETRAD: Creating Models TETRAD is an informatics environment that supports structural equation modeling. In addition to creating the structure, researchers can either provide parameters or generate them probabilistically. Specifying the structure Specifying parameters

TETRAD: Interacting with Models TETRAD also lets researchers simulate their models, plot the results, and compare model structures to each other. Simulation results Histogram of a variable’s values TETRAD is a showcase for structural and parametric search and does not support data analysis or comparison. http://www.phil.cmu.edu/projects/tetrad/

Bayesian Networks Bayesian networks replace the equations of structural equation models with conditional probability tables. Note the conditional probability table for the coma node. Possible values for the node are presented in rows. Possible states for the node’s parents appear in columns. Screenshots from an interactive GeNIe session.

GeNIe GeNIe is an informatics environment that supports building, running, and learning Bayesian networks.

GeNIe: Creating and Interacting with Models Creating models in GeNIe is similar to working in TETRAD, except users fill in conditional probability tables. For a given Bayesian network, GeNIe can calculate the probabilities of each node state. Researchers can also input evidence (set the value of one or more nodes) and see the effect on probabilities. Belief state before entering any evidence. Belief state after asserting that a coma is present.

Uses of Structural Equation Models and Bayesian Networks Scientists in several disciplines use structural equation models and Bayesian networks to explain observations. Moreover, Bayesian networks provide the foundation for informatics tools where they diagnose lymph node diseases in patients (e.g., Pathfinder: Heckerman, 1990); and monitor and direct attention to details in the space shuttle propulsion system (e.g., Vista: Horvitz, 1992). inferring missing values and classification

Qualitative Causal Models Qualitative causal models represent relationships between variables as positive or negative influences. In some cases, these influences come from a richer relational ontology. Each environment must handle the distinction between the qualitative, abstract relationships and quantitative data.

GenePath: Model Construction GenePath is an interactive modeling system for qualitative model construction. Knowledge of relationships in the genetic network. The network reflects how particular genes affect aggregation in D. discoideum. This organism transitions from uni- to multi-cellular when hungry by aggregating. Graphical representation of the genetic network. Red lines are inhibition. Green lines are activation. Numbers indicate confidence. genepath images from interactive session on website.

GenePath: Incorporating Data Adding data, experimental or observational, to GenePath results in an automatic revision of the qualitative model. The program integrates knowledge and data to find a network that reflects current knowledge. An updated model of D. discoideum aggregation that includes new links supported by various data sets.

Hybrow Hybrow is designed to evaluate hypotheses against a knowledge base and data sets. The hypotheses are actually qualitative models. The model contains spatial relationships (Gal4p is in the nucleus). The model also includes biological operators (binds, transports, etc.) screenshot from hybrow website

Hybrow: Model (Hypothesis) Text Models consist of events: ev1 = Gal4p binds Gal80p in nuc(leus) in w(ild)t(ype) ev2 = Mig1p repress gal4 in nuc in wt in presence_of glucose ev3 = Mig1p not repress gal4 in nuc in wt in absence_of hy1 = ev1 + ev2 + ev3 Submitting this model to Hybrow results in a collection of confirmatory and contradictory evidence. ev2:: Ontology: Agent b has to be gene for repress Data: Mig1p repress gal4 wt nuc (PubMed link) ev3:: Ontology: Agent b has to be gene for repress example from Hybrow website

Causal Modeling: Summary The software presented in this lecture shared several common features, such as all the systems let scientists specify models; apart from Hybrow, all systems used a simplified representation for variable interaction; although not discussed here, all the systems incorporate discovery components. Curiously, the quantitative systems lacked support for evaluating models against data. In contrast, the qualitative systems have minimal utility without their interaction with external data and knowledge. in the future, add a slide or two on network models.