Evidence for Probabilistic Hypotheses: With Applications to Causal Modeling Malcolm R. Forster Department of Philosophy University of Wisconsin-Madison.

Slides:



Advertisements
Similar presentations
Framing an Experimental Hypothesis WP5 Professor Alan K. Outram University of Exeter 8 th October 2012.
Advertisements

What is research? Lecture 2 INFO61003 Harold Somers.
WHAT IS THE NATURE OF SCIENCE?
How do we know when we know. Outline  What is Research  Measurement  Method Types  Statistical Reasoning  Issues in Human Factors.
ETHEM ALPAYDIN © The MIT Press, Lecture Slides for 1 Lecture Notes for E Alpaydın 2010.
Weakening the Causal Faithfulness Assumption
Psychology 290 Special Topics Study Course: Advanced Meta-analysis April 7, 2014.
CHAPTER 21 Inferential Statistical Analysis. Understanding probability The idea of probability is central to inferential statistics. It means the chance.
Chapter 10.  Real life problems are usually different than just estimation of population statistics.  We try on the basis of experimental evidence Whether.
Post-Positivist Perspectives on Theory Development
Concept Summary Batesville High School Physics. Natural Philosophy  Socrates, Plato, Aristotle  Were the “authorities” in Western thought from about.
Empirical Analysis Doing and interpreting empirical work.
Descriptive statistics Experiment  Data  Sample Statistics Sample mean Sample variance Normalize sample variance by N-1 Standard deviation goes as square-root.
Models -1 Scientists often describe what they do as constructing models. Understanding scientific reasoning requires understanding something about models.
CSE 221: Probabilistic Analysis of Computer Systems Topics covered: Statistical inference (Sec. )
Machine Learning CMPT 726 Simon Fraser University
On the causal interpretation of statistical models in social research Alessio Moneta & Federica Russo.
Scientific method - 1 Scientific method is a body of techniques for investigating phenomena and acquiring new knowledge, as well as for correcting and.
Statistical Methods in Computer Science Hypothesis Testing I: Treatment experiment designs Ido Dagan.
BCOR 1020 Business Statistics
1 In the previous sequence, we were performing what are described as two-sided t tests. These are appropriate when we have no information about the alternative.
Bayesian Decision Theory Making Decisions Under uncertainty 1.
Methods of Observation PS 204A, Week 2. What is Science? Science is: (think Ruse) Based on natural laws/empirical regularities. Based on natural laws/empirical.
Causality, Reasoning in Research, and Why Science is Hard
So, What Do We Know?. The skeptical worry  We might worry that our most central beliefs are false.  Because the false beliefs are central, many of our.
© 2008 McGraw-Hill Higher Education The Statistical Imagination Chapter 9. Hypothesis Testing I: The Six Steps of Statistical Inference.
Chapter 2 The Research Enterprise in Psychology. n Basic assumption: events are governed by some lawful order  Goals: Measurement and description Understanding.
Time Series Analysis – Chapter 4 Hypothesis Testing Hypothesis testing is basic to the scientific method and statistical theory gives us a way of conducting.
1 Probability and Statistics  What is probability?  What is statistics?
1 Statistical Inference. 2 The larger the sample size (n) the more confident you can be that your sample mean is a good representation of the population.
© 2005 Pearson Education Inc., publishing as Addison-Wesley Chapter 3d The Science of Astronomy.
1 PRINCIPLES OF HYPOTHESIS TESTING. 2 A Quick Review of Important Issues About Sampling: To examine the sample’s attributes (sample statistics) as ESTIMATES.
So Many Theories of Simplicity! Which One is Right? Malcolm R. Forster Department of Philosophy University of Wisconsin-Madison June 23,
 What is MASS?  What is a MEDIUM?  What is MASS? › Large, undifferentiated audience? › The large, general group is often defined by a certain demographic.
Likelihood Methods in Ecology November 16 th – 20 th, 2009 Millbrook, NY Instructors: Charles Canham and María Uriarte Teaching Assistant Liza Comita.
Ensembles. Ensemble Methods l Construct a set of classifiers from training data l Predict class label of previously unseen records by aggregating predictions.
Chapter 9 Tests of Hypothesis Single Sample Tests The Beginnings – concepts and techniques Chapter 9A.
Lesson 1 What is Science?. What do you wonder about when you see this picture? Inquiring Minds Want to Know.
WHAT IS THE NATURE OF SCIENCE?. SCIENTIFIC WORLD VIEW 1.The Universe Is Understandable. 2.The Universe Is a Vast Single System In Which the Basic Rules.
CHAPTER 1 – THE SCIENCE OF BIOLOGY What Is Science? (A) Organized way of using evidence to learn about the natural world. (B) Collection of knowledge that.
Invitation to Critical Thinking Chapter 9 Lecture Notes Chapter 9.
The Problem of Induction. Aristotle’s Inductions Aristotle’s structure of knowledge consisted of explanations such as: Aristotle’s structure of knowledge.
Introduction to Earth Science Section 2 Section 2: Science as a Process Preview Key Ideas Behavior of Natural Systems Scientific Methods Scientific Measurements.
Introduction to Statistical Inference A Comparison of Classical Orthodoxy with the Bayesian Approach.
Research Design and Methods. Causal Inference  What is causal inference “…learning about CAUSAL effects from the data observed.” (KKV, 8)  Different.
Introduction to Ethics Lecture 7 Mackie & Moral Skepticism
Business Statistics for Managerial Decision Farideh Dehkordi-Vakil.
Effect of the Reference Set on Frequency Inference Donald A. Pierce Radiation Effects Research Foundation, Japan Ruggero Bellio Udine University, Italy.
Section 4.4; Issues & debates Psychology as a science.
Scientific Method Chapter 1-1. What is Science?  Science – organized way of gathering and analyzing evidence about the natural world  Described as a.
Stats Term Test 4 Solutions. c) d) An alternative solution is to use the probability mass function and.
Chapter 2 The Research Process Text: Zechmeister, J. S., Zechmeister, E. B., & Shaughnessy, J. J. (2001). Essentials of research methods in Psychology.
Lec. 19 – Hypothesis Testing: The Null and Types of Error.
1.3 Scientific Thinking and Processes KEY CONCEPT Science is a way of thinking, questioning, and gathering evidence.
Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability Primer Bayesian Brain Probabilistic Approaches to Neural Coding 1.1 A Probability.
Basic Bayes: model fitting, model selection, model averaging Josh Tenenbaum MIT.
Lesson 1-1 Nature of Science. QUESTIONS Communicate Observe Define scope of a Problem Form a testable Question Research the known Clarify an expected.
Philosophy of science What is a scientific theory? – Is a universal statement Applies to all events in all places and time – Explains the behaviour/happening.
CHAPTER 1 – THE SCIENCE OF BIOLOGY What Is Science? (A) Organized way of using evidence to learn about the natural world. (B) Collection of knowledge that.
CSC321: Lecture 8: The Bayesian way to fit models Geoffrey Hinton.
Parsimony, Likelihood, Common Causes, and Phylogenetic Inference
Psychology as a science
Statistics for Psychology
The Scientific Method in Psychology
Using climate model robustness as an example of scientific confirmation Matt Newman CIRES.
Introduction to Econometrics, 5th edition
Pattern Recognition and Machine Learning
Rayat Shikshan Sanstha’s S. M
What determines Sex Ratio in Mammals?
Presentation transcript:

Evidence for Probabilistic Hypotheses: With Applications to Causal Modeling Malcolm R. Forster Department of Philosophy University of Wisconsin-Madison 1 Vals, Switzerland, August 7, 2013

References 2 Forster, Malcolm R. (1984): Probabilistic Causality and the Foundations of Modern Science. Ph.D. Thesis, University of Western Ontario. Forster, Malcolm R. (1988): “Sober’s Principle of Common Cause and the Problem of Incomplete Hypotheses.” Philosophy of Science 55: 538 ‑ 59. Forster, Malcolm R. (2006), “Counterexamples to a Likelihood Theory of Evidence,” Mind and Machines, 16: Whewell, William (1858): The History of Scientific Ideas, 2 vols, London, John W. Parker. Wright, Sewell (1921). “Correlation and Causation,” Journal of Agricultural Research 20:

How to discover causes… TWO THESES 3 Thesis (a): Probabilistic independences provide a way to discover causal relations. Thesis (b) Probabilistic independences provide the only way to discover causal relations.  The simplest way to argue against (b) is to show how data can favor X  Y against Y  X.

Back to first principles… Hypothesis testing in general... 4 Modus Tollens: Hypothesis H entails observation O, O is false, therefore H is false. Probabilistic Modus Tollens: H entails that observation O is highly probable, O is false, therefore H is false.  THE PROBLEM: In most situations, all rival hypotheses give the total evidence E very low probability. Put O = not-E …run prob. modus tollens … end up rejecting EVERY hypothesis!!!

A response to the PROBLEM 5 (1)We should not focus exclusively on the total evidence E. (2)We should focus on those aspects of the data O that are central to what the hypothesis says. Example 2: The independencies entailed by d- separation in causal models. Example 1: The agreement of independent measurements of the parameters postulated by the model. E.g.  in the Bernoulli model, or the agreement of independent measurements of the Earth’s mass.

A response to the PROBLEM …continued. 6 (3) We should look at what is entailed by the models by themselves, without the help of other data. Examples 1 and 2 meet this desideratum. Also justifies a faithfulness principle: Favor models that entail an independency over one that is merely able to accommodate it (even if the likelihoods go the other way). (I don’t see this as appealing to non-empirical biases, such as simplicity.)

Now apply the agreement of measurements idea to the testing of causal models… 7 What does Forward, X  Y, entail? The independencies entailed by a DAG is part of what a causal model entails. But it often says something more… It says something the forward probabilities (or densities) p(y|x), and nothing (directly) about p(x) or p(x,y) or p(x|y). X  Y says: If p 1 (x), then p 1 (x,y) = p 1 (x) p(y|x), If p 2 (x), then p 2 (x,y) = p 2 (x) p(y|x), and so on.

The key idea… 8 We can use data generated by p 1 (x,y), to estimate parameters in p(y|x). We can use data generated by p 2 (x,y), to estimate the same parameters in p(y|x). The two data clusters provide independent estimates of the parameters. If the estimates agree then we have an agreement of independent measurements. The hypothesis “stuck its neck out”, it risked falsification, it survived the test, and is thereby confirmed.

9 Prediction versus Accommodation y x Both X  Y and Y  X are able to accommodate (that is, fit) the total evidence well. So a maximum likelihood comparison is not going to discriminate well. Cluster 1 generated by p 1 (x,y). Cluster 2 generated by p 2 (x,y) But suppose we fit a model to Cluster 1, and then to Cluster 2 to see whether the independent measurements of the parameters agree.

The content of X  Y 10 X  Y says: If p 1 (x), then p 1 (x,y) = p 1 (x) p(y|x), If p 2 (x), then p 2 (x,y) = p 2 (x) p(y|x), and so on. X  Y also says: If p 1 (x), then p 1 (x|y) = p 1 (x,y)/p 1 (y). If p 2 (x), then p 2 (x|y) = p 2 (x,y)/p 2 (y)., and so on. In general, p 1 (x|y)  p 2 (x|y). That is, X  Y says that the backwards probabilities vary. If X  Y is right then Y  X is wrong. It’s metaphysically possible that that forward model say that forward probabilities depend on the input distribution. But we need to search for uniformities of nature…

x y The data are generated from Y = X + U, where x is N(–10,1), U is N(0,1) and U is independent of X. The y on x regression is different from the x on y regression. 11 The Asymmetry of Regression…

12 Forward Model: X  Y X  Y : y x Cluster 1 Cluster 2 X  Y passes the test because… Independent measurements agree!

y x Backward Model Y  X Y  X says: Y  X fails the test because… Not all independent measurements agree. 

The forwards model fits Cluster 2 (top right) better than the backwards model y x 14 Another way of seeing the same thing...

Summary Bullets 15 The phenomenon is completely general. It does not depend on any special features of the distribution, except the judicious splitting of the data into clusters. The method depends on a judicious splitting of the data. Bayesians (and likelihoods) do not split data. (They consider on the likelihoods relative to the total evidence.) If you don’t split data, then it more difficult to show that X  Y is right and Y  X is wrong.

16 Alicelosswin 1 euro euros8020 Boblosswin 1 euro euros16040 FORWARD CAUSAL MODEL Independent measurements agree!

17 Alicelosswin 1 euro euros8020 Boblosswin 1 euro euros16040 Independent measurements do NOT agree. BACKWARD CAUSAL MODEL 

In 15 runs the forwards regression is closer to the generating curve, y = x, than the backwards regression. y x 18 Robustness of the Phenomenon