MCMC Estimation MCMC = Markov chain Monte Carlo

Slides:



Advertisements
Similar presentations
Interim Analysis in Clinical Trials: A Bayesian Approach in the Regulatory Setting Telba Z. Irony, Ph.D. and Gene Pennello, Ph.D. Division of Biostatistics.
Advertisements

A small taste of inferential statistics
Introduction to Monte Carlo Markov chain (MCMC) methods
MCMC estimation in MlwiN
1 Adding a statistics package Module 2 Session 7.
Assumptions underlying regression analysis
1 Model Evaluation and Selection. 2 Example Objective: Demonstrate how to evaluate a single model and how to compare alternative models.
On Comparing Classifiers : Pitfalls to Avoid and Recommended Approach
Continued Psy 524 Ainsworth
Review bootstrap and permutation
Chapter 4 Inference About Process Quality
T HE ‘N ORMAL ’ D ISTRIBUTION. O BJECTIVES Review the Normal Distribution Properties of the Standard Normal Distribution Review the Central Limit Theorem.
Chapter 18: The Chi-Square Statistic
CHAPTER 15: Tests of Significance: The Basics Lecture PowerPoint Slides The Basic Practice of Statistics 6 th Edition Moore / Notz / Fligner.
CHAPTER 14: Confidence Intervals: The Basics
Chapter 16 Inferential Statistics
4/4/2015Slide 1 SOLVING THE PROBLEM A one-sample t-test of a population mean requires that the variable be quantitative. A one-sample test of a population.
Structural Equation Modeling
Brief introduction on Logistic Regression
1 COMM 301: Empirical Research in Communication Lecture 15 – Hypothesis Testing Kwan M Lee.
1 12. Principles of Parameter Estimation The purpose of this lecture is to illustrate the usefulness of the various concepts introduced and studied in.
Bayesian statistics 2 More on priors plus model choice.
Regression Analysis Using Excel. Econometrics Econometrics is simply the statistical analysis of economic phenomena Here, we just summarize some of the.
Bayesian statistics – MCMC techniques
1 Confidence Interval for the Population Proportion.
1 Confidence Interval for the Population Mean. 2 What a way to start a section of notes – but anyway. Imagine you are at the ground level in front of.
End of Chapter 8 Neil Weisenfeld March 28, 2005.
Inference about a Mean Part II
Using ranking and DCE data to value health states on the QALY scale using conventional and Bayesian methods Theresa Cain.
Example 16.3 Estimating Total Cost for Several Products.
Bootstrapping applied to t-tests
Elec471 Embedded Computer Systems Chapter 4, Probability and Statistics By Prof. Tim Johnson, PE Wentworth Institute of Technology Boston, MA Theory and.
1 Psych 5500/6500 Statistics and Parameters Fall, 2008.
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
Prediction concerning Y variable. Three different research questions What is the mean response, E(Y h ), for a given level, X h, of the predictor variable?
Phylogeny Estimation: Traditional and Bayesian Approaches Molecular Evolution, 2003
Conditional & Joint Probability A brief digression back to joint probability: i.e. both events O and H occur Again, we can express joint probability in.
ECE 8443 – Pattern Recognition LECTURE 06: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Bias in ML Estimates Bayesian Estimation Example Resources:
Prof. Dr. S. K. Bhattacharjee Department of Statistics University of Rajshahi.
2 nd Order CFA Byrne Chapter 5. 2 nd Order Models The idea of a 2 nd order model (sometimes called a bi-factor model) is: – You have some latent variables.
1 1 Slide © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Statistical Decision Making. Almost all problems in statistics can be formulated as a problem of making a decision. That is given some data observed from.
Section 8.1 Estimating  When  is Known In this section, we develop techniques for estimating the population mean μ using sample data. We assume that.
VI. Evaluate Model Fit Basic questions that modelers must address are: How well does the model fit the data? Do changes to a model, such as reparameterization,
CSC321: 2011 Introduction to Neural Networks and Machine Learning Lecture 11: Bayesian learning continued Geoffrey Hinton.
10.1: Confidence Intervals – The Basics. Review Question!!! If the mean and the standard deviation of a continuous random variable that is normally distributed.
A Comparison of Two MCMC Algorithms for Hierarchical Mixture Models Russell Almond Florida State University College of Education Educational Psychology.
Section 10.1 Confidence Intervals
Multigroup Models Byrne Chapter 7 Brown Chapter 7.
SEM Basics 2 Byrne Chapter 2 Kline pg 7-15, 50-51, ,
CHAPTER 27: One-Way Analysis of Variance: Comparing Several Means
Bayesian Travel Time Reliability
CONFIDENCE INTERVALS: THE BASICS Unit 8 Lesson 1.
ES 07 These slides can be found at optimized for Windows)
More on regression Petter Mostad More on indicator variables If an independent variable is an indicator variable, cases where it is 1 will.
Bayesian Modelling Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
Matrix Models for Population Management & Conservation March 2014 Lecture 10 Uncertainty, Process Variance, and Retrospective Perturbation Analysis.
Copyright © Cengage Learning. All rights reserved. 5 Joint Probability Distributions and Random Samples.
The inference and accuracy We learned how to estimate the probability that the percentage of some subjects in the sample would be in a given interval by.
Data Screening. What is it? Data screening is very important to make sure you’ve met all your assumptions, outliers, and error problems. Each type of.
Ch 1. Introduction Pattern Recognition and Machine Learning, C. M. Bishop, Updated by J.-H. Eom (2 nd round revision) Summarized by K.-I.
Remember the equation of a line: Basic Linear Regression As scientists, we find it an irresistible temptation to put a straight line though something that.
CSC321: Lecture 8: The Bayesian way to fit models Geoffrey Hinton.
Markov Chain Monte Carlo in R
Missing data: Why you should care about it and what to do about it
MCMC Stopping and Variance Estimation: Idea here is to first use multiple Chains from different initial conditions to determine a burn-in period so the.
Bayesian data analysis
Let’s continue to do a Bayesian analysis
Multiple Regression.
Amos Introduction In this tutorial, you will be briefly introduced to the student version of the SEM software known as Amos. You should download the current.
Presentation transcript:

MCMC Estimation MCMC = Markov chain Monte Carlo an alternative approach to estimating models

What is the big deal about Markov chain Monte Carlo methods? While MCMC methods are not new, recent advances in algorithms using these methods have led to a bit of a revolution in statistics. This revolution is typically seen as a "Bayesian" revolution because of the fact that the MCMC methods have been put to work by relying on Bayes theorem. It turns out that combining the use of Bayes theorem with MCMC sampling permits an extremely flexible framework for data analysis. However, it is important for us to keep in mind that a Bayesian approach to statistics and MCMC are separate things, not one and the same. For the past several years, MCMC estimation has given those adopting the Bayesian philosophical perspective a large advantage in modeling flexibility over those using other statistical approaches (frequentists and likelihoodists). Very recently, it has become clear that the MCMC-Bayesian machinery can be used to obtain likelihood estimates, essentially meaning that one doesn’t have to adopt a Bayesian philosophical perspective to use MCMC methods. The good news is that there are a lot of new capabilities for analyzing data. The bad news is that there are a lot of disparate perspectives in the literature (e.g., people attributing the merits of MCMC as being inherent advantages of the Bayesian perspective, plus different schools of Bayesian analysis).

Bayesian Fundamentals 1. Bayes Theorem P(M|D) = P(D|M) P(M) P(D) where: P(M|D) = the probability of a model/parameter value given the data P(D|M) = the probability (likelihood) of the data given the model P(M) = the prior probability of the model/parameter value given previous information P(D) = the probability of observing these data given the data-generating mechanism

Bayesian Fundamentals (cont.) 2. The context of the Bayesian approach is to reduce uncertainty through the acquisition of new data. 3. What about that prior? a. When our interest is predicting the next event, the prior information may be very helpful. b. When our interest is in analyzing data, we usually try to use uninformative priors. c. When we know something, like percentage data don't go beyond values of 0 and 100, that can be useful prior information to include. d. The biggest worry about priors is that they may have unknown influences in some cases.

Bayesian Fundamentals (cont.) 4. Bayesian estimation is now frequently conducted using Markov Chain Monte Carlo (MCMC) methods. Such methods are like a kind of bootstrapping that estimates the shape of the posterior distribution. 5. MCMC methods can also be used to obtain likelihoods; remember, posterior = likelihood * prior. By data cloning as described in Lele* et al. (2007), it is possible to obtain pure likelihood estimates using MCMC. *Lele, Dennis, and Lutscher (2007) Data cloning: easy maximum likelihood estimation for complex ecological models using Bayesian Markov chain Monte Carlo methods. Ecology Letters 10:551-563.

MCMC Estimation in Amos The next few slides give a few screen shots of Bayesian/MCMC estimation in Amos. I highly recommend the brief video developed by Jim Arbuckle that can be found at www.amosdevelopment.com/site_map.htm. Just go to this site and look under “videos” for Bayesian Estimation: Intro.

Illustration of Bayesian Estimation in Amos icon to initiate MCMC

Illustration (cont.) frown means not yet converged

Illustration (cont.) point estimates smile means program converged; once you have converted, you can pause simulation point estimates none of the 95% credible intervals include the value of 0. This indicates that we are 95% sure that the true values of the parameters fall within the CIs and are nonzero.

Illustration (cont.) Some measures of model fit. Posterior predictive p values provide some information on overall model fit to data, with values closer to 0.50 being better than ones larger or smaller. DIC values for different models can be compared in a fashion similar to the use of AIC or BIC. Discussions of model comparison for models using MCMC will be discussed in a separate module.

Illustration (cont.) shape of the prior for the parameter for the path from cover to richness. right-click on parameter row to select either prior or posterior for viewing

Illustration (cont.) shape of the posterior for the parameter for the path from cover to richness. S.D. is the standard deviation of the parameter. S.E. is the precision of the MCMC estimate determined by how long you let the process run, not the std. error! there are important options you can select down here, like viewing the trace, autocorrelation, or the first and last half estimates.

Illustration (cont.) shape of the trace for the parameter for the path from cover to richness. The trace is our evaluation of how stable our estimate was during the analysis. Believe it or not, this is how you want the trace to look! It should not be making long-term directional changes.

Illustration (cont.) shape of the autocorrelation curve for the parameter for the path from cover to richness. The autocorrelation curve measures the asymptotic decline to independence for the solution values. You want it to level off, as it has done here.

Standardized Coefficients To get standardized coefficients, including a full set of moments plus their posteriors, you need to select "Analysis Properties", the "Output" tab, and then place a check mark in front of both "Standardized Estimates" and "Indirect, direct & total effects". If you don't ask for "Indirect, direct, and totol effects", you will not actually get the standardized estimates. Then, when you have convergence from your MCMC run, go the the "View" dropdown and select, "Additional Estimands". You will probably have to grab and drag the upper boundary of the subwindows on the left to get to see everything produced, but there should be a column of choices for you to view (shown on next slide). For more information about standardized coefficients in SEM, see, for example, Grace, J.B. and K.A. Bollen. 2005. Interpreting the results from multiple regression and structural equation models. Bulletin of the Ecological Society of America. 86:283-295.

Standardized Coefficients (cont.) here you can see the results here you can choose various options

Calculating R2 Values Amos does not give the R2 values for response variables when the MCMC method is used for estimation. Some statisticians tend to shy away from making a big deal about R2 values because they are properties of the sample rather than of the population. However, other statisticians and most subject-area scientists are usually quite interested in standardized parameters such as standardized coefficients and R2 values, which measure the "strength of relationships". On the next slide I show one way to calculate R2 from the MCMC output. The reader should note that R2 values from MCMC analyses are (in my personal view) sometimes problematic in that they are noticably lower than a likelihood estimation process would produce. I intend a module on this advanced topic at some point.

Calculating R2 Values error variance for response variable R2 = 1 – (e1/variance of salt_log) We need the implied variance of response variables to calculate R2. To get implied variances in Amos, you can select that choice in the Output tab of the Analysis Properties window. With the MCMC procedure, you have to request Additional Estimands from the View dropdown after the solution has converged. For this example, we get an estimate of the implied covariance of salt_log of 0.119. So, R2 = 1-(0.034/0.119) = 0.714. This compares to the ML estimated R2 of 0.728. Again, I will have more to say in a later module about variances and errors estimated using MCMC.

Final Bit Amos makes Bayesian estimation (very!) easy. Amos can do a great deal more than what I have illustrated, like estimate custom parameters (like the differences between values). Unfortunately, Amos cannot do all the kinds of things that can be done in lower level languages like winBUGS or R. This may change before too long (James Arbuckle, developer of Amos, is not saying at the moment). For now, tapping the full potential of MCMC methods requires the use of another software package, winBUGS (or some other package like R). I will be developing separate modules on SEM using winBUGS in the near future for those who want to use more complex models (and are willing to invest considerably more time).