Statistical Modeling and Analysis of MOFEP Chong He ( with John Kabrick, Xiaoqian Sun, Mike Wallendorf) Department of Statistics University of Missouri-Columbia.

Slides:



Advertisements
Similar presentations
Sequential learning in dynamic graphical model Hao Wang, Craig Reeson Department of Statistical Science, Duke University Carlos Carvalho Booth School of.
Advertisements

Inference in the Simple Regression Model
Multiple Analysis of Variance – MANOVA
LECTURE 11: BAYESIAN PARAMETER ESTIMATION
CJT 765: Structural Equation Modeling Class 3: Data Screening: Fixing Distributional Problems, Missing Data, Measurement.
1 Graphical Diagnostic Tools for Evaluating Latent Class Models: An Application to Depression in the ECA Study Elizabeth S. Garrett Department of Biostatistics.
Workshop 2: Spatial scale and dependence in biogeographical patterns Objective: introduce the following fundamental concepts on spatial data analysis:
Parameter Estimation: Maximum Likelihood Estimation Chapter 3 (Duda et al.) – Sections CS479/679 Pattern Recognition Dr. George Bebis.
Chapter Seventeen HYPOTHESIS TESTING
Chapter 14 Conducting & Reading Research Baumgartner et al Chapter 14 Inferential Data Analysis.
1 Finite Population Inference for Latent Values Measured with Error from a Bayesian Perspective Edward J. Stanek III Department of Public Health University.
Role and Place of Statistical Data Analysis and very simple applications Simplified diagram of scientific research When you know the system: Estimation.
Quantitative Business Analysis for Decision Making Simple Linear Regression.
Empirical Estimation Review EconS 451: Lecture # 8 Describe in general terms what we are attempting to solve with empirical estimation. Understand why.
Role and Place of Statistical Data Analysis and very simple applications Simplified diagram of a scientific research When you know the system: Estimation.
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley.
Nested and Split Plot Designs. Nested and Split-Plot Designs These are multifactor experiments that address common economic and practical constraints.
Chapter 3 (part 1): Maximum-Likelihood & Bayesian Parameter Estimation  Introduction  Maximum-Likelihood Estimation  Example of a Specific Case  The.
Statistics 350 Lecture 17. Today Last Day: Introduction to Multiple Linear Regression Model Today: More Chapter 6.
DOCTORAL SEMINAR, SPRING SEMESTER 2007 Experimental Design & Analysis Further Within Designs; Mixed Designs; Response Latencies April 3, 2007.
Statistical Methods for Missing Data Roberta Harnett MAR 550 October 30, 2007.
Biostatistics-Lecture 9 Experimental designs Ruibin Xi Peking University School of Mathematical Sciences.
The Practice of Social Research
Additional Slides on Bayesian Statistics for STA 101 Prof. Jerry Reiter Fall 2008.
ECE 8443 – Pattern Recognition LECTURE 06: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Bias in ML Estimates Bayesian Estimation Example Resources:
Statistical Modeling with SAS/STAT Cheng Lei Department of Electrical and Computer Engineering University of Victoria April 9, 2015.
Bayesian inference review Objective –estimate unknown parameter  based on observations y. Result is given by probability distribution. Bayesian inference.
Bayesian Analysis and Applications of A Cure Rate Model.
Psychology 301 Chapters & Differences Between Two Means Introduction to Analysis of Variance Multiple Comparisons.
Repeated Measurements Analysis. Repeated Measures Analysis of Variance Situations in which biologists would make repeated measurements on same individual.
ECE 8443 – Pattern Recognition LECTURE 07: MAXIMUM LIKELIHOOD AND BAYESIAN ESTIMATION Objectives: Class-Conditional Density The Multivariate Case General.
Mixture Models, Monte Carlo, Bayesian Updating and Dynamic Models Mike West Computing Science and Statistics, Vol. 24, pp , 1993.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Blocks and pseudoreplication
Multiple Regression BPS chapter 28 © 2006 W.H. Freeman and Company.
ANOVA Assumptions 1.Normality (sampling distribution of the mean) 2.Homogeneity of Variance 3.Independence of Observations - reason for random assignment.
Chapter 3: Maximum-Likelihood Parameter Estimation l Introduction l Maximum-Likelihood Estimation l Multivariate Case: unknown , known  l Univariate.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition LECTURE 07: BAYESIAN ESTIMATION (Cont.) Objectives:
Ledolter & Hogg: Applied Statistics Section 6.2: Other Inferences in One-Factor Experiments (ANOVA, continued) 1.
Single-Factor Studies KNNL – Chapter 16. Single-Factor Models Independent Variable can be qualitative or quantitative If Quantitative, we typically assume.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.3 Two-Way ANOVA.
Confidence Interval & Unbiased Estimator Review and Foreword.
Repeated Measures Analysis of Variance Analysis of Variance (ANOVA) is used to compare more than 2 treatment means. Repeated measures is analogous to.
28. Multiple regression The Practice of Statistics in the Life Sciences Second Edition.
Simulation Study for Longitudinal Data with Nonignorable Missing Data Rong Liu, PhD Candidate Dr. Ramakrishnan, Advisor Department of Biostatistics Virginia.
Assessing Estimability of Latent Class Models Using a Bayesian Estimation Approach Elizabeth S. Garrett Scott L. Zeger Johns Hopkins University Departments.
Markov Chain Monte Carlo for LDA C. Andrieu, N. D. Freitas, and A. Doucet, An Introduction to MCMC for Machine Learning, R. M. Neal, Probabilistic.
The Uniform Prior and the Laplace Correction Supplemental Material not on exam.
Univariate Gaussian Case (Cont.)
ANOVA Overview of Major Designs. Between or Within Subjects Between-subjects (completely randomized) designs –Subjects are nested within treatment conditions.
Statistical NLP: Lecture 4 Mathematical Foundations I: Probability Theory (Ch2)
- 1 - Preliminaries Multivariate normal model (section 3.6, Gelman) –For a multi-parameter vector y, multivariate normal distribution is where  is covariance.
A Collapsed Variational Bayesian Inference Algorithm for Latent Dirichlet Allocation Yee W. Teh, David Newman and Max Welling Published on NIPS 2006 Discussion.
Lecturer: Ing. Martina Hanová, PhD..  How do we evaluate a model?  How do we know if the model we are using is good?  assumptions relate to the (population)
Econometrics I Summer 2011/2012 Course Guarantor: prof. Ing. Zlata Sojková, CSc., Lecturer: Ing. Martina Hanová, PhD.
Markov Chain Monte Carlo in R
CS479/679 Pattern Recognition Dr. George Bebis
IEE 380 Review.
Ch3: Model Building through Regression
Analyzing Redistribution Matrix with Wavelet
Lecture 2: Replication and pseudoreplication
Simple Linear Regression - Introduction
Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John.
Statistical NLP: Lecture 4
Two-way analysis of variance (ANOVA)
Chapter 9: Differences among Groups
LECTURE 09: BAYESIAN LEARNING
LECTURE 07: BAYESIAN ESTIMATION
A protocol for data exploration to avoid common statistical problems
CS639: Data Management for Data Science
Presentation transcript:

Statistical Modeling and Analysis of MOFEP Chong He ( with John Kabrick, Xiaoqian Sun, Mike Wallendorf) Department of Statistics University of Missouri-Columbia

Outline Review current statistical analysis Review current statistical analysis Spatial structure Spatial structure Bayesian multivariate spatial modeling Bayesian multivariate spatial modeling Our progress and challenge Our progress and challenge New research: sampling design? New research: sampling design?

Current statistical analysis models for MOFEP studies Complete random block (Sheriff & He): Complete random block (Sheriff & He): -- using compartment as unit, -- using compartment as unit, -- 9 data points each year, -- 9 data points each year, -- 5 unknown parameters: 2 for blocks, 2 for -- 5 unknown parameters: 2 for blocks, 2 for treatments, and 1 for variance; treatments, and 1 for variance; Split-plot (Sheriff & He): Split-plot (Sheriff & He): -- using ELT as unit, -- using ELT as unit, -- to test treatment effect: 9 data points and 5 unknown -- to test treatment effect: 9 data points and 5 unknown -- to test ELT related effects: 18 data points & 10 unknown -- to test ELT related effects: 18 data points & 10 unknown (assume 2 ELT per compartment). (assume 2 ELT per compartment). Split-plot with repeated measurements (Sheriff & He): Split-plot with repeated measurements (Sheriff & He): -- using ELT as unit & repeat over year. -- using ELT as unit & repeat over year.

Current statistical analysis models for MOFEP studies (cont.) Meta-analysis (Gram et al): Meta-analysis (Gram et al): -- using compartment as unit, -- using compartment as unit, -- based on effective size d j =(M T - M C )/SD TC -- based on effective size d j =(M T - M C )/SD TC cumulative effective size d + cumulative effective size d + Others, such as regression & ANOVA: Others, such as regression & ANOVA: -- using sample plot as unit, -- using sample plot as unit, -- lots of data ( assume data points are independent), -- lots of data ( assume data points are independent), -- resulting large type I error (indicate a significant -- resulting large type I error (indicate a significant treatment effect when there is not), the error rate could treatment effect when there is not), the error rate could be as high as 40%. α =.05 is based on independency be as high as 40%. α =.05 is based on independency assumption. assumption.

Spatial structure Physical and biological variables observed in nature display spatial patterns (gradients and patches); Physical and biological variables observed in nature display spatial patterns (gradients and patches); Patterns may result either from deterministic processes or from processes causing spatial autocorrelation, or both; Patterns may result either from deterministic processes or from processes causing spatial autocorrelation, or both; Model 1 (spatial dependence): Model 1 (spatial dependence): y j = µ j + f (explanatory variables j ) + ε j y j = µ j + f (explanatory variables j ) + ε j Model 2 (spatial autocorrelation): Model 2 (spatial autocorrelation): y j = µ j + Σ i f (y i - µ y ) + ε j y j = µ j + Σ i f (y i - µ y ) + ε j Model 3 (combination of model 1&2): Model 3 (combination of model 1&2): y j = µ j + f 1 (explanatory variables j ) + Σ i f 2 (y i - µ y ) +ε j y j = µ j + f 1 (explanatory variables j ) + Σ i f 2 (y i - µ y ) +ε j Model 4 : explanatory variables j may themselves be modeled Model 4 : explanatory variables j may themselves be modeled by model 3. by model 3.

Bayesian multivariate spatial model Bayesian method Bayesian method likelihood f(y| θ) + prior (θ)  posterior (θ|y) likelihood f(y| θ) + prior (θ)  posterior (θ|y) -- all the inference are based on the posterior -- all the inference are based on the posterior -- informative & non-informative priors -- informative & non-informative priors Bayesian multivariate spatial model Bayesian multivariate spatial model y j = µ j + f 1 (explanatory variables j ) y j = µ j + f 1 (explanatory variables j ) + Σ i f 2 (y i - µ y ) +ε j, y j, µ j are vectors + Σ i f 2 (y i - µ y ) +ε j, y j, µ j are vectors priors on unknown parameters priors on unknown parameters -- latent variables: response variable and explanatory variables -- latent variables: response variable and explanatory variables may be measured at difference location or scale. may be measured at difference location or scale. -- Please discuss your research questions with us and we can -- Please discuss your research questions with us and we can help you! help you!

Our progress and challenge One Ph.D. student started to work on the modeling this semester. One Ph.D. student started to work on the modeling this semester. Transfer geo-data from GIS system to Splus Transfer geo-data from GIS system to Splus system. system. Start developing Bayesian spatial model on Start developing Bayesian spatial model on vegetation data. vegetation data. Challenge: too many variables to work with.

New research: sampling design? We may use the developed model to address the sampling problem such as: We may use the developed model to address the sampling problem such as: -- do we need more or less sample points? -- do we need more or less sample points? -- where to add more sample points? -- where to add more sample points? -- how often? -- how often?