1 Finite Population Inference for Latent Values Measured with Error that Partially Account for Identifable Subjects from a Bayesian Perspective Edward.

Slides:



Advertisements
Similar presentations
CS Statistical Machine learning Lecture 13 Yuan (Alan) Qi Purdue CS Oct
Advertisements

Unit 7 Section 6.1.
Fast Bayesian Matching Pursuit Presenter: Changchun Zhang ECE / CMR Tennessee Technological University November 12, 2010 Reading Group (Authors: Philip.
1 Point and Interval Estimates Examples with z and t distributions Single sample; two samples Result: Sums (and differences) of normally distributed RV.
Probability theory 2010 Order statistics  Distribution of order variables (and extremes)  Joint distribution of order variables (and extremes)
Resampling techniques Why resampling? Jacknife Cross-validation Bootstrap Examples of application of bootstrap.

1 Introduction to Biostatistics (PUBHLTH 540) Multiple Random Variables.
Chapter 6 Introduction to Sampling Distributions
1 When are BLUPs Bad Ed Stanek UMass- Amherst Julio Singer USP- Brazil George ReedUMass- Worc. Wenjun LiUMass- Worc.
SPH&HS, UMASS Amherst 1 Sampling, WLS, and Mixed Models Festschrift to Honor Professor Gary Koch Ed Stanek and Julio Singer U of Mass, Amherst, and U of.
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 6 Introduction to Sampling Distributions.
1 Finite Population Inference for the Mean from a Bayesian Perspective Edward J. Stanek III Department of Public Health University of Massachusetts Amherst,
4. Multiple Regression Analysis: Estimation -Most econometric regressions are motivated by a question -ie: Do Canadian Heritage commercials have a positive.
2. Point and interval estimation Introduction Properties of estimators Finite sample size Asymptotic properties Construction methods Method of moments.
1 Finite Population Inference for Latent Values Measured with Error from a Bayesian Perspective Edward J. Stanek III Department of Public Health University.
Mixed models Various types of models and their relation
1 Sampling Models for the Population Mean Ed Stanek UMASS Amherst.
1 Introduction to Biostatistics (PUBHLTH 540) Sampling.
Introduction to Linear Mixed Effects Kiran Pedada PhD Student (Marketing) March 26, 2015.
Basics of regression analysis
Analysis of Covariance Goals: 1)Reduce error variance. 2)Remove sources of bias from experiment. 3)Obtain adjusted estimates of population means.
Maximum likelihood (ML)
Review of normal distribution. Exercise Solution.
: Appendix A: Mathematical Foundations 1 Montri Karnjanadecha ac.th/~montri Principles of.
Latent Variable Models Christopher M. Bishop. 1. Density Modeling A standard approach: parametric models  a number of adaptive parameters  Gaussian.
The paired sample experiment The paired t test. Frequently one is interested in comparing the effects of two treatments (drugs, etc…) on a response variable.
PBG 650 Advanced Plant Breeding
Estimating parameters in a statistical model Likelihood and Maximum likelihood estimation Bayesian point estimates Maximum a posteriori point.
Statistical Modeling and Analysis of MOFEP Chong He ( with John Kabrick, Xiaoqian Sun, Mike Wallendorf) Department of Statistics University of Missouri-Columbia.
Yaomin Jin Design of Experiments Morris Method.
The Dirichlet Labeling Process for Functional Data Analysis XuanLong Nguyen & Alan E. Gelfand Duke University Machine Learning Group Presented by Lu Ren.
Chapter 7 Sampling and Sampling Distributions ©. Simple Random Sample simple random sample Suppose that we want to select a sample of n objects from a.
1 Chapter 7 Sampling Distributions. 2 Chapter Outline  Selecting A Sample  Point Estimation  Introduction to Sampling Distributions  Sampling Distribution.
Math b (Discrete) Random Variables, Binomial Distribution.
: Chapter 3: Maximum-Likelihood and Baysian Parameter Estimation 1 Montri Karnjanadecha ac.th/~montri.
Problem: 1) Show that is a set of sufficient statistics 2) Being location and scale parameters, take as (improper) prior and show that inferences on ……
Single-Factor Studies KNNL – Chapter 16. Single-Factor Models Independent Variable can be qualitative or quantitative If Quantitative, we typically assume.
Confidence Interval & Unbiased Estimator Review and Foreword.
Statistical Analysis Professor Lynne Stokes Department of Statistical Science Lecture 18 Random Effects.
Additional Topics in Prediction Methodology. Introduction Predictive distribution for random variable Y 0 is meant to capture all the information about.
Lecture 2: Statistical learning primer for biologists
The generalization of Bayes for continuous densities is that we have some density f(y|  ) where y and  are vectors of data and parameters with  being.
Simulation Study for Longitudinal Data with Nonignorable Missing Data Rong Liu, PhD Candidate Dr. Ramakrishnan, Advisor Department of Biostatistics Virginia.
Point Estimation of Parameters and Sampling Distributions Outlines:  Sampling Distributions and the central limit theorem  Point estimation  Methods.
Statistics Sampling Distributions and Point Estimation of Parameters Contents, figures, and exercises come from the textbook: Applied Statistics and Probability.
1 Chapter 8: Model Inference and Averaging Presented by Hui Fang.
Week 21 Order Statistics The order statistics of a set of random variables X 1, X 2,…, X n are the same random variables arranged in increasing order.
G. Cowan Lectures on Statistical Data Analysis Lecture 9 page 1 Statistical Data Analysis: Lecture 9 1Probability, Bayes’ theorem 2Random variables and.
Chapter 8 Estimation ©. Estimator and Estimate estimator estimate An estimator of a population parameter is a random variable that depends on the sample.
- 1 - Preliminaries Multivariate normal model (section 3.6, Gelman) –For a multi-parameter vector y, multivariate normal distribution is where  is covariance.
Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
I. Statistical Methods for Genome-Enabled Prediction of Complex Traits OUTLINE THE CHALLENGES OF PREDICTING COMPLEX TRAITS ORDINARY LEAST SQUARES (OLS)
Chapter 3: Maximum-Likelihood Parameter Estimation
ESTIMATION.
Point and interval estimations of parameters of the normally up-diffused sign. Concept of statistical evaluation.
Probability Theory and Parameter Estimation I
Inference: Conclusion with Confidence
Ch3: Model Building through Regression
CH 5: Multivariate Methods
Sample Mean Distributions
Statistics and Art: Sampling, Response Error, Mixed Models, Missing Data, and Inference Ed Stanek And others: Recai Yucel, Julio Singer, and others on.
Ed Stanek and Julio Singer
Comparisons among methods to analyze clustered multivariate biomarker predictors of a single binary outcome Xiaoying Yu, PhD Department of Preventive Medicine.
OVERVIEW OF LINEAR MODELS
OVERVIEW OF LINEAR MODELS
More Parameter Learning, Multinomial and Continuous Variables
Parametric Methods Berlin Chen, 2005 References:
Chapter 6 Confidence Intervals.
Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.
Presentation transcript:

1 Finite Population Inference for Latent Values Measured with Error that Partially Account for Identifable Subjects from a Bayesian Perspective Edward J. Stanek III Department of Public Health University of Massachusetts Amherst, MA

2 Collaborators Parimal Mukhopadhyay, Indian Statistics Institute, Kolkata, India Viviana Lencina, Facultad de Ciencias Economicas, Universidad Nacional de Tucumán, CONICET, Argentina Luz Mery Gonzalez, Departamentao de Estadística, Universidad Nacional de Colombia, Bogotá, Colombia Julio Singer, Departamento de Estatística, Universidade de São Paulo, Brazil Wenjun Li, Department of Behavioral Medicine, UMASS Medical School, Worcester, MA Rongheng Li, Shuli Yu, Guoshu Yuan, Ruitao Zhang, Faculty and Students in the Biostatistics Program, UMASS, Amherst

3 Outline Review of Finite Population Bayesian Models 1.Populations, Prior, and Posterior 2.Notation 3.Exchangeable distribution 4.Sample Space and Data 5.Posterior Distribution, given data

4 Bayesian Model General Idea Populations # Posterior Populations: Data Prior Posterior # Prior Populations: Prior Probabilities Posterior Probabilities Review

5 Bayesian Model Population and Data Notation Populations Label Latent Value Labels Parameter Vector Data Vector Review Measurement Variance Measurement Error Variance Expected response Response Measurement Error Variance

6 Exchangeable Prior Populations General Idea When N=3 Each Permutation p of subjects in L (i.e. each different listing) Joint Probability Density Must be identical Exchangeable Random Variables The common distribution General Notation Assigns (usually) equal probability to each permutation of subjects in the population. Review

7 Exchangeable Prior Populations N=3 Potential Response for Each Listing of subjects Listings Latent Values for Listing Latent Values for permutations of listing Review

8 Exchangeable Prior Population Permutations Rose Daisy Lily Listing p=1 Review

9 Exchangeable Prior Populations N=3 Permutations Rose Daisy Lily Listing p=1 Review

10 Exchangeable Prior Populations N=3 Permutations of Listings Listing p=1 Same Point in Listing Listing p=2 Listing p=3 Listing p=4 Listing p=5 Listing p=6 Review

11 Measurement Error Model Prior Random Variables Population h, Prior # Prior Populations: Vectors Assume Random Variables representing a population are exchangeable When p=1, define Sets Prior

12 Prior Random Variables and Data with Measurement Error Prior Random Variables that will correspond to Latent values for subjects In the data Remaining Prior Random Variables Prior Data

13 Bayesian Model Exchangeable Prior Populations N=3 for h when Listing p=1 Sample Space n=2 Prior Review

14 Bayesian Model Exchangeable Prior Populations N=3: Sample Point n= Listing p= Listing p= Listing p=1 Listing p= Listing p= Listing p=6 Review

15 Exchangeable Prior Populations N=3 Permutations Rose Daisy Lily Listing p=1 Review

16 Bayesian Model Exchangeable Prior Populations N=3 for h when Listing p=1 Prior Review Sample Space n=2 when Listing p=1

17 Exchangeable Prior Populations N=3: Sample Points n= Listing p= Listing p= Listing p=1 Listing p= Listing p= Listing p=6 Positive Prob. Review

18 Data n= Review Axis

19 Data n= Review Axis Rose Daisy

20 Data n= Review Axis Rose Daisy

21 Data n=2 Adding Measurement Error to Rose Axis Rose Daisy

22 Exchangeable Prior Populations N=3 Sample Points with Positive Probability n= Listing p= Listing p= Listing p=1 Listing p= Listing p= Listing p=6 Review

23 Exchangeable Prior Populations N=3 Posterior Random Variables Prior Data If permutations of subjects in listing p are equally likely: Random variables representing the data are independent of the remaining random variables. The Expected Value of random variables for the data is the mean for the data. Review when

24 Posterior Random Variables no Measurement Error If permutations of subjects in listing p are equally likely: Review Data Populations Prior # Prior Populations:

25 Posterior Random Variables no Measurement Error If permutations of subjects in listing p are equally likely: Review Data Populations Prior # Prior Populations: nxn random permutation matrix Data

26 Posterior Random Variables no Measurement Error If permutations of subjects in listing p are equally likely: Review Data Populations Prior # Prior Populations:

27 Posterior Random Variables no Measurement Error If permutations of subjects in listing p are equally likely: Review Data Populations Prior # Prior Populations:

28 Data without Measurement Error Data (set) Vectors permutation matrix, k=1,…,n! and to be anLet Data (set of vectors) Latent Value

29 Data with and without Measurement Error No Measurement Error Latent Value Data With Measurement Error Vectors Sets Data Response at tPotential Response

30 Data with Measurement Error the realization of on occasion t The realization of Sets Data the latent value Assume: Measurement errors are independent when repeatedly measured on a subject

31 Measurement Error Model The Data Vectors Define Latent ValuesPotential response with error Data

32 Measurement Error Model Prior Random Variables Populations Prior # Prior Populations: Population h Labels Parameter Vector Assume Random Variables representing a population are exchangeable Defines the axes for a cloud of points in the prior When p=1, define

33 Exchangeable Prior Populations N=3 No Measurement Error Rose Daisy Lily Single Point

34 Exchangeable Prior Populations N=3 with Measurement Error Rose Daisy Lily Cloud of Points

35 Measurement Error Model Prior Random Variables Population h, Prior # Prior Populations: Vectors Assume Random Variables representing a population are exchangeable Defines the axes for a cloud of points in the prior When p=1, define Sets Prior and, Labels Latent Values Potential Response Vectors of Population

36 Prior Random Variables and Data with Measurement Error If permutations of subjects in listing p are equally likely: Assume Random Variables representing a population are exchangeable in each population Since Response for subject or Population Prior Data

37 Posterior Random Variables with Measurement Error If permutations of subjects in listing p are equally likely: Data Prior Points in the Prior that match the data where

38 Posterior Random Variables with Measurement Error Data Prior Points in the Prior that match the data Finite Population Mixed Model for the subjects in the Data: where random subject effect Use this model to obtain the best linear unbiased predictor of the latent value for a subject in the data (which we call the BLUP for a realized subject)

39 Capturing Partial Label Information in the Posterior Distribution Usual Posterior Prior Data for listing p=1 for points where where We want to use the label information for the response error, but not use it for the mean. In the posterior, we want to replace by

40 Capturing Partial Label Information in the Posterior Distribution Usual Posterior Prior Data We want to define In the posterior, we can list the subjects that are in the data in an order, say =realized order for posterior Now for all k=1,…,n! such that Measurement Error variance will be block diagonal, in the realized order for the posterior.

41 Partially Labeled Posterior Distribution Partially Labeled Posterior Prior Data Latent values for response in the posterior are in random order =realized order for posterior for realized response ??? Measurement error variance is heterogenous and matches the order of the subjects in the posterior.

42 Partially Labeled Posterior Distribution Prior Data Latent values for response in the posterior are in random order =realized order for posterior for realized response ??? Measurement error variance is heterogenous and matches the order of the subjects in the posterior. How do we define a prior distribution that will result in such a posterior distribution? Partially Labeled Posterior

43 Capturing Partial Label Information in the Posterior Distribution Usual Posterior Prior Data for points where To form a partially labeled posterior, we need to be equal to for all k=1,…,n!

44 Capturing Partial Label Information in the Posterior Distribution Partially Labeled Posterior Prior Data for points where since ??? How do we define a prior distribution that will result in such a posterior distribution?

45 Capturing Partial Label Information in the Posterior Distribution Partially Labeled Posterior Partially Labeled Prior Data for points where Define =realized order for posterior for realized response

46 Capturing Partial Label Information in the Posterior Distribution Partially Labeled Posterior Partially Labeled Prior Data =realized order for posterior for realized response Measurement error variance is heterogenous and matches the order of the subjects in the posterior.

47 An Example Posterior Random Variables with Measurement Error where random subject effect Use this model to obtain the best linear unbiased predictor of the latent value for a subject in the data (which we call the BLUP for a realized subject) Prior Population Data

48 Prior An Exchangeable Prior N=3 No Measurement Error R L D R D L L R D L D R D R L D L R Rose Lily Daisy Permutation p * =1 p*=2 p*=3 p*=4 p*=5 p*=6 L R D L D R R L D R D L D L R D R L Lily Rose Daisy p=1 p=2 p=6=N! D L R D R L L D R L R D R D L R L D Daisy Lily Rose Listings For Population h