Ed Stanek and Julio Singer

Slides:

Advertisements

Similar presentations

CmpE 104 SOFTWARE STATISTICAL TOOLS & METHODS MEASURING & ESTIMATING SOFTWARE SIZE AND RESOURCE & SCHEDULE ESTIMATING.

Advertisements

Chap 8-1 Statistics for Business and Economics, 6e © 2007 Pearson Education, Inc. Chapter 8 Estimation: Single Population Statistics for Business and Economics.

1 When are BLUPs Bad Ed Stanek UMass- Amherst Julio Singer USP- Brazil George ReedUMass- Worc. Wenjun LiUMass- Worc.

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 7-1 Chapter 7 Confidence Interval Estimation Statistics for Managers.

SPH&HS, UMASS Amherst 1 Sampling, WLS, and Mixed Models Festschrift to Honor Professor Gary Koch Ed Stanek and Julio Singer U of Mass, Amherst, and U of.

1 Finite Population Inference for the Mean from a Bayesian Perspective Edward J. Stanek III Department of Public Health University of Massachusetts Amherst,

1 Finite Population Inference for Latent Values Measured with Error that Partially Account for Identifable Subjects from a Bayesian Perspective Edward.

1 Finite Population Inference for Latent Values Measured with Error from a Bayesian Perspective Edward J. Stanek III Department of Public Health University.

1 Sampling Models for the Population Mean Ed Stanek UMASS Amherst.

Chapter 7 Estimation: Single Population

8-1 Copyright ©2011 Pearson Education, Inc. publishing as Prentice Hall Chapter 8 Confidence Interval Estimation Statistics for Managers using Microsoft.

Copyright ©2011 Pearson Education 8-1 Chapter 8 Confidence Interval Estimation Statistics for Managers using Microsoft Excel 6 th Global Edition.

Separate multivariate observations

Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 7-1 Chapter 7 Confidence Interval Estimation Statistics for Managers.

Basic Business Statistics, 11e © 2009 Prentice-Hall, Inc. Chap 8-1 Chapter 8 Confidence Interval Estimation Basic Business Statistics 11 th Edition.

Confidence Interval Estimation

PROBABILITY (6MTCOAE205) Chapter 6 Estimation. Confidence Intervals Contents of this chapter: Confidence Intervals for the Population Mean, μ when Population.

Statistics for Managers Using Microsoft Excel, 5e © 2008 Pearson Prentice-Hall, Inc.Chap 8-1 Statistics for Managers Using Microsoft® Excel 5th Edition.

1 Chapter 7 Sampling Distributions. 2 Chapter Outline  Selecting A Sample  Point Estimation  Introduction to Sampling Distributions  Sampling Distribution.

Chap 8-1 Chapter 8 Confidence Interval Estimation Statistics for Managers Using Microsoft Excel 7 th Edition, Global Edition Copyright ©2014 Pearson Education.

Chapter 8: Simple Linear Regression Yang Zhenlin.

Statistics for Business and Economics 8 th Edition Chapter 7 Estimation: Single Population Copyright © 2013 Pearson Education, Inc. Publishing as Prentice.

Week 21 Statistical Model A statistical model for some data is a set of distributions, one of which corresponds to the true unknown distribution that produced.

Tom.h.wilson Department of Geology and Geography West Virginia University Morgantown, WV.

Chapter 8 Confidence Interval Estimation Statistics For Managers 5 th Edition.

STA302/1001 week 11 Regression Models - Introduction In regression models, two types of variables that are studied:  A dependent variable, Y, also called.

Virtual University of Pakistan

Stats Methods at IC Lecture 3: Regression.

Chapter 8: Estimating with Confidence

Dependent-Samples t-Test

CHAPTER 9 Testing a Claim

Chapter 7 Confidence Interval Estimation

Linear Algebra Review.

Chapter 7. Classification and Prediction

Chapter 14 Inference on the Least-Squares Regression Model and Multiple Regression.

Chapter 7 (b) – Point Estimation and Sampling Distributions

Point and interval estimations of parameters of the normally up-diffused sign. Concept of statistical evaluation.

business analytics II ▌assignment one - solutions autoparts 

5 Systems of Linear Equations and Matrices

Relative Values.

Comparing Three or More Means

Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.

12 Inferential Analysis.

Statistics and Art: Sampling, Response Error, Mixed Models, Missing Data, and Inference Ed Stanek And others: Recai Yucel, Julio Singer, and others on.

Introduction to Summary Statistics

CONCEPTS OF ESTIMATION

Introduction to Summary Statistics

Econ 3790: Business and Economics Statistics

Inferential Statistics

CHAPTER 9 Testing a Claim

Geology Geomath Chapter 7 - Statistics tom.h.wilson

Confidence Interval Estimation

One-Way Analysis of Variance

12 Inferential Analysis.

Sampling Distributions

Lecture 7 Sampling and Sampling Distributions

Product moment correlation

Chapter 8: Estimating with Confidence

Chapter 8: Estimating with Confidence

CHAPTER 9 Testing a Claim

8.3 Estimating a Population Mean

CHAPTER 9 Testing a Claim

Chapter 8: Estimating with Confidence

2/5/ Estimating a Population Mean.

Chapter 8: Estimating with Confidence

Linear Regression and Correlation

MGS 3100 Business Analysis Regression Feb 18, 2016

STATISTICS INFORMED DECISIONS USING DATA

Interval Estimation of mean response

Presentation transcript:

Ed Stanek and Julio Singer Sampling, WLS, and Mixed Models Festschrift to Honor Professor Gary Koch Ed Stanek and Julio Singer U of Mass, Amherst, and U of Sao Paulo, Brazil Talk given on Tuesday, October 13, 2009 at the Carolina Club, UNC Chapel Hill at the Festschrift in Honor of Professor Gary Koch by Ed Stanek SPH&HS, UMASS Amherst

Finite Population Mixed Models Research Group Finite Population Mixed Model Research Group: From left to right: Anne Stanek; Luzmery Gonzalas; Viviana Lencina; Julio Singer; Alice Singer; Ed Stanek; Silvina San Martino; Maria Lucia Singer; and Wenjun Li Luzmery Gonzalas, Columbia; Viviana Lencina, Argentina; Julio Singer, Brazil; Silvina San Martino, Argentina; Wenjun Li, US; and Ed Stanek US

How do we make up models to get better insight using limited information? Study Design- Sampling Response Error Model Assumptions An Example: What is a subject’s saturated fat intake? In statistics, we look at information, and try to draw conclusions or make statements that go beyond the observations themselves. These conclusions, or insights, are the realm of inference in statistics. They are highly ritualized, but have as their origin the desire to simplify or summarize information in a meaningful way. Inference may use the study design and sampling plan, additional assumptions, and model frameworks. The example we consider is trying to determine a subject’s ‘true’ saturated fat intake from food/drink. True saturated fat intake is defined as the average saturated fat intake over a consecutive 21 day period prior to a blood draw. This time period is relevant for biological effects on serum cholesterol. SPH&HS, UMASS Amherst

Seasons Study UMASS Worc The Seasons Study was a longitudinal study of volunteers between the age of 18 and 75 from the Fallon HMO in Worcester, Massachusetts. The study aim was to identify factors associated with seasonal variation in cholesterol, accounting for changes in diet and physical activity over the seasons. A total of 641 volunteer subjects were enrolled. The protocol involved collection of baseline data including serum cholesterol measures, and subsequent quarterly follow-up measures for four subsequent quarters. Three weeks preceding the quarter’s measure, three un-announced 24-hour telephone dietary recalls were conducted by trained nutritionists on each subject, with two recalls during the week, and one on the weekend. More details and the results are given in the following references, with example data given at the study website at http://www.umass.edu/seasons/ Ockene, I.S., Chiriboga, D.E., Stanek, E.J.III, Harmatz, M.G., Nicolosi, R., Saperia, G., Well, A.D., Merriam, P.A., Reed, G., Ma, Y., Matthews, C.E. and Hebert, J.R. (2004). “Seasonal variation in serum cholesterol: Treatment implications and possible mechanisms.” Archives of Internal Medicine, 164:863-870. SPH&HS, UMASS Amherst

Seasons Study UMASS Worc- focus on 3 subjects This scatter plot is constructed using the mean and standard deviation for saturated fat intake for subjects with 10 or more 24-hour dietary recalls in the Season’s study. We will discuss a simpler problem- one where the population consists of N=3 subjects. SPH&HS, UMASS Amherst

The Problem-Simplified Observe: 1 Measure of SFat on each Subject Assume: Response Error (RE) Variance known Question: How do we estimate Subject’s True Sat Fat intake? Begin with a Response Error Model … which leads to…. Mixed Model Finite Population Mixed Model We discuss how to estimate a subject’s true saturated fat intake with minimal observed data (1 measure per subject), and fairly minimal assumptions (we assume response error variance is known). First, we use simple observations to motivate a response error model, and define the latent value. Next, we discuss two approaches to add to the response error model: A ‘model-based’ approach where we replace a subject effect by a random variable. A ‘design-based’ approach where we assume the subjects were a realization of a sample. Daisy Lily Rose SPH&HS, UMASS Amherst

Population We start with a population (where N=3). Subjects are identifiable, and correspond to Daisy, Lily, and Rose. It is not necessary to define the population to begin with. Instead, we could simply begin with a set of subjects, say Daisy and Rose, where a response is observed on each subject. SPH&HS, UMASS Amherst

Population Set We will assume that we only observe response on two subjects. Thus, whether or not Lily is in the population is not important. SPH&HS, UMASS Amherst

Response 4 11 In a study, we assume that we observe response on each subject in a set. For example, suppose the response corresponds to the grams of saturated fat reported to have been eaten in the previous 24 hour period. For ease of presentation, we assume that 10 (g/day) has been subtracted from each observed value to keep the numbers small. As a result, to know the actual observed response, 10 must be added to the value given. SPH&HS, UMASS Amherst

Response 11 The response that we observe is not always the same, and may vary over days for the same subjects. SPH&HS, UMASS Amherst

Response 4 9 Other possible responses are given. SPH&HS, UMASS Amherst

Response 4 11 Here are some more responses. SPH&HS, UMASS Amherst

Response 4 9 And some more responses. SPH&HS, UMASS Amherst

Response 11 After a while, we gain some experience as to what we may expect the response to be for a subject. SPH&HS, UMASS Amherst

Response…….. 11 4 We would still be hard pressed to guess the saturated fat intake for subject on a new day. SPH&HS, UMASS Amherst

Response Error Model for Set k 1 2 9 11 0.5 k 1 2 4 0.5 A response error model is a natural way of representing what we see. First notice that we refer Daisy with j=1, and to Rose with j=2. These subscripts are assigned as consecutive numbers to subjects after arranging subjects by name in ascending alphabetical order. This is the order commonly used in survey sampling when referring to elements in a set. Response is expressed as the sum of what we expect to see, plus a deviation. The response error model that we consider is additive. Multiplicative models could also be considered. We consider a simple representation of response error with only two possible values, each equal in absolute value to the square root of “sigma2”. Note that “simga2” is the usual population variance. SPH&HS, UMASS Amherst

Summary Response Error Model Latent Value A subject’s latent value is defined to be the expected response for the subject. Notice that we never observe a response equal to the latent value for a subject in this example. The latent value is an average that we define, and claim to be able to interpret. For example, if a subject’s latent value is high (say 30 mg/day saturated fat intake), a nutritionist may advise the subject to lower their saturated fat intake. However, on some days, the subject may already have low saturated fat intake. More clearly, the nutritionist may want to focus advice on days where saturated fat intake is likely to be high, and pinpoint advice to those days. This focus is on responses, not latent values. SPH&HS, UMASS Amherst

Re-parameterized RE Model Mean Latent Value: We define an average latent value for Daisy and Rose. This is an average for the set. Deviations for subjects are defined relative to this average, so that the sum of beta1 and beta2 is zero. SPH&HS, UMASS Amherst

Generating Response in the RE Model This is an urn model where we sample with replacement from possible response errors. SPH&HS, UMASS Amherst

Generating Response in the RE Model SPH&HS, UMASS Amherst

Generating an Observed Response in the RE Model -1 1 -2 2 -2 -1 Selections of response error give rise under the response error model to the observed response. SPH&HS, UMASS Amherst

Sample Space Response Error Model 9 4 11 4 11 9 9 The pair of observed response, or response vector, is a point in the sample space. Other points correspond to other possible response vectors. We assume each subject has 2 possible responses, leading to four points in the sample space. SPH&HS, UMASS Amherst

Response Error Model 9 4 11 4 9 11 This summarizes the setting. 11 SPH&HS, UMASS Amherst

Mixed Model (MM) Random Effect The mixed model substitutes a random variable for the ‘subject effect’, betaj. The subject is identified in a mixed model. SPH&HS, UMASS Amherst

Mixed Model (MM) Latent Value Adding the random effect to the mean, we have the mixed model latent value. This value, when realized, is interpreted as the latent value for the subject. We represent it by Pj. We assume the expected value (i.e. mean) is zero, and the variance is gamma2 for the random effect. Expectation is with respect to the random effect. One way to think of the random effect is as follows. Suppose there is a population of possible subject effects which we call a superpopulation. Let us select one of these effects at random. We may interpret the random effect as a selection from this superpopulation. The superpopulation may (or may not) be identifiable. Notice that the subscript for expectation is over the assumed ‘model’. It is the assumptions on the random effect that define this model. The use this idea to define the random effect. We take as the superpopulation the random effects for the two subjects, Daisy and Rose. SPH&HS, UMASS Amherst

Mixed Model (MM) in action Notice that in the mixed model, the subjects are identifiable. We have Daisy, (j=1) with her possible response error values, and Rose, (j=2) with her possible response error values. We can obtain response with a two step process. First, we pick a subject effect. Second, we pick response error. However, response could be obtained in one step: simultaneously select a subject effect and response error. We do not need to know what subject effect is realized to pick from possible response errors for a subject. SPH&HS, UMASS Amherst

Mixed Model (MM) …. SPH&HS, UMASS Amherst

Mixed Model (MM) ….?? Who Are They? ? ? We have illustrated the subject effects as if they are an attribute of a subject (keeping the same color for effect as we had when we formed the superpopulation of subject effects). This is consistent with subjects have different latent values, and the subject effects colored to keep the correspondence with the subject. Who Are They? ? ? SPH&HS, UMASS Amherst

Mixed Model (MM) ?? ?? What Does it Mean? ? ? The outcomes illustrated here for Daisy and Rose are never observed in practice. They have positive probability of occurring in the mixed model. What Does it Mean? ? ? SPH&HS, UMASS Amherst

Sample Space (MM) 3 8 1 8 Artificial 1 12 3 12 9 4 11 Real Some sample point may be observed. These are called “Real”. Other sample points have positive probability in the mixed model, but are never observed. These we call “Artificial”. 9 4 11 Real SPH&HS, UMASS Amherst

MM-Latent Values? Daisy (j=1) Rose (j=2) 2 10 Samples 3 1 12 -1 8 -2 11 4 9 Samples This table lists the MM response for the 8 possible sample points, with one row corresponding to a sample point. The MM-responses in the top 4 rows are not potentially observable, as indicated by the shaded region. The four sample points illustrated in the bottom 4 rows of the on the bottom of the table are potentially observable. SPH&HS, UMASS Amherst

What are they (for Daisy)? Daisy (j=1) Rose (j=2) 3 2 1 12 10 -1 8 -2 11 4 9 Samples Notice that the MM-latent value changes for different sample points for the same subject. This is inconsistent with the definition of the latent value from the response error model. This is a reason for concern over interpretation when predicting the MM-latent value, since such latent values are not the same as the latent values under the response error model. The inconsistency in interpretation of the MM-latent values is a direct result of allowing the subject effects to be random. SPH&HS, UMASS Amherst

What are they (for Rose)? Daisy (j=1) Rose (j=2) 3 2 1 12 10 -1 8 -2 11 4 9 Samples The same interpretation conflict occurs for the latent value of Rose. We use the term “MM-latent value” to refer to the set of latent values that have positive probability under the MM. These are distinguished from the “Latent Values” that correspond to the latent value for the subject. SPH&HS, UMASS Amherst

BLUPs of the MM-Latent Value The solution to the MM equations results in the best linear unbiased predictor (BLUP). The mean, mu-hat, is a weighted least squares mean. The shrinkage constant kj, depends on the response error variance for the subject. The shrinkage constant also depends on the variance of the random effect. We have defined this variance in terms of the two latent values in the set. It could be defined in a different way, as for example, the variance of latent values in a larger finite population, or as the variance of a distribution of latent values (assumed infinite). Notice that the expression for the variance treats the latent value as a random variable, even for the same subject. We refer to these changing latent values as MM-latent values. Also notice that the expression for the variance depends on the subject, j. SPH&HS, UMASS Amherst

MSE of BLUPs for MM-Latent Values Daisy (j=1) Rose (j=2) MSE 3.1 2 0.81 11.5 10 2.19 1.2 1.15 11.4 1.86 0.71 7.7 5.24 1.1 1.28 7.6 5.79 10.9 4.4 8.9 4.3 10.8 0.64 0.52 Samples Each row in this table gives the BLUP for a subject. Notice that each row corresponds to a point in the Mixed Model sample space. The MM-latent value changes for different rows for the same subject. Finally, the term in the column for MSE corresponds to the square of the difference between the BLUP and the MM-latent value. For Daisy, the response error variance is equal to 1. Using the BLUP, the average MSE is 0.986. Since this is less 1, the MM BLUP is usually said to be a more efficient predictor of the realized latent value than the observed response (which is the subject mean response, since there is only one measure per subject). Ave=0.986 Ave=3.768 SPH&HS, UMASS Amherst

MSE of BLUPs for True-Latent Values Daisy (j=1) Rose (j=2) MSE 3.1 10 47.2 11.5 2 89.8 1.2 78.2 11.4 87.6 48.0 7.7 32.6 1.1 79.2 7.6 31.3 10.9 1.28 4.4 5.79 8.9 0.71 4.3 5.24 10.8 1.15 0.64 1.86 0.81 0.52 2.19 Samples If we use the ‘true’ latent value for Daisy to form the differences in the mixed model, we obtain average MSE values that are much larger. This illustrates the importance of distinguishing between the MM-latent values, and the ‘true’ latent value for a subject. It is the MM-latent value that is referred to in a mixed model, and its assessments. A problem is that there are several MM-latent values for a subject, contradicting the notion that a subject’s latent value is constant. Ave=32.06 Ave=32.06 SPH&HS, UMASS Amherst

MSE of BLUPs |P=y Daisy (j=1) Rose (j=2) MSE 10.90 10 0.81 4.41 2 5.79 8.93 1.15 4.29 5.24 10.84 0.71 0.64 1.86 8.87 1.28 0.52 2.19 Samples Ave=0.986 Ave=3.768 Although the MM BLUP is developed over a MM sample space that includes points that can not occur, the sample space also includes all points that are potentially observable for the set of subjects. We can evaluate the MSE of the MM BLUP over this sample space (i.e. conditional on the true latent values). In this example (with n=2), we obtain the same numeric values for the average MSE. SPH&HS, UMASS Amherst

Finite Population Mixed Model (FPMM) A finite population mixed model is developed directly from random variables underlying the two-stage sampling. The first stage is selection of subjects in the sample. The second stage is measuring each of the selected subjects. SPH&HS, UMASS Amherst

Response Error Model Latent Value Notice that the parameter corresponding to mu has a different definition here than it did in the mixed model. Here, mu is the average latent value of the subjects in the population. Also notice that we specify the response error variance of Lily. As we proceed, the FPMM-BLUP will depend in part on this response error variance. It was not necessary to specify this variance for the MM development. In the mixed model, there are no additional subjects beyond the subjects in the set. The MM mean was defined as the average latent value of the subjects in the set. SPH&HS, UMASS Amherst

Finite Population Mixed Model (FPMM) This illustrates the two-stage procedure that leads to a response on each sample subject. Notice that the subscript, i=1, refers to an order of selection, not to a particular subject. As a result, this representation is for sample sequences. For different samples, different subjects may occupy the first position in the sample sequence. In the two-stage sampling approach, there is never a mis-match between the subject and the subject’s latent value. SPH&HS, UMASS Amherst

Finite Population Mixed Model (FPMM) Notice that there is no confusion over which subject is ‘realized’ in the finite population mixed model. SPH&HS, UMASS Amherst

Finite Population Mixed Model (FPMM) SPH&HS, UMASS Amherst

Finite Population Mixed Model (FPMM) SPH&HS, UMASS Amherst

FPMM- Sample Space … Each block represents a sample point in the sample space. 9 4 11 4 9 11 SPH&HS, UMASS Amherst

FPMM- Sample Space … Here are four more sample points. Notice that these are considered to be different points in the FPMM than the previous sample points, since this sample sequence is different from the previous sample sequence. Notice that sample sequences are used as opposed to sample sets. 11 11 4 4 9 9 SPH&HS, UMASS Amherst

FPMM- Sample Space … These are additional sample points. 4 4 -7 13 13 -7 SPH&HS, UMASS Amherst

FPMM- Sample Space … 13 13 4 -7 -7 4 Here are some more sample points. 4 -7 -7 4 SPH&HS, UMASS Amherst

FPMM- Sample Space … And some more sample points. 11 11 -7 13 9 9 13 -7 SPH&HS, UMASS Amherst

FPMM- Sample Space … Here are the final set of sample points. 13 13 9 11 -7 -7 11 9 SPH&HS, UMASS Amherst

All sample points are Potentially Observable FPMM- Sample Space All sample points are Potentially Observable This is a summary illustration of all the sample points in a finite population mixed model. The bottom row in this diagram illustrates the possible sample sets, and the sample points for each set. Additional rows illustrate permutations of the subjects in the sets. Since a sample is a sample sequence, the sample space consists of all points for each possible sample sequence. -7 11 13 9 11 4 9 -7 4 13 9 4 11 11 13 9 -7 13 4 -7 SPH&HS, UMASS Amherst

FPMM- BLUPs of Realized Latent Values This illustrates the two-stage process that is assumed to occur when generating the response for a sample. The idea of a latent value in a FPMM does not identify a subject. SPH&HS, UMASS Amherst

FPMM- BLUPs of Realized Latent Values In the FPMM, by a ‘realized subject’, we mean that we know who was selected. In this example, Daisy was selected first. SPH&HS, UMASS Amherst

FPMM- BLUPs of Realized Latent Values The variance of the random effects corresponds to the variance of the latent values in the population. This definition differs from that used for the example with the MM. Since all subjects can be selected at any position in the sample, the response error is the average response error over all subjects in the population. It is possible to make similar comparisons between the FPMM and MM BLUPs using the same definitions for the mean of the latent values, and the variance of the random effects. To do so, we begin with a set of subjects. The MM is developed in the same manner as the earlier development. The FPMM is developed assuming that the sample size is n=N (where all subjects are selected in the sample). In the FPMM, sample points correspond to sequences of subjects. SPH&HS, UMASS Amherst

FPMM- BLUPs of Realized Latent Values The MSE for the FPMM- BLUP is typically defined over all possible sample points. Here, we limit the MSE to a sample sequence. Sample Sequence SPH&HS, UMASS Amherst

Comparison of MM-BLUP and FPMM-BLUP Target Random Variable MM-Latent Value Latent Value This illustrates the different possible latent values that are conceptualized in the mixed model and the finite population mixed model. Supposing response on two individuals (Daisy and Rose) is obtained. Then the latent value for these two individuals occurs in the set of MM-latent values and the set of FPMM latent values. SPH&HS, UMASS Amherst

Comparison of MM-BLUP and FPMM-BLUP Predictor Notice the similarity of the predictors. The WLS mean is used in the MM as opposed to the simple sample mean in the FPMM. SPH&HS, UMASS Amherst

Comparison of FPMM-BLUP and MM-BLUP-Sample Space Artificial 1 12 3 8 -7 11 13 9 9 4 11 -7 4 13 11 13 9 -7 11 4 9 13 4 -7 SPH&HS, UMASS Amherst

To Compare, Focus on …THIS Sample Space 1 12 3 8 From the standpoint of sample sequences in a FPMM, a single sample sequence is selected. However, the subjects in the sequence form a set, and the same set will occur for permutations of the sample sequence. For this reason, we can compare the sample set in the mixed model with sample sequences representing permutations of the sample set in the FPMM. -7 11 13 9 9 4 11 -7 4 13 11 13 9 -7 11 4 9 13 4 -7 SPH&HS, UMASS Amherst

Bigger Sample (n=3) Population (N=4) Here we consider a slightly larger population where we have added another subject, Violet. SPH&HS, UMASS Amherst

n=3, What is Lily’s Latent value? Use n=3 subject effects for MM 1 possible sample set 11 4 13 3 9 -7 11 4 13 -7 9 SPH&HS, UMASS Amherst 60

n=3, What is Lily’s Latent value? 8 sample points 11 13 -7 SPH&HS, UMASS Amherst 61

n=3, What is Lily’s Latent value? 8x(6 permutations)=48 sample points SPH&HS, UMASS Amherst 62

n=3, What is Lily’s Latent value? Combinations SPH&HS, UMASS Amherst 63

n=3, What is Lily’s Latent value? 192 Sample Points SPH&HS, UMASS Amherst 64

Select one sequence SPH&HS, UMASS Amherst 65

Select one sequence, Observe Sample Point SPH&HS, UMASS Amherst 66

FPMM-Average MSE of Predictor over Permutations SPH&HS, UMASS Amherst 67

X Ave MSE 5.0 4.6 16.2 FPMM MM 11 13 11 11 11 -7 SPH&HS, UMASS Amherst 68

Ave MSE X 29.4 MM 11 13 -7 17.7 11 11 11 34.3 FPMM SPH&HS, UMASS Amherst 69

Conclusions Design Based Population FPMM-BLUP Sample Space 11 13 11 11 -7 11 11 11 SPH&HS, UMASS Amherst 70

Evaluate Performance Conditional on the Sample Conclusions Population Evaluate Performance Conditional on the Sample Design Based FPMM-BLUP 11 13 -7 11 11 11 SPH&HS, UMASS Amherst 71

Conclusions Model Based Conceptual “Priors” MM-BLUP 11 13 -7 SPH&HS, UMASS Amherst 72

Evaluate Performance Conditional on the Sample Conclusions Evaluate Performance Conditional on the Sample Model Based MM-BLUP 11 13 -7 SPH&HS, UMASS Amherst 73

Evaluate Performance Conditional on the Sample Conclusions Evaluate Performance Conditional on the Sample MSE for BLUPs not evaluated Correctly Extends to WLS estimate of mean MM-BLUP not always best Sample Design is relevant only for derivation- not performance Prior Distributions relevant only for derivation- not performance 11 13 -7 SPH&HS, UMASS Amherst 74

Thanks to Gary For keeping the focus on “reality” SPH&HS, UMASS Amherst 75