Presentation is loading. Please wait.

Presentation is loading. Please wait.

Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to Please.

Similar presentations


Presentation on theme: "Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to Please."— Presentation transcript:

1 Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to tlouis@jhsph.eduouis@jhsph.edu Please name the file: [myname]-project.[filetype] or [name1_name2]-project.[filetype]

2 Term 4, 2006BIO656--Multilevel Models 2 Efficiency-Robustness Trade-offs First, we consider alternatives to the Gaussian distribution for random effects Then, we move to issues of weighting, starting with some formalism Then, move to an example of informative sample size And, finally give a basic example that has broad implications of choosing among weighting schemes

3 Term 4, 2006BIO656--Multilevel Models 3 Alternatives to the Gaussian Distribution for Random Effects

4 Term 4, 2006BIO656--Multilevel Models 4 The t-distribution Broader tails than the Gaussian So, shrinks less for deviant Y-values The t-prior allows “outlying” parameters and so a deviant Y is not so indicative of a large, level 1 residual

5 Term 4, 2006BIO656--Multilevel Models 5 Creating a t-distribution Assume a Gaussian sampling distribution, Using the sample standard deviation produces the t-distribution Z is t with a large df t 3 is the most different from Z for t-distributions with a finite variance

6 Term 4, 2006BIO656--Multilevel Models 6

7 Term 4, 2006BIO656--Multilevel Models 7 With a t-prior, B is B(Y), increasing with |Y -  |

8 Term 4, 2006BIO656--Multilevel Models 8 Z is distance from the center (1-B) = ½ = 0.50

9 Term 4, 2006BIO656--Multilevel Models 9 Z is distance from the center (1- B) = 2/3 = 0.666

10 Term 4, 2006BIO656--Multilevel Models 10 Estimated Gaussian & Fully Non-parametric priors Estimated Gaussian & Fully Non-parametric priors for the USRDS data

11 Term 4, 2006BIO656--Multilevel Models 11 USRDS estimated Priors

12 Term 4, 2006BIO656--Multilevel Models 12

13 Term 4, 2006BIO656--Multilevel Models 13

14 Term 4, 2006BIO656--Multilevel Models 14

15 Term 4, 2006BIO656--Multilevel Models 15

16 Term 4, 2006BIO656--Multilevel Models 16

17 Term 4, 2006BIO656--Multilevel Models 17 Informative Sample Size (Similar to informative Censoring) Informative Sample Size (Similar to informative Censoring) See Louis et al. SMMR 2006

18 Term 4, 2006BIO656--Multilevel Models 18

19 Term 4, 2006BIO656--Multilevel Models 19

20 Term 4, 2006BIO656--Multilevel Models 20

21 Term 4, 2006BIO656--Multilevel Models 21

22 Term 4, 2006BIO656--Multilevel Models 22

23 Term 4, 2006BIO656--Multilevel Models 23

24 Term 4, 2006BIO656--Multilevel Models 24

25 Term 4, 2006BIO656--Multilevel Models 25 Choosing among weighting schemes Choosing among weighting schemes “Optimality” versus goal achievement

26 Term 4, 2006BIO656--Multilevel Models 26 Inferential Context Question What is the average length of in-hospital stay? A more specific question What is the average length of stay for: –Several hospitals of interest? –Maryland hospitals? –All hospitals? –.......

27 Term 4, 2006BIO656--Multilevel Models 27 “Data” Collection & Goal Data gathered from 5 hospitals Hospitals are selected by some method n hosp patient records are sampled at random Length of stay (LOS) is recorded Goal is to: Estimate the “population” mean

28 Term 4, 2006BIO656--Multilevel Models 28 Procedure Compute hospital-specific means “Average” them –For simplicity assume that the population variance is known and the same for all hospitals How should we compute the average? Need a goal and then a good/best way to combine information

29 Term 4, 2006BIO656--Multilevel Models 29 “DATA” Hospital # sampled n hosp Hospital size % of Total size: 100  hosp Mean LOS Within- hospital variance 130 100 1025  2 /30 260 150 1535  2 /60 315 200 2015  2 /15 430 250 2540  2 /30 515 300 3010  2 /15 Total1501000100??

30 Term 4, 2006BIO656--Multilevel Models 30 Weighted averages & Variances Weighted averages & Variances (Variances are based on FE not RE) Weighting approach Weights x100 MeanVariance Ratio 100*(Var/min) Equal20 20 20 20 2025.0130 Proportional to Reciprocal variance 20 40 10 20 1029.5100 Population  hosp 10 15 20 25 3023.8172 Each weighted average is mean = Reciprocal variance weights minimize variance Is that our goal?

31 Term 4, 2006BIO656--Multilevel Models 31 There are many weighting choices and weighting goals Minimize variance by using reciprocal variance weights Minimize bias for the population mean by using population weights (“survey weights”) Use policy weights (e.g., equal weighting) Use “my weights,”...

32 Term 4, 2006BIO656--Multilevel Models 32 General Setting When the model is correct All weighting schemes estimate the same quantities – same value for slopes in a multiple regression So, it is clearly best to minimize variance by using reciprocal variance weights When the model is incorrect Must consider analysis goals and use appropriate weights Of course, it is generally true that our model is not correct!

33 Term 4, 2006BIO656--Multilevel Models 33 Weights and their properties But if  1 =  2 =  3 =  4 =  5 =  then all weighted averages estimate the population mean:    k  k So, it’s best to minimize the variance But, if the hospital-specific  k are not all equal, then Each set of weights estimates a different target Minimizing variance might not be “best” For an unbiased estimate of set  w k =  k

34 Term 4, 2006BIO656--Multilevel Models 34 The variance-bias tradeoff General idea Trade-off variance & bias to produce low Mean Squared Error (MSE) (Estimate - True) 2 MSE = Expected(Estimate - True) 2 Variance + (Bias) 2 = Variance + (Bias) 2 Bias is unknown unless we know the  k (the true hospital-specific mean LOS) But, we can study MSE ( , w,  ) In practice, make some “guesses” and do sensitivity analyses

35 Term 4, 2006BIO656--Multilevel Models 35 Variance, Bias and MSE as a function of (the  s, w,  ) Consider a true value for the variation of the between hospital means (  * is the “overall mean”) T =  (  k -  * ) 2 Study BIAS, Variance, MSE for weights that optimize MSE for an assumed value (A) of the between- hospital variance So, when A = T, MSE is minimized by this optimizer In the following plot, A is converted to a fraction of the total variance A/(A + within-hospital) –Fraction = 0  minimize variance –Fraction = 1  minimize bias

36 Term 4, 2006BIO656--Multilevel Models 36 The bias-variance trade-off The bias-variance trade-off X-axis is assumed variance fraction Y is performance computed under the true fraction Assumed kk

37 Term 4, 2006BIO656--Multilevel Models 37 Summary Much of statistics depends on weighted averages Weights should depend on assumptions and goals trustIf you trust your (regression) model, –Then, minimize the variance, using “optimal” weights –This generalizes the equal  case worryIf you worry about model validity (bias for    –You can buy full insurance by using population weights –But, you pay in variance (efficiency) –So, consider purchasing only the insurance you need by using compromise weights


Download ppt "Term 4, 2006BIO656--Multilevel Models 1 PROJECTS ARE DUE By midnight, Friday, May 19 th Electronic submission only to Please."

Similar presentations


Ads by Google