Presentation is loading. Please wait.

Presentation is loading. Please wait.

Statistics and Art: Sampling, Response Error, Mixed Models, Missing Data, and Inference Ed Stanek And others: Recai Yucel, Julio Singer, and others on.

Similar presentations


Presentation on theme: "Statistics and Art: Sampling, Response Error, Mixed Models, Missing Data, and Inference Ed Stanek And others: Recai Yucel, Julio Singer, and others on."— Presentation transcript:

1 Statistics and Art: Sampling, Response Error, Mixed Models, Missing Data, and Inference
Ed Stanek And others: Recai Yucel, Julio Singer, and others on the Cluster Team 11/11/2018

2 Anne Stanek Viviana Lencina Alice Singer Silvia San Martino Wenjun Li
Luz Mery Gonzalas Julio Singer Ed Stanek Maria Lucia Singer 11/11/2018

3 What is truth?: Predict what? Subsets- sampling Prediction
Outline Example: Dose-response Models in Toxicology- Threshold vs Hormetic Models What is truth?: Predict what? Subsets- sampling Prediction Results on Predictor of Realized Subject True Value Illustration and Dilemma Extension to two-stage problems Missing data framework Conclusions And others: Recai Yucel, Bo Xu, Ruitao Zhang , and others on the Cluster Team 11/11/2018

4 1. Example: Dose-response Models - Threshold vs Hormetic Models
Yeast data chemicals, 13 yeast strains, 5 doses x 2 replications- Focus on doses below BMD These plots are of hypothetical ‘true’ responses. Response is represented as Percent of Control 100% is the response when the dose=0. Question: Is there evidence of hormesis? The point where the true response drops below 100% is the zero effect point. In practice, a ‘bench mark dose’ is estimated as a dose where the observed response drops below 95%. 11/11/2018

5 i = chemical J = dose k = replication 11/11/2018
A mixed model is fit to response for doses in the hormetic range. Only 5 doses; Identify BMD(5), (meaning benchmark dose 5%, value where response above is less than (100-5)%=95% , and doses below BMD; When 3 doses below BMD, Predict average response for below BMD range. Results- order predicted response for realized chemicals from low to high Equal resp error, unequal resp error i = chemical J = dose k = replication 11/11/2018

6 Plot of predicted response for the strain ‘wild type yeast’ for 253 chemicals with 3 doses below a benchmark dose of 95%, using a pooled (equal) response errors based on a mixed model. Black line is expected distribution of mean response if Threshold model held 11/11/2018

7 11/11/2018 Similar plot of with un-equal respone error.
This was constructed by fitting mixed models to each chemical, and estimating response variance. Which results should be used? Does it depends on whether model has heterogeneous response error? No- theoretically, a derivation with heterogeneous response error pools response error variances. However, in simple example, we can show that better results occur if response error is separated. The theory doesn’t match- we don’t understand the theory for the ‘better results’. Next Steps: Review what we do understand. Keep the context simple. 11/11/2018

8 2. What is truth? Predict what?
Population, subjects, true response Subject Labels: True Response: Population Parameters Mean: Variance: Subject Deviation: Subjects == chemicals True Response== Average response in hormetic range Need to Define Parameters to represent the problem

9 Non-Stochastic Model:. Index for response:. Response error:
Non-Stochastic Model: Index for response: Response error: Assume: Response Error Model: For each subject: Response Process: In hormetic range, pick a dose at random Measure response Assumptions (unbiased response error, heteroskedastic) Response Error Model is a stochastic Model Response Error is a random effect Sum of subject effects is zero (over population). Information: (subject label, response) Subsequently, take r=1 (one measure per subject) 11/11/2018

10 3. Subsets, Sampling Select n of N subjects (a subset, “sample”)
Let all subsets be equally likely: Sample Mean: Note difference with: Select n of N subjects (a subset) Sample is a set (un-ordered) of different subjects. Usually representCommon 11/11/2018

11 Sample as a Sequence (part of Permutation)
Represent Positions in a Permutation: Assume all Permutations Equally Likely: Define: Sample= positions Sample Mean: The random variable Y(ik) is not clearly defined. Sample is now a sequence (order matters)! 11/11/2018

12 Population s=2 s=3 Ed s=1 Wenjun Julio 11/11/2018
Population of N=3 subjects. The sample is the first two subjects on the left. s=2 Ed s=3 Wenjun s=1 Julio 11/11/2018

13 i=1 i=2 i=3 s=2 s=3 s=1 Position in Permutation 11/11/2018
Population of N=3 subjects. Note labels and positions. s=2 s=3 s=1 11/11/2018

14 i=1 i=2 i=1 i=2 i=3 i=3 s=2 s=1 s=3 s=1 s=3 s=2
Position in Permutation i=1 i=2 i=3 i=3 Different permutation: Ed, Julio, and Wenjun s=2 s=1 s=3 s=1 s=3 s=2 11/11/2018

15 i=1 i=2 i=3 s=3 s=1 s=2 Position in Permutation 11/11/2018
Different Permutation: Ed, Wenjun, Julio s=3 s=1 s=2 11/11/2018

16 i=1 i=2 i=3 s=3 s=2 s=1 Position in Permutation 11/11/2018
Different permutation: Julio, Wenjun, Ed s=3 s=2 s=1 11/11/2018

17 i=1 i=2 i=3 Sample Remainder s=1 s=2 s=3 Position in Permutation |
Different Permutation with Sample and Remainder: Wenjun, Julio, and Ed s=1 s=2 s=3 11/11/2018

18 i=1 i=2 i=3 Sample Remainder s=2 s=1 s=3 Position in Permutation |
Wenjun, Ed, and Julio (using sample and remainder s=2 s=1 s=3 11/11/2018

19 Population size (N) is most likely > 3
We only see “n” subjects in the sample For example: Suppose n=3, and N=7 We may see … 11/11/2018

20 i=1 i=2 i=3 Sample Remainder i=4 i=… s=3 s=4 s=5
| Position in Permutation i=1 i=2 i=3 Sample Remainder Luzmery, Wenjun, and Viviana in sample i=4 i=… s=3 s=4 s=5 11/11/2018

21 i=1 i=2 i=3 Sample Remainder s=2 s=4 s=7 i=… Position in Permutation |
Viviana, Ed, Silvina, in a sample s=2 s=4 s=7 i=… 11/11/2018

22 Traditional Sampling Approach
1 2 N Horvitz-Thompson Estimator: First order inclusion Probabilites= Prob( subject included in a sample) Bold y is a vector of population values. Missing Data Missing Data 11/11/2018

23 With Response Error Model
Sample Mean Sample is a set Sample is a Sequence U(is) is an indicator variables that has a value of 1 if subject s is in position i To represent positions: 11/11/2018

24 | Position in Permutation i=1 i=2 i=3 Sample s=1 s=2 s=3 11/11/2018

25 First Position in Permutation:
Suppose s=1,…,3=N First Position in Permutation: Then: Formal expression of response for Position i=1 in a permutation 11/11/2018

26 Positions in Sample Sequences
Sample and Remainder representation Remainder 11/11/2018

27 Basic Random Variables
Sample Remainder Population 11/11/2018

28 Finite Population Mixed Model
Response Error Model Response Error Model Finite Population Mixed Model Combine response error model with permutation, get mixed model 11/11/2018

29 Mixed Model Mixed Model 11/11/2018 Alpha = fixed effects
B = Random Effects W* = Response error Note that subscript is POSITION, not SUBJECT 11/11/2018

30 Properties of Basic Random Variables (N=3)
Sum Expected Value Sum Average Expected Value Average 11/11/2018

31 Sample Random Variables (n=2)
Sum Expected Value Sum Sum over Rows, get usual random variable, with expected value mu Sum over columns: get random variable with different expected values Expected Value 11/11/2018

32 Prediction of Mean in a Simple Case: No Response Error (N=3, n=2)
Sample Remainder Note: Criteria: Linear Function of sample Unbiased Smallest Mean Squared Error Need to predict a function of the remainder Called Best Linear Unbiased Predictor (not that we use the term “Predictor” here for a parameter, not a random variable) 11/11/2018

33 Prediction of Mean No Response Error (N=3, n=2)
Target Sample Data Realized We predict the un-observed values in the population. Best Linear Unbiased Predictor: 11/11/2018

34 Prediction of a Subject’s Mean in Position i with No Resp
Prediction of a Subject’s Mean in Position i with No Resp. Error (N=3, n=2) Target Sample Data Realized We predict the un-observed values in the population. Best Linear Unbiased Predictor: 11/11/2018

35 Prediction of a Subject’s Mean in Position i with Response Error
Target Sample Data Realized We predict the un-observed values in the population. Best Linear Unbiased Predictor: 11/11/2018

36 Prediction of Realized Random Effect – Other Examples
SRS+ Subject Resp. Error SRS+ Position Resp. Error Cluster Sampling: Balanced Return to Basic Question- Which predictor should be use- Common Response Error- Optimal via the theory Allowing K to depend on realized subject- Had smaller MSE D Cluster Sampling: Un-Balanced Similar form, more complicated 11/11/2018

37 Plot of predicted response for the strain ‘wild type yeast’ for 253 chemicals with 3 doses below a benchmark dose of 95%, using a pooled (equal) response errors based on a mixed model. 11/11/2018

38 11/11/2018 Plot of with un-equal resp error
Which results should be used? Does it depends on whether model has heterogeneous response error? No- theoretically, a derivation with heterogeneous response error pools response error variances. However, in simple example, we can show that better results occur if response error is separated. The theory doesn’t match- we don’t understand the theory for the ‘better results’. Review what we do understand. Keep the context simple. 11/11/2018

39 Delimma Pooled Response Error Variance should be used for K (Using theoretical Results) Empirical example illustrates smaller MSE results with K depending on realized Subject -- but no theory! What should we do?.... Is there a ‘gap’ in the framework? 11/11/2018

40 Basic Sample Random Variables
Sum Usual Modelling Approach (work with right column) Properties of these random variables- exchangeable- Natural lead in to Bayesian Inference Traditional Sampling (and missing data) approach (work with bottom row): Don’t use explicit notation for sample, use inclusion probabilities, Some are missing. Super-population models: Use bottom row, but re-arrange elements so that those in the sample are first. Assume the random variables are exchangeable (like for the right column). Really doesn’t make sense. Sum 11/11/2018

41 Basic Random Variables
Sample and Remainder What is potentially observable? What is observed? 11/11/2018

42 Thanks More Work is needed! 11/11/2018
Anne Stanek Viviana Lencina Alice Singer Silvia San Martino Wenjun Li Luz Mery Gonzalas Julio Singer Ed Stanek Maria Lucia Singer 11/11/2018 Thanks


Download ppt "Statistics and Art: Sampling, Response Error, Mixed Models, Missing Data, and Inference Ed Stanek And others: Recai Yucel, Julio Singer, and others on."

Similar presentations


Ads by Google