Download presentation
Presentation is loading. Please wait.
Published byΣοφοκλής Δάβης Modified over 6 years ago
1
Statistics and Art: Sampling, Response Error, Mixed Models, Missing Data, and Inference
Ed Stanek And others: Recai Yucel, Julio Singer, and others on the Cluster Team 11/11/2018
2
Anne Stanek Viviana Lencina Alice Singer Silvia San Martino Wenjun Li
Luz Mery Gonzalas Julio Singer Ed Stanek Maria Lucia Singer 11/11/2018
3
What is truth?: Predict what? Subsets- sampling Prediction
Outline Example: Dose-response Models in Toxicology- Threshold vs Hormetic Models What is truth?: Predict what? Subsets- sampling Prediction Results on Predictor of Realized Subject True Value Illustration and Dilemma Extension to two-stage problems Missing data framework Conclusions And others: Recai Yucel, Bo Xu, Ruitao Zhang , and others on the Cluster Team 11/11/2018
4
1. Example: Dose-response Models - Threshold vs Hormetic Models
Yeast data chemicals, 13 yeast strains, 5 doses x 2 replications- Focus on doses below BMD These plots are of hypothetical ‘true’ responses. Response is represented as Percent of Control 100% is the response when the dose=0. Question: Is there evidence of hormesis? The point where the true response drops below 100% is the zero effect point. In practice, a ‘bench mark dose’ is estimated as a dose where the observed response drops below 95%. 11/11/2018
5
i = chemical J = dose k = replication 11/11/2018
A mixed model is fit to response for doses in the hormetic range. Only 5 doses; Identify BMD(5), (meaning benchmark dose 5%, value where response above is less than (100-5)%=95% , and doses below BMD; When 3 doses below BMD, Predict average response for below BMD range. Results- order predicted response for realized chemicals from low to high Equal resp error, unequal resp error i = chemical J = dose k = replication 11/11/2018
6
Plot of predicted response for the strain ‘wild type yeast’ for 253 chemicals with 3 doses below a benchmark dose of 95%, using a pooled (equal) response errors based on a mixed model. Black line is expected distribution of mean response if Threshold model held 11/11/2018
7
11/11/2018 Similar plot of with un-equal respone error.
This was constructed by fitting mixed models to each chemical, and estimating response variance. Which results should be used? Does it depends on whether model has heterogeneous response error? No- theoretically, a derivation with heterogeneous response error pools response error variances. However, in simple example, we can show that better results occur if response error is separated. The theory doesn’t match- we don’t understand the theory for the ‘better results’. Next Steps: Review what we do understand. Keep the context simple. 11/11/2018
8
2. What is truth? Predict what?
Population, subjects, true response Subject Labels: True Response: Population Parameters Mean: Variance: Subject Deviation: Subjects == chemicals True Response== Average response in hormetic range Need to Define Parameters to represent the problem
9
Non-Stochastic Model:. Index for response:. Response error:
Non-Stochastic Model: Index for response: Response error: Assume: Response Error Model: For each subject: Response Process: In hormetic range, pick a dose at random Measure response Assumptions (unbiased response error, heteroskedastic) Response Error Model is a stochastic Model Response Error is a random effect Sum of subject effects is zero (over population). Information: (subject label, response) Subsequently, take r=1 (one measure per subject) 11/11/2018
10
3. Subsets, Sampling Select n of N subjects (a subset, “sample”)
Let all subsets be equally likely: Sample Mean: Note difference with: Select n of N subjects (a subset) Sample is a set (un-ordered) of different subjects. Usually representCommon 11/11/2018
11
Sample as a Sequence (part of Permutation)
Represent Positions in a Permutation: Assume all Permutations Equally Likely: Define: Sample= positions Sample Mean: The random variable Y(ik) is not clearly defined. Sample is now a sequence (order matters)! 11/11/2018
12
Population s=2 s=3 Ed s=1 Wenjun Julio 11/11/2018
Population of N=3 subjects. The sample is the first two subjects on the left. s=2 Ed s=3 Wenjun s=1 Julio 11/11/2018
13
i=1 i=2 i=3 s=2 s=3 s=1 Position in Permutation 11/11/2018
Population of N=3 subjects. Note labels and positions. s=2 s=3 s=1 11/11/2018
14
i=1 i=2 i=1 i=2 i=3 i=3 s=2 s=1 s=3 s=1 s=3 s=2
Position in Permutation i=1 i=2 i=3 i=3 Different permutation: Ed, Julio, and Wenjun s=2 s=1 s=3 s=1 s=3 s=2 11/11/2018
15
i=1 i=2 i=3 s=3 s=1 s=2 Position in Permutation 11/11/2018
Different Permutation: Ed, Wenjun, Julio s=3 s=1 s=2 11/11/2018
16
i=1 i=2 i=3 s=3 s=2 s=1 Position in Permutation 11/11/2018
Different permutation: Julio, Wenjun, Ed s=3 s=2 s=1 11/11/2018
17
i=1 i=2 i=3 Sample Remainder s=1 s=2 s=3 Position in Permutation |
Different Permutation with Sample and Remainder: Wenjun, Julio, and Ed s=1 s=2 s=3 11/11/2018
18
i=1 i=2 i=3 Sample Remainder s=2 s=1 s=3 Position in Permutation |
Wenjun, Ed, and Julio (using sample and remainder s=2 s=1 s=3 11/11/2018
19
Population size (N) is most likely > 3
We only see “n” subjects in the sample For example: Suppose n=3, and N=7 We may see … 11/11/2018
20
i=1 i=2 i=3 Sample Remainder i=4 i=… s=3 s=4 s=5
| Position in Permutation i=1 i=2 i=3 Sample Remainder Luzmery, Wenjun, and Viviana in sample i=4 i=… s=3 s=4 s=5 11/11/2018
21
i=1 i=2 i=3 Sample Remainder s=2 s=4 s=7 i=… Position in Permutation |
Viviana, Ed, Silvina, in a sample s=2 s=4 s=7 i=… 11/11/2018
22
Traditional Sampling Approach
1 2 … N Horvitz-Thompson Estimator: First order inclusion Probabilites= Prob( subject included in a sample) Bold y is a vector of population values. Missing Data Missing Data 11/11/2018
23
With Response Error Model
Sample Mean Sample is a set Sample is a Sequence U(is) is an indicator variables that has a value of 1 if subject s is in position i To represent positions: 11/11/2018
24
| Position in Permutation i=1 i=2 i=3 Sample s=1 s=2 s=3 11/11/2018
25
First Position in Permutation:
Suppose s=1,…,3=N First Position in Permutation: Then: Formal expression of response for Position i=1 in a permutation 11/11/2018
26
Positions in Sample Sequences
Sample and Remainder representation Remainder 11/11/2018
27
Basic Random Variables
Sample Remainder Population 11/11/2018
28
Finite Population Mixed Model
Response Error Model Response Error Model Finite Population Mixed Model Combine response error model with permutation, get mixed model 11/11/2018
29
Mixed Model Mixed Model 11/11/2018 Alpha = fixed effects
B = Random Effects W* = Response error Note that subscript is POSITION, not SUBJECT 11/11/2018
30
Properties of Basic Random Variables (N=3)
Sum Expected Value Sum Average Expected Value Average 11/11/2018
31
Sample Random Variables (n=2)
Sum Expected Value Sum Sum over Rows, get usual random variable, with expected value mu Sum over columns: get random variable with different expected values Expected Value 11/11/2018
32
Prediction of Mean in a Simple Case: No Response Error (N=3, n=2)
Sample Remainder Note: Criteria: Linear Function of sample Unbiased Smallest Mean Squared Error Need to predict a function of the remainder Called Best Linear Unbiased Predictor (not that we use the term “Predictor” here for a parameter, not a random variable) 11/11/2018
33
Prediction of Mean No Response Error (N=3, n=2)
Target Sample Data Realized We predict the un-observed values in the population. Best Linear Unbiased Predictor: 11/11/2018
34
Prediction of a Subject’s Mean in Position i with No Resp
Prediction of a Subject’s Mean in Position i with No Resp. Error (N=3, n=2) Target Sample Data Realized We predict the un-observed values in the population. Best Linear Unbiased Predictor: 11/11/2018
35
Prediction of a Subject’s Mean in Position i with Response Error
Target Sample Data Realized We predict the un-observed values in the population. Best Linear Unbiased Predictor: 11/11/2018
36
Prediction of Realized Random Effect – Other Examples
SRS+ Subject Resp. Error SRS+ Position Resp. Error Cluster Sampling: Balanced Return to Basic Question- Which predictor should be use- Common Response Error- Optimal via the theory Allowing K to depend on realized subject- Had smaller MSE D Cluster Sampling: Un-Balanced Similar form, more complicated 11/11/2018
37
Plot of predicted response for the strain ‘wild type yeast’ for 253 chemicals with 3 doses below a benchmark dose of 95%, using a pooled (equal) response errors based on a mixed model. 11/11/2018
38
11/11/2018 Plot of with un-equal resp error
Which results should be used? Does it depends on whether model has heterogeneous response error? No- theoretically, a derivation with heterogeneous response error pools response error variances. However, in simple example, we can show that better results occur if response error is separated. The theory doesn’t match- we don’t understand the theory for the ‘better results’. Review what we do understand. Keep the context simple. 11/11/2018
39
Delimma Pooled Response Error Variance should be used for K (Using theoretical Results) Empirical example illustrates smaller MSE results with K depending on realized Subject -- but no theory! What should we do?.... Is there a ‘gap’ in the framework? 11/11/2018
40
Basic Sample Random Variables
Sum Usual Modelling Approach (work with right column) Properties of these random variables- exchangeable- Natural lead in to Bayesian Inference Traditional Sampling (and missing data) approach (work with bottom row): Don’t use explicit notation for sample, use inclusion probabilities, Some are missing. Super-population models: Use bottom row, but re-arrange elements so that those in the sample are first. Assume the random variables are exchangeable (like for the right column). Really doesn’t make sense. Sum 11/11/2018
41
Basic Random Variables
Sample and Remainder What is potentially observable? What is observed? 11/11/2018
42
Thanks More Work is needed! 11/11/2018
Anne Stanek Viviana Lencina Alice Singer Silvia San Martino Wenjun Li Luz Mery Gonzalas Julio Singer Ed Stanek Maria Lucia Singer 11/11/2018 Thanks
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.