SPH 247 Statistical Analysis of Laboratory Data 1April 2, 2013SPH 247 Statistical Analysis of Laboratory Data.

Slides:



Advertisements
Similar presentations
Lecture 2 ANALYSIS OF VARIANCE: AN INTRODUCTION
Advertisements

ANALYSIS OF VARIANCE (ONE WAY)
Test of (µ 1 – µ 2 ),  1 =  2, Populations Normal Test Statistic and df = n 1 + n 2 – 2 2– )1– 2 ( 2 1 )1– 1 ( 2 where ] 2 – 1 [–
1-Way Analysis of Variance
Lecture 10 F-tests in MLR (continued) Coefficients of Determination BMTRY 701 Biostatistical Methods II.
Multiple Comparisons in Factorial Experiments
Factorial Models Random Effects Random Effects Gauge R&R studies (Repeatability and Reproducibility) have been an expanding area of application Gauge R&R.
Factorial ANOVA More than one categorical explanatory variable.
Hypothesis Testing Steps in Hypothesis Testing:
ANOVA notes NR 245 Austin Troy
Design of Engineering Experiments - Experiments with Random Factors
Lecture 23: Tues., Dec. 2 Today: Thursday:
PSY 307 – Statistics for the Behavioral Sciences
1 Some R Basics EPP 245/298 Statistical Analysis of Laboratory Data.
1 Analysis of Variance (ANOVA) EPP 245 Statistical Analysis of Laboratory Data.
PSY 307 – Statistics for the Behavioral Sciences Chapter 19 – Chi-Square Test for Qualitative Data Chapter 21 – Deciding Which Test to Use.
Analysis of Variance Introduction The Analysis of Variance is abbreviated as ANOVA The Analysis of Variance is abbreviated as ANOVA Used for hypothesis.
Chapter 12: Analysis of Variance
F-Test ( ANOVA ) & Two-Way ANOVA
The Chi-Square Distribution 1. The student will be able to  Perform a Goodness of Fit hypothesis test  Perform a Test of Independence hypothesis test.
1 Experimental Statistics - week 7 Chapter 15: Factorial Models (15.5) Chapter 17: Random Effects Models.
1 Tests with two+ groups We have examined tests of means for a single group, and for a difference if we have a matched sample (as in husbands and wives)
© 2002 Prentice-Hall, Inc.Chap 14-1 Introduction to Multiple Regression Model.
 Combines linear regression and ANOVA  Can be used to compare g treatments, after controlling for quantitative factor believed to be related to response.
1 Design of Engineering Experiments Part 10 – Nested and Split-Plot Designs Text reference, Chapter 14, Pg. 525 These are multifactor experiments that.
23-1 Analysis of Covariance (Chapter 16) A procedure for comparing treatment means that incorporates information on a quantitative explanatory variable,
PSY 307 – Statistics for the Behavioral Sciences Chapter 16 – One-Factor Analysis of Variance (ANOVA)
Testing Multiple Means and the Analysis of Variance (§8.1, 8.2, 8.6) Situations where comparing more than two means is important. The approach to testing.
POSC 202A: Lecture 12/10 Announcements: “Lab” Tomorrow; Final ed out tomorrow or Friday. I will make it due Wed, 5pm. Aren’t I tender? Lecture: Substantive.
Lecture 9: ANOVA tables F-tests BMTRY 701 Biostatistical Methods II.
Chapter 10: Analyzing Experimental Data Inferential statistics are used to determine whether the independent variable had an effect on the dependent variance.
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
Exercise 1 The standard deviation of measurements at low level for a method for detecting benzene in blood is 52 ng/L. What is the Critical Level if we.
The Completely Randomized Design (§8.3)
Exercise 1 You have a clinical study in which 10 patients will either get the standard treatment or a new treatment Randomize which 5 of the 10 get the.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Analysis of Variance (One Factor). ANOVA Analysis of Variance Tests whether differences exist among population means categorized by only one factor or.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
1 ANALYSIS OF VARIANCE (ANOVA) Heibatollah Baghi, and Mastee Badii.
Statistics for Differential Expression Naomi Altman Oct. 06.
Chapter 22: Building Multiple Regression Models Generalization of univariate linear regression models. One unit of data with a value of dependent variable.
Chapter 13 Repeated-Measures and Two-Factor Analysis of Variance
1 Experimental Statistics - week 9 Chapter 17: Models with Random Effects Chapter 18: Repeated Measures.
CHAPTER 27: One-Way Analysis of Variance: Comparing Several Means
IE241: Introduction to Design of Experiments. Last term we talked about testing the difference between two independent means. For means from a normal.
Statistics for Political Science Levin and Fox Chapter Seven
Copyright c 2001 The McGraw-Hill Companies, Inc.1 Chapter 11 Testing for Differences Differences betweens groups or categories of the independent variable.
1 Statistics 262: Intermediate Biostatistics Regression Models for longitudinal data: Mixed Models.
EPP 245 Statistical Analysis of Laboratory Data 1April 23, 2010SPH 247 Statistical Analysis of Laboratory Data.
Experimental Statistics - week 9
Exercise 1 You have a clinical study in which 10 patients will either get the standard treatment or a new treatment Randomize which 5 of the 10 get the.
Exercise 1 You have a clinical study in which 10 patients will either get the standard treatment or a new treatment Randomize which 5 of the 10 get the.
EDUC 200C Section 9 ANOVA November 30, Goals One-way ANOVA Least Significant Difference (LSD) Practice Problem Questions?
© 2006 by The McGraw-Hill Companies, Inc. All rights reserved. 1 Chapter 11 Testing for Differences Differences betweens groups or categories of the independent.
Formula for Linear Regression y = bx + a Y variable plotted on vertical axis. X variable plotted on horizontal axis. Slope or the change in y for every.
1 Experimental Statistics - week 8 Chapter 17: Mixed Models Chapter 18: Repeated Measures.
1 Analysis of Variance (ANOVA) EPP 245/298 Statistical Analysis of Laboratory Data.
Differences Among Groups
> xyplot(cat~time|id,dd,groups=ses,lty=1:3,type="l") > dd head(dd) id ses cat pre post time
1 Analysis of Variance (ANOVA) EPP 245/298 Statistical Analysis of Laboratory Data.
Statistical Data Analysis - Lecture /04/03
Comparing Three or More Means
Mixed models and their uses in meta-analysis
POSC 202A: Lecture Lecture: Substantive Significance, Relationship between Variables 1.
CHAPTER 29: Multiple Regression*
Chapter 13 Group Differences
Alexandra Kuznetsova Biostatistician, Leo Pharma
MGS 3100 Business Analysis Regression Feb 18, 2016
Analysis of Variance (ANOVA)
Presentation transcript:

SPH 247 Statistical Analysis of Laboratory Data 1April 2, 2013SPH 247 Statistical Analysis of Laboratory Data

ANOVA—Fixed and Random Effects We will review the analysis of variance (ANOVA) and then move to random and fixed effects models Nested models are used to look at levels of variability (days within subjects, replicate measurements within days) Crossed models are often used when there are both fixed and random effects. April 2, 2013SPH 247 Statistical Analysis of Laboratory Data2

The Basic Idea The analysis of variance is a way of testing whether observed differences between groups are too large to be explained by chance variation One-way ANOVA is used when there are k ≥ 2 groups for one factor, and no other quantitative variable or classification factor. April 2, 2013SPH 247 Statistical Analysis of Laboratory Data3

April 2, 2013SPH 247 Statistical Analysis of Laboratory Data4 ABC

April 2, 2013SPH 247 Statistical Analysis of Laboratory Data5 Data = Grand Mean + Column Deviations from grand mean + Cell Deviations from column mean Are the column deviations from the grand mean too big to be accounted for by the cell deviations from the column means?

April 2, 2013SPH 247 Statistical Analysis of Laboratory Data6 ABC Data

April 2, 2013SPH 247 Statistical Analysis of Laboratory Data7 ABC Column Means

April 2, 2013SPH 247 Statistical Analysis of Laboratory Data8 ABC Deviations from Column Means

April 2, 2013SPH 247 Statistical Analysis of Laboratory Data9 red.cell.folate package:ISwR R Documentation Red cell folate data Description: The 'folate' data frame has 22 rows and 2 columns. It contains data on red cell folate levels in patients receiving three different methods of ventilation during anesthesia. Format: This data frame contains the following columns: folate: a numeric vector. Folate concentration (μg/l). ventilation: a factor with levels: 'N2O+O2,24h': 50% nitrous oxide and 50% oxygen, continuously for 24~hours; 'N2O+O2,op': 50% nitrous oxide and 50% oxygen, only during operation; 'O2,24h': no nitrous oxide, but 35-50% oxygen for 24~hours.

April 2, 2013SPH 247 Statistical Analysis of Laboratory Data10 > data(red.cell.folate) > help(red.cell.folate) > summary(red.cell.folate) folate ventilation Min. :206.0 N2O+O2,24h:8 1st Qu.:249.5 N2O+O2,op :9 Median :274.0 O2,24h :5 Mean : rd Qu.:305.5 Max. :392.0 > attach(red.cell.folate) > plot(folate ~ ventilation)

April 2, 2013SPH 247 Statistical Analysis of Laboratory Data11

April 2, 2013SPH 247 Statistical Analysis of Laboratory Data12 > folate.lm <- lm(folate ~ ventilation) > summary(folate.lm) Call: lm(formula = folate ~ ventilation) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) e-14 *** ventilationN2O+O2,op * ventilationO2,24h Signif. codes: 0 `***' `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 Residual standard error: on 19 degrees of freedom Multiple R-Squared: , Adjusted R-squared: F-statistic: on 2 and 19 DF, p-value:

April 2, 2013SPH 247 Statistical Analysis of Laboratory Data13 > anova(folate.lm) Analysis of Variance Table Response: folate Df Sum Sq Mean Sq F value Pr(>F) ventilation * Residuals Signif. codes: 0 `***' `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1

Two- and Multi-way ANOVA If there is more than one factor, the sum of squares can be decomposed according to each factor, and possibly according to interactions One can also have factors and quantitative variables in the same model (cf. analysis of covariance) All have similar interpretations April 2, 2013SPH 247 Statistical Analysis of Laboratory Data14

April 2, 2013SPH 247 Statistical Analysis of Laboratory Data15 Heart rates after enalaprilat (ACE inhibitor) Description: 36 rows and 3 columns. data for nine patients with congestive heart failure before and shortly after administration of enalaprilat, in a balanced two-way layout. Format: hr a numeric vector. Heart rate in beats per minute. subj a factor with levels '1' to '9'. time a factor with levels '0' (before), '30', '60', and '120' (minutes after administration).

April 2, 2013SPH 247 Statistical Analysis of Laboratory Data16 > data(heart.rate) > attach(heart.rate) > heart.rate hr subj time

April 2, 2013SPH 247 Statistical Analysis of Laboratory Data17 > plot(hr~subj) > plot(hr~time) > hr.lm <- lm(hr~subj+time) > anova(hr.lm) Analysis of Variance Table Response: hr Df Sum Sq Mean Sq F value Pr(>F) subj e-16 *** time * Residuals Signif. codes: 0 `***' `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 > sres <- resid(lm(hr~subj)) > plot(sres~time) Note that when the design is orthogonal, the ANOVA results don’t depend on the order of terms.

April 2, 2013SPH 247 Statistical Analysis of Laboratory Data18

April 2, 2013SPH 247 Statistical Analysis of Laboratory Data19

April 2, 2013SPH 247 Statistical Analysis of Laboratory Data20

Fixed and Random Effects A fixed effect is a factor that can be duplicated (dosage of a drug) A random effect is one that cannot be duplicated Patient/subject Repeated measurement There can be important differences in the analysis of data with random effects The error term is always a random effect April 2, 2013SPH 247 Statistical Analysis of Laboratory Data21

April 2, 2013SPH 247 Statistical Analysis of Laboratory Data22

April 2, 2013SPH 247 Statistical Analysis of Laboratory Data23

Estradiol data from Rosner 5 subjects from the Nurses’ Health Study One blood sample each Each sample assayed twice for estradiol (and three other hormones) The within variability is strictly technical/assay Variability within a person over time will be much greater April 2, 2013SPH 247 Statistical Analysis of Laboratory Data24

April 2, 2013SPH 247 Statistical Analysis of Laboratory Data25

April 2, 2013SPH 247 Statistical Analysis of Laboratory Data26 > anova(lm(Estradiol ~ Subject,data=endocrin)) Analysis of Variance Table Response: Estradiol Df Sum Sq Mean Sq F value Pr(>F) Subject ** Residuals Signif. codes: 0 ‘***’ ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Replication error variance is 6.043, so the standard deviation of replicates is 2.46 pg/mL This compared to average levels across subjects from 8.05 to Estimated variance across subjects is ( − 6.043)/2 = Standard deviation across subjects is 8.43 pg/mL If we average the replicates, we get five values, the standard deviation of which is also 71.1

Fasting Blood Glucose Part of a larger study that also examined glucose tolerance during pregnancy Here we have 53 subjects with 6 tests each at intervals of at least a year The response is glucose as mg/100mL April 2, 2013SPH 247 Statistical Analysis of Laboratory Data27

April 2, 2013SPH 247 Statistical Analysis of Laboratory Data28 > anova(lm(FG ~ Subject,data=fg2)) Analysis of Variance Table Response: FG Df Sum Sq Mean Sq F value Pr(>F) Subject e-09 *** Residuals Signif. codes: 0 ‘***’ ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 > Estimated within-Subject variance is , so the standard deviation is 8.48 mg/100mL Estimated between-Subject variance is ( − )/6 = , sd = 4.80 mg/100mL The variance of the 53 means is 35.05, which is larger because it includes a component of the within-subject variance

Nested Random Effects Models Cooperative trial with 6 laboratories, one analyte (7 in the full data set), 3 batches per lab (a month apart), and 2 replicates per batch Estimate the variance components due to labs, batches, and replicates Test for significance if possible Effects are lab, batch-in-lab, and error April 2, 2013SPH 247 Statistical Analysis of Laboratory Data29

Analysis using lm or aov April 2, 2013SPH 247 Statistical Analysis of Laboratory Data30 > anova(lm(Conc ~ Lab + Lab:Bat,data=coop2)) Analysis of Variance Table Response: Conc Df Sum Sq Mean Sq F value Pr(>F) Lab e-10 *** Lab:Bat * Residuals The test for batch-in-lab is correct, but the test for lab is not—the denominator should be The Lab:Bat MS, so F(5,12) = / = so p = 3.47e-4, still significant Residual Batch Lab

April 2, 2013SPH 247 Statistical Analysis of Laboratory Data31

Analysis using lme R package nlme Two separate formulas, one for the fixed effects and one for the random effects In this case, no fixed effects Nested random effects use the / notation lme(Conc ~1, random = ~1 | Lab/Bat,data=coop2) April 2, 2013SPH 247 Statistical Analysis of Laboratory Data32

April 2, 2013SPH 247 Statistical Analysis of Laboratory Data33 lme(Conc ~1, random = ~1 | Lab/Bat,data=coop2) Linear mixed-effects model fit by REML Data: coop2 Log-restricted-likelihood: Fixed: Conc ~ 1 (Intercept) Random effects: Formula: ~1 | Lab (Intercept) StdDev: Formula: ~1 | Bat %in% Lab (Intercept) Residual StdDev: Number of Observations: 36 Number of Groups: Lab Bat %in% Lab 6 18 Average Concentration SD of Labs SD of Batches and Replicates

Hypothesis Tests When data are balanced, one can compute expected mean squares, and many times can compute a valid F test. In more complex cases, or when data are unbalanced, this is more difficult One requirement for certain hypothesis tests to be valid is that the null hypothesis value is not on the edge of the possible values For H 0 : α = 0, we have that α could be either positive or negative For H 0 : σ 2 = 0, negative variances are not possible April 2, 2013SPH 247 Statistical Analysis of Laboratory Data34

April 2, 2013SPH 247 Statistical Analysis of Laboratory Data35 EffectVarianceSD Residual Batch Lab The variance among replicates a month apart ( = ) is about twice that of those on the same day ( ), and the standard deviations are and These are CV’s on the average of 21% and 16% respectively The variance among values from different labs is about , with a standard deviation of and a CV of about 33%

More complex models When data are balanced and the expected mean squares can be computed, this is a valid way for testing and estimation Programs like lme and lmer in R and Proc Mixed in SAS can handle complex models But most likely this is a time when you may need to consult an expert April 2, 2013SPH 247 Statistical Analysis of Laboratory Data36