Exercise 1 The standard deviation of measurements at low level for a method for detecting benzene in blood is 52 ng/L. What is the Critical Level if we.

Slides:



Advertisements
Similar presentations
Lecture 10 F-tests in MLR (continued) Coefficients of Determination BMTRY 701 Biostatistical Methods II.
Advertisements

SPH 247 Statistical Analysis of Laboratory Data 1April 2, 2013SPH 247 Statistical Analysis of Laboratory Data.
BA 275 Quantitative Business Methods
Inference for Regression
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Review of Univariate Linear Regression BMTRY 726 3/4/14.
Workshop in R & GLMs: #2 Diane Srivastava University of British Columbia
5/18/ lecture 101 STATS 330: Lecture 10. 5/18/ lecture 102 Diagnostics 2 Aim of today’s lecture  To describe some more remedies for non-planar.
Confidence Intervals Underlying model: Unknown parameter We know how to calculate point estimates E.g. regression analysis But different data would change.
Regression with ARMA Errors. Example: Seat-belt legislation Story: In February 1983 seat-belt legislation was introduced in UK in the hope of reducing.
Multiple Regression Predicting a response with multiple explanatory variables.
Zinc Data SPH 247 Statistical Analysis of Laboratory Data.
x y z The data as seen in R [1,] population city manager compensation [2,] [3,] [4,]
SPH 247 Statistical Analysis of Laboratory Data 1April 23, 2010SPH 247 Statistical Analysis of Laboratory Data.
Examining Relationship of Variables  Response (dependent) variable - measures the outcome of a study.  Explanatory (Independent) variable - explains.
Nemours Biomedical Research Statistics April 2, 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility.
7/2/ Lecture 51 STATS 330: Lecture 5. 7/2/ Lecture 52 Tutorials  These will cover computing details  Held in basement floor tutorial lab,
Crime? FBI records violent crime, z x y z [1,] [2,] [3,] [4,] [5,]
1 Regression and Calibration EPP 245 Statistical Analysis of Laboratory Data.
Correlation and Regression Analysis
Regression Transformations for Normality and to Simplify Relationships U.S. Coal Mine Production – 2011 Source:
SPH 247 Statistical Analysis of Laboratory Data April 9, 2013SPH 247 Statistical Analysis of Laboratory Data1.
Checking Regression Model Assumptions NBA 2013/14 Player Heights and Weights.
How to plot x-y data and put statistics analysis on GLEON Fellowship Workshop January 14-18, 2013 Sunapee, NH Ari Santoso.
Introduction to Linear Regression and Correlation Analysis
BIOL 582 Lecture Set 19 Matrices, Matrix calculations, Linear models using linear algebra.
PCA Example Air pollution in 41 cities in the USA.
9/14/ Lecture 61 STATS 330: Lecture 6. 9/14/ Lecture 62 Inference for the Regression model Aim of today’s lecture: To discuss how we assess.
Analysis of Covariance Harry R. Erwin, PhD School of Computing and Technology University of Sunderland.
 Combines linear regression and ANOVA  Can be used to compare g treatments, after controlling for quantitative factor believed to be related to response.
7.1 - Motivation Motivation Correlation / Simple Linear Regression Correlation / Simple Linear Regression Extensions of Simple.
23-1 Analysis of Covariance (Chapter 16) A procedure for comparing treatment means that incorporates information on a quantitative explanatory variable,
Lecture 3: Inference in Simple Linear Regression BMTRY 701 Biostatistical Methods II.
Testing Multiple Means and the Analysis of Variance (§8.1, 8.2, 8.6) Situations where comparing more than two means is important. The approach to testing.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Use of Weighted Least Squares. In fitting models of the form y i = f(x i ) +  i i = 1………n, least squares is optimal under the condition  1 ……….  n.
Regression and Analysis Variance Linear Models in R.
Exercise 8.25 Stat 121 KJ Wang. Votes for Bush and Buchanan in all Florida Counties Palm Beach County (outlier)
Lecture 9: ANOVA tables F-tests BMTRY 701 Biostatistical Methods II.
Regression Model Building LPGA Golf Performance
SPH 247 Statistical Analysis of Laboratory Data April 9, 2013SPH 247 Statistical Analysis of Laboratory Data1.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Using R for Marketing Research Dan Toomey 2/23/2015
FACTORS AFFECTING HOUSING PRICES IN SYRACUSE Sample collected from Zillow in January, 2015 Urban Policy Class Exercise - Lecy.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
Tutorial 4 MBP 1010 Kevin Brown. Correlation Review Pearson’s correlation coefficient – Varies between – 1 (perfect negative linear correlation) and 1.
Environmental Modeling Basic Testing Methods - Statistics III.
Lecture 6: Multiple Linear Regression Adjusted Variable Plots BMTRY 701 Biostatistical Methods II.
Determining Factors of GPA Natalie Arndt Allison Mucha MA /6/07.
Lecture 6: Multiple Linear Regression Adjusted Variable Plots BMTRY 701 Biostatistical Methods II.
Linear Models Alan Lee Sample presentation for STATS 760.
Exercise 1 The standard deviation of measurements at low level for a method for detecting benzene in blood is 52 ng/L. What is the Critical Level if we.
EPP 245 Statistical Analysis of Laboratory Data 1April 23, 2010SPH 247 Statistical Analysis of Laboratory Data.
Tutorial 5 Thursday February 14 MBP 1010 Kevin Brown.
1 Analysis of Variance (ANOVA) EPP 245/298 Statistical Analysis of Laboratory Data.
WSUG M AY 2012 EViews, S-Plus and R Damian Staszek Bristol Water.
Chapter 12 Simple Linear Regression and Correlation
Résolution de l’ex 1 p40 t=c(2:12);N=c(55,90,135,245,403,665,1100,1810,3000,4450,7350) T=data.frame(t,N,y=log(N));T; > T t N y
CHAPTER 7 Linear Correlation & Regression Methods
Checking Regression Model Assumptions
BPK 304W Correlation.
Checking Regression Model Assumptions
Console Editeur : myProg.R 1
PSY 626: Bayesian Statistics for Psychological Science
Chapter 12 Simple Linear Regression and Correlation
Multi Linear Regression Lab
Essentials of Statistics for Business and Economics (8e)
Estimating the Variance of the Error Terms
Exercise 1 The standard deviation of measurements at low level for a method for detecting benzene in blood is 52 ng/L. What is the Critical Level if we.
Presentation transcript:

Exercise 1 The standard deviation of measurements at low level for a method for detecting benzene in blood is 52 ng/L. What is the Critical Level if we use a 1% probability criterion? What is the Minimum Detectable Value? If we can use 52 ng/L as the standard deviation, what is a 95% confidence interval for the true concentration if the measured concentration is 175 ng/L? If the CV at high levels is 12%, about what is the standard deviation at high levels for the natural log measured concentration? Find a 95% confidence interval for the concentration if the measured concentration is 1850 ng/L? January 29, 2014BST 226 Statistical Methods for Bioinformatics1

January 29, 2014BST 226 Statistical Methods for Bioinformatics2

Exercise 2 Download data on measurement of zinc in water by ICP/MS (“Zinc.csv”). Use read.csv() to load. Conduct a regression analysis in which you predict peak area from concentration Which of the usual regression assumptions appears to be satisfied and which do not? What would the estimated concentration be if the peak area of a new sample was 1850? From the blanks part of the data, how big should a result be to indicate the presence of zinc with some degree of certainty? Try using weighted least squares for a better estimate of the calibration curve. Does it seem to make a difference? January 29, 2014BST 226 Statistical Methods for Bioinformatics3

January 29, 2014BST 226 Statistical Methods for Bioinformatics4 zinc <- read.csv("zinc.csv") > names(zinc) [1] "Concentration" "Peak.Area" > zinc.lm <- lm(Peak.Area ~ Concentration,data=zinc) > summary(zinc.lm) Call: lm(formula = Peak.Area ~ Concentration, data = zinc) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) Concentration <2e-16 *** --- Signif. codes: 0 ‘***’ ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 2201 on 89 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: 5.512e+04 on 1 and 89 DF, p-value: < 2.2e-16 > plot(zinc$Concentration,zinc$Peak.Area) > abline(coef(zinc.lm)) > plot(fitted(zinc.lm),resid(zinc.lm)) > ( )/ [1]

January 29, 2014BST 226 Statistical Methods for Bioinformatics5

January 29, 2014BST 226 Statistical Methods for Bioinformatics6

January 29, 2014BST 226 Statistical Methods for Bioinformatics7 > zinc[zinc$Concentration==0,] Concentration Peak.Area > mean(zinc[zinc$Concentration==0,2]) [1] > sd(zinc[zinc$Concentration==0,2]) [1] *205.3 [1] The CL in peak area terms is 742, with zero concentration indicated at 265. The CL in ppt is (742 − )/ = 88 ppt. The MDV is 176 ppt. No samples for true concentrations of 0, 10, or 20 had peak areas above the CL. For the samples at 100ppt or above, all had peak areas above the CL of 742.

> summary(lm(Peak.Area ~ Concentration,data=zinc)) Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) Concentration <2e-16 *** > vars <- tapply(zinc$Peak,zinc$Conc,var) e e e e e e e e e e e+07 > concs <- unique(zinc$Conc) > concnums <- table(zinc$Conc) January 29, 2014BST 226 Statistical Methods for Bioinformatics8

> summary(lm(Peak.Area ~ Concentration,data=zinc)) Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) Concentration <2e-16 *** > vars <- tapply(zinc$Peak,zinc$Conc,var) > concs <- unique(zinc$Conc) > concnums <- table(zinc$Conc) > concs2 <- concs^2 > var.lm <- lm(vars ~ concs2) > rbind(vars,predict(var.lm)) vars > wt1 <- 1/rep(vars,concnums) > wt2 <- 1/rep(predict(var.lm),concnums) > summary(lm(Peak.Area ~ Concentration,data=zinc,weights=wt1)) Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) <2e-16 *** Concentration <2e-16 *** > summary(lm(Peak.Area ~ Concentration,data=zinc,weights=wt2)) Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) e-06 *** Concentration < 2e-16 *** January 29, 2014BST 226 Statistical Methods for Bioinformatics9

Exercise 3 The file hiv.csv contains data on an HIV PCR assay calibration. These are dilutions of ten samples at 15 copy numbers from 25 to 20,000,000. In theory, the Ct value (Target in the data set) should be linear in log copy number. Fit the calibration line and look at plots to examine the assumptions of linear regression. What is the estimated copy number for an unknown if Ct = 25? The column QS is the Ct value for an in-tube standard. Consider calibrating Ct(Target) – Ct(Standard) instead. Does this work better or not? What is a good criterion? January 29, 2014BST 226 Statistical Methods for Bioinformatics10

January 29, 2014BST 226 Statistical Methods for Bioinformatics11 > hiv.lm <- lm(Target ~ log(Nominal),data=hiv) > summary(hiv.lm) Call: lm(formula = Target ~ log(Nominal), data = hiv) Residuals: Min 1Q Median 3Q Max Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) <2e-16 *** log(Nominal) <2e-16 *** --- Signif. codes: 0 ‘***’ ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: on 878 degrees of freedom Multiple R-squared: , Adjusted R-squared: F-statistic: 5.649e+04 on 1 and 878 DF, p-value: < 2.2e-16

January 29, 2014BST 226 Statistical Methods for Bioinformatics12 > anova(hiv.lm) Analysis of Variance Table Response: Target Df Sum Sq Mean Sq F value Pr(>F) log(Nominal) < 2.2e-16 *** Residuals Signif. codes: 0 ‘***’ ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

January 29, 2014BST 226 Statistical Methods for Bioinformatics13

Calibration Results Variance is not constant, being higher at higher Ct levels. Adding the standard helps unless copy number is very high (more than 1 million) Using the Target ~ regression, a Ct value of 25 corresponds to a log copy number of (25 − 38.93)/(− 1.386) = Copy number = exp(10.05) = 23,167 January 29, 2014BST 226 Statistical Methods for Bioinformatics14

January 29, 2014BST 226 Statistical Methods for Bioinformatics15 SD ratio Target ~ vs Target – QS ~ Use of standard is better when copy number > 100 and less than 5 million

Exercise 4 The file AD-Luminex.csv contains Luminex protein assays for 124 proteins on 104 patients who are either AD (Alzheimer's Disease, OD (other dementia) or NDC (non-demented controls). See if the measured levels of ApoE are associated with diagnosis. See if the measured levels of IL.1beta are associated with diagnosis. January 29, 2014BST 226 Statistical Methods for Bioinformatics16

January 29, 2014BST 226 Statistical Methods for Bioinformatics17 > ad.data <- read.csv("AD-Luminex.csv") > anova(lm(ApoE ~ Diagnosis,data=ad.data)) Analysis of Variance Table Response: ApoE Df Sum Sq Mean Sq F value Pr(>F) Diagnosis * Residuals Signif. codes: 0 ‘***’ ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 > anova(lm(log(ApoE) ~ Diagnosis,data=ad.data)) Analysis of Variance Table Response: log(ApoE) Df Sum Sq Mean Sq F value Pr(>F) Diagnosis ** Residuals Signif. codes: 0 ‘***’ ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

January 29, 2014BST 226 Statistical Methods for Bioinformatics18 > anova(lm(IL.1beta ~ Diagnosis,data=ad.data)) Analysis of Variance Table Response: IL.1beta Df Sum Sq Mean Sq F value Pr(>F) Diagnosis Residuals > anova(lm(log(IL.1beta) ~ Diagnosis,data=ad.data)) Analysis of Variance Table Response: log(IL.1beta) Df Sum Sq Mean Sq F value Pr(>F) Diagnosis Residuals Signif. codes: 0 ‘***’ ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

January 29, 2014BST 226 Statistical Methods for Bioinformatics19

January 29, 2014BST 226 Statistical Methods for Bioinformatics20

January 29, 2014BST 226 Statistical Methods for Bioinformatics21

January 29, 2014BST 226 Statistical Methods for Bioinformatics22