SPH 247 Statistical Analysis of Laboratory Data April 9, 2013SPH 247 Statistical Analysis of Laboratory Data1.

Slides:



Advertisements
Similar presentations
Chapter 12 Inference for Linear Regression
Advertisements

Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 12 l Multiple Regression: Predicting One Factor from Several Others.
Copyright © 2009 Pearson Education, Inc. Chapter 29 Multiple Regression.
Inference for Regression
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Review of Univariate Linear Regression BMTRY 726 3/4/14.
EPI 809/Spring Probability Distribution of Random Error.
SPH 247 Statistical Analysis of Laboratory Data April 2, 2010SPH 247 Statistical Analysis of Laboratory Data1.
Lecture 9 Today: Ch. 3: Multiple Regression Analysis Example with two independent variables Frisch-Waugh-Lovell theorem.
Objectives (BPS chapter 24)
Multiple Regression Predicting a response with multiple explanatory variables.
LINEAR REGRESSION: Evaluating Regression Models Overview Assumptions for Linear Regression Evaluating a Regression Model.
Zinc Data SPH 247 Statistical Analysis of Laboratory Data.
LINEAR REGRESSION: Evaluating Regression Models. Overview Assumptions for Linear Regression Evaluating a Regression Model.
Copyright © 2010, 2007, 2004 Pearson Education, Inc. *Chapter 29 Multiple Regression.
x y z The data as seen in R [1,] population city manager compensation [2,] [3,] [4,]
Chapter 12 Simple Regression
SPH 247 Statistical Analysis of Laboratory Data 1April 23, 2010SPH 247 Statistical Analysis of Laboratory Data.
Regression Hal Varian 10 April What is regression? History Curve fitting v statistics Correlation and causation Statistical models Gauss-Markov.
Chapter 13 Introduction to Linear Regression and Correlation Analysis
Examining Relationship of Variables  Response (dependent) variable - measures the outcome of a study.  Explanatory (Independent) variable - explains.
1 Some R Basics EPP 245/298 Statistical Analysis of Laboratory Data.
Nemours Biomedical Research Statistics April 2, 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility.
7/2/ Lecture 51 STATS 330: Lecture 5. 7/2/ Lecture 52 Tutorials  These will cover computing details  Held in basement floor tutorial lab,
Inferences About Process Quality
Chapter 14 Introduction to Linear Regression and Correlation Analysis
1 Regression and Calibration EPP 245 Statistical Analysis of Laboratory Data.
Correlation and Regression Analysis
Introduction to Regression Analysis, Chapter 13,
Copyright ©2006 Brooks/Cole, a division of Thomson Learning, Inc. More About Regression Chapter 14.
Forecasting Revenue: An Example of Regression Model Building Setting: Possibly a large set of predictor variables used to predict future quarterly revenues.
Checking Regression Model Assumptions NBA 2013/14 Player Heights and Weights.
Introduction to Linear Regression and Correlation Analysis
Inference for regression - Simple linear regression
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 12-1 Chapter 12 Simple Linear Regression Statistics for Managers Using.
 Combines linear regression and ANOVA  Can be used to compare g treatments, after controlling for quantitative factor believed to be related to response.
BPS - 3rd Ed. Chapter 211 Inference for Regression.
Chapter 14 Simple Regression
23-1 Analysis of Covariance (Chapter 16) A procedure for comparing treatment means that incorporates information on a quantitative explanatory variable,
Lecture 3: Inference in Simple Linear Regression BMTRY 701 Biostatistical Methods II.
+ Chapter 12: Inference for Regression Inference for Linear Regression.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide
Business Statistics: A First Course, 5e © 2009 Prentice-Hall, Inc. Chap 12-1 Correlation and Regression.
Introduction to Linear Regression
Chap 12-1 A Course In Business Statistics, 4th © 2006 Prentice-Hall, Inc. A Course In Business Statistics 4 th Edition Chapter 12 Introduction to Linear.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
Regression and Analysis Variance Linear Models in R.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
Review of Building Multiple Regression Models Generalization of univariate linear regression models. One unit of data with a value of dependent variable.
Exercise 1 The standard deviation of measurements at low level for a method for detecting benzene in blood is 52 ng/L. What is the Critical Level if we.
Lecture 11 Multicollinearity BMTRY 701 Biostatistical Methods II.
Tutorial 4 MBP 1010 Kevin Brown. Correlation Review Pearson’s correlation coefficient – Varies between – 1 (perfect negative linear correlation) and 1.
Lecture 10: Correlation and Regression Model.
Chapter 22: Building Multiple Regression Models Generalization of univariate linear regression models. One unit of data with a value of dependent variable.
Lecture 6: Multiple Linear Regression Adjusted Variable Plots BMTRY 701 Biostatistical Methods II.
28. Multiple regression The Practice of Statistics in the Life Sciences Second Edition.
Lecture 6: Multiple Linear Regression Adjusted Variable Plots BMTRY 701 Biostatistical Methods II.
Linear Models Alan Lee Sample presentation for STATS 760.
Tutorial 5 Thursday February 14 MBP 1010 Kevin Brown.
Lecturer: Ing. Martina Hanová, PhD.. Regression analysis Regression analysis is a tool for analyzing relationships between financial variables:  Identify.
BPS - 5th Ed. Chapter 231 Inference for Regression.
Bivariate Regression. Bivariate Regression analyzes the relationship between two variables. Bivariate Regression analyzes the relationship between two.
Lecture 11: Simple Linear Regression
Chapter 12: Regression Diagnostics
Simple Linear Regression
Simple Linear Regression
CHAPTER 12 More About Regression
Presentation transcript:

SPH 247 Statistical Analysis of Laboratory Data April 9, 2013SPH 247 Statistical Analysis of Laboratory Data1

Quantitative Prediction Regression analysis is the statistical name for the prediction of one quantitative variable (fasting blood glucose level) from another (body mass index) Items of interest include whether there is in fact a relationship and what the expected change is in one variable when the other changes April 9, 2013SPH 247 Statistical Analysis of Laboratory Data2

Assumptions Inference about whether there is a real relationship or not is dependent on a number of assumptions, many of which can be checked When these assumptions are substantially incorrect, alterations in method can rescue the analysis No assumption is ever exactly correct April 9, 2013SPH 247 Statistical Analysis of Laboratory Data3

Linearity This is the most important assumption If x is the predictor, and y is the response, then we assume that the average response for a given value of x is a linear function of x E(y) = a + bx y = a + bx + ε ε is the error or variability April 9, 2013SPH 247 Statistical Analysis of Laboratory Data4

April 9, 2013SPH 247 Statistical Analysis of Laboratory Data5

April 9, 2013SPH 247 Statistical Analysis of Laboratory Data6

In general, it is important to get the model right, and the most important of these issues is that the mean function looks like it is specified If a linear function does not fit, various types of curves can be used, but what is used should fit the data Otherwise predictions are biased April 9, 2013SPH 247 Statistical Analysis of Laboratory Data7

Independence It is assumed that different observations are statistically independent If this is not the case inference and prediction can be completely wrong There may appear to be a relationship even though there is not Randomization and then controlling the treatment assignment prevents this in general April 9, 2013SPH 247 Statistical Analysis of Laboratory Data8

April 9, 2013SPH 247 Statistical Analysis of Laboratory Data9

April 9, 2013SPH 247 Statistical Analysis of Laboratory Data10

Note no relationship between x and y These data were generated as follows: April 9, 2013 SPH 247 Statistical Analysis of Laboratory Data11

Constant Variance Constant variance, or homoscedacticity, means that the variability is the same in all parts of the prediction function If this is not the case, the predictions may be on the average correct, but the uncertainties associated with the predictions will be wrong Heteroscedacticity is non-constant variance April 9, 2013SPH 247 Statistical Analysis of Laboratory Data12

April 9, 2013SPH 247 Statistical Analysis of Laboratory Data13

April 9, 2013SPH 247 Statistical Analysis of Laboratory Data14

Consequences of Heteroscedacticity Predictions may be unbiased (correct on the average) Prediction uncertainties are not correct; too small sometimes, too large others Inferences are incorrect (is there any relationship or is it random?) April 9, 2013SPH 247 Statistical Analysis of Laboratory Data15

Normality of Errors Mostly this is not particularly important Very large outliers can be problematic Graphing data often helps If in a gene expression array experiment, we do 40,000 regressions, graphical analysis is not possible Significant relationships should be examined in detail April 9, 2013SPH 247 Statistical Analysis of Laboratory Data16

April 9, 2013SPH 247 Statistical Analysis of Laboratory Data17

Statistical Lab Books You should keep track of what things you try The eventual analysis is best recorded in a file of commands so it can later be replicated Plots should also be produced this way, at least in final form, and not done on the fly Otherwise, when the paper comes back for review, you may not even be able to reproduce your own analysis April 9, 2013SPH 247 Statistical Analysis of Laboratory Data18

Fluorescein Example Standard aqueous solutions of fluorescein (in pg/ml) are examined in a fluorescence spectrometer and the intensity (arbitrary units) is recorded What is the relationship of intensity to concentration Use later to infer concentration of labeled analyte April 9, 2013SPH 247 Statistical Analysis of Laboratory Data19 Concentration (pg/ml) Intensity

April 9, 2013SPH 247 Statistical Analysis of Laboratory Data20 > fluor.lm <- lm(intensity ~ concentration) > summary(fluor.lm) Call: lm(formula = intensity ~ concentration) Residuals: Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) ** concentration e-08 *** --- Signif. codes: 0 `***' `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1 Residual standard error: on 5 degrees of freedom Multiple R-Squared: , Adjusted R-squared: F-statistic: 2228 on 1 and 5 DF, p-value: 8.066e-08

April 9, 2013SPH 247 Statistical Analysis of Laboratory Data21 Use of the calibration curve

April 9, 2013SPH 247 Statistical Analysis of Laboratory Data22

Measurement and Calibration Essentially all things we measure are indirect The thing we wish to measure produces an observed transduced value that is related to the quantity of interest but is not itself directly the quantity of interest Calibration takes known quantities, observes the transduced values, and uses the inferred relationship to quantitate unknowns April 9, 2013SPH 247 Statistical Analysis of Laboratory Data23

Measurement Examples Weight is observed via deflection of a spring (calibrated) Concentration of an analyte in mass spec is observed through the electrical current integrated over a peak (possibly calibrated) Gene expression is observed via fluorescence of a spot to which the analyte has bound (usually not calibrated) April 9, 2013SPH 247 Statistical Analysis of Laboratory Data24

Correlation Wright peak-flow data set has two measures of peak expiratory flow rate for each of 17 patients in l/min. ISwR library, data(wright) Both are subject to measurement error In ordinary regression, we assume the predictor is known For two measures of the same thing with no error-free gold standard, one can use correlation to measure agreement April 9, 2013SPH 247 Statistical Analysis of Laboratory Data25

April 9, 2013SPH 247 Statistical Analysis of Laboratory Data26 > setwd("c:/td/classes/SPH Spring") > source(“wright.r”) > cor(wright) std.wright mini.wright std.wright mini.wright > wplot1() File wright.r: library(ISwR) data(wright) attach(wright) wplot1 <- function() { plot(std.wright,mini.wright,xlab="Standard Flow Meter", ylab="Mini Flow Meter",lwd=2) title("Mini vs. Standard Peak Flow Meters") wright.lm <- lm(mini.wright ~ std.wright) abline(coef(wright.lm),col="red",lwd=2) } detach(wright)

April 9, 2013SPH 247 Statistical Analysis of Laboratory Data27

Issues with Correlation For any given relationship between two measurement devices, the correlation will depend on the range over which the devices are compared. If we restrict the Wright data to the range , the correlation falls from 0.94 to Correlation only measures linear agreement April 9, 2013SPH 247 Statistical Analysis of Laboratory Data28

April 9, 2013SPH 247 Statistical Analysis of Laboratory Data29

Measurement with no Gold Standard April 9, 2013SPH 247 Statistical Analysis of Laboratory Data30