Design of Experiments and Data Analysis. Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores.

Slides:



Advertisements
Similar presentations
Test of (µ 1 – µ 2 ),  1 =  2, Populations Normal Test Statistic and df = n 1 + n 2 – 2 2– )1– 2 ( 2 1 )1– 1 ( 2 where ] 2 – 1 [–
Advertisements

Kin 304 Regression Linear Regression Least Sum of Squares
13- 1 Chapter Thirteen McGraw-Hill/Irwin © 2005 The McGraw-Hill Companies, Inc., All Rights Reserved.
Inference for Regression
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Objectives (BPS chapter 24)
Classical Regression III
Chapter 13 Multiple Regression
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
Lecture 5 Regression. Homework Issues…past 1.Bad Objective: Conduct an experiment because I have to for this class 2.Commas – ugh  3.Do not write out.
Chapter 10 Simple Regression.
Chapter 12 Multiple Regression
Statistics: Data Analysis and Presentation Fr Clinic II.
Correlation 2 Computations, and the best fitting line.
Analysis of Differential Expression T-test ANOVA Non-parametric methods Correlation Regression.
1 Review of Correlation A correlation coefficient measures the strength of a linear relation between two measurement variables. The measure is based on.
Chapter Topics Types of Regression Models
Analysis of Variance Chapter 3Design & Analysis of Experiments 7E 2009 Montgomery 1.
Regression Chapter 10 Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania.
Independent Sample T-test Classical design used in psychology/medicine N subjects are randomly assigned to two groups (Control * Treatment). After treatment,
Linear Regression/Correlation
Relationships Among Variables
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Statistics for the Social Sciences Psychology 340 Fall 2013 Thursday, November 21 Review for Exam #4.
The Chi-Square Distribution 1. The student will be able to  Perform a Goodness of Fit hypothesis test  Perform a Test of Independence hypothesis test.
Introduction to Linear Regression and Correlation Analysis
Inference for regression - Simple linear regression
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Inference on the Least-Squares Regression Model and Multiple Regression 14.
Simple Linear Regression Models
BPS - 3rd Ed. Chapter 211 Inference for Regression.
Inferences in Regression and Correlation Analysis Ayona Chatterjee Spring 2008 Math 4803/5803.
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide Simple Linear Regression Part A n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n.
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
1 1 Slide © 2004 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
CHAPTER 14 MULTIPLE REGRESSION
Correlation and Regression Used when we are interested in the relationship between two variables. NOT the differences between means or medians of different.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Chapter 10: Analyzing Experimental Data Inferential statistics are used to determine whether the independent variable had an effect on the dependent variance.
Lecture 8 Simple Linear Regression (cont.). Section Objectives: Statistical model for linear regression Data for simple linear regression Estimation.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Lesson Multiple Regression Models. Objectives Obtain the correlation matrix Use technology to find a multiple regression equation Interpret the.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
STA 286 week 131 Inference for the Regression Coefficient Recall, b 0 and b 1 are the estimates of the slope β 1 and intercept β 0 of population regression.
VI. Regression Analysis A. Simple Linear Regression 1. Scatter Plots Regression analysis is best taught via an example. Pencil lead is a ceramic material.
1 The Two-Factor Mixed Model Two factors, factorial experiment, factor A fixed, factor B random (Section 13-3, pg. 495) The model parameters are NID random.
Lecture 10: Correlation and Regression Model.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Applied Quantitative Analysis and Practices LECTURE#25 By Dr. Osman Sadiq Paracha.
Hypothesis test flow chart frequency data Measurement scale number of variables 1 basic χ 2 test (19.5) Table I χ 2 test for independence (19.9) Table.
Copyright (C) 2002 Houghton Mifflin Company. All rights reserved. 1 Understandable Statistics Seventh Edition By Brase and Brase Prepared by: Lynn Smith.
Introduction to Multiple Regression Lecture 11. The Multiple Regression Model Idea: Examine the linear relationship between 1 dependent (Y) & 2 or more.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
Significance Tests for Regression Analysis. A. Testing the Significance of Regression Models The first important significance test is for the regression.
BPS - 5th Ed. Chapter 231 Inference for Regression.
Stats Methods at IC Lecture 3: Regression.
Inference about the slope parameter and correlation
Chapter 14 Introduction to Multiple Regression
Regression and Correlation
Regression Analysis AGEC 784.
Chapter 11 Simple Regression
BPK 304W Correlation.
Correlation and Regression
Hypothesis testing and Estimation
Simple Linear Regression
Essentials of Statistics for Business and Economics (8e)
Chapter Thirteen McGraw-Hill/Irwin
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Design of Experiments and Data Analysis

Let’s Work an Example Data obtained from MS Thesis Studied the “bioavailability” of metals in sediment cores We’ll analyze chromium data

Pt. Mugu Marsh

Analytical Techniques Sediment samples were taken with cores Sliced into 1 cm slices Sediment in each slice was extracted using a strong acid Extracts were analyzed using an Inductively Coupled Plasma Mass Spectrometer (ICP-MS) Calibrations were also conducted Surfaces areas (SA) and organic carbon (OC) contents of sediment in each slice were also measured

Core processing 1-cm slices Organic Carbon Surface Areas Tessier Extractions

Objectives To determine if there is a correlation between sediment surface area and organic carbon content To determine if there is a relationship between concentration of a specific metal and sediment SA and/or OC To determine if there is a relationship between or among metal concentrations

Example of Results

Data File Create a folder entitled “REU” in the C:\My Documents folder Create a folder entitled “2006” in this REU folder Create a folder entitled “Data Analysis Workshop” in this 2006 folder Download Excel File REU_dataanalysis_data.xls from instructional1.calstatela.edu/ckhachi into the Data Analysis Workshop folder Open the file

Data File Structure There should be 2 worksheets in the workbook: –Data: raw SA, OC, and metals concentration data –Calibration Curves: ICP-MS calibration data (relating raw metals concentrations to known calibration concentrations) Data for the cores are separated by yellow bands

Data File Structure Data Columns include: –ID: Random sample ID –Ave Depth: Ave depth of each slice –Solid Mass: Mass of sediments in each slice –Raw ICP-MS data for each of five metals Calibration Columns include: –Conc: Concentration of standards in parts per billion (ppb) –ICP-MS responses for the 5 metals

Let’s Start with Calibration Curves Most instruments over reasonable ranges have linear responses (i.e., calibration curves are straight lines) We need to “model” the data – regression analysis to determine the best-fit line that relates ICP-MS response to concentrations We will then use these calibration equations to calculate concentrations for our samples Note: because we know that calibrations are usually linear, we will choose a linear regression model…if you don’t know the relationship b/w 2 variables, it sometimes helps to start with plots

Calibration Curve for Cr Linear response We know slope and intercept R 2 value provided Best-fit line drawn (looks good to me) Not enough statistical information provided to be able to conduct proper error analysis

Regression Analysis for Cr Rename Worksheet “Cr Analysis”

Assumptions On average, errors are not consistently positive nor negative. –Linear Model: y i = mx + b + e i, where e i is the error associated with each observation –Line goes through the middle of data Variance of error terms the same across all observations Data are independent of each other Error terms are normally distributed (not that important)

Residual Plot Look at data and linear fit carefully; points lie above the line for smaller values of concentration. If you delete the last point, you get a very different result

Regression Statistics Multiple R (or just r) is the correlation: –+1  perfectly positively correlated (as x goes up, so does y) –0  not correlated –-1  perfectly negatively correlated (as x goes up, y goes down)

Regression Statistics R Square (R 2 ): coefficient of determination –Between 0 and 1 0  no linear relationship 1  perfect linear relationship (+ or -) –Square of the r value –Theoretically, as the number of data points  ∞, R 2  1 (denominator is fixed) Adjusted R Square: fixes this problem…is probably a better measure of how strong the linear relationship is (R 2 more common) Use 2 or 3 significant figures to report these #s

Regression Statistics Standard Error: a measure of the amount of error in the prediction of y for an individual x. Observations: # of data points

ANOVA ANalysis Of VAriance (sometimes called an F test) df: degrees of freedom SS: sum of squares R 2 = (1-SS residual )/SS total MS: Mean squares = SS/df F = MS regression /MS residual  larger  reject null hypothesis (no correlation) Not very useful for single treatment

Correlation results Linear Calibration: y = mx + b –Slope (m) = –Intercept (b) = Standard Error: used for hypothesis testing and confidence band formation

Correlation results Confidence intervals –Intercept Lower: – (2.571) =  standard two-tale t-test table with df = 5 and probability = 0.05 –Slope Lower: – (2.571) = Upper: (2.571) = t stat: = Coefficient/Standard Error

Correlation results P-value: probability of wrongly rejecting the null hypothesis (H o ), in this case  no correlation, if it is in fact true –p > 0.10  null hypothesis maybe OK –0.10 < p < 0.05  slight evidence against null hypothesis –p < 0.05  moderate evidence against null hypothesis –p < 0.01  strong evidence against null hypothesis

Consult statistical tables again: –For df = 5 and t stat = 25.4, p < –For df = 5 and t stat = 71.4, p < Very, very strong evidence that H o is false  the calibration curves are linear! Linear Model: Correlation results

Using Calibration Equations Now we have an equation that relates the response of our equipment to concentrations Let’s use this equation to determine concentrations in our samples

Raw Data Excel Sheet

Measurement Errors Add 2 columns to the right of the Cr data Assume instrument has a 3% error (in reality, you need to run sample 3 times to get the proper error)

Propagation of Errors Let us assume that X is dependent upon the experimental variables p, q, and r, which fluctuate in a random and independent way. Addition or Subtraction: X = p + q - r: Where “s” is the standard deviation or error for each of the variables

Propagation of Errors (cont’d) Multiplication or Division: X = p * (q/r) Other equations exist for logs, etc. Round +/- to the # of decimal places of the component number with the fewest number of decimal places Round x/÷ to the number of significant digits of the component number with the fewest significant digits.

Let’s use the Calibration Eqn Response  detector output Concentration  what we are looking for in the column labeled “Cr Conc (ppb)”

Let’s use the Calibration Eqn Let’s look at the first line: Rearrange to solve for Conc: Let’s look at the numerator

Num = = Error in Num: –Recall for +/-: –Error in Conc = So now: Let’s use the Calibration Eqn

Conc = Recall, for x/÷: or So, Err Conc = Final result  Conc = ± 1.04 Let’s use the Calibration Eqn

Final Results Use error bars in the plots

Plotting Error Bars Error bars can be: –1-3 standard deviation(s) –Standard error –etc… Just be clear in your figure caption what your error bar represents

Next Presentation A little about design of experiments A little more about errors, hypothesis testing, etc…