So far... We have been estimating differences caused by application of various treatments, and determining the probability that an observed difference.

Slides:



Advertisements
Similar presentations
Test of (µ 1 – µ 2 ),  1 =  2, Populations Normal Test Statistic and df = n 1 + n 2 – 2 2– )1– 2 ( 2 1 )1– 1 ( 2 where ] 2 – 1 [–
Advertisements

So far... Until we looked at factorial interactions, we were looking at differences and their significance - or the probability that an observed difference.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Topic 12: Multiple Linear Regression
Simple Linear Regression. G. Baker, Department of Statistics University of South Carolina; Slide 2 Relationship Between Two Quantitative Variables If.
Chapter 12 Simple Linear Regression
Section 4.2 Fitting Curves and Surfaces by Least Squares.
Chapter 10 Simple Regression.
Chapter 12 Simple Regression
The Statistical Analysis Partitions the total variation in the data into components associated with sources of variation –For a Completely Randomized Design.
Analysis of Covariance Goals: 1)Reduce error variance. 2)Remove sources of bias from experiment. 3)Obtain adjusted estimates of population means.
The Simple Regression Model
Lesson #32 Simple Linear Regression. Regression is used to model and/or predict a variable; called the dependent variable, Y; based on one or more independent.
SIMPLE LINEAR REGRESSION
Pengujian Parameter Koefisien Korelasi Pertemuan 04 Matakuliah: I0174 – Analisis Regresi Tahun: Ganjil 2007/2008.
Chapter 10 - Part 1 Factorial Experiments.
Multiple Linear Regression
Introduction to Probability and Statistics Linear Regression and Correlation.
SIMPLE LINEAR REGRESSION
This Week Continue with linear regression Begin multiple regression –Le 8.2 –C & S 9:A-E Handout: Class examples and assignment 3.
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Linear Regression and Linear Prediction Predicting the score on one variable.
Chapter 7 Forecasting with Simple Regression
Linear Regression/Correlation
Review Guess the correlation. A.-2.0 B.-0.9 C.-0.1 D.0.1 E.0.9.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Descriptive Methods in Regression and Correlation
Introduction to Linear Regression and Correlation Analysis
One-Factor Experiments Andy Wang CIS 5930 Computer Systems Performance Analysis.
1 FORECASTING Regression Analysis Aslı Sencer Graduate Program in Business Information Systems.
Experimental Design An Experimental Design is a plan for the assignment of the treatments to the plots in the experiment Designs differ primarily in the.
Econ 3790: Business and Economics Statistics
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved Chapter 13 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Regression. Idea behind Regression Y X We have a scatter of points, and we want to find the line that best fits that scatter.
© 2003 Prentice-Hall, Inc.Chap 13-1 Basic Business Statistics (9 th Edition) Chapter 13 Simple Linear Regression.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
1 1 Slide Simple Linear Regression Coefficient of Determination Chapter 14 BA 303 – Spring 2011.
1 Chapter 10 Correlation and Regression 10.2 Correlation 10.3 Regression.
1 G Lect 6M Comparing two coefficients within a regression equation Analysis of sets of variables: partitioning the sums of squares Polynomial curve.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Orthogonal Linear Contrasts This is a technique for partitioning ANOVA sum of squares into individual degrees of freedom.
Chapter 10 Correlation and Regression
Orthogonal Linear Contrasts
Curvilinear 2 Modeling Departures from the Straight Line (Curves and Interactions)
Multivariate Analysis. One-way ANOVA Tests the difference in the means of 2 or more nominal groups Tests the difference in the means of 2 or more nominal.
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Multiple Regression and Model Building Chapter 15 Copyright © 2014 by The McGraw-Hill Companies, Inc. All rights reserved.McGraw-Hill/Irwin.
Regression Regression relationship = trend + scatter
MGS3100_04.ppt/Sep 29, 2015/Page 1 Georgia State University - Confidential MGS 3100 Business Analysis Regression Sep 29 and 30, 2015.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
Orthogonal Linear Contrasts This is a technique for partitioning ANOVA sum of squares into individual degrees of freedom.
Control of Experimental Error Blocking - –A block is a group of homogeneous experimental units –Maximize the variation among blocks in order to minimize.
Lecture 10: Correlation and Regression Model.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Chapter 12 Simple Linear Regression n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n Testing.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
Multiple Regression.
The simple linear regression model and parameter estimation
Statistics for Managers using Microsoft Excel 3rd Edition
Regression 10/29.
Comparing Three or More Means
12 Inferential Analysis.
Quantitative Methods Simple Regression.
CHAPTER 29: Multiple Regression*
Multiple Regression.
Relationship between mean yield, coefficient of variation, mean square error and plot size in wheat field experiments Coefficient of variation: Relative.
12 Inferential Analysis.
MGS 3100 Business Analysis Regression Feb 18, 2016
STATISTICS INFORMED DECISIONS USING DATA
Presentation transcript:

So far... We have been estimating differences caused by application of various treatments, and determining the probability that an observed difference was due to chance The presence of interactions may indicate that two or more treatment factors have a joint effect on a response variable But we have not learned anything about how two (or more) variables are related

Types of Variables in Crop Experiments Treatments such as fertilizer rates, varieties, and weed control methods which are the primary focus of the experiment Environmental factors, such as rainfall and solar radiation which are not within the researcher’s control Responses which represent the biological and physical features of the experimental units that are expected to be affected by the treatments being tested

What is Regression? The way that one variable is related to another. As you change one, how are others affected? Yield Grain Protein % May want to –Develop and test a model for a biological system –Predict the values of one variable from another

Usual associations within ANOVA... Agronomic experiments frequently consist of different levels of one or more quantitative variables: –Varying amounts of fertilizer –Several different row spacings –Two or more depths of seeding Would be useful to develop an equation to describe the relationship between plant response and treatment level –the response could then be specified for not only the treatment levels actually tested but for all other intermediate points within the range of those treatments Simplest form of response is a straight line

Fitting the Linear Regression Model Wheat Yield (Y) Applied N Level X 1 X 2 X 3 X 4 Y3Y3 Y1Y1 Y2Y2 Y4Y4 Y =  0 +  1 X +  where: Y = wheat yield X = nitrogen level  0 = yield with no N  1 = change in yield per unit of applied N  = random error Choose a line that minimizes deviation of observed values from the line (predicted values)

Types of regression models Model I –Values of the independent variable X are controlled by the experimenter –Assumed to be measured without error –We measure response of the independent variable Y to changes in X Model II –Both the X and the Y variables are measured and subject to error (e.g., in an observational study) –Either variable could be considered as the independent variable; choice depends on the context of the experiment –Often interested in correlations between variables –May be descriptive, but might not be reliable for prediction

Sums of Squares due to Regression Because the line passes through

Partitioning SST Sums of Squares for Treatments (SST) contains: –SS LIN = Sum of squares associated with the linear regression of Y on X (with 1 df) –SS LOF = Sum of squares for the failure of the regression model to describe the relationship between Y and X (lack of fit) (with t-2 df)

One way: Find a set of coefficients that define a linear contrast –use the deviations of the treatment levels from the mean level of all treatments –so that Therefore The sum of the coefficients will be zero, satisfying the definition of a contrast

Computing SS LIN SS LOF (sum of squares for lack of fit) is computed by subtraction SS LOF = SST - SS LIN (df is df for treatments - 1) Not to be confused with SSE which is still the SS for pure error (experimental error) _ SS LIN = r*L LIN 2 /[  j (X j - X) 2 ] really no different from any other contrast - df is always 1

F Ratios and their meaning All F ratios have MSE as a denominator F T = MST/MSE tests –significance of differences among the treatment means F LIN = MS LIN /MSE tests –H 0 : no linear relationship between X and Y (  1 = 0) –H a : there is a linear relationship between X and Y (  1  0) F LOF = MS LOF /MSE tests –H 0 : the simple linear regression model describes the data E(Y) =  0 +  1 X –H a : there is significant deviation from a linear relationship between X and Y E(Y)   0 +  1 X

The linear relationship The expected value of Y given X is described by the equation: where: – = grand mean of Y –X j = value of X (treatment level) at which Y is estimated –

Orthogonal Polynomials If the relationship is not linear, we can simplify curve fitting within the ANOVA with the use of orthogonal polynomial coefficients under these conditions: –equal replication –the levels of the treatment variable must be equally spaced e.g., 20, 40, 60, 80, 100 kg of fertilizer per plot

Curve fitting Model: E(Y) =  0 +  1 X +  2 X 2 +  3 X 3 +… Determine the coefficients for 2 nd order and higher polynomials from a table Use the F ratio to test the significance of each contrast. Unless there is prior reason to believe that the equation is of a particular order, it is customary to fit the terms sequentially Include all terms in the equation up to and including the term at which lack of fit first becomes nonsignificant Table of coefficients

Where do linear contrast coefficients come from? (revisited) Assume 5 Nitrogen levels: 30, 60, 90, 120, 150 x = 90 k 1 = (-60, -30, 0, 30, 60) If we code the treatments as 1, 2, 3, 4, 5 x = 3 k 1 = (-2, -1, 0, 1, 2) b 1 = L LIN / [r  j (x j - x) 2 ], but must be decoded back to original scale _ _ _

Consider an experiment Five levels of N (10, 30, 50, 70, 90) with four replications Linear contrast – –SS LIN = 4* L LIN 2 / 10 Quadratic – –SS QUAD = 4*L QUAD 2 / 14

LOF still significant? Keep going… Cubic – –SS CUB = 4*L CUB 2 / 10 Quartic – –SS QUAR = 4*L QUAR 2 / 70 Each contrast has 1 degree of freedom Each F has MSE in denominator

Numerical Example An experiment to determine the effect of nitrogen on the yield of sugarbeet roots: –RBD –three blocks –5 levels of N (0, 35, 70, 105, and 140) kg/ha Meets the criteria –N is a quantitative variable –levels are equally spaced –equally replicated Significant SST so we go to contrasts

Orthogonal Partition of SST N level (kg/ha) Order Mean L i  j k j 2 SS(L) i Linear Quadratic Cubic Quartic

Sequential Test of Nitrogen Effects SourcedfSSMSF (1)Nitrogen ** (2)Linear ** Dev (LOF) ** (3)Quadratic ** Dev (LOF) ns Choose a quadratic model –First point at which the LOF is not significant –Implies that a cubic term would not be significant

Regression Equation b i = L REG /  j k j 2 Coefficientb 0 b 1 b To scale to original X values Easier way1) use contrasts to find the best model and estimate pure error 2) get the equation from a graph or from regression analysis Useful for prediction

Common misuse of regression... Broad Generalization –Extrapolating the result of a regression line outside the range of X values tested –Don’t go beyond the highest nitrogen rate tested, for example –Or don’t generalize over all varieties when you have just tested one Do not over interpret higher order polynomials –with t-1 df, they will explain all of the variation among treatments, whether there is any meaningful pattern to the data or not

Class vs nonclass variables General linear model in matrix notation Y = Xß +  X is the design matrix –Assume a CRD with 3 fertilizer treatments, 2 replications  x 1 x 2 x 3  L 1 L 2 b 0 x x 2 ANOVA (class variables) Orthogonal polynomials Regression (continuous variables) This column is dropped - it provides no additional information