Presentation is loading. Please wait.

Presentation is loading. Please wait.

Fundamental Statistics in Applied Linguistics Research Spring 2010 Weekend MA Program on Applied English Dr. Da-Fu Huang.

Similar presentations


Presentation on theme: "Fundamental Statistics in Applied Linguistics Research Spring 2010 Weekend MA Program on Applied English Dr. Da-Fu Huang."— Presentation transcript:

1 Fundamental Statistics in Applied Linguistics Research Spring 2010 Weekend MA Program on Applied English Dr. Da-Fu Huang

2 6. Looking for groups of explanatory variables through multiple regression Explanatory variables vs. response variables MR examines whether the explanatory variables (EV) we’ve posited explain very much of what is going on in response variables (RV) Y = α + β 1 x i1 + … + β k x ik + error i TOEFL score = some constant number (the intercept; α ) + time spent on English per week ( β 1 ) + aptitude score ( β 2 ) + a number which fluctuates for each individual (the error) MR can also predicts how people in the future will score on the response variable

3 Venn diagram of regression variables Hours of study per week MLAT score Personality TOEFL score

4 The mathematical formula of a line Line equation: y = 2 + 0.5 x slope = 0.5 intercept = 2 ‧ ‧ ‧ ‧ ‧ actual value Y 1 predicted value Y’ 1 Error or residual ‧ ‧ ‧ ‧ ‧ Regression line

5 The regression line ‧ ‧ ‧ ‧ ‧ Error or residual The best fitting line (closest to the data points) The least squares regression line (the line that minimizes the sum of the squared errors about the line; Σ(Y-Y’) 2 is minimized

6 6. Looking for groups of explanatory variables through multiple regression 6.1 Standard multiple regression (SMR) In SMR, the importance of the EV variable depends on how much it uniquely overlaps with the RV. SMR answers the two questions: What are the nature and size of the relationship between the RV and the set of EV? How much of the relationship is contributed uniquely by each EV?

7 Venn diagram of standard regression design Hours of study per week MLAT score Personality TOEFL score a b c d e

8 6. Looking for groups of explanatory variables through multiple regression 6.2 Sequential (Hierarchical) multiple regression (HMR) In HMR, all of the areas of the EV’s that overlap with the RV will be counted, but the way that they will be included depends on the order in which the researcher enters the variables into the equation The importance of any variable can be emphasized in HMR, depending on the order in which it is entered. If two variables overlap to a large degree, then entering one of them first will leave little room for explanation for the second variable HMR answers the question: Do the subsequent variables entered in each step add to the prediction of the RV after differences in the variables from the previous step have been eliminated?

9 Venn diagram of sequential regression design Hours of study per week MLAT score Personality TOEFL score a b c d e H  M  P

10 Assumptions for MR Table 7.1 (P184) Normal distribution Homogeneity of variances Linearity Multicollinearity (EV’s involved in the regression should not be highly intercorrelated)

11 6. Looking for groups of explanatory variables through multiple regression 6.4 Starting the MR (PP187-188) Analyze > Regression > Linear Put the RV in the box “Dependent” For Standard regression: put all EV into the “Independent” box with the Method set at “Enter” For sequential regression: put all EV’s into the “Independent” box with the Method set at “Enter”. Push the Next button after entering each one. Enter the EV in the order you want them into the regression equation. Open the buttons: Statistics, Plots, and Options

12

13

14

15

16

17

18

19

20

21

22

23

24

25 6. Looking for groups of explanatory variables through multiple regression 6.5 Regression output in SPSS Analyze > Regression > Linear

26 Regression Output

27

28 Regression Output (Standard)

29 Regression Output (Sequential)

30 Regression output (Standard)

31 Regression Output (Sequential)

32 Regression output (Standard) Under 5 Y=8.62 + 4.14*EngProf + 1.21*Anx +.46*Mid + 2.50*EvaTch + 3.63*Motiv T-test

33 Regression Output (Sequential) Y= 42.87 + (6.73)*EngProf + (5.56)*Motiv

34 Regression Output Check outliers

35 Regression Output: P-P plot for diagnosing normal distribution of data Check normality assumption Look at distribution of residuals, not individual variables

36 Regression Output: Plot of studentized residuals crossed with fitted values The shape should show a cloud of data scattered randomly Check homogeneity of variances

37 6. Looking for groups of explanatory variables through multiple regression 6.6 Reporting the results of regression analysis Correlations between the explanatory variables and the response variable Correlations among the explanatory variables Correlation matrix with r-value, p-value, and N Standard or sequential regression? R square or R square change for each step of the model Regression coefficients for all regression models (esp. unstandarized coefficients, labeled B, and the coefficient for the intercept, labeled “constant” in SPSS output) For standard regression, report the t-tests for the contribution of each variable to the model

38 6. Looking for groups of explanatory variables through multiple regression 6.6 Reporting the results of regression analysis The multiple correlation coefficient, R 2, expresses how much of the variable in scores of the response variable can be explained by the variance in the statistical explanatory variables The squared semipartial correlations ( sr 2 ) provides a way of assessing the unique contribution of each variable to the overall R. These numbers are already a percentage variance effect size (of the r family) Example reporting on Lafrance & Gottardo (2005): P198

39 Application activities 7.4.5 (Q1-Q6): PP199-200


Download ppt "Fundamental Statistics in Applied Linguistics Research Spring 2010 Weekend MA Program on Applied English Dr. Da-Fu Huang."

Similar presentations


Ads by Google