School of Information - The University of Texas at Austin LIS 397.1, Introduction to Research in Library and Information Science LIS 397.1 Introduction.

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

Lesson 10: Linear Regression and Correlation
Chapter 7 Statistical Data Treatment and Evaluation
Hypothesis Testing Steps in Hypothesis Testing:
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
Learning Objectives Copyright © 2002 South-Western/Thomson Learning Data Analysis: Bivariate Correlation and Regression CHAPTER sixteen.
Learning Objectives Copyright © 2004 John Wiley & Sons, Inc. Bivariate Correlation and Regression CHAPTER Thirteen.
Describing Relationships Using Correlation and Regression
Regression Analysis Using Excel. Econometrics Econometrics is simply the statistical analysis of economic phenomena Here, we just summarize some of the.
Chapter 11 Contingency Table Analysis. Nonparametric Systems Another method of examining the relationship between independent (X) and dependant (Y) variables.
Chapter 15 (Ch. 13 in 2nd Can.) Association Between Variables Measured at the Interval-Ratio Level: Bivariate Correlation and Regression.
Chapter 12 Simple Linear Regression
© 2010 Pearson Prentice Hall. All rights reserved Least Squares Regression Models.
PSY 340 Statistics for the Social Sciences Chi-Squared Test of Independence Statistics for the Social Sciences Psychology 340 Spring 2010.
Chi-square Test of Independence
Chapter 3 Summarizing Descriptive Relationships ©.
Crosstabs and Chi Squares Computer Applications in Psychology.
Regression Chapter 10 Understandable Statistics Ninth Edition By Brase and Brase Prepared by Yixun Shi Bloomsburg University of Pennsylvania.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Simple Linear Regression Analysis
Relationships Among Variables
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Correlation and Linear Regression
Correlation and Linear Regression
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 13 Linear Regression and Correlation.
Correlation and Linear Regression Chapter 13 Copyright © 2013 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 11 Regression.
Statistics for the Social Sciences Psychology 340 Fall 2013 Tuesday, November 19 Chi-Squared Test of Independence.
Statistics for the Social Sciences Psychology 340 Fall 2013 Thursday, November 21 Review for Exam #4.
Introduction to Linear Regression and Correlation Analysis
Linear Regression and Correlation
Regression Method.
CORRELATION & REGRESSION
School of Information - The University of Texas at Austin LIS 397.1, Introduction to Research in Library and Information Science LIS Introduction.
PSYCHOLOGY: Themes and Variations Weiten and McCann Appendix B : Statistical Methods Copyright © 2007 by Nelson, a division of Thomson Canada Limited.
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide © 2004 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
© The McGraw-Hill Companies, Inc., Chapter 11 Correlation and Regression.
McGraw-Hill/Irwin Copyright © 2010 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 13 Linear Regression and Correlation.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
Copyright © Cengage Learning. All rights reserved. 13 Linear Correlation and Regression Analysis.
Reasoning in Psychology Using Statistics
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Reasoning in Psychology Using Statistics Psychology
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Correlation & Regression Analysis
Regression Analysis. 1. To comprehend the nature of correlation analysis. 2. To understand bivariate regression analysis. 3. To become aware of the coefficient.
Chi-Square Analyses.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
STATS 10x Revision CONTENT COVERED: CHAPTERS
Chapter 9 Minitab Recipe Cards. Contingency tests Enter the data from Example 9.1 in C1, C2 and C3.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Stats Methods at IC Lecture 3: Regression.
Regression and Correlation
BINARY LOGISTIC REGRESSION
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Inference for Regression (Chapter 14) A.P. Stats Review Topic #3
Correlation and Simple Linear Regression
R. E. Wyllys Copyright 2003 by R. E. Wyllys Last revised 2003 Jan 15
Correlation and Simple Linear Regression
R. E. Wyllys Copyright 2003 by R. E. Wyllys Last revised 2003 Jan 15
Correlation and Regression
Correlation and Simple Linear Regression
Correlation and Regression
Simple Linear Regression and Correlation
Product moment correlation
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

School of Information - The University of Texas at Austin LIS 397.1, Introduction to Research in Library and Information Science LIS Introduction to Research in Library and Information Science Working with Two Variables: Correlation, Regression, and Chi-Square R. E. Wyllys Copyright 2003 by R. E. Wyllys Last revised 2003 Jan 15

School of Information - The University of Texas at Austin LIS 397.1, Introduction to Research in Library and Information Science Standardized Tests of Statistical Hypotheses To each type of statistical hypothesis corresponds a particular standardized test procedure or procedures Each test procedure includes a formula, the “test statistic” You –place, into the test statistic, data from observed sample or samples –obtain a number, the observed value of the test statistic

School of Information - The University of Texas at Austin LIS 397.1, Introduction to Research in Library and Information Science Standardized Tests of Statistical Hypotheses Traditional Method: Compare absolute value of observed value of test statistic against threshold value from pertinent table –If |test statistic|  tabled threshold Accept H 0 –If |test statistic| > tabled threshold Reject H 0 Computer-Era Method: Use probability of getting observed value of test statistic when the null hypothesis H 0 is true (OVTSWNHT) –If P(OVTSWNHT)   Accept H 0 –If P(OVTSWNHT) <  Reject H 0

School of Information - The University of Texas at Austin LIS 397.1, Introduction to Research in Library and Information Science Common Types of Two-Variable Statistical Hypotheses H 0 :  XY = 0 –Interval variables X and Y are not correlated in the population: “There is no correlation between the age and the salary of a typical librarian” H 0 : Categorical variables X and Y are not associated in the population: –“There is no association between the sex of a library patron and the type of book the patron prefers”

School of Information - The University of Texas at Austin LIS 397.1, Introduction to Research in Library and Information Science H 0 :  XY = 0 Test statistic r XY can be as large as +1 and as small as -1. It expresses the tendency, if any, toward "preferential co-occurrence": the tendency of certain values of X to "prefer" to occur together with certain values of Y.

School of Information - The University of Texas at Austin LIS 397.1, Introduction to Research in Library and Information Science How Correlation Works Numerator of r XY is where paired behavior is analyzed : Each pair of parentheses contains pairs of numbers, of like or unlike signs. Pairs (+)(+) and (-)(-) yield positive products; pairs (+)(-) and (-)(+) yield negative products. If large values of X tend to occur along with large values of Y, and small values of X along with small values of Y, then most of the pairs will be either (+)(+) or (-)(-) and will contribute positive numbers to the sum. If large values of X tend to occur along with small values of Y, and small values of X along with large values of Y, then most of the pairs will be either (+)(-) or (-)(+) and will contribute negative numbers to the sum. The first situation is called positive correlation; the second, negative correlation. Numbers in denominator of r XY simply adjust for units in which variables are observed and for sample size, so as to yield range from -1 to +1.

School of Information - The University of Texas at Austin LIS 397.1, Introduction to Research in Library and Information Science H 0 :  XY = 0 Example 1 : 1 Adapted from Stephens, p. 320, Example Excel' s output for its Correlation procedure is rather brief, as shown next. However, the same information, and much more, is supplied by its Regression procedure.

School of Information - The University of Texas at Austin LIS 397.1, Introduction to Research in Library and Information Science H 0 :  XY = 0 Example 1 : 1 Adapted from Stephens, p. 320, Example Output of Excel' s Correlation procedure

School of Information - The University of Texas at Austin LIS 397.1, Introduction to Research in Library and Information Science H 0 :  XY = 0 Example 1 : 1 Adapted from Stephens, p. 320, Example Partial output of Excel' s Regression procedure

School of Information - The University of Texas at Austin LIS 397.1, Introduction to Research in Library and Information Science Linear Regression Linear regression is applicable only when you have a pair of correlated variables. Regression allows you to use an observed value of one of the variables to provide an estimated corresponding value for the other variable. This is especially valuable when one of the variables can be observed easily and/or now, and the other variable can be observed only with difficulty and/or in the future. The "predictor" variable is denoted by X; the "predicted," or "criterion," variable, by Y. The LR equation is shown below; the coefficients B 0 and B 1 are calculated from the observed pairs of values of X and Y.

School of Information - The University of Texas at Austin LIS 397.1, Introduction to Research in Library and Information Science Linear Regression The LR equation is the algebraic equation of a straight line, written in “slope-intercept” form. The coefficient B 0 is called the “intercept coefficient” because it tells us where the line intercepts the vertical axis (the Y axis). The coefficient B 1 is called the “slope coefficient” because it tells us the slope of the line: e.g., a line that slopes up to the right at a 45 o angle has a slope of 1; a line that slopes down at an angle of –22.5 o has a slope of –0.5. Programs that calculate the coefficients B i often label B 0 as “Intercept” and label B 1 with the name of the X variable.

School of Information - The University of Texas at Austin LIS 397.1, Introduction to Research in Library and Information Science Calculation of the Linear Regression Coefficients

School of Information - The University of Texas at Austin LIS 397.1, Introduction to Research in Library and Information Science Partial output of Excel 's Regression procedure for data in Stephens's Table 13.2

School of Information - The University of Texas at Austin LIS 397.1, Introduction to Research in Library and Information Science Excel plot of observed points (X, Y) and trendline for data in Stephens's Table 13.2

School of Information - The University of Texas at Austin LIS 397.1, Introduction to Research in Library and Information Science SPSS plot of observed points (X, Y) and trendline for data in Stephens's Table 13.2

School of Information - The University of Texas at Austin LIS 397.1, Introduction to Research in Library and Information Science Multilinear Regression Multilinear regression extends the idea of linear regression. It permits the estimation of a predicted variable on the basis of observations of 2 or more predictor variables. It is a very powerful tool—widely used in research and in modern management—for analyzing complicated situations and figuring out which factors are important to some result (the predicted variable) and which factors are not important. In essence, the important factors will have large (positive or negative) values for their associated B i s; the unimportant factors, small (near zero) values for their B i s.

School of Information - The University of Texas at Austin LIS 397.1, Introduction to Research in Library and Information Science Chi-Square Test of Association The chi-square test of association is applicable when you have 2 categorical variables and wonder whether there is any association between them. This is analogous to the search for a tendency (correlation) for preferential co- occurrence of values of a pair of interval variables. You cross-tabulate the elements of your sample according to the 2 variables, thus obtaining the observed frequencies of occurrence of the various values of the variables. You calculate expected frequencies and calculate  2 as shown below.

School of Information - The University of Texas at Austin LIS 397.1, Introduction to Research in Library and Information Science Observed value of  2 is compared with tabled threshold value, chosen according to –Chosen level of significance,  –Number of degrees of freedom: df = (# of rows - 1)(# of columns - 1) Most statistical processing programs, including SPSS, also calculate the level of significance of the observed value of  2 Chi-Square Test of Association

School of Information - The University of Texas at Austin LIS 397.1, Introduction to Research in Library and Information Science Example of  2 Association Test Observed values are shown in blue; expected values, in red.

School of Information - The University of Texas at Austin LIS 397.1, Introduction to Research in Library and Information Science Example of  2 Association Test Output from SPSS' s Crosstabs procedure with Chi- Square option

School of Information - The University of Texas at Austin LIS 397.1, Introduction to Research in Library and Information Science Don't use  2 if any E i < 5 (size of O i s does not matter) –If that happens, try collapsing categories (i.e., merging rows or columns) till every expected frequency is at least 5 In the 2x2 case, use Yates's correction for continuity (SPSS uses it automatically in the 2x2 case), as shown in the formula below. Here a and b represent the observed frequencies in the top row of the 2x2 table; c and d represent the observed frequencies in the bottom row; and n is the size of the sample. Special Notes re the  2 Test of Association

School of Information - The University of Texas at Austin LIS 397.1, Introduction to Research in Library and Information Science Two-Variable Problems Usually Concern Interactions and Patterns of Co- Occurrence