Measurement Math DeShon - 2006. Univariate Descriptives Mean Mean Variance, standard deviation Variance, standard deviation Skew & Kurtosis Skew & Kurtosis.

Slides:



Advertisements
Similar presentations
Lesson 10: Linear Regression and Correlation
Advertisements

Chapter 12 Simple Linear Regression
G Lecture 101 Examples of Binary Data Binary Data and Correlation Measurement Models and Binary Data Measurement Models and Ordinal Data Analyzing.
Correlation and regression Dr. Ghada Abo-Zaid
Regression Analysis Module 3. Regression Regression is the attempt to explain the variation in a dependent variable using the variation in independent.
Simple Linear Regression. G. Baker, Department of Statistics University of South Carolina; Slide 2 Relationship Between Two Quantitative Variables If.
Correlation and Regression
EPI 809/Spring Probability Distribution of Random Error.
Simple Linear Regression and Correlation
Chapter 12 Simple Linear Regression
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1-1 Regression Models  Population Deterministic Regression Model Y i =  0 +  1 X i u Y i only depends on the value of X i and no other factor can affect.
Statistics for Business and Economics
Chapter 12a Simple Linear Regression
The Simple Regression Model
Fall 2006 – Fundamentals of Business Statistics 1 Chapter 13 Introduction to Linear Regression and Correlation Analysis.
SIMPLE LINEAR REGRESSION
Chapter Topics Types of Regression Models
1 Simple Linear Regression Chapter Introduction In this chapter we examine the relationship among interval variables via a mathematical equation.
REGRESSION AND CORRELATION
Analysis of Individual Variables Descriptive – –Measures of Central Tendency Mean – Average score of distribution (1 st moment) Median – Middle score (50.
This Week Continue with linear regression Begin multiple regression –Le 8.2 –C & S 9:A-E Handout: Class examples and assignment 3.
Simple Linear Regression and Correlation
Introduction to Regression Analysis, Chapter 13,
1 1 Slide Simple Linear Regression Chapter 14 BA 303 – Spring 2011.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS & Updated by SPIROS VELIANITIS.
Lecture 5 Correlation and Regression
Correlation & Regression
Regression and Correlation Methods Judy Zhong Ph.D.
Introduction to Linear Regression and Correlation Analysis
Equations in Simple Regression Analysis. The Variance.
Simple Linear Regression
1 1 Slide © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Introduction to Regression Analysis. Two Purposes Explanation –Explain (or account for) the variance in a variable (e.g., explain why children’s test.
Measuring Association September 10, 2001 Statistics for Psychosocial Research Lecture 2.
1 FORECASTING Regression Analysis Aslı Sencer Graduate Program in Business Information Systems.
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
Statistics for clinicians Biostatistics course by Kevin E. Kip, Ph.D., FAHA Professor and Executive Director, Research Center University of South Florida,
© 2001 Prentice-Hall, Inc. Statistics for Business and Economics Simple Linear Regression Chapter 10.
1 1 Slide © 2014 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
Examining Relationships in Quantitative Research
1 Chapter 12 Simple Linear Regression. 2 Chapter Outline  Simple Linear Regression Model  Least Squares Method  Coefficient of Determination  Model.
Chapter 4 Linear Regression 1. Introduction Managerial decisions are often based on the relationship between two or more variables. For example, after.
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Chapter 16 Data Analysis: Testing for Associations.
Multiple Regression. Simple Regression in detail Y i = β o + β 1 x i + ε i Where Y => Dependent variable X => Independent variable β o => Model parameter.
Examining Relationships in Quantitative Research
Chapter 12 Simple Linear Regression n Simple Linear Regression Model n Least Squares Method n Coefficient of Determination n Model Assumptions n Testing.
1 1 Slide The Simple Linear Regression Model n Simple Linear Regression Model y =  0 +  1 x +  n Simple Linear Regression Equation E( y ) =  0 + 
Multiple Regression Learning Objectives n Explain the Linear Multiple Regression Model n Interpret Linear Multiple Regression Computer Output n Test.
Lecture 10 Introduction to Linear Regression and Correlation Analysis.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
11-1 Copyright © 2014, 2011, and 2008 Pearson Education, Inc.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Slides by JOHN LOUCKS St. Edward’s University.
Multiple Regression.
The simple linear regression model and parameter estimation
Lecture 11: Simple Linear Regression
Statistics for Business and Economics (13e)
Quantitative Methods Simple Regression.
Econ 3790: Business and Economics Statistics
Slides by JOHN LOUCKS St. Edward’s University.
CHAPTER 29: Multiple Regression*
Multiple Regression.
Prepared by Lee Revere and John Large
Ass. Prof. Dr. Mogeeb Mosleh
PENGOLAHAN DAN PENYAJIAN
Introduction to Regression
St. Edward’s University
Presentation transcript:

Measurement Math DeShon

Univariate Descriptives Mean Mean Variance, standard deviation Variance, standard deviation Skew & Kurtosis Skew & Kurtosis If normal distribution, mean and SD are sufficient statistics If normal distribution, mean and SD are sufficient statistics

Normal Distribution

Univariate Probability Functions

Bivariate Descriptives Mean and SD of each variable and the correlation (ρ) between them are sufficient statistics for a bivariate normal distribution Mean and SD of each variable and the correlation (ρ) between them are sufficient statistics for a bivariate normal distribution Distributions are abstractions or models Distributions are abstractions or models Used to simplify Used to simplify Useful to the extent the assumptions of the model are met Useful to the extent the assumptions of the model are met

2D – Ellipse or Scatterplot Galton’s Original Graph

3D Probability Density

Covariance Covariance is the extent to which two variables co-vary from their respective means Covariance is the extent to which two variables co-vary from their respective means CaseXYx=X-3y =Y-4xy Sum 17 Cov(X,Y) = 17/(4-1) = 5.667

Covariance Covariance ranges from negative to positive infinity Covariance ranges from negative to positive infinity Variance - Covariance matrix Variance - Covariance matrix Variance is the covariance of a variable with itself Variance is the covariance of a variable with itself

Correlation Covariance is an unbounded statistic Covariance is an unbounded statistic Standardize the covariance with the standard deviations Standardize the covariance with the standard deviations -1 ≤ r ≤ 1 -1 ≤ r ≤ 1

Correlation Matrix Table 1. Descriptive Statistics for the Variables Correlations VariablesMeans.d Self-rated cog ability Self-enhancement Individualism Horiz individualism Vert individualism Collectivism Age Gender Academic seniority Actual cog ability Notes: N = 608; gender was coded 1 for male and 2 for female. Reliabilities (Coefficient alpha) are on the diagonal.

Coefficient of Determination r = percentage of variance in Y accounted for by X r 2 = percentage of variance in Y accounted for by X Ranges from 0 to 1 (positive only) Ranges from 0 to 1 (positive only) This number is a meaningful proportion This number is a meaningful proportion

Other measures of association Point Biserial Correlation Point Biserial Correlation Biserial Correlation Biserial Correlation Tetrachoric Correlation Tetrachoric Correlation binary variables binary variables Polychoric Correlation Polychoric Correlation ordinal variables ordinal variables Odds Ratio Odds Ratio binary variables binary variables

Point Biserial Correlation Used when one variable is a natural (real) dichotomy (two categories) and the other variable is interval or continuous Used when one variable is a natural (real) dichotomy (two categories) and the other variable is interval or continuous Just a normal correlation between a continuous and a dichotomous variable Just a normal correlation between a continuous and a dichotomous variable

Biserial Correlation Biserial Correlation When one variable is an artificial dichotomy (two categories) and the criterion variable is interval or continuous When one variable is an artificial dichotomy (two categories) and the criterion variable is interval or continuous

Tetrachoric Correlation Estimates what the correlation between two binary variables would be if you could measure variables on a continuous scale. Estimates what the correlation between two binary variables would be if you could measure variables on a continuous scale. Example: difficulty walking up 10 steps and difficulty lifting 10 lbs. Example: difficulty walking up 10 steps and difficulty lifting 10 lbs. Difficulty Walking Up 10 Steps

Tetrachoric Correlation Assumes that both “traits” are normally distributed Assumes that both “traits” are normally distributed Correlation, r, measures how narrow the ellipse is. Correlation, r, measures how narrow the ellipse is. a, b, c, d are the proportions in each quadrant a, b, c, d are the proportions in each quadrant

Tetrachoric Correlation For α = ad/bc, Approximation 1: Approximation 2 (Digby):

Tetrachoric Correlation Example: Example: Tetrachoric correlation = 0.61 Pearson correlation = 0.41 o Assumes threshold is the same across people o Strong assumption that underlying quantity of interest is truly continuous

Odds Ratio Measure of association between two binary variables Measure of association between two binary variables Risk associated with x given y. Risk associated with x given y. Example: Example: odds of difficulty walking up 10 steps to the odds of difficulty lifting 10 lb: odds of difficulty walking up 10 steps to the odds of difficulty lifting 10 lb:

Pros and Cons Tetrachoric correlation Tetrachoric correlation same interpretation as Spearman and Pearson correlations same interpretation as Spearman and Pearson correlations “difficult” to calculate exactly “difficult” to calculate exactly Makes assumptions Makes assumptions Odds Ratio Odds Ratio easy to understand, but no “perfect” association that is manageable (i.e. { ∞, - ∞ } ) easy to understand, but no “perfect” association that is manageable (i.e. { ∞, - ∞ } ) easy to calculate easy to calculate not comparable to correlations not comparable to correlations May give you different results/inference! May give you different results/inference!

Dichotomized Data: A Bad Habit of Psychologists Sometimes perfectly good quantitative data is made binary because it seems easier to talk about "High" vs. "Low" Sometimes perfectly good quantitative data is made binary because it seems easier to talk about "High" vs. "Low" The worst habit is median split The worst habit is median split Usually the High and Low groups are mixtures of the continua Usually the High and Low groups are mixtures of the continua Rarely is the median interpreted rationally Rarely is the median interpreted rationally See references See references Cohen, J. (1983) The cost of dichotomization. Applied Psychological Measurement, 7, Cohen, J. (1983) The cost of dichotomization. Applied Psychological Measurement, 7, McCallum, R.C., Zhang, S., Preacher, K.J., Rucker, D.D. (2002) On the practice of dichotomization of quantitative variables. Psychological Methods, 7, McCallum, R.C., Zhang, S., Preacher, K.J., Rucker, D.D. (2002) On the practice of dichotomization of quantitative variables. Psychological Methods, 7,

Simple Regression The simple linear regression MODEL is: The simple linear regression MODEL is: y =  0 +  1 x +  describes how y is related to x describes how y is related to x  0 and  1 are called parameters of the model.  0 and  1 are called parameters of the model.  is a random variable called the error term.  is a random variable called the error term. xy e

Simple Regression Graph of the regression equation is a straight line. Graph of the regression equation is a straight line. β is the population y-intercept of the regression line. β 0 is the population y-intercept of the regression line. β 1 is the population slope of the regression line. β 1 is the population slope of the regression line. E(y) is the expected value of y for a given x value E(y) is the expected value of y for a given x value

Simple Regression E(y)E(y)E(y)E(y) x Slope  1 is positive Regression line Intercept  0

Simple Regression E(y)E(y)E(y)E(y) x Slope  1 is 0 Regression line Intercept  0

Estimated Simple Regression The estimated simple linear regression equation is: The estimated simple linear regression equation is: The graph is called the estimated regression line. The graph is called the estimated regression line. b0 is the y intercept of the line. b0 is the y intercept of the line. b1 is the slope of the line. b1 is the slope of the line. is the estimated/predicted value of y for a given x value. is the estimated/predicted value of y for a given x value.

Estimation process Regression Model y =  0 +  1 x +  Regression Equation E ( y ) =  0 +  1 x Unknown Parameters  0,  1 Sample Data: x y x 1 y x n y n Estimated Regression Equation Sample Statistics b 0, b 1 b 0 and b 1 provide estimates of  0 and  1

Least Squares Estimation Least Squares Criterion Least Squares Criterionwhere: y i = observed value of the dependent variable for the i th observation y i = predicted/estimated value of the dependent variable for the i th observation ^

Least Squares Estimation Estimated Slope Estimated Slope Estimated y-Intercept Estimated y-Intercept

Model Assumptions 1.X is measured without error. 2.X and  are independent 3.The error  is a random variable with mean of zero. 4.The variance of , denoted by  2, is the same for all values of the independent variable (homogeneity of error variance). 5.The values of  are independent. 6.The error  is a normally distributed random variable.

Example: Consumer Warfare Number of Ads (X) Purchases (Y)

Example Slope for the Estimated Regression Equation Slope for the Estimated Regression Equation b 1 = (10)(100)/5 = 5 b 1 = (10)(100)/5 = (10) 2 / (10) 2 /5 y-Intercept for the Estimated Regression Equation y-Intercept for the Estimated Regression Equation b 0 = (2) = 10 b 0 = (2) = 10 Estimated Regression Equation Estimated Regression Equation y = x ^

Example Scatter plot with regression line Scatter plot with regression line ^

Evaluating Fit Coefficient of Determination Coefficient of Determinationwhere: SST = total sum of squares SST = total sum of squares SSR = sum of squares due to regression SSR = sum of squares due to regression SSE = sum of squares due to error SSE = sum of squares due to error SST = SSR + SSE ^^ r 2 = SSR/SST

Evaluating Fit Coefficient of Determination Coefficient of Determination r 2 = SSR/SST = 100/114 =.8772 The regression relationship is very strong because 88% of the variation in number of purchases can be explained by the linear relationship with the between the number of TV ads

Mean Square Error An Estimate of  2 An Estimate of  2 The mean square error (MSE) provides the estimate of  2, S 2 = MSE = SSE/(n-2) S 2 = MSE = SSE/(n-2)where:

Standard Error of Estimate An Estimate of S An Estimate of S To estimate  we take the square root of  2. To estimate  we take the square root of  2. The resulting S is called the standard error of the estimate. The resulting S is called the standard error of the estimate. Also called “Root Mean Squared Error” Also called “Root Mean Squared Error”

Linear Composites Linear composites are fundamental to behavioral measurement Linear composites are fundamental to behavioral measurement Prediction & Multiple Regression Prediction & Multiple Regression Principle Component Analysis Principle Component Analysis Factor Analysis Factor Analysis Confirmatory Factor Analysis Confirmatory Factor Analysis Scale Development Scale Development Ex: Unit-weighting of items in a test Ex: Unit-weighting of items in a test Test = 1*X1 + 1*X2 + 1*X3 + … + 1*Xn Test = 1*X1 + 1*X2 + 1*X3 + … + 1*Xn

Linear Composites Sum Scale Sum Scale Scale A = X 1 + X 2 + X 3 + … + X n Scale A = X 1 + X 2 + X 3 + … + X n Unit-weighted linear composite Unit-weighted linear composite Scale A = 1*X 1 + 1*X 2 + 1*X 3 + … + 1*X n Scale A = 1*X 1 + 1*X 2 + 1*X 3 + … + 1*X n Weighted linear composite Weighted linear composite Scale A = b 1 X 1 + b 2 X 2 + b 3 X 3 + … + b n X n Scale A = b 1 X 1 + b 2 X 2 + b 3 X 3 + … + b n X n

Variance of a weighted Composite XY YVar(X)Cov(XY) YCov(XY)Var(Y)

Effective vs. Nominal Weights Nominal weights Nominal weights The desired weight assigned to each component The desired weight assigned to each component Effective weights Effective weights the actual contribution of each component to the composite the actual contribution of each component to the composite function of the desired weights, standard deviations, and covariances of the components function of the desired weights, standard deviations, and covariances of the components

Principles of Composite Formation Standardize before combining!!!!! Standardize before combining!!!!! Weighting doesn’t matter much when the correlations among the components are moderate to large Weighting doesn’t matter much when the correlations among the components are moderate to large As the number of components increases, the importance of weighting decreases As the number of components increases, the importance of weighting decreases Differential weights are difficult to replicate/cross-validate Differential weights are difficult to replicate/cross-validate

Decision Accuracy Truth Yes No Decision Fail Pass True Positive False Positive False Negative True Negative

Signal Detection Theory

Polygraph Example Sensitivity, etc… Sensitivity, etc…