Multivariate Statistics

Slides:



Advertisements
Similar presentations
MANOVA Mechanics. MANOVA is a multivariate generalization of ANOVA, so there are analogous parts to the simpler ANOVA equations First lets revisit Anova.
Advertisements

Brief introduction on Logistic Regression
Extension The General Linear Model with Categorical Predictors.
Chapter 17 Overview of Multivariate Analysis Methods
Lecture 7: Principal component analysis (PCA)
1 Multivariate Statistics ESM 206, 5/17/05. 2 WHAT IS MULTIVARIATE STATISTICS? A collection of techniques to help us understand patterns in and make predictions.
© 2005 The McGraw-Hill Companies, Inc., All Rights Reserved. Chapter 14 Using Multivariate Design and Analysis.
Common Factor Analysis “World View” of PC vs. CF Choosing between PC and CF PAF -- most common kind of CF Communality & Communality Estimation Common Factor.
“Ghost Chasing”: Demystifying Latent Variables and SEM
MANOVA LDF & MANOVA Geometric example of MANVOA & multivariate power MANOVA dimensionality Follow-up analyses if k > 2 Factorial MANOVA.
Analysis of Variance & Multivariate Analysis of Variance
Today Concepts underlying inferential statistics
Multivariate Analysis Techniques
Discriminant Analysis Testing latent variables as predictors of groups.
Classification and Prediction: Regression Analysis
Review for Final Exam Some important themes from Chapters 9-11 Final exam covers these chapters, but implicitly tests the entire course, because we use.
Multiple Linear Regression A method for analyzing the effects of several predictor variables concurrently. - Simultaneously - Stepwise Minimizing the squared.
Hypothesis Testing. Outline The Null Hypothesis The Null Hypothesis Type I and Type II Error Type I and Type II Error Using Statistics to test the Null.
Multiple Regression Farrokh Alemi, Ph.D. Kashif Haqqi M.D.
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Factor Analysis Psy 524 Ainsworth.
Business Research Methods William G. Zikmund Chapter 24 Multivariate Analysis.
ANCOVA Lecture 9 Andrew Ainsworth. What is ANCOVA?
© 2013 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part.
بسم الله الرحمن الرحیم.. Multivariate Analysis of Variance.
1 Multivariate Analysis (Source: W.G Zikmund, B.J Babin, J.C Carr and M. Griffin, Business Research Methods, 8th Edition, U.S, South-Western Cengage Learning,
Chapter 24 Multivariate Statistical Analysis © 2010 South-Western/Cengage Learning. All rights reserved. May not be scanned, copied or duplicated, or posted.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Business Research Methods William G. Zikmund Chapter 24 Multivariate Analysis.
Soc 3306a Multiple Regression Testing a Model and Interpreting Coefficients.
Chapter 18 Some Other (Important) Statistical Procedures You Should Know About Part IV Significantly Different: Using Inferential Statistics.
Chapter 9 Analyzing Data Multiple Variables. Basic Directions Review page 180 for basic directions on which way to proceed with your analysis Provides.
Sociology 680 Multivariate Analysis: Analysis of Variance.
Multivariate Statistics Matrix Algebra I W. M. van der Veld University of Amsterdam.
BIOL 582 Supplemental Material Matrices, Matrix calculations, GLM using matrix algebra.
MANOVA Mechanics. MANOVA is a multivariate generalization of ANOVA, so there are analogous parts to the simpler ANOVA equations First lets revisit Anova.
Chapter 10: Analyzing Experimental Data Inferential statistics are used to determine whether the independent variable had an effect on the dependent variance.
Discriminant Analysis Discriminant analysis is a technique for analyzing data when the criterion or dependent variable is categorical and the predictor.
Complex Analytic Designs. Outcomes (DVs) Predictors (IVs)1 ContinuousMany Continuous1 CategoricalMany Categorical None(histogram)Factor Analysis: PCA,
Stats Multivariate Data Analysis. Instructor:W.H.Laverty Office:235 McLean Hall Phone: Lectures: M W F 9:30am - 10:20am McLean Hall.
Descriptive Statistics vs. Factor Analysis Descriptive statistics will inform on the prevalence of a phenomenon, among a given population, captured by.
Chapter 13 Multiple Regression
ANCOVA. What is Analysis of Covariance? When you think of Ancova, you should think of sequential regression, because really that’s all it is Covariate(s)
Multivariate Analysis: Analysis of Variance
Module III Multivariate Analysis Techniques- Framework, Factor Analysis, Cluster Analysis and Conjoint Analysis Research Report.
1 PSY6010: Statistics, Psychometrics and Research Design Professor Leora Lawton Spring 2007 Wednesdays 7-10 PM Room 204.
Copyright © 2011, 2005, 1998, 1993 by Mosby, Inc., an affiliate of Elsevier Inc. Chapter 19: Statistical Analysis for Experimental-Type Research.
ANCOVA.
Biostatistics Regression and Correlation Methods Class #10 April 4, 2000.
Copyright © 2012 Wolters Kluwer Health | Lippincott Williams & Wilkins Chapter 18 Multivariate Statistics.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Appendix I A Refresher on some Statistical Terms and Tests.
Methods of multivariate analysis Ing. Jozef Palkovič, PhD.
Multiple Regression.
University of Warwick, Department of Sociology, 2014/15 SO 201: SSAASS (Surveys and Statistics) (Richard Lampard)   Week 5 Multiple Regression  
MANOVA Dig it!.
Chapter 7. Classification and Prediction
Hypothesis Testing.
Multivariate Data Analysis
12 Inferential Analysis.
Multiple Regression.
Main Effects and Interaction Effects
I271b Quantitative Methods
Descriptive Statistics vs. Factor Analysis
LEARNING OUTCOMES After studying this chapter, you should be able to
12 Inferential Analysis.
Product moment correlation
Multivariate Analysis: Analysis of Variance
MGS 3100 Business Analysis Regression Feb 18, 2016
Multivariate Analysis: Analysis of Variance
Presentation transcript:

Multivariate Statistics BSAD 6314 Multivariate Statistics

Major Goals: Expand your repertoire of analytical options. Choose appropriate techniques Conduct analyses using available software. Interpret findings Know where to go for additional help.

What are Multivariate Stats? Correlation 1 Canonical Correlation t test One-way ANOVA One way MANOVA IV Multiple Regression Canonical Correlation >1 Factorial ANOVA Factorial MANOVA 1 >1 DV

Hair Taxonomy Structure/ Independence Dependence Type of relationship # variables? Structure Single Rel.; Mult DVs One DV, Mult IVs Structure among cases/ people Mult Rel. IVs & DVs Structure among variables IV Cont. IV Cat. DV Cont. DV Cat. Factor Analysis Cluster Analysis SEM MANOVA Canon. Corr. Mult. Regr. Discrim; Log Regr

Matrices: The math behind the statistics! Two dimensional math represented in an array People x Variables (OB/HR). Organization x Variables (Strategy/OT).

V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 P1 P2 P3 P4 P5 . PN The variables (V) can be continuous measures, categories represented by numbers, and transformations, products or combinations of other variables.

Nearly all statistical procedures—univariate and multivariate—are based on linear combinations. Understanding that basic fact has far-reaching implications for using statistical procedures to their fullest advantage. A linear combination (LC) is nothing more than a weighted sum of variables: LC = W1V1 + W2V2 + . . . + WKVK

LC = W1V1 + W2V2 + . . . + WKVK A very simple example is the total score on a questionnaire. The individual items on the questionnaire are the variables V1, V2, V3, etc. The weights are all set to a value of 1 (i.e., W1 = W2 = . . . Wk = 1). We call this unit weighting.

The items combined in a linear combination need not be variables The items combined in a linear combination need not be variables. The items combined are often cases (e.g., people, groups, organizations). LC = W1P1 + W2P2 + . . . + WKPN A good example is the sample mean. In this case the weights are set to the reciprocal of the sample size (i.e., W1 = W2 = . . . Wk = 1/N).

Different statistical procedures derive the weights (W) in a linear combination to either maximize some desirable property (e.g., a correlation) or to minimize some undesirable property (e.g., error). The weights are sometimes assigned rather than derived (e.g., dummy, effect, and contrast codes) to produce linear combinations of particular interest.

Choice of Statistical Techniques Research Purpose Prediction, Explanation, Data Reduction Number of IVs Number of DVs Measurement Scale Continuous, categorical

Dependence/Prediction Correlation Regression Multiple Regression With nominal variables; Polynomials Logistic regression/Dicriminant analysis Canonical Correlation MANOVA

V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 P1 P2 P3 P4 P5 P6 . PN The simplest possible inferential statistic-the bivariate correlation-involves just two variables. r

r Continuous Continuous V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 P1 P2 P3 P4 P5 P6 . PN Continuous Continuous In its usual form, the correlation is calculated on variables that are continuous. r

r Continuous Categorical V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 P1 P2 P3 P4 P5 P6 . PN Continuous Categorical When one of the variables is categorical, the calculation produces a point-biserial correlation. r

r Categorical Categorical V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 P1 P2 P3 P4 P5 P6 . PN Categorical Categorical When both variables are categorical, the calculation produces a phi coefficient. r

V2 predicted = A + BV1 r Regression . PN Regression All forms of these correlations, however, can be recast as a linear combination: V2 predicted = A + BV1 r

r OLS (ordinary least squares) Regression V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 P1 P2 P3 P4 P5 P6 . PN OLS (ordinary least squares) Regression B and A can be chosen so that the sum of the squared deviations between V2 predicted and V2 are minimized. Solving for B and A using this rule also produces the maximum possible correlation between V1 and V2. r

Least Squared Error

V4predicted = B1V1 + B2V2 + B3V3 + A . PN The problem can be easily expanded to include more than one “predictor.” This is multiple regression: V4predicted = B1V1 + B2V2 + B3V3 + A Continuous Continuous R

The values for B1, B2, B3, and A are found by the least squares rule P1 P2 P3 P4 P5 P6 . PN The values for B1, B2, B3, and A are found by the least squares rule Continuous Continuous R

V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 P1 P2 P3 P4 P5 P6 . PN V1, V2, and V3 could be categorical contrast variables, perhaps coding the two main effects and the interaction from an experimental design. In that case, the multiple regression produces an analysis of variance. Categorical Continuous R

V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 P1 P2 P3 P4 P5 P6 . PN Or V2 might be the square of V1 and V3 might be the cube of V1. Then the multiple regression examines the polynomial trends. Continuous Continuous R

V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 P1 P2 P3 P4 P5 P6 . PN If the “outcome” variable is categorical, the basic nature of the analysis does not change. We still seek an “optimal” linear combination of V1, V2, and V3. Categorical “R”

V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 P1 P2 P3 P4 P5 . PN Set A The basic multiple regression problem can be generalized to situations that involve more than one “outcome” variable. Set B R

LCA = W1V1 + W2V2 + W3V3 + W4V4 + W5V5 + W6V6 Set B P1 P2 P3 P4 P5 . PN Set A LCA = W1V1 + W2V2 + W3V3 + W4V4 + W5V5 + W6V6 Set B LCB = W7V7 + W8V8 + W9V9 + W10V10 + W11V11 + W12V12 RAB

Structure of Data Goal is typically data reduction How do the data “hang together? Techniques include: Factor analysis (the items) Cluster analysis (the cases, people, organizations) MDS (the objects)

V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 P1 P2 P3 P4 P5 . PN Sometimes we are not interested in relations between sets of variables but instead focus on a single set and seek a linear combination that has desirable properties.

V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 P1 P2 P3 P4 P5 . PN Or we might wonder how many “dimensions” underlie the 12 variables. These multiple dimensions also would be represented by linear combinations, perhaps constrained to be uncorrelated. These questions are addressed in principal components analysis and factor analysis.

V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 P1 P2 P3 P4 P5 . PN A B C D

P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 . . . PN V1 V2 V3 V4 V5 . VK Sometimes we might shift the status of “people” and “variables” in our analysis. Our interest might be in whether a smaller number of dimensions or clusters might underlie the larger collection of people.

P1 P2 P3 P4 P5 P6 P7 P8 P9 P10 . . . PN V1 V2 V3 V4 V5 . VK Approaches such as multidimensional scaling and cluster analysis can address such questions. These are conceptually similar to principal components analysis, but on a transposed matrix.

Structural Equation Modeling Examines both structure of the data and predictive relationships simultaneously Measurement model Structural model

Structural equation models examine the relations among latent variables. Two kinds of linear combinations are needed. A1 A A2 D3 B1 B D D2 B2 D1 C1 C C2

The key idea is that the original data matrix can be transformed using linear combinations to provide useful ways to summarize the data and to test hypotheses about how the data are structured. Sometimes the linear combinations are of variables and sometimes they are of people.

Once a linear combination is created, we also need to know something about its variability. Everything we need to know about the variability of a linear combination is contained in the variance-covariance matrix of the original variables (S). The weights that are applied to create a linear combination can also be applied to S to get the variance of that LC. V1 V2 V3 V4 V5 s21 s12 s13 s14 s15 s22 s23 s24 s25 s31 s32 s34 s35 s41 s42 s43 s45 s51 s52 s53 s54 s12 r12 = _______ (s21s22)½