Basics of Group Analysis

Slides:



Advertisements
Similar presentations
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Advertisements

6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
July 1, 2008Lecture 17 - Regression Testing1 Testing Relationships between Variables Statistics Lecture 17.
Econ 140 Lecture 151 Multiple Regression Applications Lecture 15.
Linear Regression and Correlation Analysis
Chapter 11 Multiple Regression.
Business Statistics - QBM117 Interval estimation for the slope and y-intercept Hypothesis tests for regression.
Basic Mathematics for Portfolio Management. Statistics Variables x, y, z Constants a, b Observations {x n, y n |n=1,…N} Mean.
Simple Linear Regression Analysis
1 Chapter 10 Correlation and Regression We deal with two variables, x and y. Main goal: Investigate how x and y are related, or correlated; how much they.
Review Guess the correlation. A.-2.0 B.-0.9 C.-0.1 D.0.1 E.0.9.
Section #6 November 13 th 2009 Regression. First, Review Scatter Plots A scatter plot (x, y) x y A scatter plot is a graph of the ordered pairs (x, y)
1 1 Slide © 2005 Thomson/South-Western Slides Prepared by JOHN S. LOUCKS St. Edward’s University Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved Chapter 13 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
1 1 Slide © 2008 Thomson South-Western. All Rights Reserved Chapter 15 Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple.
Brain Mapping Unit The General Linear Model A Basic Introduction Roger Tait
Statistics and Linear Algebra (the real thing). Vector A vector is a rectangular arrangement of number in several rows and one column. A vector is denoted.
POSC 202A: Lecture 12/10 Announcements: “Lab” Tomorrow; Final ed out tomorrow or Friday. I will make it due Wed, 5pm. Aren’t I tender? Lecture: Substantive.
Chapter 20 Linear Regression. What if… We believe that an important relation between two measures exists? For example, we ask 5 people about their salary.
Regression Chapter 16. Regression >Builds on Correlation >The difference is a question of prediction versus relation Regression predicts, correlation.
Reasoning in Psychology Using Statistics
© Copyright McGraw-Hill Correlation and Regression CHAPTER 10.
Chapter 13 Multiple Regression
Marshall University School of Medicine Department of Biochemistry and Microbiology BMS 617 Lecture 11: Models Marshall University Genomics Core Facility.
Regression. Outline of Today’s Discussion 1.Coefficient of Determination 2.Regression Analysis: Introduction 3.Regression Analysis: SPSS 4.Regression.
3.3. SIMPLE LINEAR REGRESSION: DUMMY VARIABLES 1 Design and Data Analysis in Psychology II Salvador Chacón Moscoso Susana Sanduvete Chaves.
Stats Methods at IC Lecture 3: Regression.
Linear Regression Essentials Line Basics y = mx + b vs. Definitions
Correlation & Regression
REGRESSION G&W p
Introduction to Matrices
Reasoning in Psychology Using Statistics
Review Guess the correlation
Regression 11/6.
Regression 10/29.
3.1 Examples of Demand Functions
Linear Regression and Correlation Analysis
Correlation and Regression
Correlation – Regression
Learning Objectives For models with dichotomous intendant variables, you will learn: Basic terminology from ANOVA framework How to identify main effects,
Regression.
Chapter 5 STATISTICS (PART 4).
Correlation and regression
STAT 250 Dr. Kari Lock Morgan
Multiple Regression Analysis with Qualitative Information
Correlation and Regression
Applied Statistical Analysis
POSC 202A: Lecture Lecture: Substantive Significance, Relationship between Variables 1.
CHAPTER 10 Correlation and Regression (Objectives)
Simple Linear Regression - Introduction
Correlation and Regression
CHAPTER 29: Multiple Regression*
6-1 Introduction To Empirical Models
Prepared by Lee Revere and John Large
Multiple Regression BPS 7e Chapter 29 © 2015 W. H. Freeman and Company.
Correlation and Regression
Statistics for the Social Sciences
Statistical Inference about Regression
Correlation and Regression
Reasoning in Psychology Using Statistics
Correlation and Regression
Chapter 8: DUMMY VARIABLE (D.V.) REGRESSION MODELS
y = mx + b Linear Regression line of best fit REMEMBER:
Inferential Statistics
15.1 The Role of Statistics in the Research Process
Financial Econometrics Fin. 505
Introduction to Regression
Adding variables. There is a difference between assessing the statistical significance of a variable acting alone and a variable being added to a model.
Linear Algebra and Matrices
Presentation transcript:

Basics of Group Analysis FreeSurfer Course Spring 2016 Emily Lindemer

Outline Objective Types of questions Linear algebra review Statistics review Example

objective To create a model that can describe patterns of interactions and associations The parameters of the model provide measures of the strength of associations A General Linear Model (GLM) focuses on estimating the parameters of the model such that they can be applied to new data sets to create reasonable inferences.

What kinds of questions are we interested in? Does a specific variable have a significant association with an outcome? If we control for the effects of a second variable, is the association still significant? Is there a group difference in outcome? Does a specific variable affect individual outcome differently between groups of individuals?

Linear algebra review (stay calm…) x

Linear algebra review (stay calm…) We can put this in matrix format: -One row per data point -Add column of 1’s for the offset term (b) -One set of parameters y1 y2 y3 y4 x1 1 x2 x3 1 x4 y b m m = * b Design Matrix Regression Coefficients x

matrix multiplication y1 y2 y3 y4 x1 1 x2 x3 1 x4 b m = * y1 = 1*b + x1*m y2 = 1*b + x2*m System of Linear Equations y3 = 1*b + x3*m y4 = 1*b + x4*m

Error y1 = 1*b + x1*m + r1 y2 = 1*b + x2*m + r2 y3 = 1*b + x3*m + r3 BUT… if we have the same m and b for all data points, we will have errors: y b m y1 = 1*b + x1*m + r1 y2 = 1*b + x2*m + r2 y3 = 1*b + x3*m + r3 y4 = 1*b + x4*m + r4 GOAL: minimize the sum of the square of error terms when estimating our m and b terms There are lots of ways to do this! (Beyond the scope of this talk, but FreeSurfer does it for you!)

Forming a hypothesis Null hypothesis: m = 0 Now, we can fit our parameters, but we need a hypothesis Simple example: Is there a significant association between age and memory? Formal Hypothesis: The slope of age v. memory (m) is significantly different from zero memory Null hypothesis: m = 0 age

Testing our hypothesis Once we fit our model for the optimal regression coefficients (m and b), we need to test them for significance as well as test the direction of the effect We do this by forming something called a contrast matrix that isolates our parameter of interest We can multiply our contrast matrix by our regression coefficient matrix to compute a variable g, which tells us the direction of our effect In this example, since our hypothesis is about the slope m we will design our contrast matrix to be [0 1] If g is negative, then the direction of our effect (slope) is also negative memory1 memory2 memory3 memory4 age1 1 age2 age3 1 age4 b m b m = * g = [0 1] *

Testing our hypothesis This t-value corresponds to a p-value that depends on your sample size. This p-value is between 0 and 1, values closer to 0 indicate a more significant result.

Putting it all together We used our empirical data to form a design matrix: X We fit regression coefficients (b and m) to our x,y data We created a contrast matrix: C to test our hypothesis for: Direction of effect: g = C*β Significance of effect: t-test

Example: using 2 groups What if we have two groups? (i.e. does age effect memory different in men and women?) Real data: Age Sex Memory Score 74 M 28 75 68 30 81 24 79 26 65 F 77 27 80 69 29

Example: using 2 groups What will linear equation look like now? How will this change our design matrix? y = m1x1 + b1 + m2x2 + b2 Each group will now get it’s own set of regression coefficients and our design matrix will look as follows: female male female age male age Age Sex Memory Score 74 M 28 75 68 30 81 24 79 26 65 F 77 27 80 69 29 0 1 0 74 0 1 0 75 0 1 0 68 0 1 0 81 0 1 0 79 1 0 65 0 1 0 77 0 1 0 79 0 1 0 80 0 1 0 69 0 #columns = #groups * #parameters to estimate

Example: using 2 groups 45.5 60.3 -0.24 -0.44 28 30 24 26 27 29 Age Sex Memory Score 74 M 28 75 68 30 81 24 79 26 65 F 77 27 80 69 29 28 30 24 26 27 29 0 1 0 74 0 1 0 75 0 1 0 68 0 1 0 81 0 1 0 79 1 0 65 0 1 0 77 0 1 0 79 0 1 0 80 0 1 0 69 0 45.5 60.3 -0.24 -0.44 b1 b2 m1 m2 = *

Example: using 2 groups 45.5 60.3 -0.24 0 0 1 -1 -0.44 DOF = 6 How do we test if the slopes are significantly different? Null hypothesis: The difference in slopes between men and women is zero. Regression coefficients: 45.5 60.3 -0.24 -0.44 Contrast matrix: 0 0 1 -1 DOF = 6 p<0.0001 Null hypothesis is FALSE = 14.14

Next…. If you understand the basics of this talk, you can test much more complex hypotheses We will cover more difficult examples in the formal group analysis talk later this morning Quiz: what would your design and contrast matrices look like if you wanted to test for a significant association between age and memory, while controlling for IQ? 1 age1 IQ1 1 age2 IQ2 1 age3 IQ3 1 age4 IQ4 IQ age QUESTIONS? 0 1 0