Analyzing Reading time data LabSyntax, 03/01/06 T. Florian Jaeger.

Slides:



Advertisements
Similar presentations
Assumptions underlying regression analysis
Advertisements

11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Tests of Significance for Regression & Correlation b* will equal the population parameter of the slope rather thanbecause beta has another meaning with.
FACTORIAL ANOVA Overview of Factorial ANOVA Factorial Designs Types of Effects Assumptions Analyzing the Variance Regression Equation Fixed and Random.
FACTORIAL ANOVA. Overview of Factorial ANOVA Factorial Designs Types of Effects Assumptions Analyzing the Variance Regression Equation Fixed and Random.
Repeated measures ANOVA LabSyntax, 02/23/06 T. Florian Jaeger.
6-1 Introduction To Empirical Models 6-1 Introduction To Empirical Models.
11 Simple Linear Regression and Correlation CHAPTER OUTLINE
Regression Analysis Once a linear relationship is defined, the independent variable can be used to forecast the dependent variable. Y ^ = bo + bX bo is.
Advanced Methods and Models in Behavioral Research – 2014 Been there / done that: Stata Logistic regression (……) Conjoint analysis Coming up: Multi-level.
Some Terms Y =  o +  1 X Regression of Y on X Regress Y on X X called independent variable or predictor variable or covariate or factor Which factors.
Chapter 13 Multiple Regression
Multiple Regression Predicting a response with multiple explanatory variables.
FACTORIAL ANOVA.
Chapter 12 Multiple Regression
PSYC512: Research Methods PSYC512: Research Methods Lecture 19 Brian P. Dyre University of Idaho.
1 Review of Correlation A correlation coefficient measures the strength of a linear relation between two measurement variables. The measure is based on.
Copyright © 2006 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide Are the Means of Several Groups Equal? Ho:Ha: Consider the following.
Topics: Regression Simple Linear Regression: one dependent variable and one independent variable Multiple Regression: one dependent variable and two or.
1 1 Slide © 2003 South-Western/Thomson Learning™ Slides Prepared by JOHN S. LOUCKS St. Edward’s University.
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Simple Linear Regression Analysis
Checking Regression Model Assumptions NBA 2013/14 Player Heights and Weights.
1 Doing Statistics for Business Doing Statistics for Business Data, Inference, and Decision Making Marilyn K. Pelosi Theresa M. Sandifer Chapter 11 Regression.
Introduction to Linear Regression and Correlation Analysis
Regression Analysis Regression analysis is a statistical technique that is very useful for exploring the relationships between two or more variables (one.
Inference for regression - Simple linear regression
Chapter 13: Inference in Regression
OPIM 303-Lecture #8 Jose M. Cruz Assistant Professor.
1 1 Slide © 2007 Thomson South-Western. All Rights Reserved OPIM 303-Lecture #9 Jose M. Cruz Assistant Professor.
1 1 Slide © 2012 Cengage Learning. All Rights Reserved. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole.
1 1 Slide Multiple Regression n Multiple Regression Model n Least Squares Method n Multiple Coefficient of Determination n Model Assumptions n Testing.
Applied Quantitative Analysis and Practices LECTURE#23 By Dr. Osman Sadiq Paracha.
MBP1010H – Lecture 4: March 26, Multiple regression 2.Survival analysis Reading: Introduction to the Practice of Statistics: Chapters 2, 10 and 11.
Inference for Regression Simple Linear Regression IPS Chapter 10.1 © 2009 W.H. Freeman and Company.
Multiple Regression Petter Mostad Review: Simple linear regression We define a model where are independent (normally distributed) with equal.
1 11 Simple Linear Regression and Correlation 11-1 Empirical Models 11-2 Simple Linear Regression 11-3 Properties of the Least Squares Estimators 11-4.
Univariate Linear Regression Problem Model: Y=  0 +  1 X+  Test: H 0 : β 1 =0. Alternative: H 1 : β 1 >0. The distribution of Y is normal under both.
Multiple Regression BPS chapter 28 © 2006 W.H. Freeman and Company.
Social Science Research Design and Statistics, 2/e Alfred P. Rovai, Jason D. Baker, and Michael K. Ponton Within Subjects Analysis of Variance PowerPoint.
Simple Linear Regression ANOVA for regression (10.2)
Chapter 13 Multiple Regression
ANOVA Assumptions 1.Normality (sampling distribution of the mean) 2.Homogeneity of Variance 3.Independence of Observations - reason for random assignment.
Copyright © 2013, 2009, and 2007, Pearson Education, Inc. Chapter 14 Comparing Groups: Analysis of Variance Methods Section 14.1 One-Way ANOVA: Comparing.
1 Regression Analysis The contents in this chapter are from Chapters of the textbook. The cntry15.sav data will be used. The data collected 15 countries’
Mixed Effects Models Rebecca Atkins and Rachel Smith March 30, 2015.
Tutorial 4 MBP 1010 Kevin Brown. Correlation Review Pearson’s correlation coefficient – Varies between – 1 (perfect negative linear correlation) and 1.
REGRESSION DIAGNOSTICS Fall 2013 Dec 12/13. WHY REGRESSION DIAGNOSTICS? The validity of a regression model is based on a set of assumptions. Violation.
Statistics for Business and Economics 8 th Edition Chapter 11 Simple Regression Copyright © 2013 Pearson Education, Inc. Publishing as Prentice Hall Ch.
Linear Discriminant Analysis (LDA). Goal To classify observations into 2 or more groups based on k discriminant functions (Dependent variable Y is categorical.
Assumptions 5.4 Data Screening. Assumptions Parametric tests based on the normal distribution assume: – Independence – Additivity and linearity – Normality.
Comparing Two Means Chapter 9. Experiments Simple experiments – One IV that’s categorical (two levels!) – One DV that’s interval/ratio/continuous – For.
ANOVA, Regression and Multiple Regression March
Robust Regression. Regression Methods  We are going to look at three approaches to robust regression:  Regression with robust standard errors  Regression.
Copyright © 2004 by The McGraw-Hill Companies, Inc. All rights reserved.
Tutorial 5 Thursday February 14 MBP 1010 Kevin Brown.
ENGR 610 Applied Statistics Fall Week 11 Marshall University CITE Jack Smith.
732G21/732G28/732A35 Lecture 3. Properties of the model errors ε 4. ε are assumed to be normally distributed
Lecturer: Ing. Martina Hanová, PhD.. Regression analysis Regression analysis is a tool for analyzing relationships between financial variables:  Identify.
Lecture 11: Simple Linear Regression
11-1 Empirical Models Many problems in engineering and science involve exploring the relationships between two or more variables. Regression analysis.
Correlation and Simple Linear Regression
CHAPTER 29: Multiple Regression*
6-1 Introduction To Empirical Models
Correlation and Simple Linear Regression
Chapter 4, Regression Diagnostics Detection of Model Violation
An Introductory Tutorial
Multiple Regression Berlin Chen
Correlation and Simple Linear Regression
Correlation and Simple Linear Regression
Presentation transcript:

Analyzing Reading time data LabSyntax, 03/01/06 T. Florian Jaeger

[2] Self-paced RT studies A measure of processing complexity Say we have a hypothesis that some supposedly ungrammatical wh-orders are actually just hard to process (cf. superiority violations). As part of this hypothesis we predict that accessibility of the wh-fillers and accessibility of interveners result in more processing at the integration site (the verb)

[3] Copyright The following slides refer to a data set (downloadable along with these slides) that has been collected by the WH-Research Group, Linguistics Department, Stanford University. Please do not use, cite, or distribute any results based on that dataset (data-accessibility.rtm) without our permission. or for more

[4] Follow along You can follow along this tutorial presentation in R by downloading the dataset and the R script from: The.cnd file is used to extract the results from linger and to define the regions of interest The.rtm file contains all the reading time data (including the practice items (see Lingeralyzer documentation)Lingeralyzer documentation The.r file contains the employed R script. I haven’t documented things carefully, but with some R experience you should be able to figure things out.

[5] Input file # prin2n3 1 BARE_BARE Mary wondered what who read but later the teacher told her. ?Did Mary want to know what was painted? N # prin2n3 1 BARE_WHICH Mary wondered what which student read but later the teacher told her. ?Did Mary want to know what was painted? N # prin2n3 1 WHICH_BARE Mary wondered which book who read but later the teacher told her. ?Did Mary want to know what was painted? N # prin2n3 1 WHICH_WHICH Mary wondered which book which student read but later the teacher told her. ?Did Mary want to know what was painted? N Stimulus identifier: # experimentID itemID conditionID Stimulus (regions separated by “|”; default: word-by- word Content question and answer (Y/N)

[6] Extracting results (.cnd) set COND_NAME "prin2n3 BARE_BARE" set ANOVA_FACTORS "WH1 WH2" set REGIONS {1:1-2 2:3-4 3:5-8 4:9-99} addCondition set COND_NAME "prin2n3 BARE_WHICH" set ANOVA_FACTORS "WH1 WH2" set REGIONS {1:1-2 2:3-5 3:6-9 4:10-99} addCondition set COND_NAME "prin2n3 WHICH_BARE" set ANOVA_FACTORS "WH1 WH2" set REGIONS {1:1-2 2:3-5 3:6-9 4:10-99} addCondition … Mary wondered what which student read but later the teacher told her.

[7] Output file

[8] Import into R data <- read.table("C:/Documents and Settings/tiflo/Desktop/CLASS/RT-example/data- accessibility.rtm") colnames(data) <- c("expt","extraction","attachment","item","subj","orde r","position","word","region","rt","rtz","resrt","resrtz"," qa") Let’s do some data exploration and cleaning

[9] Testing the assumptions of ANOVA Homogeneity of variances: The variances of all conditions (and the variance of the error) are assumed to be identical.  Violations of this assumption are tolerable as long as the variances are correlated (cf. Howell, 1995:340-1) Normality: The dependent variable is assumed to be normally distributed within each condition.  ANOVA is relative robust against violations of normality Independence of observations: This assumption forces us to include subject and items as factors  repeated measures ANOVA; mixed effect models

Normality

[11] Outlier exclusion

[12] Transformations Reading times should be log- transformed (works also for magnitude estimation judgment)

[13] Normality check Within each condition, the dependent variable (logRT) is approximately normally distributed

Independence

[15] A simple regression data.verb <- subset(data.oe.clean, region== "3" & expt == "prin2n3") lm <- lm(logRT ~ filler*intervener, data= data.verb) summary(lm) Output: Estimate Std. Error t value Pr(>|t|) (Intercept) < 2e-16 *** fillerBARE ** intervenerBARE * filBARE:intBARE Multiple R-Squared: , Adjusted R-squared: NB: Coefficients are given for logRT

[16] Overview

Overview

[18] Clusters in your data The assumption of independence is violated if clusters in your data are correlated  Several trials by the same subject  Several trials of the same item Do subjects really differ?

[19] Some example subjects lms1 <- lm(logRT ~ filler*intervener, data= data.verb, subset= subj== "1") lms2 <- lm(logRT ~ filler*intervener, data= data.verb, subset= subj== "2") lms3 <- lm(logRT ~ filler*intervener, data= data.verb, subset= subj== "3") coefficients(lms1) coefficients(lms2) coefficients(lms3)

[20] Three random subjects > coefficients(lms1) (Intercept) fillerBARE intBARE fillerBARE:intBARE > coefficients(lms2) (Intercept) fillerBARE intBARE fillerBARE:intBARE > coefficients(lms3) (Intercept) fillerBARE intBARE fillerBARE:intBARE

[21] Plotting data for all subjects (from Fox, 2002) trellis.device(color=F) xyplot(logRT ~ filler | subj, data=data.verb, main="Verb logRTs", ylim=c(5,7), panel=function(x, y){ panel.xyplot(x, y) #panel.loess(x, y, span=1) panel.lmline(x, y, lty=2) } )

[22]

[23] A more convenient way lmList (in package lme4) lmList(formula = logRT ~ filler * intervener | subj, data = data.verb) Coefficients: (Intercept) fillerBARE intBARE fillerBARE:intBARE …

[24] Conclusion That’s why we do repeated measures or mixed effect analyses (to capture the differences between subjects as well as the commonalities of all trials by the same participant)

[25] Repeated Measures ANOVA in R data.verb.F1 <- aggregate(data.verb, by= list(subj= data.verb$subj, filler= data.verb$filler, intervener= data.verb$intervener), FUN= mean) data.verb.F2 <- aggregate(data.verb, by= list(item= data.verb$item, filler= data.verb$filler, intervener= data.verb$intervener), FUN= mean) F1 <- aov(logRT ~ filler*intervener + Error(subj/(filler*intervener)), data.verb.F1) F2 <- aov(logRT ~ filler*intervener + Error(item/(filler*intervener)), data.verb.F2) summary(F1) summary(F2)