Analysis of Variance ANOVA and its terminology Within and between subject designs Case study Slide deck by Saul Greenberg. Permission is granted to use.

Slides:



Advertisements
Similar presentations
Controlled Experiments Analysis of Variance Lecture /slide deck produced by Saul Greenberg, University of Calgary, Canada Notice: some material in this.
Advertisements

The Keyboard Study Lecture /slide deck produced by Saul Greenberg, University of Calgary, Canada Notice: some material in this deck is used from other.
PSY 307 – Statistics for the Behavioral Sciences
Dr George Sandamas Room TG60
Quantitative Evaluation
The Two Factor ANOVA © 2010 Pearson Prentice Hall. All rights reserved.
January 7, afternoon session 1 Multi-factor ANOVA and Multiple Regression January 5-9, 2008 Beth Ayers.
Chapter 3 Analysis of Variance
Statistics for Managers Using Microsoft® Excel 5th Edition
CS 544 Experimental Design
Chapter 9 - Lecture 2 Computing the analysis of variance for simple experiments (single factor, unrelated groups experiments).
PSY 307 – Statistics for the Behavioral Sciences Chapter 19 – Chi-Square Test for Qualitative Data Chapter 21 – Deciding Which Test to Use.
Today Concepts underlying inferential statistics
Experimental Group Designs
Chapter 14 Inferential Data Analysis
Go to Table of ContentTable of Content Analysis of Variance: Randomized Blocks Farrokh Alemi Ph.D. Kashif Haqqi M.D.
1 CSCI 4163/6610 Statistics Acknowledgement: Most of the material in this lecture is based on material prepared for similar courses by Saul Greenberg (University.
Chapter 12 Inferential Statistics Gay, Mills, and Airasian
Psy B07 Chapter 1Slide 1 ANALYSIS OF VARIANCE. Psy B07 Chapter 1Slide 2 t-test refresher  In chapter 7 we talked about analyses that could be conducted.
ANOVA Chapter 12.
Basic Statistics Michael Hylin. Scientific Method Start w/ a question Gather information and resources (observe) Form hypothesis Perform experiment and.
 The “4 Steps” of Hypothesis Testing: 1. State the hypothesis 2. Set decision criteria 3. Collect data and compute sample statistic 4. Make a decision.
Copyright © 2013, 2010 and 2007 Pearson Education, Inc. Chapter Comparing Three or More Means 13.
Chapter 14: Repeated-Measures Analysis of Variance.
Analysis of Variance ( ANOVA )
Week 10 Chapter 10 - Hypothesis Testing III : The Analysis of Variance
Chapter 8 Experimental Design: Dependent Groups and Mixed Groups Designs.
Which Test Do I Use? Statistics for Two Group Experiments The Chi Square Test The t Test Analyzing Multiple Groups and Factorial Experiments Analysis of.
Single-Factor Experimental Designs
Between- Subjects Design Chapter 8. Review Two types of Ex research Two basic research designs are used to obtain the groups of scores that are compared.
t(ea) for Two: Test between the Means of Different Groups When you want to know if there is a ‘difference’ between the two groups in the mean Use “t-test”.
A Repertoire of Hypothesis Tests  z-test – for use with normal distributions and large samples.  t-test – for use with small samples and when the pop.
Factorial Analysis of Variance One dependent variable, more than one independent variable (“factor”) 1 2.
Psychology 301 Chapters & Differences Between Two Means Introduction to Analysis of Variance Multiple Comparisons.
1 Experimental Design. 2  Single Factor - One treatment with several levels.  Multiple Factors - More than one treatment with several levels each. 
Statistics (cont.) Psych 231: Research Methods in Psychology.
Statistics for Managers Using Microsoft Excel, 4e © 2004 Prentice-Hall, Inc. Chap 10-1 Chapter 10 Analysis of Variance Statistics for Managers Using Microsoft.
Chapter 10: Analyzing Experimental Data Inferential statistics are used to determine whether the independent variable had an effect on the dependent variance.
Educational Research Chapter 13 Inferential Statistics Gay, Mills, and Airasian 10 th Edition.
Human-Computer Interaction. Overview What is a study? Empirically testing a hypothesis Evaluate interfaces Why run a study? Determine ‘truth’ Evaluate.
Analysis of Variance (One Factor). ANOVA Analysis of Variance Tests whether differences exist among population means categorized by only one factor or.
Ledolter & Hogg: Applied Statistics Section 6.2: Other Inferences in One-Factor Experiments (ANOVA, continued) 1.
General Linear Model.
Chapter 13 Repeated-Measures and Two-Factor Analysis of Variance
IE241: Introduction to Design of Experiments. Last term we talked about testing the difference between two independent means. For means from a normal.
Introduction to ANOVA Research Designs for ANOVAs Type I Error and Multiple Hypothesis Tests The Logic of ANOVA ANOVA vocabulary, notation, and formulas.
1 Psych 5510/6510 Chapter 14 Repeated Measures ANOVA: Models with Nonindependent ERRORs Part 2 (Crossed Designs) Spring, 2009.
ANOVA Overview of Major Designs. Between or Within Subjects Between-subjects (completely randomized) designs –Subjects are nested within treatment conditions.
Oneway/Randomized Block Designs Q560: Experimental Methods in Cognitive Science Lecture 8.
Factorial Design of Experiments. An Economy of Design Two or more levels of each factor (variable) are administered in combination with the two or more.
Analysis of Variance ANOVA and its terminology Within and between subject designs Case study Slide deck by Saul Greenberg. Permission is granted to use.
Differences Among Groups
Saul Greenberg Analysis of Variance What terminology do I need to know to understand Anova? How can Anova handle within and between subject designs? A.
Educational Research Inferential Statistics Chapter th Chapter 12- 8th Gay and Airasian.
Inferential Statistics Psych 231: Research Methods in Psychology.
Choosing and using your statistic. Steps of hypothesis testing 1. Establish the null hypothesis, H 0. 2.Establish the alternate hypothesis: H 1. 3.Decide.
Oneway ANOVA comparing 3 or more means. Overall Purpose A Oneway ANOVA is used to compare three or more average scores. A Oneway ANOVA is used to compare.
The 2 nd to last topic this year!!.  ANOVA Testing is similar to a “two sample t- test except” that it compares more than two samples to one another.
CHAPTER 15: THE NUTS AND BOLTS OF USING STATISTICS.
Chapter 11 Analysis of Variance
Comparing Multiple Groups:
Chapter 12 Chi-Square Tests and Nonparametric Tests
Factorial Designs Q560: Experimental Methods in Cognitive Science Lecture 11.
Factorial Experiments
Controlled Experiments
An Introduction to Two-Way ANOVA
Applied Business Statistics, 7th ed. by Ken Black
Comparing Three or More Means
Comparing Multiple Groups: Analysis of Variance ANOVA (1-way)
STATISTICS INFORMED DECISIONS USING DATA
Presentation transcript:

Analysis of Variance ANOVA and its terminology Within and between subject designs Case study Slide deck by Saul Greenberg. Permission is granted to use this for non-commercial purposes as long as general credit to Saul Greenberg is clearly maintained. Warning: some material in this deck is used from other sources without permission. Credit to the original source is given if it is known.

Analysis of Variance (Anova) Statistical Workhorse –supports moderately complex experimental designs and statistical analysis –Lets you examine multiple independent variables at the same time –Examples: There is no difference between people’s mouse typing ability on the Random, Alphabetic and Qwerty keyboard There is no difference in the number of cavities of people aged under 12, between 12-16, and older than 16 when using Crest vs No-teeth toothpaste

Analysis of Variance (Anova) Terminology –Factor = independent variable –Factor level= specific value of independent variable Keyboard Qwerty Random Alphabetic Factor Toothpaste type CrestNo-teeth Age < >16 Factor level Factor

Anova terminology Factorial design –cross combination of levels of one factor with levels of another –eg keyboard type (3) x size (2) Cell –unique treatment combination –eg qwerty x large Qwerty Random Alphabetic Keyboard Size large small

Anova terminology Between subjects (aka nested factors) –subject assigned to only one factor level of treatment –control is general population –advantage: –guarantees independence i.e., no learning effects –problem: –greater variability, requires more subjects Qwerty S1-20 Random S21-40 Alphabetic S41-60 Keyboard different subjects in each cell

Anova terminology Within subjects (aka crossed factors) –subjects assigned to all factor levels of a treatment –advantages requires fewer subjects subjects act as their own control less variability as subject measures are paired –problems: order effects Qwerty S1-20 Random S1-20 Alphabetic S1-20 Keyboard same subjects in each cell

Anova terminology Order effects –within subjects only –doing one factor level affects performance in doing the next factor level, usually through learning Example: learning to mouse type on any keyboard likely improves performance on the next keyboard even if there was really no difference between keyboards: Alphabetic > Random > Qwerty performance S1: Q then R then A S2: Q then R then A S3: Q then R then A S4: Q then R then A …

Anova terminology Counter-balanced ordering –mitigates order problem –subjects do factor levels in different orders –distributes order effect across all conditions, but does not remove them Works only if order effects are equal between conditions –e.g., people’s performance improves when starting on Qwerty but worsens when starting on Random S1: Q then R then A q > (r < a) S2: R then A then Qr << a < q S3: A then Q then Ra < q < r S4: Q then A then R…q > (a < r) …

Anova terminology Mixed factor –contains both between and within subject combinations –within subjects: keyboard type –between subjects: size Qwerty Random Alphabetic Keyboard S1-20 S21-40 Size Large Small S1-20

Single Factor Analysis of Variance Compare means between two or more factor levels within a single factor example: –independent variable (factor): keyboard –dependent variable: mouse-typing speed Qwerty Alphabetic Random S1: 25 secs S2: 29 … S20: 33 S21: 40 secs S22: 55 … S40: 33 S51: 17 secs S52: 45 … S60: 23 Keyboard Qwerty Alphabetic Random S1: 25 secs S2: 29 … S20: 33 S1: 40 secs S2: 55 … S20: 43 S1: 41 secs S2: 54 … S20: 47 Keyboard between subject designwithin subject design

Anova Compares relationships between many factors In reality, we must look at multiple variables to understand what is going on Provides more informed results –considers the interactions between factors

Anova Interactions Example interaction –typists are: faster on Qwerty-large keyboards slower on the Alpha-small same on all other keyboards is the same –cannot simply say that one layout is best without talking about size Qwerty Random Alpha S1-S10 S11-S20 S21-S30 S31-S40 S41-S50 S51-S60 large small

Anova Interactions Example interaction –typists are faster on Qwerty than the other keyboards –non-typists perform the same across all keyboards –cannot simply say that one keyboard is best without talking about typing ability Qwerty Random Alpha S1-S10 S11-S20 S21-S30 S31-S40 S41-S50 S51-S60 non-typist typist

Anova - Interactions Example: –t-test: crest vs no-teeth subjects who use crest have fewer cavities –interpretation: recommend crest cavities 0 5 crest no-teeth Statistically different

Anova - Interactions Example: –anova: toothpaste x age subjects 14 or less have fewer cavities with crest. subjects older than 14 have fewer cavities with no-teeth. –interpretation? the sweet taste of crest makes kids use it more, while it repels older folks cavities 0 5 crestno-teeth age 0-6 age 7-14 age >14 Statistically different

Anova case study The situation –text-based menu display for large telephone directory –names listed as a range within a selectable menu item –users navigate menu until unique names are reached 1) Arbor- Kalmer 2) Kalmerson- Ulston 3) Unger- Zlotsky 1) Arbor- Farquar 2) Farston- Hoover 3) Hover- Kalmer 1) Horace - Horton 2) Hoster, James 3) Howard, Rex …

Anova case study The problem –we can display these ranges in several possible ways –expected users have varied computer experiences General question –which display method is best for particular classes of user expertise?

1) Arbor 2) Barrymore 3) Danby 4) Farquar 5) Kalmerson 6) Moriarty 7) Proctor 8) Sagin 9) Unger --(Zlotsky) -- (Arbor) 1) Barney 2) Dacker 3) Estovitch 4) Kalmer 5) Moreen 6) Praleen 7) Sageen 8) Ulston 9) Zlotsky 1) Arbor- Barney 2) Barrymore- Dacker 3) Danby- Estovitch 4) Farquar- Kalmer 5) Kalmerson- Moreen 6) Moriarty- Praleen 7) Proctor- Sageen 8) Sagin- Ulston 9) Unger- Zlotsky Range Delimeters FullLowerUpper

1) Arbor 2) Barrymore 3) Danby 4) Farquar 5) Kalmerson 6) Moriarty 7) Proctor 8) Sagin 9) Unger --(Zlotsky) 1) A 2) Barr 3) Dan 4) F 5) Kalmers 6) Mori 7) Pro 8) Sagi 9) Un --(Z) -- (Arbor) 1) Barney 2) Dacker 3) Estovitch 4) Kalmer 5) Moreen 6) Praleen 7) Sageen 8) Ulston 9) Zlotsky 1) Arbor- Barney 2) Barrymore- Dacker 3) Danby- Estovitch 4) Farquar- Kalmer 5) Kalmerson- Moreen 6) Moriarty- Praleen 7) Proctor- Sageen 8) Sagin- Ulston 9) Unger- Zlotsky -- (A) 1) Barn 2) Dac 3) E 4) Kalmera 5) More 6) Pra 7) Sage 8) Ul 9) Z 1) A- Barn 2) Barr- Dac 3) Dan- E 4) F- Kalmerr 5) Kalmers- More 6) Mori- Pra 7) Pro- Sage 8) Sagi- Ul 9) Un- Z Range Delimeters Truncation FullLowerUpper None Truncated

1) Arbor 2) Barrymore 3) Danby 4) Farquar 5) Kalmerson 6) Moriarty 7) Proctor 8) Sagin 9) Unger --(Zlotsky) 1) Danby 2) Danton 3) Desiran 4) Desis 5) Dolton 6) Dormer 7) Eason 8) Erick 9) Fabian --(Farquar) Wide Span Narrow Span Span as one descends the menu hierarchy, name suffixes become similar Span

Null Hypothesis –six menu display systems based on combinations of truncation and range delimiter methods do not differ significantly from each other as measured by people’s scanning speed and error rate –menu span and user experience has no significant effect on these results –2 level (truncation) x 2 level (menu span) x 2 level (experience) x 3 level (delimiter) S1-8 Novice S9-16 Expert S17-24 Novice S25-32 Expert S33-40 Novice S40-48 Expert Full Upper Lower narrowwidenarrowwide TruncatedNot Truncated

Statistical results Scanning speed F-ratio.p Range delimeter (R)2.2*<0.5 Truncation (T)0.4 Experience (E)5.5*<0.5 Menu Span (S)216.0**<0.01 RxT0.0 RxE1.0 RxS3.0 TxE1.1 Trunc. X Span14.8*<0.5 ExS1.0 RxTxE0.0 RxTxS1.0 RxExS1.7 TxExS0.3 RxTxExS0.5

Statistical results Scanning speed: Truncation x SpanMain effects (means) Results on Selection time Full range delimiters slowest Truncation has very minor effect on time: ignore Narrow span menus are slowest Novices are slower speed 4 6 wide narrow not truncated truncated FullLowerUpper Full *1.31* Lower Upper---- Span: Wide4.35 Narrow5.54 Experience Novice5.44 Expert4.36

Statistical results Error rate F-ratio.p Range delimeter (R)3.7*<0.5 Truncation (T)2.7 Experience (E)5.6*<0.5 Menu Span (S)77.9**<0.01 RxT1.1 RxE4.7* <0.5 RxS5.4* <0.5 TxE1.2 TxS1.5 ExS2.0 RxTxE0.5 RxTxS1.6 RxExS1.4 TxExS0.1 RxTxExS0.1

Statistical results Error rates Range x ExperienceRange x Span Results on Errors –more errors with lower range delimiters at narrow span –truncation has no effect on errors –novices have more errors at lower range delimiter novice errors 0 16 full upper expert lower errors 0 16 wide narrow lower upper full

Conclusions Upper range delimiter is best Truncation up to the implementers Keep users from descending the menu hierarchy Experience is critical in menu displays

You now know Anova terminology –factors, levels, cells –factorial design between, within, mixed designs You should be able to: Find a paper in CHI proceedings that uses Anova Draw the Anova table, and state dependant variables independant variables / factors factor levels between/within subject design