Qualitative vs. Quantitative

Slides:



Advertisements
Similar presentations
T-tests continued.
Advertisements

Using Statistics in Research Psych 231: Research Methods in Psychology.
Evaluating Hypotheses Chapter 9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics.
Cal State Northridge  320 Ainsworth Sampling Distributions and Hypothesis Testing.
1/55 EF 507 QUANTITATIVE METHODS FOR ECONOMICS AND FINANCE FALL 2008 Chapter 10 Hypothesis Testing.
Evaluating Hypotheses Chapter 9 Homework: 1-9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics ~
C82MCP Diploma Statistics School of Psychology University of Nottingham 1 Overview of Lecture Independent and Dependent Variables Between and Within Designs.
Today Concepts underlying inferential statistics
Choosing Statistical Procedures
AM Recitation 2/10/11.
Testing Hypotheses I Lesson 9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics n Inferential Statistics.
Fall 2013 Lecture 5: Chapter 5 Statistical Analysis of Data …yes the “S” word.
Evidence Based Medicine
1 Power and Sample Size in Testing One Mean. 2 Type I & Type II Error Type I Error: reject the null hypothesis when it is true. The probability of a Type.
Statistics & Biology Shelly’s Super Happy Fun Times February 7, 2012 Will Herrick.
Hypothesis Testing Quantitative Methods in HPELS 440:210.
Chapter 8 Introduction to Hypothesis Testing
Education Research 250:205 Writing Chapter 3. Objectives Subjects Instrumentation Procedures Experimental Design Statistical Analysis  Displaying data.
Learning Objectives In this chapter you will learn about the t-test and its distribution t-test for related samples t-test for independent samples hypothesis.
User Study Evaluation Human-Computer Interaction.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Conducting a User Study Human-Computer Interaction.
Research Process Parts of the research study Parts of the research study Aim: purpose of the study Aim: purpose of the study Target population: group whose.
Lecture 5: Chapter 5: Part I: pg Statistical Analysis of Data …yes the “S” word.
Inference and Inferential Statistics Methods of Educational Research EDU 660.
Lecture 16 Section 8.1 Objectives: Testing Statistical Hypotheses − Stating hypotheses statements − Type I and II errors − Conducting a hypothesis test.
Human-Computer Interaction. Overview What is a study? Empirically testing a hypothesis Evaluate interfaces Why run a study? Determine ‘truth’ Evaluate.
Experimental Psychology PSY 433 Appendix B Statistics.
BHS Methods in Behavioral Sciences I May 9, 2003 Chapter 6 and 7 (Ray) Control: The Keystone of the Experimental Method.
Data Analysis. Qualitative vs. Quantitative Data collection methods can be roughly divided into two groups. It is essential to understand the difference.
Inferential Statistics Psych 231: Research Methods in Psychology.
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
PS Research Methods I with Kimberly Maring Unit 9 – Experimental Research Chapter 6 of our text: Zechmeister, J. S., Zechmeister, E. B., & Shaughnessy,
Irwin/McGraw-Hill © Andrew F. Siegel, 1997 and l Chapter 7 l Hypothesis Tests 7.1 Developing Null and Alternative Hypotheses 7.2 Type I & Type.
Chapter 9 Introduction to the t Statistic
Some Terminology experiment vs. correlational study IV vs. DV descriptive vs. inferential statistics sample vs. population statistic vs. parameter H 0.
Methods of Presenting and Interpreting Information Class 9.
Lecture Slides Elementary Statistics Twelfth Edition
Chapter 8 Introducing Inferential Statistics.
Chapter 2 Research Methods.
Hypothesis Tests l Chapter 7 l 7.1 Developing Null and Alternative
Statistical Significance
Logic of Hypothesis Testing
Chapter 2: The Research Enterprise in Psychology
Psych 231: Research Methods in Psychology
INF397C Introduction to Research in Information Studies Spring, Day 12
Hypothesis testing Chapter S12 Learning Objectives
Understanding Results
Hypothesis Testing and Confidence Intervals (Part 1): Using the Standard Normal Lecture 8 Justin Kern October 10 and 12, 2017.
Conducting a User Study
Central Limit Theorem, z-tests, & t-tests
Hypothesis Testing: Hypotheses
Hypothesis Testing.
Scientific Method Steps
Psych 231: Research Methods in Psychology
Psych 231: Research Methods in Psychology
Chapter 12 Power Analysis.
Psych 231: Research Methods in Psychology
Psych 231: Research Methods in Psychology
Sampling and Power Slides by Jishnu Das.
What are their purposes? What kinds?
Inferential Statistics
Investigations using the
Psych 231: Research Methods in Psychology
Psych 231: Research Methods in Psychology
Psych 231: Research Methods in Psychology
Psych 231: Research Methods in Psychology
Testing Hypotheses I Lesson 9.
Type I and Type II Errors
BHS Methods in Behavioral Sciences I
Presentation transcript:

Qualitative vs. Quantitative Slides taken from Ben Bederson Qualitative vs. Quantitative Qualitative: Develop understanding of human experience Quantitative: Objectively measure human performance Less about more vs. more about less When are each appropriate? The act of getting a phd is to know absolutely everything about nothing

Quantitative Evaluation Gather (performance) measurements Methods User interaction collection (i.e., logging) Mouse clicks, keys pressed,… Data collected during system use Google, Amazon Controlled experiments Set forth a testable hypothesis Manipulate one or more independent variable Observe effect on one or more dependent variable Can be reproduced by others Goal of controlled experiments is to know that results aren’t due to chance Example: Does caffeine intake affect the ability of people to recall a list of words? independent variable: caffeine intake dependent variable: ability to recall the words

Controlled experiment State a lucid, testable hypothesis Identify independent and dependent variables Design the experimental protocol Choose the user population Apply for human subjects protocol review (IRB) Run some pilot participants Fix the experimental protocol Run the experiment Perform statistical analysis Draw conclusions Communicate results

Experiment Design Is it reliable (repeatable)? Is it valid? Will you get the same result if someone else repeats the experiment? Confounding variables? Does the experiment take into account variations between subjects? Is it valid? Does the experiment reflect target use? Were users typical? Were tasks typical? Was the setting realistic? Was the experience biased? Non-reliable: * Time of day biased results (confounding variable) Invalid: * Undergrads not representative of actual market * Clicking rapidly between items disregarded time to think about how to use calculator

Are results significant? Statistical significance Is observed result is due to chance? Type I errors are the most disruptive Design significance? 3.00s versus 3.05s? Researcher’s Decision Actual Situation NO effect Effect Correct decision Type II error Type I error

Are results significant? Statistical significance Comparing to the null hypothesis: “There is no effect” Type I errors are the most disruptive Design significance? 3.00s versus 3.05s? Researcher’s Decision Actual Situation: Null Hypothesis is True False Fail to reject the null hypothesis Correct decision Type II error Reject the null hypothesis Type I error

Example Examine value of animation during scrolling [Klein & Bederson 2005] http://portal.acm.org/citation.cfm?doid=1056808.1057068 Various reading tasks For various document types While scrolling with varying kinds of animated transitions

State a lucid, testable hypothesis “Animated scrolling will speed up reading and decrease errors, especially for plain documents.”

Choose the variables Manipulate one or more independent variable (the thing you change) Document type Animation speed Observe effect on one or more dependent variable (the thing you measure) Time to completion Accuracy (i.e., error rate)

Design the experimental protocol Between or within subjects? Between subjects: each subject runs one condition - Need more subjects Difference between subjects might introduce a bias + No learning effects across conditions Within subjects: each subject runs several conditions + Need fewer subjects + No bias across participants Possible learning effect across conditions Which tasks? Must reflect the hypothesis Must avoid bias Instructions, ordering… In doubt, always favor the null hypothesis How might you handle learning effects?

Design the experimental protocol Running Example: Reading while scrolling Hypothesis: landmarks will help, and thus decrease the benefit of animation 2 reading tasks, 1 counting task

Chose the user population Pick a well balanced sample Novices, experts, average Age group Gender… Population group may be one of the independent variables Running example Varied population, did not control

Run the experiment Always run pilots first! There are always unexpected problem! When the experiment has started you cannot pick and choose Use a check-list so that all subjects follow the same steps IRB - Don’t forget the consent form! Don’t forget to debrief each subject People not confortable with aparatus People tricking the logging mechanism

Run statistical analysis Properties of our result data Mean, variance… How different data sets relate to each other Are we sampling from similar of different distributions? Probability that our claims are correct Statistical significance: “The hypothesis that technique X is faster is accepted (p < .05)” means that there is a higher than 95% chance the hypothesis is true Typical levels are .05 and .01 level These levels are socially determined

Statistical tools I - Descriptive Mean Average Median Central value Standard Deviation Measure of variation

Statistical tools I - Descriptive Correlation Measure the extent to which 2 concepts are related Caveats Correlation does not imply cause and effect (hidden variable) Ice cream consumption and drowning Third variable problem Directionality problem Need a large enough group r = 0 (no correlation) r = 1 (positively correlated) r = -1 (negatively correlated)

Example (random data) Height Self Esteem 68 4.1 71 4.6 62 3.8 75 4.4 58 3.2 60 3.1 67 3.8 71 4.3 69 3.7 68 3.5 67 3.2 63 3.7 62 3.3 60 3.4 63 4.0 65 4.1 63 3.4 61 3.6 Mean 65.4 3.8 Median 66.0 3.8 SD 4.4 0.4

Example (continued) Correlation: r = 0.73

Statistical tools II - Analytical T-test Compare the mean of 2 populations Null hypothesis: no difference between means Can only examine a single independent variable Assumptions Samples are normally distributed Very robust in practice Population variances are equal Reasonably robust for differing variances Individual observations in samples are independent Very important

T-Tests Increase statistical power by: Increasing number of participants Decreasing variance Running 1-tail instead of 2-tail test Two-tail: Effect found if the value of the statistic is either sufficiently large or small.

Statistical tool II - Analytical ANOVA Single factor analysis of variance Compare three or more means Analysis of variance Compare relationship between many factors Beginners type at the same speed on all keyboards, Touch-typist type fastest on the qwerty Your protocol influences the kind of test you can use If in doubt, consult with a statistician before starting the experiment!

Reporting Results T-TEST “There was a significant difference in the scores for sugar (M=4.2, SD=1.3) and no sugar (M=2.2, SD=0.84) conditions; p = 0.020.” ANOVA “As one would expect, movement times increased as either W decreased or D increased (i.e., as the task got more difficult: for W, F(2,25)=801, p<0.001; and for D, F(3,54)=1429, p<0.001).”

Running example data Error bars show +/- one standard deviation

Running Example Analysis Measured speed and error Significant results for reading time reading error counting time Animated scrolling reduces reading errors by up to 54% task time by up to 3% task time by up to 24% for counting tasks 300ms seems to be best overall animation time for reading. Formatted documents had a higher base rate perf, and lower improvement – thus visual landmarks appear to be a good idea