ICS 463, Intro to Human Computer Interaction Design: 9. Experiments Dan Suthers.

Slides:



Advertisements
Similar presentations
Chapter 14: Usability testing and field studies
Advertisements

Ch 8: Experimental Design Ch 9: Conducting Experiments
Validity (cont.)/Control RMS – October 7. Validity Experimental validity – the soundness of the experimental design – Not the same as measurement validity.
Experimental Design I. Definition of Experimental Design
Hypothesis testing –Revisited A method for deciding whether the sample that you are looking at has been changed by some type of treatment (Independent.
Design of Experiments and Analysis of Variance
Chapter 14: Usability testing and field studies. 2 FJK User-Centered Design and Development Instructor: Franz J. Kurfess Computer Science Dept.
User Testing & Experiments. Objectives Explain the process of running a user testing or experiment session. Describe evaluation scripts and pilot tests.
Chapter 14: Usability testing and field studies. Usability Testing Emphasizes the property of being usable Key Components –User Pre-Test –User Test –User.
Causal Comparative Research: Purpose
1 User Centered Design and Evaluation. 2 Overview My evaluation experience Why involve users at all? What is a user-centered approach? Evaluation strategies.
Educational Research by John W. Creswell. Copyright © 2002 by Pearson Education. All rights reserved. Slide 1 Chapter 11 Experimental and Quasi-experimental.
Saul Greenberg Evaluating Interfaces With Users Why evaluation is crucial to interface design General approaches and tradeoffs in evaluation The role of.
© 2001 Dr. Laura Snodgrass, Ph.D.1 Basic Experimental Design Common Problems Assigning Participants to Groups Single variable experiments –bivalent –multivalent.
From Controlled to Natural Settings
ICS 463, Intro to Human Computer Interaction Design: 8. Evaluation and Data Dan Suthers.
ISE554 The WWW 3.4 Evaluation Methods. Evaluating Interfaces with Users Why evaluation is crucial to interface design General approaches and tradeoffs.
James Tam Evaluating Interfaces With Users Why evaluation is crucial to interface design General approaches and tradeoffs in evaluation The role of ethics.
Prelude to Internal Validity Types Of the four types of validity (Measurement, External, Internal & Statistical Conclusion), we will be most concerned.
L1 Chapter 11 Experimental and Quasi- experimental Designs Dr. Bill Bauer.
EVALUATING YOUR RESEARCH DESIGN EDRS 5305 EDUCATIONAL RESEARCH & STATISTICS.
Fig Theory construction. A good theory will generate a host of testable hypotheses. In a typical study, only one or a few of these hypotheses can.
BHS Methods in Behavioral Sciences I May 19, 2003 Chapter 9 (Ray) Within-Subjects Designs (Cont.), Matched-Subjects Procedures.
Chapter 14: Usability testing and field studies
+ Controlled User studies HCI /6610 Winter 2013.
1 Experimental Designs HOW DO HOW DO WE FIND WE FIND THE ANSWERS ? THE ANSWERS ?
Conducting a User Study Human-Computer Interaction.
Design Experimental Control. Experimental control allows causal inference (IV caused observed change in DV) Experiment has internal validity when it fulfills.
Single-Factor Experimental Designs
The Research Enterprise in Psychology. The Scientific Method: Terminology Operational definitions are used to clarify precisely what is meant by each.
Evaluating a Research Report
Today: Our process Assignment 3 Q&A Concept of Control Reading: Framework for Hybrid Experiments Sampling If time, get a start on True Experiments: Single-Factor.
The Scientific Method in Psychology.  Descriptive Studies: naturalistic observations; case studies. Individuals observed in their environment.  Correlational.
Quantitative Research. Quantitative Methods based in the collection and analysis of numerical data, usually obtained from questionnaires, tests, checklists,
Techniques of research control: -Extraneous variables (confounding) are: The variables which could have an unwanted effect on the dependent variable under.
Student information pack: Validity Some key points which you may find helpful.
Computer Science Education Debra Lee Davis, M.S., M.A., Ph.D.
Collection of Data Chapter 4. Three Types of Studies Survey Survey Observational Study Observational Study Controlled Experiment Controlled Experiment.
Selecting and Recruiting Subjects One Independent Variable: Two Group Designs Two Independent Groups Two Matched Groups Multiple Groups.
Testing & modeling users. The aims Describe how to do user testing. Discuss the differences between user testing, usability testing and research experiments.
Research Methods in Psychology (Pp ). IB Internal Assessment The IB Psychology Guide states that SL students are required to replicate a simple.
Usability Testing Chapter 6. Reliability Can you repeat the test?
Wade/Tavris, (c) 2006, Prentice Hall How Psychologists Do Research Chapter 2.
Quantitative and Qualitative Approaches
What is Computer Science?  Three paradigms (CACM 1/89) Theory (math): definitions, theorems, proofs, interpretations Abstraction (science): hypothesize,
Review of Research Methods. Overview of the Research Process I. Develop a research question II. Develop a hypothesis III. Choose a research design IV.
Experimental Psychology PSY 433 Appendix B Statistics.
1 MP2 Experimental Design Review HCI W2014 Acknowledgement: Much of the material in this lecture is based on material prepared for similar courses by Saul.
Research Methods in Psychology Chapter 2. The Research ProcessPsychological MeasurementEthical Issues in Human and Animal ResearchBecoming a Critical.
 Descriptive Methods ◦ Observation ◦ Survey Research  Experimental Methods ◦ Independent Groups Designs ◦ Repeated Measures Designs ◦ Complex Designs.
Today: Assignment 2 back on Friday
Varieties of Research Designs 3x3 Structure for single-IV designs –(3) Design differences & causal interpretability –(3) Design differences & statistical.
Experiments.  Labs (update and questions)  STATA Introduction  Intro to Experiments and Experimental Design 2.
Introduction to Research
HYPOTHESIS TESTING FOR DIFFERENCES BETWEEN MEANS AND BETWEEN PROPORTIONS.
NURS 306, Nursing Research Lisa Broughton, MSN, RN, CCRN RESEARCH STATISTICS.
Issues in Evaluating Educational Research
User Interface Evaluation
Research Problems, Purposes, & Hypotheses
Stats/Methods II JEOPARDY.
Scientific Method Attitude Process
From Controlled to Natural Settings
Experiments with Two Groups
What is Science?.
Experiments: Validity, Reliability and Other Design Considerations
Introduction to Experimental Design
Experiments II: Validity and Design Considerations
HCI Evaluation Techniques
Presentation transcript:

ICS 463, Intro to Human Computer Interaction Design: 9. Experiments Dan Suthers

Experiments and Usability Tests Preview: Controlled Experiments –Control all but one variable –Very informative about specific issues –Extremely resource intensive –Questionable ecological validity Usability Tests –Less rigorous in terms of variables controlled –Informative about selected benchmarks –Somewhat resource intensive –Better but still questionable ecological validity

Experiments Decide why you are doing it Pick a testable hypothesis Define a method –Subjects –Materials –Procedures and Instructions Choose statistical tests Interpret results Pilot Study!

Variables Independent variable –Hypothesized causal factor –What you modify –“The input to the organism” Dependent variable –Hypothesized effect –What you measure –“The output from the organism”

Subjects or Participants Must balance for … –Age –Gender –Prior experience –Aptitude Consider incentive to participate Obtain informed consent –Aware of risks and benefits –Option to quit at any time

Experimental Designs at a glance Between Subjects: each experimental condition has different subjects Within Subjects: each condition has the same subjects One-way, 2x2, etc.

Between Subjects Designs Between subjects good for ensuring no cross-treatment effects, but it requires more subjects and raises issues of whether the groups are the same. Two ways to make the groups the same: Independent subject design: randomly assign to experimental conditions Matched subject design: design the groups to be similar by matching subjects

Within subjects designs Pre-post test: measure before & after treatment –Problem: confounds with time on task Repeated Measures: each subject tested in both experimental conditions –Problem: TOT and cross-condition effects –Solution: counterbalance order in which conditions given Half the subjects: Condition A, Condition B Other half: Condition B, Condition A –Can get complicated with more conditions

2x2 designs –Condition A, Condition 1 –Condition A, Condition 2 –Condition B, Condition 1 –Condition B, Condition 2 Good for finding interactions between two variables (e.g., using ANOVA) Balancing order of treatment gets complex

Critiquing Procedure Critique your own before you run it! –Instructions –Amount of practice –User’s interpretation of IV –Sufficient but not excessive task complexity –Users understand tasks –Sufficient but not excessive time on task

Critiquing Results Is size of effect meaningful? Are there alternative explanations? Are the results consistent? Compare dependent variables How general are the results?

Usability Engineering Instead of comparing treatment groups, specify quantitative objectives on a set of measures and test interface on representative set of users until these objectives are met Less concerned with experimental design and controlling variables More oriented towards design

Doing Usability Engineering Define usability goals –Choose metrics –Set quantitative levels to achieve Design: Use available information to make best choice –Analyze design solutions w.r.t. these metrics –Incorporate user-derived feedback into design Evaluate resulting design against metrics Iterate through Design and Evaluate until metric objectives achieved

More on Usability Engineering Benchmark Tasks: standard and representative set of tasks used in each iteration of evaluation Attitude metrics may also be used Design may require tradeoffs between measurable objectives This method is attractive to organizations desiring measurable objectives but is criticized for lack of ecological validity: use field observations to compensate