Controlled Experiments

Slides:

Advertisements

Similar presentations

Critical Reading Strategies: Overview of Research Process

Advertisements

Win8 on Intel Programming Course Win8 for developers, in detail Cédric Andreolli Intel.

Win8 on Intel Programming Course Desktop : WPF Cédric Andreolli Intel Software

10+10 Descending the Design Funnel Chapter 1.4 in Sketching User Experiences: The Workbook.

Intel Do-It-Yourself Challenge node.js

The State Transition Diagram

Win8 on Intel Programming Course Desktop : Sensors Cédric Andreolli Intel Software

Internet of Things with Intel Edison GPIO on Edison

Controlled Experiments Analysis of Variance Lecture /slide deck produced by Saul Greenberg, University of Calgary, Canada Notice: some material in this.

Win8 on Intel Programming Course Win8 and Intel Paul Guermonprez Intel Software

Why Should I Sketch? Chapter 1.2 in Sketching User Experiences: The Workbook.

The Keyboard Study Lecture /slide deck produced by Saul Greenberg, University of Calgary, Canada Notice: some material in this deck is used from other.

Evaluation - Controlled Experiments What is experimental design? What is an experimental hypothesis? How do I plan an experiment? Why are statistics used?

Controlled experiments Traditional scientific method Reductionist –clear convincing result on specific issues In HCI: –insights into cognitive process,

The Branching Storyboard Chapter 4.3 in Sketching the User Interface: The Workbook Image from:

©2013 Core Knowledge Foundation. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

Quantitative Evaluation

User Testing & Experiments. Objectives Explain the process of running a user testing or experiment session. Describe evaluation scripts and pilot tests.

Tuesday, March 17, 2015 Pop Quiz – Controlled Experiments in HCI 1.

Evaluating Hypotheses Chapter 9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics.

1 User Centered Design and Evaluation. 2 Overview Why involve users at all? What is a user-centered approach? Evaluation strategies Examples from “Snap-Together.

ESE Einführung in Software Engineering X. CHAPTER Prof. O. Nierstrasz Wintersemester 2005 / 2006.

Evaluating Hypotheses Chapter 9 Homework: 1-9. Descriptive vs. Inferential Statistics n Descriptive l quantitative descriptions of characteristics ~

N. XXX Prof. O. Nierstrasz Thanks to Jens Palsberg and Tony Hosking for their kind permission to reuse and adapt the CS132 and CS502 lecture notes.

1 User Centered Design and Evaluation. 2 Overview My evaluation experience Why involve users at all? What is a user-centered approach? Evaluation strategies.

Internet of Things with Intel Edison Led sensor lab

CP — Concurrent Programming X. CHAPTER Prof. O. Nierstrasz Wintersemester 2005 / 2006.

12. eToys. © O. Nierstrasz PS — eToys 12.2 Denotational Semantics Overview:  … References:  …

SWOT Analysis Strengths Weaknesses SWOT Opportunities Threats.

How to Write an Abstract Grad Tips by Saul Greenberg University of Calgary Image from:

Sequential Storyboards Chapter 4.1 in Sketching the User Interface: The Workbook Image from:

CPSC 581 Human Computer Interaction II Interaction Design Lecture /slide deck produced by Saul Greenberg, University of Calgary, Canada Notice: some material.

Collecting Images & Clippings Chapter 2.3 in Sketching User Experiences: The Workbook.

Graphical Screen Design Part 1: Contrast, Repetition, Alignment, Proximity Lecture /slide deck produced by Saul Greenberg, University of Calgary, Canada.

Win8 on Intel Programming Course Modern UI : Sensors Cédric Andreolli Intel Software.

Win8 on Intel Programming Course Modern UI : Features Cédric Andreolli Intel Software.

Intel Do-It-Yourself Challenge OpenCV

Win8 on Intel Programming Course The challenge Paul Guermonprez Intel Software

Win8 on Intel Programming Course Modern UI HelloWorld in HTML5/JS Cédric Andreolli Intel.

What is a sketch? Chapter 1.2 addendum Sketching User Experiences: The Workbook.

Internet of Things with Intel Edison Compiling and running Pierre Collet Intel Software.

The Animated Sequence Chapter 5.1 in Sketching User Experiences: The Workbook.

Internet of Things with Intel Edison CylonJS Pierre Collet Intel Software

Assumes that events are governed by some lawful order

The Sketchbook Chapter 1.4 in Sketching User Experiences: The Workbook.

1 MP2 Experimental Design Review HCI W2014 Acknowledgement: Much of the material in this lecture is based on material prepared for similar courses by Saul.

Sketching Vocabulary Chapter 3.4 in Sketching User Experiences: The Workbook Drawing objects, people, and their activities.

Win8 on Intel Programming Course Paul Guermonprez Intel Software

Design of Everyday Things Part 2: Useful Designs? Lecture /slide deck produced by Saul Greenberg, University of Calgary, Canada Images from:

The Narrative Storyboard Chapter 4.4 in Sketching User Experiences: The Workbook.

Images of pesticides By: Leslie London, University of Cape Town This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 2.5.

Controlled Experiments Part 2. Basic Statistical Tests Lecture /slide deck produced by Saul Greenberg, University of Calgary, Canada Notice: some material.

Controlled Experiments

Controlled experiments Traditional scientific method Reductionist clear convincing result on specific issues In HCI: insights into cognitive process,

Sketching Vocabulary Chapter 3.4 in Sketching User Experiences: The Workbook Drawing objects, people, and their activities.

Methodology Overview basics in user studies Lecture /slide deck produced by Saul Greenberg, University of Calgary, Canada Notice: some material in this.

Agenda Video pre-presentations Digital sketches & photo traces

Methodology Overview 2 basics in user studies Lecture /slide deck produced by Saul Greenberg, University of Calgary, Canada Notice: some material in this.

Writing the Methods Section

FOTW Worksheet Slides Christopher Penn, Financial Aid Podcast Student Loan Network.

Scientific Methods Science in Practice.

Barbara Gastel INASP Associate

Chapter 4: Designing Studies

Life Science Chapter 1 Review

Writing the Introduction

Introduction to electronic resources management

Introduction to electronic resources management

HCI Evaluation Techniques

Data and Data Collection

Nature of Science “Science is a particular way of knowing about the world. In science, explanations are limited to those based on observations and experiments.

Presentation transcript:

Controlled Experiments Part 1: Introduction Lecture /slide deck produced by Saul Greenberg, University of Calgary, Canada Notice: some material in this deck is used from other sources without permission. Credit to the original source is given if it is known,

Outline Terminology What is experimental design? What is an experimental hypothesis? How do I plan an experiment? Why are statistics used? What are the important statistical methods?

Quantitative evaluation of systems precise measurement, numerical values bounds on how correct our statements are Methods user performance data collection controlled experiments

Collecting user performance data Data collected on system use (often lots of data) Exploratory: hope something interesting shows up (e.g., patterns) but can be difficult to analyze Targeted look for specific information, but may miss something frequency of request for on-line assistance what did people ask for help with? frequency of use of different parts of the system why are parts of system unused? number of errors and where they occurred why does an error occur repeatedly? time it takes to complete some operation what tasks take longer than expected?

Logging example How people navigate with web browsers From: Tauscher, L. and Greenberg, S. (1997) How People Revisit Web Pages: Empirical Findings and Implications for the Design of History Systems. International Journal of Human Computer Studies - IJHCS, 47(1):97-138.

Logging example How people navigate with web browsers From: Tauscher, L. and Greenberg, S. (1997) How People Revisit Web Pages: Empirical Findings and Implications for the Design of History Systems. International Journal of Human Computer Studies - IJHCS, 47(1):97-138.

Controlled experiments Traditional scientific method Reductionist clear convincing result on specific issues In HCI: insights into cognitive process, human performance limitations, ... allows system comparison, fine-tuning of details ...

example Which toothpaste is best? Images from http://www.futurederm.com/wp-content/uploads/2008/06/060308-toothpaste.jpg and http://4.bp.blogspot.com/_i2tTNonulCM/R7t3T7qDxTI/AAAAAAAAAB0/JrUU1wJMeFo/s400/ist2_2301636_tooth_paste[1].jpg

A) Lucid and testable hypothesis State a lucid, testable hypothesis this is a precise problem statement Example: There is no difference in the number of cavities in children and teenagers using crest and no-teeth toothpaste when brushing daily over a one month period

Independent variables (IVs) b) Hypothesis includes the independent variables (IVs) that are to be altered the things you manipulate independent of a subject’s behaviour determines a modification to the conditions the subjects undergo may arise from subjects being classified into different groups

Independent variables (IVs) in toothpaste experiment There is no difference in the number of cavities in children and teenagers using glow-right and no-teeth toothpaste when brushing daily over a one month period IV1: toothpaste type: uses Crest or No-teeth toothpaste IV2: age: <= 11 years or > 11 years

Dependent variables (DVs) c) Hypothesis includes the dependent variables (DVs) that will be measured variables dependent on the subject’s behaviour / reaction to the independent variable the specific things you set out to quantitatively measure / observe

Dependent variables (DVs) in toothpaste experiment There is no difference in the number of cavities in children and teenagers using glow-right and no-teeth toothpaste when brushing daily over a one month period in toothpaste experiment number of cavities other things we could have measured frequency of brushing preference

Subject Selection d) Judiciously select and assign subjects to groups ways of controlling subject variability reasonable amount of subjects random assignment make different user groups an independent variable screen for anomalies in subject group superstars versus poor performers Novice Expert

Controlling bias e) Control for bias unbiased instructions unbiased experimental protocols prepare scripts ahead of time unbiased subject selection Now you get to do the pop-up menus. I think you will really like them... I designed them myself!

Statistical analysis f) Apply statistical methods to data analysis confidence limits: the confidence that your conclusion is correct “the hypothesis that computer experience makes no difference is rejected at the .05 level” means: a 95% chance that your statement is correct a 5% chance you are wrong

Interpretation g) Interpret your results what you believe the results really mean their implications to your research their implications to practitioners how generalizable they are limitations and critique

Planning flowchart for experiments Stage 1 Stage 2 Stage 3 Stage 4 Stage 5 Problem Planning Conduct Analysis Interpret- definition research ation feedback research define data interpretation preliminary idea variables reductions testing generalization literature review controls data statistics reporting collection apparatus hypothesis statement of testing problem procedures hypothesis select development subjects experimental design feedback Image reproduced from an early ACM CHI tutorial, but I cannot recall which one

More examples

example Which menu should we use? File Edit View Insert New Open Close Save File Edit View Insert New Open Close Save

A) Lucid and testable hypothesis Example 2: There is no difference in user performance (time and error rate) when selecting a single item from a pop-up or a pull down menu of length 3, 6, 9 or 12 items, regardless of the subject’s previous expertise in using a mouse or using the different menu types File Edit View Insert New Open Close Save File Edit View Insert New Open Close Save

Independent variables (IVs) in menu experiment There is no difference in user performance (time and error rate) when selecting a single item from a pop-up or a pull down menu of length 3, 6, 9 or 12 items, regardless of the subject’s previous expertise in using a mouse or using the different menu types IV1: menu type: pop-up or pull-down IV2: menu length: 3, 6, 9, 12 IV3: subject type (expert or novice)

Dependent variables (DVs) in menu experiment There is no difference in user performance (time and error rate) when selecting a single item from a pop-up or a pull down menu of length 3, 6, 9 or 12 items, regardless of the subject’s previous expertise in using a mouse or using the different menu types time to select an item selection errors made

example Choosing on-screen keyboards Keyboard size what is the best trades off with screen real estate?

example Choosing on-screen keyboards Keyboard layout ease of learning by non-typists vs. expertise touch typing ≠hunt and peck Qwerty Alphabetic Random Dvorak

example Choosing on-screen keyboards Unconventional keyboard layouts are they ‘better’? Raynal, Vinot & Truillet: UIST’07

example Choosing on-screen keyboards Effects of input device?

example Choosing on-screen keyboards Issues can’t just ask people (preference ≠performance) observations alone won’t work effects may be too small to see but important variability of people will mask differences (if any) need to understand differences between users strong vs. moderate vs. weak typists …

A) Lucid and testable hypothesis Example 3: There is no difference in user performance (time and error rate) and preference (5 point likert scale) when typing on two sizes of an alphabetic, qwerty and random on-screen keyboard using a touch-based large screen, a mouse-based monitor, or a stylus-based PDA.

Independent variables (IVs) in keyboard experiment There is no difference in user performance (time and error rate) and preference (5 point likert scale) when typing on two sizes of an alphabetic, qwerty and random on-screen keyboard using a touch-based large screen, a mouse-based monitor, or a stylus-based PDA. IV1: keyboard type: alphabetic, qwerty, random IV2: size: small, large IV3: input/display: touch/large, mouse/monitor, stylus/PDA

Dependent variables (DVs) in keyboard experiment There is no difference in user performance (time and error rate) and preference (5 point likert scale) when typing on two sizes of an alphabetic, qwerty and random on-screen keyboard using a touch-based large screen, a mouse-based monitor, or a stylus-based PDA. other things we could have measured time to learn to use it to proficiency

You know now Controlled experiments strive for lucid and testable hypothesis quantitative measurement measure of confidence in results obtained (statistics) replicability of experiment control of variables and conditions removal of experimenter bias Experimental design requires careful planning

Permissions You are free: to Share — to copy, distribute and transmit the work to Remix — to adapt the work Under the following conditions: Attribution — You must attribute the work in the manner specified by the author (but not in any way that suggests that they endorse you or your use of the work) by citing: “Lecture materials by Saul Greenberg, University of Calgary, AB, Canada. http://saul.cpsc.ucalgary.ca/saul/pmwiki.php/HCIResources/HCILectures” Noncommercial — You may not use this work for commercial purposes, except to assist one’s own teaching and training within commercial organizations. Share Alike — If you alter, transform, or build upon this work, you may distribute the resulting work only under the same or similar license to this one. With the understanding that: Not all material have transferable rights — materials from other sources which are included here are cited Waiver — Any of the above conditions can be waived if you get permission from the copyright holder. Public Domain — Where the work or any of its elements is in the public domain under applicable law, that status is in no way affected by the license. Other Rights — In no way are any of the following rights affected by the license: Your fair dealing or fair use rights, or other applicable copyright exceptions and limitations; The author's moral rights; Rights other persons may have either in the work itself or in how the work is used, such as publicity or privacy rights. Notice — For any reuse or distribution, you must make clear to others the license terms of this work. The best way to do this is with a link to this web page.