Evaluation (cont.): Empirical Studies CS352. Announcements Notice upcoming due dates (web page). Where we are in PRICPE: –Predispositions: Did this in.

Slides:



Advertisements
Similar presentations
Critical Reading Strategies: Overview of Research Process
Advertisements

Oral Presentations.
Foundations and Strategies Attention Investment CS352.
Human Capabilities: Mental Models CS352. Announcements Notice upcoming due dates (web page). Where we are in PRICPE: –Predispositions: Did this in Project.
Kerr Elementary Science Fair GETTING STARTED Pick Your Topic. Choose something that interests you. Ideas might come from hobbies or problems.
Evaluation (cont.): Heuristic Evaluation Cognitive Walkthrough CS352.
Evaluation Types GOMS and KLM
Announcements Project proposal part 1 &2 due Tue HW #2 due Wed night 11:59pm. Quiz #2 on Wednesday. Reading: –(’07 ver.) 7-7.4, (’02 ver.)
CS160 Discussion Section Matthew Kam Apr 14, 2003.
Chapter 14: Usability testing and field studies. Usability Testing Emphasizes the property of being usable Key Components –User Pre-Test –User Test –User.
Usable Privacy and Security Carnegie Mellon University Spring 2008 Lorrie Cranor 1 Designing user studies February.
1 CS 430 / INFO 430 Information Retrieval Lecture 24 Usability 2.
On Comparing Classifiers: Pitfalls to Avoid and Recommended Approach Published by Steven L. Salzberg Presented by Prakash Tilwani MACS 598 April 25 th.
Evaluation (cont.): Empirical Studies CS352. Announcements Where we are in PRICPE: –Predispositions: Did this in Project Proposal. –RI: Research was studying.
Types of Formal Reports Chapter 14. Definition  Report is the term used for a group of documents that inform, analyze or recommend.  We will categorize.
Example 10.1 Experimenting with a New Pizza Style at the Pepperoni Pizza Restaurant Concepts in Hypothesis Testing.
Chapter 14: Usability testing and field studies
Talks & Statistics (wrapping up) Psych 231: Research Methods in Psychology.
Introduction to Usability Engineering Learning about your users 1.
Copyright © 2010 Pearson Education, Inc. Chapter 13 Experiments and Observational Studies.
Experiments and Observational Studies. Observational Studies In an observational study, researchers don’t assign choices; they simply observe them. look.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 13 Experiments and Observational Studies.
EES Graduate Seminar Fall 2008 Geoc/Geol/Geop/Hydr 592 Rick Aster MSEC
Evidence Based Medicine
The Scientific Method. The Scientific Method The Scientific Method is a problem solving-strategy. *It is just a series of steps that can be used to solve.
Usability testing IS 403: User Interface Design Shaun Kane.
+ Variables and Hypotheses I hypothesize that you will do great theses.
Today: Our process Assignment 3 Q&A Concept of Control Reading: Framework for Hybrid Experiments Sampling If time, get a start on True Experiments: Single-Factor.
10.2 Tests of Significance Use confidence intervals when the goal is to estimate the population parameter If the goal is to.
Testing & modeling users. The aims Describe how to do user testing. Discuss the differences between user testing, usability testing and research experiments.
Assumes that events are governed by some lawful order
Steve Krug MinneWebCon 2015 You're NOT doing usability testing? Are you…nuts?
Usability Testing Chapter 6. Reliability Can you repeat the test?
Conclusions (in general… and for this assignment).
Concepts and Prototypes CS352. Announcements Notice upcoming due dates (web page). Where we are in PRICPE: –Predispositions: Did this in Project Proposal.
EVALUATION PROfessional network of Master’s degrees in Informatics as a Second Competence – PROMIS ( TEMPUS FR-TEMPUS-JPCR)
The research process Psych 231: Research Methods in Psychology.
Evaluation (cont.): Heuristic Evaluation Cognitive Walkthrough CS352.
The Information School of the University of Washington Information System Design Info-440 Autumn 2002 Session #15.
Chapter 13: Inferences about Comparing Two Populations Lecture 8b Date: 15 th November 2015 Instructor: Naveen Abedin.
CS12230 Introduction to Programming Lecture 6-2 –Errors and Exceptions 1.
Computer/Human Interaction Spring 2013 Northeastern University1 Name of Interface Tagline if you have one (80 chars max, including spaces) Team member.
Statistics (cont.) Psych 231: Research Methods in Psychology.
Evaluation Types GOMS and KLM CS352. Quiz Announcements Notice upcoming due dates (web page). Where we are in PRICPE: –Predispositions: Did this in Project.
Dana Nau: CMSC 722, AI Planning Licensed under the Creative Commons Attribution-NonCommercial-ShareAlike License:
SE320: Introduction to Computer Games Week 3 Gazihan Alankus 10/4/20111.
Evaluation / Usability. ImplementDesignAnalysisEvaluateDevelop ADDIE.
University of Nebraska  Lincoln R School of Natural Resources Usability Testing Mark Mesarch Web/Database Programmer School of Natural Resources
We have created a template to help you structure your idea. Use this to show the brand and judges all the thinking behind your idea. There are lots hints.
School of Engineering and Information and Communication Technology KIT305/607 Mobile Application Development Week 7: Usability (think-alouds) Dr. Rainer.
Evaluation through user participation
Debugging Intermittent Issues
Concepts and Prototypes
Introduction to Usability Engineering
Usability Evaluation, part 2
Analytical Evaluation with GOMS and KLM
Evaluation techniques
Evaluation (cont.): Cognitive Walkthrough and Heuristic Evaluation
Evaluation - Analytical Cognitive Walkthrough and Heuristic Evaluation
Evaluation (cont.): Empirical Studies
Reading Research Papers
Evaluation Types CS352.
Introduction to Usability Engineering
Learning about your users
HCI Evaluation Techniques
Evaluation (cont.): Empirical Studies
Evaluation - Analytical Cognitive Walkthrough and Heuristic Evaluation
Evaluation - Analytical Cognitive Walkthrough and Heuristic Evaluation
Introduction to Usability Engineering
Evaluation (cont.): Empirical Studies: The Thinkaloud
Presentation transcript:

Evaluation (cont.): Empirical Studies CS352

Announcements Notice upcoming due dates (web page). Where we are in PRICPE: –Predispositions: Did this in Project Proposal. –RI: Research was studying users. Hopefully led to Insights. –CP: Concept and initial (very low-fi) Prototypes. –Evaluate throughout, repeat iteratively!! 2

Evaluation Analytical – based on your head Empirical – based on data Advantages/disadvantages of empirical –More expensive (time, $) to do. +Greater accuracy in what users REALLY do. +You’ll get more surprises. +Greater credibility with your bosses.

Empirical Work with Humans What do you want from it? –List of problems: Usability study. (eg, 5 users in a lab). –List of behaviors/strategies/needs: Think-aloud study or field observation. –Compare, boolean outcome (eg, A>B) Statistical study. Note: –Impossible to “prove” no problems exist. Can only find problems.

The Usability Study Returns a list of UI problems. Pick a task and user profile. Users do the task with your prototype –in here: using your CogTool-generated html

Usability Study: How How many: 5 users find 60-80% problems. How: –Be organized!! Have everything ready,... –Test it first. –Users do task (one user at a time). Data IS the result (no stats needed).

Usability Study: How (cont.) Q: What, exactly, do you have to do during planning? A: Try to figure this out from the Steve Krug example: –Reminder of where we were in “3. The Home Page Tour” (7:00-7:45). –Doing a task “4. The Tasks” (8:13-11:50, 12:00-13:15) Discussion of Steve’s planning to do this.

Many Cheap Usability Studies: Steve Krug’s Recommendations Steve Krug recommends: –How often: one morning/month. –When: Continually, thru-out the dev. process. –# participants: 3 (to make once/month viable). –Participants: better to relax your standards about getting the “perfect” representative than to sacrifice the once/month frequency. –Where: in your lab, with a one-way mirror so that the dev team can watch. Not in same room as participant – why?

Steve Krug’s recommendations (cont.) –Who watches when: as many of the dev team as can come, for as long as they can stay. –Who identifies the problems when: Everyone watching, while they watch. Then at lunch, debrief and make decisions. –Reporting: a brief to the dev team summarizing the decisions made.

How to test sketches/concepts To spot problems with the concept. Like the “home page tour” from Steve Krug’s demo video. Can often be done in about 5-10 minutes.

How to test paper prototypes Here you need the user to do a task, as in the “wizard of oz” Excel demonstration. Usually minutes will get you everything you need.

How to test a “working” prototype Such as your CogTool prototypes. Get it into executable form. –CogTool: export to html. Then do what Steve did: –the warm-up, –then present the task, –then do as in the part of the Steve Krug video we just watched.

Things to Say When user does this:Say this: Something surprises them (“Oh!!”)“Is that what you expected to happen?” Participant tries to get you to give them clues. “What would you do if you were at home?” (Wait.) “Then why don’t you go ahead and try that?” “I’d like you to do whatever you’d normally do.” “How do you think it would work?” Participants says something but you don’t know what triggered it. “Was there something in particular that made you think that?” Participant suggest that s/he’s not giving you what you need. “No, this is very helpful.”

About the “think-aloud” part... Steve recommends having the participant “think aloud”, and this is a good idea.

Think-Aloud Most helpful with a working prototype or system, –but can also use it for early prototypes/concepts. –can be part of a usability study (eg, Steve Krug) or for more Research-y purposes. Returns a list of behaviors/strategies/impressions/thoughts... Pick a task and user profile. Users do the task with your prototype.

Think-Aloud: How How many: 5-10 users is usual in a research study, fewer if usability study. –If a research study, data analysis is time- consuming. –In here: 1-2 max! How: –Be organized!! Have everything ready,... –Test it first. –Users do task (one user at a time).

Think-Aloud Training (Activity) –“We are interested in what you say to yourself as you do the task. So we’ll ask you to TALK ALOUD CONSTANTLY as you work on the tasks. Say aloud EVERYTHING you say to yourself. Let’s practice talking aloud.” –Add in your head (aloud). Discuss –How many windows are in your parents’ house? (aloud). Discuss –What states can you ski in that start with “A”?

Think-Aloud Research Studies Analyze data for patterns, surprises, etc. No stats: not enough subjects for this. Sample think-aloud results: –From VL/HCC'03 (Prabhakararao et al.)VL/HCC'03 (Prabhakararao et al.)

Statistical Studies We will not do these in this class, but –you need to know some basics. Goal: answer a binary question. –eg: does system X help users create animations? –eg: are people better debuggers using X than using Y? Advantage: your audience believes it. Disadvantage: you might not find out enough about “why or why not”.

Hypotheses Testing Need to be specific, and provable/refutable. –eg: “users will debug better using system X than in system Y” –(Strictly speaking we use the “null” hypothesis, which says there won’t be a difference, but it’s a fine point...) –Pick a significance value (rule of thumb: 0.05). If you get a p-value <=0.05, this says you’ve shown a significant difference, but there’s a 5% chance that the difference is a fluke.

Design the Experiment Identify outputs (dependent variables) for the hypotheses: –eg: more bugs fixed? –eg: fewer minutes to fix same number of bugs? Identify independent variables (“treatments”) we’ll manipulate: –eg: which system they use, X vs. Y?

Design the experiment (cont.) Decide on within vs. between subject. –“Within”: 1 group experiences all treatments. In random order. “Within” is best, if possible. (Why?) –“Between”: different group for each treatment. How many subjects? –Rule of thumb: 30/treatment. –More subjects -> more statistical power -> more likely to get p<=.05 if there really is a difference.

Design the experiment (cont.) Design the task they will do. –Since you usually run a lot of these at once and you’re comparing them, you need to be careful with length. Long enough to get over learning curve. Big enough to be convincing. Small enough to be do-able in the amount of time subjects have to spend with you. –Vary the order if multiple tasks.

Design the experiment (cont.) Develop the tutorial. –Practice it like crazy! (Must work the same for everyone!) Plan the data to gather. –Log files? –Questionnaires before/after? –Saved result files at end?

Designing the experiment (cont.) Water in the beer: –Sources of uncontrolled variation spoil your statistical power. Sources: –Too much variation in subject background. –Not a good enough tutorial. –Task not a good match for what you wanted to find out. –Etc. Result: no significant difference.

Finally, Analyze the Data Choose an appropriate statistical test. (There are entire courses on this.) Run it. Hope for p<=.05. Summary: –Statistical studies are a lot of work, too much to do in this class! –Right choice for answering X>Y questions.

Summary: For your project In here, you’ll be doing the Usability Study, with Think-aloud. 1-3 users. Your plan for doing this is part of what’s due on Friday! –To get you ready for actually doing the study the following week.