User Testing John Kelleher IT Sligo.

Slides:



Advertisements
Similar presentations
Quantitative and Scientific Reasoning Standard n Students must demonstrate the math skills needed to enter the working world right out of high school or.
Advertisements

CS305: HCI in SW Development Evaluation (Return to…)
HCI 특론 (2007 Fall) User Testing. 2 Hall of Fame or Hall of Shame? frys.com.
COMP 3715 Spring 05. Computer Interface Interaction between human and computer Has to deal with two things  User’s mental model Different user has different.
©N. Hari Narayanan Computer Science & Software Engineering Auburn University 1 COMP 7620 Evaluation Chapter 9.
1 User Testing. 2 Hall of Fame or Hall of Shame? frys.com.
Think-aloud usability experiments or concurrent verbal accounts Judy Kay CHAI: Computer human adapted interaction research group School of Information.
User Interface Testing. Hall of Fame or Hall of Shame?  java.sun.com.
An evaluation framework
Test Taking Tips How to help yourself with multiple choice and short answer questions for reading selections A. Caldwell.
Involving Users in Interface Evaluation Marti Hearst (UCB SIMS) SIMS 213, UI Design & Development April 8, 1999.
Requirements Engineering Requirements Elicitation Process Lecture-8.
Object-Oriented Software Engineering Practical Software Development using UML and Java Chapter 7: Focusing on Users and Their Tasks.
©2010 John Wiley and Sons Chapter 6 Research Methods in Human-Computer Interaction Chapter 6- Diaries.
Usability Testing Chapter 6. Reliability Can you repeat the test?
COMP5047 Pervasive Computing: 2012 Think-aloud usability experiments or concurrent verbal accounts Judy Kay CHAI: Computer human adapted interaction research.
User Interface Design & Usability for the Web Card Sorting You should now have a basic idea as to content requirements, functional requirements and user.
Task Analysis Methods IST 331. March 16 th
Reading Strategies To Improve Comprehension Empowering Gifted Children.
EVALUATION PROfessional network of Master’s degrees in Informatics as a Second Competence – PROMIS ( TEMPUS FR-TEMPUS-JPCR)
Prof. James A. Landay University of Washington Autumn 2004 User Testing December 2, 2004.
Introduction to Evaluation without Users. Where are you at with readings? Should have read –TCUID, Chapter 4 For Next Week –Two Papers on Heuristics from.
Prof. James A. Landay University of Washington Autumn 2006 User Testing November 30, 2006.
User Testing. CSE490f - Autumn 2006User Interface Design, Prototyping, & Evaluation2 Hall of Fame or Hall of Shame? frys.com.
Observing the Current System Benefits Can see how the system actually works in practice Can ask people to explain what they are doing – to gain a clear.
Usability Evaluation or, “I can’t figure this out...do I still get the donuts?”
Copyright 1999 all rights reserved Evaluating Paper Prototypes n How do you do this? –What are the considerations? –What are the steps? ?
11/10/981 User Testing CS 160, Fall ‘98 Professor James Landay November 10, 1998.
School of Engineering and Information and Communication Technology KIT305/607 Mobile Application Development Week 7: Usability (think-alouds) Dr. Rainer.
Research in natural settings 2 Qualitative research: Surveys and fieldwork Macau University of Science and Technology.
BEHAVIOR BASED SELECTION Reducing the risk. Goals  Improve hiring accuracy  Save time and money  Reduce risk.
Section 2 Effective Groupwork Online. Contents Effective group work activity what is expected of you in this segment of the course: Read the articles.
An –Najah National University Submitted to : Dr. Suzan Arafat
GS 221 Work and Society Dr. Haydar Badawi Sadig.
Effective Time Management
Supporting School Transformation
Collecting data.
Articulating Your Practice C3 - Session #3
Interviewing Techniques
Interviewing Techniques
What the problem looks like:
Usability Testing 3 CPSC 481: HCI I Fall 2014 Anthony Tang.
The Nature and Scope of Organizational Behaviour
Variables are factors that change or can be changed.
Developing a Methodology
Entry Task #1 – Date Self-concept is a collection of facts and ideas about yourself. Describe yourself in your journal in a least three sentences. What.
Hypothesis-Based Science
Articulating Your Practice C3 - Session #3
Professor John Canny Spring 2003
Research in Psychology
Design, prototyping and construction
Lesson Plan: Oral history interview with whole class or group
Lesson 5. Lesson 5 Extraneous variables Extraneous variable (EV) is a general term for any variable, other than the IV, that might affect the results.
From Controlled to Natural Settings
Lesson Plan: Oral history interview with whole class or group
based on notes by James Landay
User Testing November 27, 2007.
User interface design.
Professor John Canny Spring 2004
Evaluation.
HCI Evaluation Techniques
Creative assessment and feedback
Experimental Evaluation
SE365 Human Computer Interaction
Professor John Canny Fall 2004
Cognitive Walkthrough
Safety toolbox Johnnie Stanton-MECA
Design, prototyping and construction
AS Psychology Research Methods
Chapter 14 INTRODUCING EVALUATION
Presentation transcript:

User Testing John Kelleher IT Sligo

Why do User Testing? Can’t tell how good or bad UI is until? people use it! Other methods are based on evaluators who? may know too much may not know enough (about tasks, etc.) Summary: Hard to predict what real users will do

When resources get squeezed…

Evaluating Prototypes Before Preparing for the evaluation Selecting tasks During Teaching the user Recording what happens After Using the results

Considerations of User Testing & Observation Establish trust Hawthorne Effect Be wary of recording media and logging Read “Microsoft Usability Labs” Consider ethnography

Hawthorne Effect Hawthorne Plant of the Western Electric Company in Cicero, Illinois (1927-1932) Prof. Elton Mayo (Harvard Business School) Findings The aptitudes of individuals are imperfect predictors of job performance. Although they give some indication of the physical and mental potential of the individual, the amount produced is strongly influenced by social factors. Informal organization affects productivity. The Hawthorne researchers discovered a group life among the workers. The studies also showed that the relations that supervisors develop with workers tend to influence the manner in which the workers carry out directives. Work-group norms affect productivity. The Hawthorne researchers were not the first to recognize that work groups tend to arrive at norms of what is "a fair day's work," however, they provided the best systematic description and interpretation of this phenomenon. The workplace is a social system. The Hawthorne researchers came to view the workplace as a social system made up of interdependent parts.

Preparing for the Evaluation Set an objective Select representative users Think through what you will do with the user Define tasks for the user Decide on explanations & instructions Decide how you will observe & record what happens Decide on use of recording equipment Decide on setting

Choosing Participants Representative of eventual users in terms of job-specific vocabulary / knowledge tasks If you can’t get real users, get approximation system intended for doctors get medical students system intended for electrical engineers get engineering students Use incentives to get participants

Ethical Considerations Sometimes tests can be distressing embarrassed by mistakes You have a responsibility to alleviate this make voluntary with informed consent avoid pressure to participate let them know they can stop at any time stress that you are testing the system, not them make collected data as anonymous as possible

Selecting Tasks Select enough tasks to cover the range of expected use Use the requirements to help determine the range Go for variety first, comprehensiveness second E.g. Assume task of: “Use the water tap to wash your hands.”

Selecting Tasks Make tasks specific but do not tell the user how to operate the interface Describe the result not the steps to get there Right: “Assume the sink is full of bits of food from having rinsed the dishes. Wash these bits down the drain.” Wrong: “Move the faucet to the left, right, front and back while it is on power wash.”

Deciding on Data to Collect Two types of data process data observations of what users are doing & thinking bottom-line data summary of what happened (time, errors, success…) i.e., the dependent variables

Process Data vs. Bottom Line Data Focus on process data first gives good overview of where problems are Bottom-line doesn’t tell you where to fix just says: “too slow”, “too many errors”, etc. Hard to get reliable bottom-line results need many users for statistical significance

Instructions to Participants Describe the purpose of the evaluation “I’m testing the product; I’m not testing you” Tell them they can quit at any time Demonstrate the equipment Explain how to think aloud Explain that you will not provide help Describe the task give written instructions

Teaching the User Try to avoid teaching the user Better interfaces require less instruction Reflect real use of the system Will the user have instructions? Will the user get training? Teaching changes the evaluation Bad teaching can ruin a good design Good teaching can save a bad design

Do NOT Teach the User if... There will not be instructions for the user under real conditions The task is perceived as so easy that the user expects to do it without instruction The user will not be using the interface on any regular basis

Warning! If you use instructions, Do Teach the User if... There will be instructions for the user under real conditions The interface provides new capabilities unfamiliar to the user The interface has many different functions and capabilities Warning! If you use instructions, write them beforehand

Observing and Recording Know what you want to observe And be alert for the unexpected Do not rely on memory Be prepared to record what happens Record the time to perform various parts of the task Materials Stopwatch Chart to record the start time for each subtask

Errors Categorize each error the user makes Typing errors Path error - additional steps or steps in an unexpected order Command error - an incorrect action for the task the user is trying to do Motor error - user selects wrong object accidentally

User Understanding Check user understand after she or he has done the tasks Give the user a short test on the interface Multiple choice test Have the user describe to you how to do various tasks with the interface

User’s Perception Don’t ask how they liked it overall Ask specific questions about the interface Right: “Do you think the temperature is easy to adjust?” Wrong: “Rate on a scale from one to five, how easy you found the interface to use.”

Watch out for Response Bias! User’s want you to feel good about your design If you ask them if they like it, they will say YES Ask them instead what parts they like & dislike Users want to be successful Norman on tendency of people to see problems as their fault

Using the Results There is no value unless you apply the result of the evaluation Use each type of observation for insight into the design Listen to the user, but do not completely abandon your own common sense

Time Does it take too long to do the task? Reduce the number of steps Favor frequent tasks Simple tasks should be simple Combine temperature change and turning the water “on” Build a one-step key for hand washing button and cold drinks

Errors Does your user make a lot of errors? Avoid designs that require precise motor performance Avoid very fine scales on sliders Avoid multiple level cascading menus Avoid moving from one end of the screen to the other and back, multiple times

Confusion Is the user confused? Long pauses on user talking aloud? Subtask take longer than others? Moving from one selection to another? Vague explanation of the interface?

Confusion Cures for confusion Put sequential steps near each other Put together selections that are likely to occur together Change labels - make the mapping better Add a tap icon showing what shower looks like Modify the conceptual model Draw a picture of a sink for specifying the tap directions

Total Confusion Is the user lost? User cannot complete the task User succeeds only with great difficult or a lot of help

Total Confusion Select a new conceptual model Ask the user what she or he thinks is the model for the interface Ask the user to ignore the interface you have designed and describe how they might perform the tasks you have specified

Using the Test Results Summarize the data What does data tell you? make a list of all critical incidents (CI) positive & negative include references back to original data try to judge why each difficulty occurred What does data tell you? UI work the way you thought it would? consistent with your cognitive walkthrough? users take approaches you expected? something missing?

Using the Results (cont.) Update task analysis and rethink design rate severity & ease of fixing CIs fix both severe problems & make the easy fixes Will thinking aloud give the right answers? not always if you ask a question, people will always give an answer, even it is has nothing to do with the facts try to avoid specific questions

Measuring Bottom-Line Usability Situations in which numbers are useful time requirements for task completion successful task completion compare two designs on speed or # of errors Do not combine with thinking-aloud talking can affect speed & accuracy (neg. & pos.) Time is easy to record Error or successful completion is harder define in advance what these mean

Measuring User Preference How much users like or dislike the system can ask them to rate on a scale of 1 to 10 or have them choose among statements “best UI I’ve ever…”, “better than average”… hard to be sure what data will mean novelty of UI, feelings, not realistic setting, etc. If many give you low ratings -> trouble Can get some useful data by asking what they liked, disliked, where they had trouble, best part, worst part, etc. (redundant questions)

Details (cont.) Keeping variability down Debriefing test users recruit test users with similar background brief users to bring them to common level perform the test the same way every time don’t help some more than others (plan in advance) make instructions clear Debriefing test users often don’t remember, so show video segments ask for comments on specific features show them screen (online or on paper)

Summary User testing is important, but takes time/effort Early testing can be done on mock-ups (low-fi) Use real tasks & representative participants Be ethical & treat your participants well Want to know what people are doing & why i.e., collect process data Using bottom line data requires more users to get statistically reliable results

Ethnography Derived from anthropology Situated Action method Literally means ‘writing the culture’ Situated Action method Randomly mixed groups in an experimental situation serve to illustrate process of group formation rather than that of a normal working group. Experimental psychology vs. more qualitative sociological analysis Need to probe understanding/influence of social context Crucial for CSCW Study of social character and activity in groups in their natural setting Aims to find the order within an activity rather than impose any framework of interpretation on it. Observers immerse themselves in the users’ environment Participate in their day-to-day work Joining in conversations Attending meetings Reading documents E.g. Nancy Baym’s study of online community (soap operas) E.g. Mateas et al study of home PC usage using felt board & pizza!

Ethnography Very detailed recording (typically video) is crucial Aim is to make implicit, explicit. Aim is to be encultured to understand the situation from within its own cultural framework E.g. Kuhn’s report on telephone company repair men and their ‘off-task’ conversations. Concern over lack of methodology Reports often in a form not readily translated into a design specification ‘It is an experience rather than a data-collection exercise’. Problem with time-span of study Open-ended view of the situation Not to be confused with participatory design (Scandinavian approach) In PD, workers come out of their work situation, either physically or mentally In ethnography, analyst goes into the workplace, retaining objectivity Benefit of analyst seeing the whole group’s perspective, rather than that of a single individual

Ethnography Study by Heath et al. (1993)1 Examined how dealers in a stock exchange work together Proposed system sought to remedy perceived problems Process of writing tickets to record deals Used pen and paper Time-consuming Prone to error Suggested touch screens to input details of transactions and headphones to elininate distracting external noise Study revealed: Touch screens would reduce the availability of information to others on how deals were progressing Headphones would impede the dealers’ ability to inadvertently monitor other dealers’ actions, thereby risking exposure Suggestions: Pen-based mobile systems with gestural recognition 1 Heath, C., Jirotka, M., Luff, P., and Hindmarsh, J. (1993) Unpacking collaboration: the interactional organization of trading in a city dealing room. In Proceedings of the Third European Conference on Computer-Supported Cooperative Work. Dordrecht: Kluwer.