ITCS 6010 VUI Evaluation.

Slides:



Advertisements
Similar presentations
Chapter 14: Usability testing and field studies
Advertisements

Chapter 2 The Process of Experimentation
Evaluation of User Interface Design
User Interface Evaluation Formative Evaluation. Summative Evaluation Evaluation of the user interface after it has been developed. Typically performed.
6.811 / PPAT: Principles and Practice of Assistive Technology Wednesday, 16 October 2013 Prof. Rob Miller Today: User Testing.
CS305: HCI in SW Development Evaluation (Return to…)
User Interface Evaluation Usability Inspection Methods
Chapter 14: Usability testing and field studies. Usability Testing Emphasizes the property of being usable Key Components –User Pre-Test –User Test –User.
©N. Hari Narayanan Computer Science & Software Engineering Auburn University 1 COMP 7620 Evaluation Chapter 9.
© De Montfort University, 2001 Questionnaires contain closed questions (attitude scales) and open questions pre- and post questionnaires obtain ratings.
Methodology Overview Dr. Saul Greenberg John Kelleher.
Evaluation. formative 4 There are many times throughout the lifecycle of a software development that a designer needs answers to questions that check.
Empirical Methods in Human- Computer Interaction.
Think-aloud usability experiments or concurrent verbal accounts Judy Kay CHAI: Computer human adapted interaction research group School of Information.
Usability Inspection n Usability inspection is a generic name for a set of methods based on having evaluators inspect or examine usability-related issues.
Design and Evaluation of Iterative Systems n For most interactive systems, the ‘design it right first’ approach is not useful. n The 3 basic steps in the.
An evaluation framework
Focus Groups for the Health Workforce Retention Study.
ICS 463, Intro to Human Computer Interaction Design: 8. Evaluation and Data Dan Suthers.
Damian Gordon.  Summary and Relevance of topic paper  Definition of Usability Testing ◦ Formal vs. Informal methods of testing  Testing Basics ◦ Five.
Usability Methods: Cognitive Walkthrough & Heuristic Evaluation Dr. Dania Bilal IS 588 Spring 2008 Dr. D. Bilal.
1. Learning Outcomes At the end of this lecture, you should be able to: –Define the term “Usability Engineering” –Describe the various steps involved.
©2011 1www.id-book.com Analytical evaluation Chapter 15.
Chapter 14: Usability testing and field studies
User Interface Evaluation Usability Inquiry Methods
Predictive Evaluation
User Interface Evaluation Usability Testing Methods.
Presentation: Techniques for user involvement ITAPC1.
Ch 14. Testing & modeling users
Part 1-Intro; Part 2- Req; Part 3- Design  Chapter 20 Why evaluate the usability of user interface designs?  Chapter 21 Deciding on what you need to.
Gathering User Data IS 588 Dr. Dania Bilal Spring 2008.
Multimedia Specification Design and Production 2013 / Semester 1 / week 9 Lecturer: Dr. Nikos Gazepidis
Human Computer Interaction
Usability testing. Goals & questions focus on how well users perform tasks with the product. – typical users – doing typical tasks. Comparison of products.
Usability Evaluation June 8, Why do we need to do usability evaluation?
What is Usability? Usability Is a measure of how easy it is to use something: –How easy will the use of the software be for a typical user to understand,
Testing & modeling users. The aims Describe how to do user testing. Discuss the differences between user testing, usability testing and research experiments.
Assumes that events are governed by some lawful order
Usability Testing Chapter 6. Reliability Can you repeat the test?
COMP5047 Pervasive Computing: 2012 Think-aloud usability experiments or concurrent verbal accounts Judy Kay CHAI: Computer human adapted interaction research.
Chapter 15 Qualitative Data Collection Gay, Mills, and Airasian
Chapter 8 Usability Specification Techniques Hix & Hartson.
The product of evaluation is knowledge. This could be knowledge about a design, knowledge about the user or knowledge about the task.
AMSc Research Methods Research approach IV: Experimental [1] Jane Reid
Usability Engineering Dr. Dania Bilal IS 582 Spring 2006.
Copyright 2010, The World Bank Group. All Rights Reserved. Testing and Documentation Part II.
EVALUATION PROfessional network of Master’s degrees in Informatics as a Second Competence – PROMIS ( TEMPUS FR-TEMPUS-JPCR)
Usability Engineering Dr. Dania Bilal IS 592 Spring 2005.
Chapter 15: Analytical evaluation. Aims: Describe inspection methods. Show how heuristic evaluation can be adapted to evaluate different products. Explain.
Research Methods Observations Interviews Case Studies Surveys Quasi Experiments.
Introduction to Evaluation without Users. Where are you at with readings? Should have read –TCUID, Chapter 4 For Next Week –Two Papers on Heuristics from.
Fall 2002CS/PSY Predictive Evaluation (Evaluation Without Users) Gathering data about usability of a design by a specified group of users for a particular.
Usability Evaluation or, “I can’t figure this out...do I still get the donuts?”
Usability Engineering Dr. Dania Bilal IS 582 Spring 2007.
Usability Engineering Dr. Dania Bilal IS 587 Fall 2007.
Evaluation / Usability. ImplementDesignAnalysisEvaluateDevelop ADDIE.
1 Usability Analysis n Why Analyze n Types of Usability Analysis n Human Subjects Research n Project 3: Heuristic Evaluation.
Design Evaluation Overview Introduction Model for Interface Design Evaluation Types of Evaluation –Conceptual Design –Usability –Learning Outcome.
School of Engineering and Information and Communication Technology KIT305/607 Mobile Application Development Week 7: Usability (think-alouds) Dr. Rainer.
Day 8 Usability testing.
User Interface Evaluation
SIE 515 Design Evaluation Lecture 7.
Imran Hussain University of Management and Technology (UMT)
SY DE 542 User Testing March 7, 2005 R. Chow
Evaluation.
HCI Evaluation Techniques
CSM18 Usability Engineering
Testing & modeling users
Empirical Evaluation Data Collection: Techniques, methods, tricks Objective data IRB Clarification All research done outside the class (i.e., with non-class.
Presentation transcript:

ITCS 6010 VUI Evaluation

Summative Evaluation Evaluation of the interface after it has been developed. Typically performed only once at the end of development. Rarely used in practice. Not very formal. Data is used in the next major release.

Formative Evaluation Evaluation of the interface as it is being developed. Begins as soon as possible in the development cycle. Typically, formative evaluation appears as part of prototyping. Extremely formal and well organized.

Formative Evaluation Performed several times. An average of 3 major cycles followed by iterative redesign per version released First major cycle produces the most data. Following cycles should produce less data, if you did it right.

Formative Evaluation Data Objective Data Directly observed data. The facts! Subjective Data Opinions, generally of the user. Some times this is a hypothesis that leads to additional experiments.

Formative Evaluation Data Subjective data is critical for VUIs.

Formative Evaluation Data Quantitative Data Numeric Performance metrics, opinion ratings (Likert Scale) Statistical analysis Tells you that something is wrong. Qualitative Data Non numeric User opinions, views or list of problems/observations Tells you what is wrong.

Formative Evaluation Data Not all subjective data are qualitative. Not all objective data are quantitative. Quantitative Subjective Data Likert Scale of how a user feels about something. Qualitative Objective Data Benchmark task performance measurements where the outcome is the expert’s opinion on how users performed.

Steps in Formative Evaluation State hypothesis and design the experiment. Conduct the experiment. Collect the data. Analyze the data. Draw your conclusions & establish hypotheses. Redesign and do it again.

Experiment Design Subject selection Who are your participants? What are the characteristics of your participants? What skills must the participants possess? How many participants do I need (5, 8, 10, …) Do you need to pay them?

Experiment Design Task Development What tasks do you want the subjects to perform using your interface? What do you want to observe for each task? What do you think will happen? Benchmarks? What determines success or failure?

Experiment Design Protocol & Procedures What can you say to the user without contaminating the experiment? What are all the necessary steps needed to eliminate bias? You want every subject to undergo the same experiment. Do you need consent forms (IRB)?

Experiment Trials Calculate Method Effectiveness Sears, A., (1997) “Heuristic Walkthroughs: Finding the Problems Without the Noise,” International Journal of Human-Computer Interaction, 9(3), 213-23. Follow protocol and procedures. Don’t say “say” in your experiment, this will bias or contaminate your experiment. Pilot Study Expect the unexpected.

Experiment Trials Pilot Study An initial run of a study (e.g. an experiment, survey, or interview) for the purpose of verifying that the test itself is well-formulated. For instance, a colleague or friend can be asked to participate in a user test to check whether the test script is clear, the tasks are not too simple or too hard, and that the data collected can be meaningfully analyzed. (see http://www.usabilityfirst.com/ )

Experiment Trials – Pilot Study Wizard of OZ You play the “Wizard” or system. Users call the Wizard and have the Wizard pretend to be the system.

Data Collection Collect more than enough data. Backup your data. More is better! Backup your data. Secure your data.

Data Analysis Use more than one method. All data lead to the same point. Your different types of data should support each other. Remember: Quantitative data tells you something is wrong. Qualitative data tells you what is wrong. Experts tell you how to fix it.

Measuring Method Effectiveness

Redesign Redesign should be supported by data findings. Setup next experiment. Sometimes it is best to keep the same experiment. Sometimes you have to change the experiment. Is there a flaw in the experiment or the interface?

Formative Evaluation Methods Usability Inspection Methods Usability experts are used to inspect your system during formative evaluation. Usability Testing Methods Usability tests are conducted with real users under observation by experts. Usability Inquiry Methods Usability evaluators collect information about the user’s likes, dislikes and understanding of the interface.

Usability Inspection Methods Usability experts “inspect” your interfaces during formative evaluation. Widely used in practice. Often abused by developers that consider themselves to be usability experts.

Usability Inspection Methods Heuristic Evaluation Cognitive Walkthroughs Pluralistic Walkthroughs Feature, Consistency & Standards Inspection

Heuristic Evaluation: What is it? Several evaluators independently evaluate the interface & come up with potential usability problems. It is important that there be several of these evaluators and that the evaluations be done independently. Nielsen's experience indicates that around 5 evaluators usually results in about 75% of the overall usability problems being discovered.

Heuristic Evaluation: How can I do it? Obtain the service of 4, 5 or 6 usability experts. Each expert will perform an independent evaluation. Give experts a heuristics inspection guide. Collect the individual evaluations. Bring the experts together and do a group heuristic evaluation. (Optional)

Cognitive Walkthroughs: What is it? Cognitive walkthroughs involve one or a group of evaluators inspecting a user interface by going through a set of tasks and evaluate its understandability and ease of learning. The input to the walkthrough also include the user profile, especially the users' knowledge of the task domain and of the interface, and the task cases. Based upon exploratory learning methods. Exploration of the user interface.

Cognitive Walkthroughs: What is it? The evaluators may include Human factors engineers Software developers People from marketing Documentation, etc. Best used in the design stage of development.

Cognitive Walkthroughs: How can I do it? During the walkthrough: Illustrate the task and then ask a user to perform a task. Accept input from all participants: do not interrupt demo. After the walkthrough: Make interface changes. Plan the next evaluation.

Pluralistic Walkthroughs: What is it? During the design stage, a group of people: Users Developers Usability Experts Meet to perform a walkthrough.

Pluralistic Walkthroughs: How can I do it? The group meets and 1 person acts as coordinator. A task is presented to the group. Paper prototypes, screen shots, etc. are presented. Each participants write down comments on each interface. After the demo, a discussion will follow.

Feature, Consistency & Standards Inspection: What is it? Feature, Consistency & Standards are inspected by an expert.

Feature, Consistency & Standards Inspection: How can I do it? Feature Inspection The expert is given use cases/scenarios and asked to inspect the system. Consistency Inspection The expert is asked to inspect consistency within your application. Standards Inspection The expert is asked to inspect standards. Standards can be in house, government, etc.

Usability Testing Methods Carrying out experiments to find out specific information about a design and/or product. Basis comes from experimental psychology. Uses statistical data methods Quantitative and Qualitative

Usability Testing Methods During usability testing, users work on specific tasks using the interface/product and evaluators use the results to evaluate and modify the interface/product. Widely used in practice, but not appropriately used. Often abused by developers that consider themselves to be usability experts. Can be very expensive and time consuming.

Usability Testing Methods Performance Measurement Thinking-aloud Protocol Question-asking Protocol Coaching Method

Usability Testing Methods Co-discovery Learning Teaching Method Retrospective Testing Remote Testing

Performance Measurement: What is it? Used to collect quantitative data. Typically, you will be looking for benchmark data. Objectives MUST be quantifiable 75% of users shall be able to complete the basic task in less than 30 minutes.

Performance Measurement: How can I do it? Define the goals that you expect users to perform Quantify the goals The time users take to complete a specific task. The Ratio between successful interactions and errors. The time spent recovering from errors. The number of user errors. The number of commands or other features that were never used by the user. The number of system features the user can remember during a debriefing after the test. The proportion of users who say that they would prefer using the system over some specified competitor.

Performance Measurement: How can I do it? Get participants for the experiments Conduct very controlled experiments All variables must remain consistent across users Problem with performance measurement No qualitative data

Thinking-aloud Protocol: What is it? Technique where the participant is asked to vocalize his or her thoughts, feelings, and opinions while interacting with the product.

Thinking-aloud Protocol: How can I do it? Select the participants, who will be involved? Select the tasks and design scenarios. Ask the participant to perform a task using the software. During the task, ask the user to vocalize Thoughts, opinions, feelings, etc.

Thinking-aloud Protocol Problem With Thinking-Aloud Protocol Cognitive Overload Can you walk & chew gum at the same time? Asking the participants to do too much.

Question-asking Protocol: What is it? Similar to Thinking-aloud protocol. Instead of participant saying what they are thinking, the evaluator prompts the participant with questions while using the system.

Question-asking Protocol: How can I do it? Select the participants, who will be involved? Select the tasks and design scenarios. Ask the participant to perform a task using the software.

Question-asking Protocol: How can I do it? During the task, ask the user to questions about the product Thoughts, opinions, feelings, etc. Problem With Thinking-Aloud Protocol Cognitive Overload++ Can you walk, chew gum & talk at the same time? Asking the participants to do too much. Added pressure when the evaluator asks questions. Can be frustrating on novice users.

Coaching Method: What is it? A system expert sits with the participant and acts as a coach. Expert answers the participant’s questions. The evaluator observes their interaction.

Coaching Method: How can I do it? Select the participants, who will be involved? Select the tasks and design scenarios. Ask the participant to perform a task using the software in the presence of a coach/expert.

Coaching Method: How can I do it? During the task, the user will ask the expert questions about the product. Problem With Coaching Method In reality, there will not be a coach present. This is good for creating a coaching system, but not for evaluating an interface.

Co-Discovery Learning: What is it? Two test users attempt to perform tasks together while being observed. They are to help each other in the same manner as they would if they were working together to accomplish a common goal using the product. They are encouraged to explain what they are thinking about while working on the tasks. Thinking Aloud, but more natural because of partner.

Co-Discovery Learning: How can I do it? Select the participants, who will be involved? Select the tasks and design scenarios. Ask the participants to perform a task using the software.

Co-Discovery Learning: How can I do it? During the task, the users will help each other and voice their thoughts by talking to each other. Problem With Co-Discovery Learning Neither is an expert The blind leading the blind.

Teaching Method: What is it? You have 1 participant use the system. Ask the participant to teach a novice participant how to use the system.

Teaching Method: How can I do it? Select the participants. Select the tasks and design scenarios. Ask the 1st participant to perform a task using the software. Ask the 1st participant to teach a new participant.

Teaching Method: How can I do it? Observe their interactions. Problem With Teaching Method Neither is an expert The blind leading the blind. Advantage of Teaching Method Possible to discover some interesting things about the learn-ability of your interfaces.

Retrospective Testing: What is it? A videotape of the session is observed by the usability expert and the participants.

Retrospective Testing: How can I do it? Select the participants, who will be involved? Select the tasks and design scenarios. Use one of the usability testing methods that we have discussed. Videotape the session.

Retrospective Testing: How can I do it? Review the videotape with the users. Problem With Retrospective Testing Extremely time consuming!

Remote Testing: What is it? The participants are separated from the evaluators. No formal observation. No usability lab.

Remote Testing: How can I do it? Give the product/software to participants. Collect information about how they use your software/product. Methods Same-Time Different Place Different-Time Different Place

Remote Testing: How can I do it? Lotus Video Cam, Look@Me, SnagIt Usability Logger http://www.usabletools.com/ Journaled Sessions

Remote Testing: How can I do it? Problem With Remote Testing The evaluator is not there. Can’t observe facial expressions. Great for Web based systems.

Usability Testing Methods Select the method that works best for you. Select the method that fits your implementation. Be thorough during your experiments. The more data, the better.

Usability Testing Methods Hawthorne Effect The tendency for people to change their behavior and thus performance when they know their performance is being studied.

Usability Inquiry Methods Usability experts learn about the users’ likes, dislikes, needs, etc. of the system through: Observation Verbal questioning Written questioning Widely used in practice. Different methods have different costs, but in general, this is relatively cheap.

Usability Inquiry Methods Contextual Inquiry Field Observation Questionnaires Interviews Focus Groups Logging Actual Use

Contextual Inquiry: What is it? Before designing the system, the expert(s) visit the users’ workplace and question them. This should occur before any design has been done.

Contextual Inquiry: How can I do it? Determine who your users are. Go visit them where they work. Talk to them about the system How do they currently do their job? How would you like to do your job? What do you like about the current system/method? What don’t you like about the current system/method? http://jthom.best.vwh.net/usability/context.htm

Field Observation: What is it? Usability experts observe users in the field using the system/product.

Field Observation: How can I do it? Go to the users’ workplace and simply observe. Things to look for: What is the user’s mental model? Are the users using it the way you expect? You don’t want them to know you are evaluating them.

Questionnaires: What is it? Written lists of questions that you distribute to your users.

Questionnaires: How can I do it? Develop a list of questions on paper, web, email, etc. and give the questionnaire(s) to the users. The users will answer the questions and return the questionnaires to you. http://jthom.best.vwh.net/usability/question.htm http://www.acm.org/~perlman/question.html

Interviews: What is it? You interview users and ask them questions.

Interviews: How can I do it? Develop a list of questions for the users. Meet with the users, individually. Ask them the questions and log the responses Written and/or taped

Interviews: How can I do it? Interview Tips: Clearly define this is an interview. Ask open ended questions to get the user talking. Yes-No questions are bad. Begin with less demanding topics and progress to more difficult topics. Don’t ask questions to support your belief or hypothesis. Do not answer your own questions. Do not agree or disagree … remain neutral.

Interviews: How can I do it? Probes: used to encourage the subjects to continue speaking, or to guide their response in a particular direction Addition Probe Encourages more information or clarifies certain responses from the test users. Either verbally or nonverbally the message is, "Go on, tell me more," or "Don't stop."

Interviews: How can I do it? Reflecting Probe Uses a nondirective technique, encourages the test user to give more detailed information. The interviewer can reformulate the question or synthesize the previous response as a proposition. Directive Probe Specifies the direction in which a continuation of the reply should follow without suggesting any particular content. A directive probe may take the form of "Why is the (the case)?"

Interviews: How can I do it? Defining Probe Requires the subject to explain the meaning of a particular term or concept. http://jthom.best.vwh.net/usability/surveys.htm

Focus Groups: What is it? A group of users are gathered to talk about the system. The expert acts as the moderator. Should conduct more than 1 focus group.

Focus Groups: How can I do it? Bring a group of users together and begin. Collect data

Logging Actual Use: What is it? The computer automatically collect usage data. You could ask the user to log their usage, but that’s not practical.

Logging Actual Use: How can I do it? Usability Logger http://www.usabletools.com/ Automatic capture of keyboard, mouse, etc. VideoCam and other products.

Logging Actual Use Facts On Logging Actual Use You know exactly what the user is doing. You don’t know why, but you do know what, when, where. You don’t know how the user feels.

Conclusions The data should support your conclusions. Method Effectiveness Measure Make design changes based upon the data. Establish new hypotheses based upon the data.