Evaluation of usability tests. Why evaluate? 1. choose the most suitable data- collection techniques 2. identify methodological strength and weaknesses.

Slides:



Advertisements
Similar presentations
©2011 1www.id-book.com An evaluation framework Chapter 13.
Advertisements

Chapter 13: An evaluation framework
Testing Relational Database
Methodology and Explanation XX50125 Lecture 3: Interviews and questionnaires Dr. Danaë Stanton Fraser.
Usability Testing HiØ, Masterstudium Informatikk Grensesnittdesign høsten 2006 Gisle Andresen Forsker, Institutt for Energiteknik.
Item Writing Techniques KNR 279. TYPES OF QUESTIONS Closed ended  Checking yes/no, multiple choice, etc.  Puts answers in categories  Easy to score.
6.811 / PPAT: Principles and Practice of Assistive Technology Wednesday, 16 October 2013 Prof. Rob Miller Today: User Testing.
CS305: HCI in SW Development Evaluation (Return to…)
Deciding How to Measure Usability How to conduct successful user requirements activity?
Customer: Rosalva Gallardo Team members: Susan Lin Buda Chiou Jim Milewski Marcos Mercado November 23, 2010.
© De Montfort University, 2001 Questionnaires contain closed questions (attitude scales) and open questions pre- and post questionnaires obtain ratings.
User Interface Testing. Hall of Fame or Hall of Shame?  java.sun.com.
Evaluation Methodologies
User-Centered Design and Development Instructor: Franz J. Kurfess Computer Science Dept. Cal Poly San Luis Obispo FJK 2005.
Project Sharing  Team discussions –Share results of heuristic evaluations –Discuss your choice of methods and results  Class-level discussion –Each spokesperson.
Part 4: Evaluation Chapter 20: Why evaluate? Chapter 21: Deciding on what to evaluate: the strategy Chapter 22: Planning who, what, where, and when Chapter.
Fundamentals of Information Systems, Second Edition
Data-collection techniques. Contents Types of data Observations Event logs Questionnaires Interview.
An evaluation framework
Observing Learning Helen Bacon and Jan Ridgway Inclusion Support Services.
James Tam Evaluating Interfaces With Users Why evaluation is crucial to interface design General approaches and tradeoffs in evaluation The role of ethics.
Software Process and Product Metrics
Formulating the research design
류 현 정류 현 정 Human Computer Interaction Introducing evaluation.
Evaluation Framework Prevention vs. Intervention CHONG POH WAN 21 JUNE 2011.
Chapter 11: An Evaluation Framework Group 4: Tony Masi, Sam Esswein, Brian Rood, & Chris Troisi.
Presentation: Techniques for user involvement ITAPC1.
Task Analysis Methods IST Oct Example HTA Login – Select login screen – Enter ID – Enter password Choose objects – Browse listing – Select.
ASSESSMENT OF STUDENT LEARNING Manal bait Gharim.
Demystifying the Business Analysis Body of Knowledge Central Iowa IIBA Chapter December 7, 2005.
S556 SYSTEMS ANALYSIS & DESIGN Week 11. Creating a Vision (Solution) SLIS S556 2  Visioning:  Encourages you to think more systemically about your redesign.
Research methods in psychology Simple revision points.
TYPES: Laboratory Experiments- studies in a closed setting, where experimenter has control over all variables. Natural Settings- Real life occurrences,
Scientific Process ► 1) Developing a research idea and hypothesis ► 2) Choosing a research design (correlational vs. experimental) ► 3) Choosing subjects.
Human Computer Interaction
Usability testing. Goals & questions focus on how well users perform tasks with the product. – typical users – doing typical tasks. Comparison of products.
Interviewing 1. Goals of Interviewing  Make sure that the biases and predispositions of the interviewer do not interfere with a free exchange of information.
Evaluating HRD Programs
Usability Testing Chapter 6. Reliability Can you repeat the test?
Chapter 13. Reviewing, Evaluating, and Testing © 2010 by Bedford/St. Martin's1 Usability relates to five factors of use: ease of learning efficiency of.
William H. Bowers – Usability Evaluation Torres 14.
EVALUATION OF HRD PROGRAMS Jayendra Rimal. The Purpose of HRD Evaluation HRD Evaluation – the systematic collection of descriptive and judgmental information.
© An Evaluation Framework Chapter 13.
Task Analysis Methods IST 331. March 16 th
The product of evaluation is knowledge. This could be knowledge about a design, knowledge about the user or knowledge about the task.
Usability Engineering Dr. Dania Bilal IS 582 Spring 2006.
Assessment Information from multiple sources that describes a student’s level of achievement Used to make educational decisions about students Gives feedback.
Usability Evaluation, part 2. REVIEW: A Test Plan Checklist, 1 Goal of the test? Specific questions you want to answer? Who will be the experimenter?
Usability Engineering Dr. Dania Bilal IS 592 Spring 2005.
DECIDE: An evaluation framework. DECIDE: a framework to guide evaluation D D etermine the goals. E E xplore the questions. C C hoose the evaluation approach.
Chapter 23 Deciding how to collect data. UIDE Chapter 23 Introduction: Preparing to Collect Evaluation Data Timing and Logging Actions –Automatic Logging.
Oct 211 The next two weeks Oct 21 & 23: Lectures on user interface evaluation Oct 28: Lecture by Dr. Maurice Masliah No office hours (out of town) Oct.
Usability Engineering Dr. Dania Bilal IS 587 Fall 2007.
Interviews, Questionnaires, and control flowcharts Chapter 19.
Evaluation / Usability. ImplementDesignAnalysisEvaluateDevelop ADDIE.
1 Usability Analysis n Why Analyze n Types of Usability Analysis n Human Subjects Research n Project 3: Heuristic Evaluation.
Day 8 Usability testing.
Chapter 13: An evaluation framework. The aims are: To discuss the conceptual, practical and ethical issues involved in evaluation. To introduce and explain.
Methods – use of case studies
TYPES OF RESEARCH Chapter 1.
Usability Evaluation, part 2
Usability Evaluation.
HCS 465 GUIDE Lessons in Excellence-- hcs465guide.com.
HCS 465 GUIDE Education for Service-- hcs465guide.com.
Qualitative and Quantitative Data
Chapter 23 Deciding how to collect data
Explaining the Methodology : steps to take and content to include
Experimental Evaluation
Task Analysis IST 331 – Organization and Design of Information Systems: User and System Principles Instructor: Mithu Bhattacharya Spring 2011.
Presentation transcript:

Evaluation of usability tests

Why evaluate? 1. choose the most suitable data- collection techniques 2. identify methodological strength and weaknesses of a user test

Evaluation Criteria for data-collection techniques Utility how useful are the data? Costs resources needed? Objectivity how much subjective judgement is involved? Level of detail is the amount and resolution of the data suitable? Intrusiveness does the method interfere with the user’s performance?

Observations in real time Strengths: Level of detail: Allows you to experience the context in which performance takes place Weaknesses: Level of detail: Difficult to keep up with the pace of the user Objective: Based on your own subjective judgement as an observer

Observations from video Strengths: Utility: Allows you to conduct detailed analysis of various usability attributes Utility: Can obtain data about the user’s reasoning (”Think-aloud”) Weaknesses: Costs: Time consuming Utility: Lots of data not being used Intrusiveness: ”Think- aloud” may disturb the user

Observations: Real time or Video? Real timeVideo ContextProduct Context Level of detail

Event logs Strengths: Objective: The data are collected automatically Costs: Automated data collection requires little effort from the test team Weaknesses: Level of detail: Both the amount of data and the resolution can be too high Utility: It can be difficult to create useful measures

Questionnaire, self-made Strengths: Level of detail: Can be tailored to fit the purpose of the test Utility: Can be used in several setting with different products Costs: It doesn’t take long time to develop Weaknesses: Objectivity: Based on subjective judgement Utility: Difficult to construct good items

Questionnaire, validated Strengths: Utility: Can be used in several setting with different products Costs: the data are typically easy to transform into measures Weaknesses: Level of detail: Validated questionnaires may not address the features of the interface you are interested in. Objectivity: based on subjective judgement

Summary data-collection techniques The assessment concern MEASURES and not use/problem descriptions; ++ = very good; + = good; - = not so good; -- = poor

…Use/problem descriptions Observation and Interviews are the most suitable data- collection techniques for use/problem descriptions

Evaluation of measures The evaluation criteria of the data-collection techniques Validitity Reliability

Validity Do you measure what you believe you measure?

Reliability Do you obtain the same results when you measure the same thing during similar conditions at different points in time?

Relationship between Validity & Reliability Evaluating the validity of a measure is primarily based on subjective judgement, while reliability is typically evaluated by means of statistics It is possible to obtain reliable results that are invalid, but not unreliable results that are valid!

How can you avoid invalid results?  Use several measures!  Triangulation  Multiple operationalism

Ethical issues  Be well prepared - act professionally!  Create a script  Introduction  During test  Debriefing  Create a consent form

Ethical issues  The product is being tested, not the user!  Respectful treatment: preserve integrity  Informed consent  Inform the user what will happen, how the collected data will be used etc.  Make sure the user understands and agrees  The user may leave whenever she/he wants  Confidentiality

Types of measures Experience-attitude Performance Cognitive

Experience-attitude Strengths: Utility: Can address most usability attributes Validity: User-centered; we ask for the user’s opinions Weaknesses: Validity/Objectivity: based on the user’s subjective judgement

Performance: completeness Strengths: Utility: Can be used for most tasks and in different settings Cost-effective: Quite easy to create a list of activities Weaknesses: Validity/reliability: The user may choose a solution path you didn’t think of, but that nevertheless is satisfactory Validity(senitivity): Ceiling or flooring effects: the task is too easy or too difficult

Summary of measures ++ = very good; + = good; - = not so good; -- = poor

Relation between data-collection techniques and measures ++ = very good; + = good; - = not so good; -- = poor

Relation between data-collection techniques and measures Measure Data-collection technique Practicle limitations Purpose of test