“Retrospective vs. concurrent think-aloud protocols: usability testing of an online library catalogue.” Presented by: Aram Saponjyan & Elie Boutros.

Slides:



Advertisements
Similar presentations
Testing Relational Database
Advertisements

The Response Process Model as a Tool for Evaluating Business Surveys Deirdre Giesen Statistics Netherlands Montreal, June 20th 2007 ICES III.
The Cost of Authoring with a Knowledge Layer Judy Kay and Lichao Li School of Information Technologies The University of Sydney, Australia.
Usability testing The Think-aloud test Presented by: Connor Stroomberg.
Asper School of Business University of Manitoba Systems Analysis & Design Instructor: Bob Travica Determining systems requirements Updated: September 2014.
Evaluation & exam Social Approach Core Study 1: Milgram (1963)
 Retrospective view of Empirical and Experimental Research in Translation  In search of an efficient method to observe students´processes: Standing over.
4.11 PowerPoint Emily Smith.
1 RUNNING a CLASS (2) Pertemuan Matakuliah: G0454/Class Management & Education Media Tahun: 2006.
Presentation: Usability Testing Steve Laumaillet November 22, 2004 Comp 585 V&V, Fall 2004.
Think-aloud usability experiments or concurrent verbal accounts Judy Kay CHAI: Computer human adapted interaction research group School of Information.
Doing Social Psychology Research
Lecture 4 Chapter 8: Secondary data Chapter 9: Primary data
Feedback from Usability Evaluation to User Interface Design: Are Usability Reports Any Good? Christian M. Nielsen 1 Michael Overgaard 2 Michael B. Pedersen.
User interface design Designing effective interfaces for software systems Objectives To suggest some general design principles for user interface design.
Knowledge is Power Marketing Information System (MIS) determines what information managers need and then gathers, sorts, analyzes, stores, and distributes.
RESEARCH METHODS IN EDUCATIONAL PSYCHOLOGY
Chapter 9 Descriptive Research. Overview of Descriptive Research Focused towards the present –Gathering information and describing the current situation.
Damian Gordon.  Summary and Relevance of topic paper  Definition of Usability Testing ◦ Formal vs. Informal methods of testing  Testing Basics ◦ Five.
Web 2.0 Testing and Marketing E-engagement capacity enhancement for NGOs HKU ExCEL3.
Writing User-Oriented Instructions and Manuals Debopriyo Roy.
The Impact of On-line Teaching Practices On Young EFL Learners' Instruction Dr. Trisevgeni Liontou RHODES MAY
Chapter 3 Needs Assessment
Business and Management Research
Chapter 8: Systems analysis and design
Advanced Topics in Requirement Engineering. Requirements Elicitation Elicit means to gather, acquire, extract, and obtain, etc. Requirements elicitation.
SiTEL LMS Focus Group Executive Summary Prepared: January 25, 2012.
Chapter Seven Copyright © 2006 McGraw-Hill/Irwin Descriptive Research Designs: Survey Methods and Errors.
Descriptive and Causal Research Designs Copyright © 2010 by the McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin.
Ch 14. Testing & modeling users
Evaluating a Research Report
3461P Crash Course Lesson on Usability Testing The extreme, extreme basics...
Object-Oriented Software Engineering Practical Software Development using UML and Java Chapter 7: Focusing on Users and Their Tasks.
Usability testing. Goals & questions focus on how well users perform tasks with the product. – typical users – doing typical tasks. Comparison of products.
What is Usability? Usability Is a measure of how easy it is to use something: –How easy will the use of the software be for a typical user to understand,
SEG3120 User Interfaces Design and Implementation
Assumes that events are governed by some lawful order
Usability Testing Chapter 6. Reliability Can you repeat the test?
1 Usability Studies. 2 Evaluate Usability Run a usability study to judge how an interface facilitates tasks with respect to the aspects of usability mentioned.
CS2003 Usability Engineering Usability Evaluation Dr Steve Love.
Designing & Testing Information Systems Notes Information Systems Design & Development: Purpose, features functionality, users & Testing.
COMP5047 Pervasive Computing: 2012 Think-aloud usability experiments or concurrent verbal accounts Judy Kay CHAI: Computer human adapted interaction research.
Lecture-3.
Y ASER G HANAM Heuristic Evaluation. Roadmap Introduction How it works Advantages Shortcomings Conclusion Exercise.
Maximizing Library Investments in Digital Collections Through Better Data Gathering and Analysis (MaxData) Carol Tenopir and Donald.
Research Methods In Psychology Mrs. Andrews. Psychology… The scientific study of behavior and mental processes.
CHAPTER 12 Descriptive, Program Evaluation, and Advanced Methods.
Software Engineering User Interface Design Slide 1 User Interface Design.
The product of evaluation is knowledge. This could be knowledge about a design, knowledge about the user or knowledge about the task.
Lesson Overview Lesson Overview What Is Science? Lesson Overview 1.1 What Is Science?
AMSc Research Methods Research approach IV: Experimental [1] Jane Reid
Aim: Review Session 1 for Final Exploratory Data Analysis & Types of Studies HW: complete worksheet.
Julia Eka Rini Petra Christian University, Surabaya LTBI Atma Jaya, Jakarta.
Chapter 23 Deciding how to collect data. UIDE Chapter 23 Introduction: Preparing to Collect Evaluation Data Timing and Logging Actions –Automatic Logging.
Lesson 11: Designing Research. Naturalistic Observation When designing a naturalistic observation researchers need to consider;  behavioural categories,
Usability Engineering Dr. Dania Bilal IS 587 Fall 2007.
Sampling & Simulation Chapter – Common Sampling Techniques  For researchers to make valid inferences about population characteristics, samples.
Survey Training Pack Session 3 – Questionnaire Design.
School of Engineering and Information and Communication Technology KIT305/607 Mobile Application Development Week 7: Usability (think-alouds) Dr. Rainer.
The Information School of the University of Washington Information System Design Info-440 Autumn 2002 Session #20.
Day 8 Usability testing.
Research Methods for Business Students
Adapted from PPT developed by Jhpiego corporation
Noticing and Text-Based Chat
Marketing Research and Information Systems
Business and Management Research
Usability Testing: An Overview
Evaluation.
Dr Amina Rashad and Dr Nahed Kandeel
Retrieval Performance Evaluation - Measures
Presentation transcript:

“Retrospective vs. concurrent think-aloud protocols: usability testing of an online library catalogue.” Presented by: Aram Saponjyan & Elie Boutros

Overview: The article discusses the think aloud techniques that are used as part of usability tests. The two main think aloud approaches, retrospective and concurrent, are compared through the test of an online library catalogue. The three main points of comparison are : the detected usability problems, the overall task performance and the participants experience.

Think Aloud Protocol A method of usability evaluation. A method that allows researchers to understand the thought process of testers as they use a given product or device. It is a great method for software designers to interact with potential users and to improve their designs based on the user feedback.

RTA vs. CTA In RTA, participants are asked to perform a set of tests silently(while being video taped) and then verbalize their experience at the end of the testing session while watching themselves on tape. In CTA, participants are asked to explain their thoughts as they are testing the product. A facilitator is always present to remind them to “think aloud” in case they remain silent.

Both the retrospective and the concurrent techniques are used for usability tests of websites,GUIs, and database front ends. Both techniques are valid, useful, and widely adopted in usability tests. Both methods yield nearly unbiased software evaluations since participants do not have to recall their thoughts long after performing the tasks.

Advantages of CTA CTA tends to involve less biased thoughts since users are asked to verbalize their thinking process during task performance. “CTA is more faithful representative of a strictly task oriented usability test”. More observed problems are revealed during task completion as opposed to the RTA which depends heavily on the user’s verbalizations which take place after task completion.

Disadvantages of CTA Users might potentially feel uncomfortable verbalizing their thoughts while performing the task at hand.( especially if they are not doing so in their native spoken language) participants have an extra burden in speaking their thoughts while performing the tasks as opposed to the RTA where users have more time to verbalize the problems after task completion.

Effects of CTA disadvantages on test results. This burden did not slow down the process of task completion. However, the success rate of task completion was affected. CTA participants were less successful in completing their tasks than those who used RTA.

Advantages of RTA Participants are not burdened with the extra task of verbalizing their thoughts as they test. This will make it easier for non-native English speakers since they will have more time to think and translate their thoughts from their native language into English. Another benefit of RTA is the potential decrease in reactivity since participant can execute a task at their own pace and are not rushed in a way that can affect their normal software usage. This will make it more likely for them to not perform better nor worse than usual.

Disadvantages of RTA RTA might not be as precise in the user experience description as CTA since users are asked to describe their experience after finishing their tasks. This extra time might introduce biased judgment since participants might forget specific things they had experienced during their task performance. Overall session time is longer in RTA than it is in CTA since users of RTA not only perform their tasks but also watch these in retrospect.

Test Object. The online library catalogue was chosen to be tested because it combines the characteristics of a search engine and a website which makes it complexes enough for novice users. The participants were a group of 40 university students gathered by the mean of announcements and printed forms. The participants were of age 18 to 24 and were asked to participate in return for a financial reward.

Tasks The tasks were all equally difficult and independent in order to prevent participants from getting stuck. They were defined to cover the catalogue’s main search functions. Those search functions included the simple search, advance search, sort results and filter results.

Questionnaires Two different questionnaires were given to the participants. One at the beginning of the test session and the other at the end. The 1 st one had questions on the demographic details of the participants such as age, gender and education. The 2 nd one had questions aiming towards finding out how participants felt about participating in the experiment.

Processing of the data Total number of usability problems detected in each condition was examined. After that, a distinction was made according to the way the usability problems had surfaced in the data: through observation of the behavioral data through verbalization by the participant a combination of observation and verbalization.

Problem Types Layout problems: The participant fails to spot a particular element within a screen of the catalogue; Terminology problems: The participant does not comprehend part(s) of the terminology used in the catalogue; Data entry problems: The participant does not know how to conduct a search (i.e. enter a search term, use dropdown windows, or start the actual searching); Comprehensiveness problems: The catalogue lacks information necessary to use it effectively; Feedback problems: The catalogue fails to give relevant feedback on searches conducted.

Results 93% of all comments made by CTA participants corresponded to an observable problem in their task execution, compared to 54% of the comments of the RTA participants

Of the 72 problems that were detected, 47% were reported in both conditions, 31% were detected exclusively in the CTA condition, and another 22% were detected exclusively in the RTA condition.

This table shows that 89% of all the problem detections involved problems that were experienced by participants in both conditions.

What this tables show? The CTA participants had to verbalize and work at the same time, which gave them less time to comment on problems that were not acute. While the CTA method reveals more problems that can be observed during task performance, the RTA method depends more on the participants’ verbalizations. Verbal protocols in this study do not so much serve to reveal problems but rather to verbally support the problems that are otherwise observable.

Task performance Does double workload in CTA has an effect on the participants’ task performance? Indicators:  the successful completion of the seven tasks  the time it took the participants to complete them Result: No significant differences were found.

Participant experiences Questions:  experiences with concurrent or retrospective thinking aloud  method of working  presence of the facilitator and the recording equipment Result: No significant differences as to how the participants in both conditions experienced CAT & RAT.

CTA participants found the test situation less disturbing than the RTA participants. Explanation:  RTA participants are given more time to fill in the questionnaire.  Presence of the facilitator during the first part of the RTA test (silent task performance) is less functional than in a CTA design, and that it may be confronting for participants to see their actions back on video.  The CTA participants had to actively perform tasks and think aloud, which considerably reduced the amount of attention they could spare for noticing the facilitator and the recording equipment.

Conclusion Both methods are comparable in terms of quantitative output, they differed significantly as to how this output was established. RTA method proved to be more effective in revealing problems that were not observable, but could only be detected by means of verbalization. RTA participants tended to give explanations and suggestions, while CTA participants more often limited themselves to giving descriptions of their actions. Very limited contribution of the participant’s verbalizations to the outcome (in terms of user problems detected) of the usability test.

Conclusion The task of concurrently verbalizing thoughts caused the participants to make more errors in the process of task performing and to be less successful in completing the seven tasks. Less successful performance of CTA method lies in the participant’s workload: the difficulty of the tasks given to the participants may have been a crucial factor in this study. A strong, and new argument in favor of RTA protocols is that they may be less susceptible to the influence of task difficulty, both in terms of reactivity and in terms of completeness of the verbalizations.