Download presentation
Presentation is loading. Please wait.
Published byDominic Clarke Modified over 9 years ago
1
Evolving Evaluation: from Engineers to Experience Stanford University Human-Computer Interaction Seminar 27 April 2007 Joseph ‘Jofish’ Kaye Cornell University, Ithaca NY jofish @ cornell.edu
2
What is evaluation? Part of the design-build- evaluate iterative design cycle A comparison of ‘built’ to ‘planned’ A place to reflect on both this and the next design But also… A way of defining a field A way a discipline validates the knowledge it creates.
3
What is evaluation? Something you do at the end of a project to show it works… … so you can publish it. A reason papers get rejected Which are other ways of saying: A way of defining a field A way a discipline validates the knowledge it creates. A validation of a design and knowledge about that design
4
HCI Evaluation: Validity “Methods for establishing validity vary depending on the nature of the contribution. They may involve empirical work in the laboratory or the field, the description of rationales for design decisions and approaches, applications of analytical techniques, or ‘proof of concept’ system implementations” CHI 2007 Website
5
So… How and why did we end up with the system(s) we use for HCI evaluation today? How can our current approaches to evaluation deal with novel concepts of HCI, such as third-wave or experience-focused (rather than task focused) HCI? And in particular…
6
The Virtual Intimate Object (VIO) A device for couples in long distance relationships to communicate intimacy When one partner clicks, the other’s circle lights up, and then fades over time. www.intimateobjects.org Kaye & Goulding. Intimate Objects. Proc. DIS’04 Kaye, Levitt, Nevins, Golden & Schmidt. Communicating Intimacy One Bit at a Time. Ext. Abs. CHI 2005. Kaye. I just clicked to say I love you. alt.chi, Ext. Abs. CHI 2006.
7
Evaluation of the VIO It’s about the experience; it’s not about the task How can we measure intimacy and the transmission thereof? Kaye, Levitt, Nevins, Golden & Schmidt. Communicating Intimacy One Bit at a Time. Ext. Abs. CHI 2005. Kaye. I just clicked to say I love you. alt.chi, Ext. Abs. CHI 2006.
8
The 19 Hearts Problem
9
The 19 Hearts Problem: How can you evaluate the ineffable?
10
Understanding how we got to where we are today 1.Evaluation by Engineers 2.Evaluation by Computer Scientists 3.Evaluation by Experimental Psychologists & Cognitive Scientists 4.Evaluation by HCI Professionals 5.(Evaluation in CSCW) 6.Evaluation for Experience
11
(with case studies) 1.Evaluation by Engineers 2.Evaluation by Computer Scientists 3.Evaluation by Experimental Psychologists & Cognitive Scientists a.Evaluation of Text Editors 4.Evaluation by HCI Professionals a)Damaged Merchandise 5.Evaluation in CSCW 6.Evaluation for Experience 1.The VIO 2.Home Health Horoscopes 3.The Whereabouts Clock
12
HCI History:3 Questions to ask about an era Who are the users? Who are the evaluators? What are the limiting factors? P.S. And note the simplification
13
Evaluation by Engineers Users are engineers & mathematicians Evaluators are engineers The limiting factor is reliability
14
Evaluation by Computer Scientists Users are programmers Evaluators are programmers The speed of the machine is the limiting factor
15
Evaluation by Experimental Psychologists & Cognitive Scientists Users are users: the computer is a tool, not an end result Evaluators are cognitive scientists and experimental psychologists: they’re used to measuring things through experiment The limiting factor is what the human can do
16
Case Study of ExPsych / CogSci Evaluation: Text Editors Roberts & Moran, 1982, 1983. Their methodology for evaluating text editors had three criteria: objectivity thoroughness ease-of-use
17
Case Study: Text Editors objectivity “implies that the methodology not be biased in favor of any particular editor’s conceptual structure” thoroughness “implies that multiple aspects of editor use be considered” ease-of-use (of the method, not the editor itself) “the methodology should be usable by editor designers, managers of word processing centers, or other nonpsychologists who need this kind of evaluative information but who have limited time and equipment resources”
18
Case Study: Text Editors objectivity “implies that the methodology not be biased in favor of any particular editor’s conceptual structure” thoroughness “implies that multiple aspects of editor use be considered”. ease-of-use (of the method (not the editor itself), “the methodology should be usable by editor designers, managers of word processing centers, or other nonpsychologists who need this kind of evaluative information but who have limited time and equipment resources.”
19
Case Study: Text Editors Text editors are the white rats of HCI Thomas Green, 1984, in Grudin, 1990.
20
Evaluation by Usability Professionals Evaluators are usability professionals (often with Exp.Psych/CogSci backgrounds) Users are white collar, with non-IT jobs, just using computers The limiting factor is the time of the worker accomplishing their job
21
Evaluation by HCI Professionals They believe in expertise over experiment (Nielsen 1984) They’ve made a decision to decide to focus on better results, regardless of whether they were experimentally provable or not.
22
Case Study: The Damaged Merchandise Debate
23
Damaged Merchandise Setup Early eighties: usability evaluation methods (UEMs) - heuristics (Neilsen) - cognitive walkthrough - GOMS - …
24
Damaged Merchandise Comparison Studies Jeffries, Miller, Wharton and Uyeda (1991) Karat, Campbell and Fiegel (1992) Nielsen (1992) Desuirve, Kondziela, and Atwood (1992) Nielsen and Phillips (1993)
25
Damaged Merchandise Panel Wayne D. Gray, Panel at CHI’95 Discount or Disservice? Discount Usability Analysis at a Bargain Price or Simply Damaged Merchandise
26
Damaged Merchandise Paper Wayne D. Gray & Marilyn Salzman Special issue of HCI: Experimental Comparisons of Usability Evaluation Methods
27
Damaged Merchandise Response Commentary on Damaged Merchandise Karat: experiment in context Jeffries & Miller: real-world Lund & McClelland: practical John: case studies Monk: broad questions Oviatt: field-wide science MacKay: triangulate Newman: simulation & modelling
28
Damaged Merchandise Clash of Paradigms Experimental Psychologists & Cognitive Scientists (who believe in experimentation) vs. HCI Professionals (who believe in experience and expertise, even if ‘unprovable’) (and who were trying to present their work in the terms of the dominant paradigm of the field.)
29
Damaged Merchandise Clash of Paradigms It’s not about who’s right It’s about presenting work in the terms of the dominant paradigm of the field It’s about recognizing what paradigm clashes look like in HCI It’s thinking about how to recognize and re-think our own approaches to knowing and doing HCI
30
Experience Focused HCI A possibly emerging sub-field, drawing from traditions and disciplines outside the field Emphasis on the experience, not [just] the task Thinking about technology as more like… a car than a text editor Wright & McCarthy, Gaver, Blythe, Höök, Taylor & Swan, Bødker, Peterson, Isbister…
31
Experience Focused HCI For example… How can you evaluate a car? Why do you drive what you drive? –Grad-student-chic? –Eco-chic? –Machismo? Safety? Gay? Speed? For users, ‘HCI’ is cultural as well as technological We’ll fail if we evaluate purely on task
32
Experience Focused HCI The users are everybody, in everyday life. The evaluators – perhaps - are ethnographers and designers and documentary filmmakers and writers and playwrights The limiting factor might be how to express oneself, how to be and be seen (or not). (P.S. This stuff’s harder from the inside than the outside.)
33
Evaluating Experience focused HCI: The Virtual Intimate Object So how did we evaluate the VIO? Kaye, Levitt, Nevins, Golden & Schmidt. Communicating Intimacy One Bit at a Time. Ext. Abs. CHI 2005.
34
Daily logbook entries: open ended, user interpreted defamiliarizing questions about the situation impacted by the technology as well as the technology itself that leverage both designers’ & users’ skills in lay cultural interpretation. Evaluating Experience focused HCI: The Virtual Intimate Object
35
What color is your relationship? Has the VIO made you feel closer to your partner? Further away? What TV show represents your family? What song? What director would direct a story about your mobile use? Why? Kaye. I just clicked to say I love you. alt.chi, Ext. Abs. CHI 2006.
36
Evaluating Experience focused HCI: The Virtual Intimate Object The color that currently best represents my relationship is… Amber/yellow --> do I proceed w/ caution or speed up to beat the red or slow down anticipating a step Purple - we have a more matured, aged relationship rather than a new, boundless one which would best be described by red. Purple is the more aged, ripened form of red. Yellow! Like a sun, like a summer. I often laugh with Sven especially in those days. Using Vio is really funny and interesting. Kaye, J. ‘J.’ I just clicked to say I love you. alt.chi, Ext. Abs. CHI 2006.
37
Experience focused HCI: cultural commentators Gaver: cultural commentators with expertise in their own fields provide multiple levels of assessment. Gaver, W. (2007) Cultural Commentators for Polyphonic Assessment. To appear, IJHCI.
38
Experience focused HCI: Home Health Horoscope Domestic ubiquitous computing Privacy-preserving sensors Wellbeing in the home Output to encourage reflection Emphasis on the users’ interpretation Designing for serendipity Difficult problem Gaver, Sengers, Kerridge, Kaye & Bowers. Home Health Horoscope. To appear Proc. CHI’07
39
Evaluating the Whereabouts Clock Second round: 5 families, n=~30 3-10+ weeks per family Phones, WAC for all Open-ended diaries Weekly visits Some guided brainstorming Sellen, A., Eardley, R., Izadi, S., and Harper, R. The whereabouts clock: early testing of a situated awareness device. Ext. Abs. CHI’06
40
Some insights: Location-in-interaction: the social and interactive affordances of location Mums’ situated knowledge of expected events: technological output is not blindly accepted Brown, Taylor, Izadi, Sellen, & Kaye. Locating Family Values: A Field Trial of the Whereabouts Clock. Under consideration for Ubicomp 2007. Evaluating the Whereabouts Clock
41
An evolving discussion Shameless plugs for those attending CHI: alt.chi: Evaluating Evaluation (Monday 4:30p) paper: Home Health Horoscopes (Tuesday 2:30p) sig: Evaluating Experience-focused HCI (Thursday 9a) Special thanks to Terry Winograd and my very good personal friend Wandy Jo. Also thanks to Phoebe Sengers & the Culturally Embedded Computing Group, sBostonCHI, Alex Taylor, Ken Wood, Richard Harper, Abi Sellen, Shahram Izadi, Lorna Brown & the CMLG, Microsoft Cambridge, Apala Lahiri Chavan & Eric Schaffer, HFI, CHI Bangalore, CHI Mumbai, BostonCHI, the Cornell S&TS Department, Maria Håkansson & IT University Göteborg, Louise Barkhuus, Barry Brown & University of Glasgow, Mark Blythe & University of York, Andy Warr & the Oxford E-Research Center, Susanne Bødker, Marianne Graves Petersen & The University of Aarhus, Jonathan Grudin, Liam Bannon, Gilbert Cockton, William Newman, Kirsten Boehner, Jeff Hancock, Bill Gaver, Janet Vertesi, Kia Höök, Jarmo Laaksolahti, Anna Ståhl, Helen Jeffries, Paul Dourish, Jen Rode, Peter Wright, Ryan Aipperspach, Bill Buxton, Michael Lynch, Seth ‘Beemer’ McGinnis & Katherine Isbister.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.