IAT 334 Experimental Evaluation ______________________________________________________________________________________ SCHOOL OF INTERACTIVE ARTS + TECHNOLOGY.

Slides:



Advertisements
Similar presentations
Interaksi Manusia Komputer Evaluasi Empiris1/68 EVALUASI EMPIRIS Pengenalan Evaluasi Empiris Perancangan Eksperimen Partisipasi, IRB dan Etika Pengumpulan.
Advertisements

6.811 / PPAT: Principles and Practice of Assistive Technology Wednesday, 16 October 2013 Prof. Rob Miller Today: User Testing.
CS305: HCI in SW Development Evaluation (Return to…)
Data gathering.
Research Methods Lab In-Depth Interviews. Why Interviews? A major advantage of the interview is its adaptability A skillful interviewer can follow up.
©N. Hari Narayanan Computer Science & Software Engineering Auburn University 1 COMP 7620 Evaluation Chapter 9.
Observation Watch, listen, and learn…. Agenda  Observation exercise Come back at 3:40.  Questions?  Observation.
Evaluation. formative 4 There are many times throughout the lifecycle of a software development that a designer needs answers to questions that check.
Think-aloud usability experiments or concurrent verbal accounts Judy Kay CHAI: Computer human adapted interaction research group School of Information.
User Interface Testing. Hall of Fame or Hall of Shame?  java.sun.com.
Evaluation Methodologies
6.S196 / PPAT: Principles and Practice of Assistive Technology Monday, 28 Nov Prof. Rob Miller Today: User Testing & Ethics.
1 CS 430 / INFO 430 Information Retrieval Lecture 24 Usability 2.
An evaluation framework
Intro to Evaluation See how (un)usable your software really is…
Evaluation Methods April 20, 2005 Tara Matthews CS 160.
An evaluation framework
Human Computer Interaction Design Evaluation and Content Presentation
Saul Greenberg Evaluating Interfaces With Users Why evaluation is crucial to interface design General approaches and tradeoffs in evaluation The role of.
From Controlled to Natural Settings
ICS 463, Intro to Human Computer Interaction Design: 8. Evaluation and Data Dan Suthers.
ISE554 The WWW 3.4 Evaluation Methods. Evaluating Interfaces with Users Why evaluation is crucial to interface design General approaches and tradeoffs.
Intro to Evaluation See how (un)usable your software really is…
HCI revision lecture. Main points Understanding Applying knowledge Knowing key points Knowing relationship between things If you’ve done the group project.
James Tam Evaluating Interfaces With Users Why evaluation is crucial to interface design General approaches and tradeoffs in evaluation The role of ethics.
‘Hints for Designing Effective Questionnaires ’
Spring break survey how much will your plans suck? how long are your plans? how many people are involved? how much did you overpay? what’s your name? how.
Gathering Usability Data
Empirical Evaluation Assessing usability (with users)
User Interface Evaluation Usability Inquiry Methods
Predictive Evaluation
Evaluation Framework Prevention vs. Intervention CHONG POH WAN 21 JUNE 2011.
Intro to Evaluation See how (un)usable your software really is…
Data gathering. Overview Four key issues of data gathering Data recording Interviews Questionnaires Observation Choosing and combining techniques.
Gathering User Data IS 588 Dr. Dania Bilal Spring 2008.
Evaluating a Research Report
Data Collection Methods
Chapter 12 Observing Users Li, Jia Li, Wei. Outline What and when to observe Approaches to observation How to observe How to collect data Indirect observation.
Fall 2002CS/PSY Empirical Evaluation Analyzing data, Informing design, Usability Specifications Inspecting your data Analyzing & interpreting results.
Human Computer Interaction
Usability testing. Goals & questions focus on how well users perform tasks with the product. – typical users – doing typical tasks. Comparison of products.
COMP5047 Pervasive Computing: 2012 Think-aloud usability experiments or concurrent verbal accounts Judy Kay CHAI: Computer human adapted interaction research.
Dr. Engr. Sami ur Rahman Assistant Professor Department of Computer Science University of Malakand Research Methods in Computer Science Lecture: Data Generation.
Ways of Collecting Information Interviews Questionnaires Ethnography Books and leaflets in the organization Joint Application Design Prototyping.
Software Engineering User Interface Design Slide 1 User Interface Design.
Chapter 8 Usability Specification Techniques Hix & Hartson.
CS5714 Usability Engineering Formative Evaluation of User Interaction: During Evaluation Session Copyright © 2003 H. Rex Hartson and Deborah Hix.
AMSc Research Methods Research approach IV: Experimental [1] Jane Reid
Usability Engineering Dr. Dania Bilal IS 582 Spring 2006.
Educational Research: Competencies for Analysis and Application, 9 th edition. Gay, Mills, & Airasian © 2009 Pearson Education, Inc. All rights reserved.
Requirements Gathering CS 561. Where do Requirements Come From? Handed to you (?) Dialogue with – Customer – User Are these always the same? Are these.
Observation & Experiments Watch, listen, and learn…
©2011 1www.id-book.com Data Gathering Chapter 7. ©2011 Data Gathering What is data gathering? –The act of gathering data through a study The data can.
Fall 2002CS/PSY Empirical Evaluation Data collection: Subjective data Questionnaires, interviews Gathering data, cont’d Subjective Data Quantitative.
Usability Engineering Dr. Dania Bilal IS 592 Spring 2005.
Working with People & Project Overview “Doing right by your participants”
Evaluation Methods - Summary. How to chose a method? Stage of study – formative, iterative, summative Pros & cons Metrics – depends on what you want to.
Ethics in Evaluation Why ethics? What you have to do Slide deck by Saul Greenberg. Permission is granted to use this for non-commercial purposes as long.
Research Methods Observations Interviews Case Studies Surveys Quasi Experiments.
Usability Evaluation or, “I can’t figure this out...do I still get the donuts?”
1 ITM 734 Introduction to Human Factors in Information Systems Cindy Corritore Testing the UI – part 2.
School of Engineering and Information and Communication Technology KIT305/607 Mobile Application Development Week 7: Usability (think-alouds) Dr. Rainer.
Day 8 Usability testing.
User Interface Evaluation
Inspecting your data Analyzing & interpreting results
Gathering data, cont’d Subjective Data Quantitative
Experimental Evaluation
Empirical Evaluation Data Collection: Techniques, methods, tricks Objective data IRB Clarification All research done outside the class (i.e., with non-class.
Presentation transcript:

IAT 334 Experimental Evaluation ______________________________________________________________________________________ SCHOOL OF INTERACTIVE ARTS + TECHNOLOGY [SIAT] |

March 10, 2011IAT 3342 Evaluation  Evaluation styles  Subjective data –Questionnaires, Interviews  Objective data –Observing Users Techniques, Recording  Usability Specifications –Why, How

March 10, 2011IAT 3343 Our goal?

March 10, 2011IAT 3344 Evaluation  Earlier: –Interpretive and Predictive Heuristic evaluation, walkthroughs, ethnography…  Now: –User involved Usage observations, experiments, interviews...

March 10, 2011IAT 3345 Evaluation Forms  Summative –After a system has been finished. Make judgments about final item.  Formative –As project is forming. All through the lifecycle. Early, continuous.

March 10, 2011IAT 3346 Evaluation Data Gathering  Design the experiment to collect the data to test the hypothesis to evaluate the interface to refine the design  Information we gather about an interface can be subjective or objective  Information also can be qualitative or quantitative –Which are tougher to measure?

March 10, 2011IAT 3347 Subjective Data  Satisfaction is an important factor in performance over time  Learning what people prefer is valuable data to gather

March 10, 2011IAT 3348 Methods  Ways of gathering subjective data –Questionnaires –Interviews –Booths (eg, trade show) –Call-in product hot-line –Field support workers

March 10, 2011IAT 3349 Questionnaires  Preparation is expensive, but administration is cheap  Oral vs. written –Oral advs: Can ask follow-up questions –Oral disadvs: Costly, time-consuming  Forms can provide better quantitative data

March 10, 2011IAT Questionnaires  Issues –Only as good as questions you ask –Establish purpose of questionnaire –Don’t ask things that you will not use –Who is your audience? –How do you deliver and collect questionnaire?

March 10, 2011IAT Questionnaire Topic  Can gather demographic data and data about the interface being studied  Demographic data: –Age, gender –Task expertise –Motivation –Frequency of use –Education/literacy

March 10, 2011IAT Interface Data  Can gather data about –screen –graphic design –terminology –capabilities –learning –overall impression –...

March 10, 2011IAT Question Format  Closed format –Answer restricted to a set of choices Characters on screen hard to read easy to read

March 10, 2011IAT Closed Format  Likert Scale –Typical scale uses 5, 7 or 9 choices –Above that is hard to discern –Doing an odd number gives the neutral choice in the middle

March 10, 2011IAT Closed Format  Advantages –Clarify alternatives –Easily quantifiable –Eliminates useless answers  Disadvantages –Must cover whole range –All should be equally likely –Don’t get interesting, “different” reactions

March 10, 2011IAT Issues  Question specificity –“Do you have a computer?”  Language –Beware terminology, jargon  Clarity  Leading questions –Can be phrased either positive or negative

March 10, 2011IAT Issues  Prestige bias –People answer a certain way because they want you to think that way about them  Embarrassing questions  Hypothetical questions  “Halo effect” –When estimate of one feature affects estimate of another (eg, intelligence/looks)

March 10, 2011IAT Deployment  Steps –Discuss questions among team –Administer verbally/written to a few people (pilot). Verbally query about thoughts on questions –Administer final test

March 10, 2011IAT Open-ended Questions  Asks for unprompted opinions  Good for general, subjective information, but difficult to analyze rigorously  May help with design ideas –“Can you suggest improvements to this interface?”

March 10, 2011IAT Ethics  People can be sensitive about this process and issues  Make sure they know you are testing software, not them  Attribution theory –Studies why people believe that they succeeded or failed--themselves or outside factors (gender, age differences)  Can quit anytime

March 10, 2011IAT Objective Data  Users interact with interface –You observe, monitor, calculate, examine, measure, …  Objective, scientific data gathering  Comparison to interpretive/predictive evaluation

March 10, 2011IAT Observing Users  Not as easy as you think  One of the best ways to gather feedback about your interface  Watch, listen and learn as a person interacts with your system

March 10, 2011IAT Observation  Direct –In same room –Can be intrusive –Users aware of your presence –Only see it one time –May use semitransparent mirror to reduce intrusiveness  Indirect –Video recording –Reduces intrusiveness, but doesn’t eliminate it –Cameras focused on screen, face & keyboard –Gives archival record, but can spend a lot of time reviewing it

March 10, 2011IAT Location  Observations may be –In lab - Maybe a specially built usability lab Easier to control Can have user complete set of tasks –In field Watch their everyday actions More realistic Harder to control other factors

March 10, 2011IAT Challenge  In simple observation, you observe actions but don’t know what’s going on in their head  Often utilize some form of verbal protocol where users describe their thoughts

March 10, 2011IAT Verbal Protocol  One technique: Think-aloud –User describes verbally what s/he is thinking and doing What they believe is happening Why they take an action What they are trying to do

March 10, 2011IAT Think Aloud  Very widely used, useful technique  Allows you to understand user’s thought processes better  Potential problems: –Can be awkward for participant –Thinking aloud can modify way user performs task

March 10, 2011IAT Teams  Another technique: Co-discovery learning –Join pairs of participants to work together –Use think aloud –Perhaps have one person be semi-expert (coach) and one be novice –More natural (like conversation) so removes some awkwardness of individual think aloud

March 10, 2011IAT Alternative  What if thinking aloud during session will be too disruptive?  Can use post-event protocol –User performs session, then watches video afterwards and describes what s/he was thinking –Sometimes difficult to recall

March 10, 2011IAT Historical Record  In observing users, how do you capture events in the session for later analysis?

March 10, 2011IAT Capturing a Session  1. Paper & pencil –Can be slow –May miss things –Is definitely cheap and easy Time 10:00 10:03 10:08 10:22 Task 1 Task 2 Task 3 … SeSe SeSe

March 10, 2011IAT Capturing a Session  2. Audio tape –Good for talk-aloud –Hard to tie to interface  3. Video tape –Multiple cameras probably needed –Good record –Can be intrusive

March 10, 2011IAT Capturing a Session  4. Software logging –Modify software to log user actions –Can give time-stamped key press or mouse event –Two problems: Too low-level, want higher level events Massive amount of data, need analysis tools

March 10, 2011IAT Assessing Usability  Usability Specifications –Quantitative usability goals, used a guide for knowing when interface is “good enough” –Should be established as early as possible in development process

March 10, 2011IAT Measurement Process  “If you can’t measure it, you can’t manage it”  Need to keep gathering data on each iterative refinement

March 10, 2011IAT What to Measure?  Usability attributes –Initial performance –Long-term performance –Learnability –Retainability –Advanced feature usage –First impression –Long-term user satisfaction

March 10, 2011IAT How to Measure?  Benchmark Task –Specific, clearly stated task for users to carry out  Example: Calendar manager –“Schedule an appointment with Prof. Smith for next Thursday at 3pm.”  Users perform these under a variety of conditions and you measure performance

March 10, 2011IAT Assessment Technique Usability Measure Value to Current Worst Planned Best poss Observ attribute instrument be measured level acc level target level level results Initial Benchmk Length of 15 secs 30 secs 20 secs 10 secs perf task time to (manual) success add appt on first trial First Quest ?? impression

March 10, 2011IAT Summary  Measuring Instrument –Questionnaires –Benchmark tasks

March 10, 2011IAT Summary  Value to be measured –Time to complete task –Number of percentage of errors –Percent of task completed in given time –Ratio of successes to failures –Number of commands used –Frequency of help usage

March 10, 2011IAT Summary  Target level –Often established by comparison with competing system or non-computer based task

Ethics  Testing can be arduous  Each participant should consent to be in experiment (informal or formal) –Know what experiment involves, what to expect, what the potential risks are  Must be able to stop without danger or penalty  All participants to be treated with respect Nov 2, 2009IAT 33442

Consent  Why important? –People can be sensitive about this process and issues –Errors will likely be made, participant may feel inadequate –May be mentally or physically strenuous  What are the potential risks (there are always risks)? –Examples?  “Vulnerable” populations need special care & consideration (& IRB review) –Children; disabled; pregnant; students (why?) Nov 2, 2009IAT 33443

Before Study  Be well prepared so participant’s time is not wasted  Make sure they know you are testing software, not them –(Usability testing, not User testing)  Maintain privacy  Explain procedures without compromising results  Can quit anytime  Administer signed consent form Nov 2, 2009IAT 33444

During Study Make sure participant is comfortable Session should not be too long Maintain relaxed atmosphere Never indicate displeasure or anger Nov 2, 2009 IAT 33445

After Study State how session will help you improve system Show participant how to perform failed tasks Don’t compromise privacy (never identify people, only show videos with explicit permission) Data to be stored anonymously, securely, and/or destroyed Nov 2, 2009 IAT 33446

March 10, 2011IAT One Model