Heuristic Evaluation John Kelleher
1 What do you want for your product? Good quality? Inexpensive? Quick to get to the market? Good, cheap, quick: pick any two. - Old engineer’s saying
2 Outline Discount usability engineering Heuristic evaluation Heuristics How to perform an HE HE vs. user testing How well does HE work
3 Discount Usability Engineering Cheap no special labs or equipment needed the more careful you are, the better it gets Fast on order of 1 day to apply standard usability testing may take weeks Easy to use can be taught in 2-4 hours
4 Expert Evaluation Strongly diagnostic Overview of whole interface Few resources needed (except for experts) Cheap High potential return - detects significant problems Relies in role playing – can be restricting Subject to bias Problems locating experts Cannot capture real user behaviour AdvantagesDisadvantages
5 Heuristic Evaluation Developed by Jakob Nielsen ( ( Helps find usability problems in a UI design Small set (3-5) of evaluators examine UI independently check for compliance with usability principles (“heuristics”) different evaluators will find different problems Can perform on working UI or on sketches
6 Heuristic Evaluation (cont.) Evaluators goes through UI several times inspects various dialogue elements compares with list of usability principles consider any additional principles or results that come to mind Usability principles Nielsen’s “heuristics” supplementary list of category-specific heuristics competitive analysis & user testing of existing products Use violations to redesign/fix problems
7 Heuristics (original) H1-1: Simple and natural dialog H1-2: Speak the users’ language H1-3: Minimize users’ memory load H1-4: Consistency H1-5: Feedback H1-6: Clearly marked exits H1-7: Shortcuts H1-8: Precise and constructive error messages H1-9: Prevent errors H1-10: Help and documentation
8 Phases of Heuristic Evaluation 1) Pre-evaluation training give evaluators needed domain knowledge and information on the scenario 2) Evaluation individuals evaluate and then aggregate results 3) Severity rating determine how severe each problem is (priority) 4) Debriefing discuss the outcome with design team
9 How to Perform Evaluation Design may be verbal description, paper mock-up, working prototype, or running system. [when evaluating paper mock-ups, pay special attention to missing dialogue elements!] Optionally provide evaluators with some domain-specific training. Each evaluator works alone ( ~1–2 hours). Interface examined in two passes: first pass focuses on general flow, second on individual dialogue elements. Notes taken either by evaluator or evaluation manager. Independent findings are aggregated Severity ratings are assigned first individually and are then aggregated. Group debriefing session to suggest possible redesigns.
10 Severity Rating Used to allocate resources to fix problems Estimates of need for more usability efforts Combination of frequency impact number of affected users Should be calculated after all evals. are in Should be done independently by all judges
11 Severity Ratings (cont.) 0 - don’t agree that this is a usability problem 1 - cosmetic problem 2 - minor usability problem 3 - major usability problem; important to fix 4 - usability catastrophe; imperative to fix
12 How Many Problems Found? Four heuristic evaluations were conducted by “usability novices” (Nielsen93, UE)
13 Aggregated Evaluations Individual evaluators found relatively few problems. Aggregating the evaluations of several individuals produced much better results:
14 Aggregated Evaluations Average proportion of usability problems found by aggregates of size 1 to 30.
15 Debriefing Conduct with evaluators, observers, and development team members Discuss general characteristics of UI Suggest potential improvements to address major usability problems Make it a brainstorming session little criticism until end of session
16 Examples Can’t copy info from one window to another violates “Minimize the users’ memory load” (H1-3) fix: allow copying Typography uses mix of upper/lower case formats and fonts violates “Consistency and standards” (H2-4) slows users down probably wouldn’t be found by user testing fix: pick a single format for entire interface
17 HE vs. User Testing HE is much faster 1-2 hours each evaluator vs. days-weeks HE doesn’t require interpreting user’s actions User testing is far more accurate (by def.) takes into account actual users and tasks HE may miss problems & find “false positives” Good to alternate between HE and user testing find different problems don’t waste participants
18 Results of Using HE Discount: benefit-cost ratio of 48 [Nielsen94] cost was $10,500 for benefit of $500,000 value of each problem ~15K (Nielsen & Landauer) how might we calculate this value? in-house productivity; open market sales Correlation between severity & finding w/ HE Single evaluator achieves poor results only finds 35% of usability problems 5 evaluators find ~ 75% of usability problems why not more evaluators???? 10? 20?
19 # Evals vs Problems Found
20 Cost vs. Benefit
21 Experience of Evaluators Experience of evaluators influences results. Study of one interface, the Banking System, a touch tone “voice response” telephone banking system, by 3 groups of evaluators: 31 “novice” evaluators: computer science students with no formal knowledge of UI or usability (no usability expertise). 19 “regular” specialists: people with UI and usability experience, but no expertise in voice-response systems (usability expertise). 14 “double” specialists: people with expertise both in usability and in telephone-operated interfaces (usability and domain expertise). Task: transfer $1000 from savings account to check account.
22 Sample Banking System Dialogue [First there is a short dialogue in which the user is identified by entering an identification number and access code – this is not part of the evaluation exercise] 1) S: Enter one for account information, three for transfers between your own accounts,... 2) U: 3#{the user interrupts the system} 3) S: Enter account to transfer from. 4) U: # {savings account number} 5) S: Enter account to transfer to. 6) U: # {an abbreviation for the checking account} 7) S: Enter amount in cents. 8) U: # 9) S: From account number twelve thirtyfour fiftysix seventyeight ninety to account number primary account, a transfer of one thousand dollars is to be made. Press one to confirm, zero to cancel. 10) U: 1# 11) S: You do not have access to this function.
23 Major Usability Problems Proportion of novice, specialist, and double specialist usability evaluators finding problems in the Banking System. Results from Nielsen [1992].
24 Minor Usability Problems
25 Results Average proportion of usability problems found by aggregates of novice evaluators, regular specialists, and double specialists. Results from Nielsen [1992].
26 Heuristic Evaluation Test The following figure illustrates a checkout screen for an online store. We describe ten usability violations. Each violation is labelled with a number on the figure. For each problem, suggest a solution to solve each of these problems.
27 Heuristic Evaluation Test
29 10 Heuristic Violations 1. H2-1 Visibility of System Status Problem: UI only says that you are in stage 3, not providing the user with information on how many more stages there are left. Solution: Indicating that the user is in Stage 3 of 6 or providing a timeline along the top of the page stepping the user through the timeline as they progress through their transaction. 2. H2-2 Match between system and the real world Problem: The term “Wagon” does not match the user’s conceptual model of shopping. Solution: Change the term “Wagon” to “Cart” or “Basket”.
30 10 Heuristic Violations (contd) 3. H2-8 Aesthetic and minimalist design Problem: The news from the net section has nothing to do with the user’s transaction. This information is distracting and can lead to the user leaving our site to explore a news story and not complete their transaction. Solution: Remove this section. Can provide this kind of information after the transaction is completed. 4. H2-9 Help users recognize, diagnose, and recover from errors Problem: The message tells the user that the form has errors, but it doesn’t tell them which fields have errors. Potentially the user could create more errors by changing fields that were originally correct. Solution: Mark the fields that need to be changed. Move the error message to the top of the page and highlight the fields that the user needs to fix.
31 10 Heuristic Violations (contd) 5. H2-4 Consistency and standards Problem: The ‘Modify’ and ‘Change’ button seem to have the same functionality. Therefore they should be labeled the same. If they do have distinct functions, then they should be labeled clearer and moved so that they do not mislead the user. Solution: Change the labels on the buttons to ‘Change Item’. 6. H2-3 User control and freedom Problem: The user is given only one choice that is to proceed to the next page. There is not option to cancel or go back. Solution: A cancel and back button should be implemented allowing the user to have more control over their process
32 10 Heuristic Violations (contd) 7. H2-2 Match between system and the real world Problem: ‘Transmit’ is not a common term, it is a technical term for sending a form to be processed. Solution: Change the term to something more clear, like ‘Submit’. 8. H2-6 Recognition rather than recall Problem: To insert an item the user has to recall the item number. This is too much for the user to remember, especially if there is no correlation between the code and the item. Solution: Provide a link for the user to continue shopping. This will allow the user to go back to the initial page and search and browse items they might want to add to their cart.
33 10 Heuristic Violations (contd) 9. H2-4 Consistency and standards Problem: The text is in blue and underlined, signaling the user that the text is a hyperlink, which it probably isn’t. Solution: Change the color and the underlining. Ideally this section should not even be on this page. 10. H2-5 Error prevention Problem: The fields for phone numbers are not fixed in length. This can be an area that users enter in invalid data. Solution: To prevent users from accidentally entering in incorrect data, set widths for the text fields so that a format is provided, or provide an example of how the entry should be made.
34 Summary Heuristic evaluation is a discount method Single evaluator finds only small subset of potential problems. Have evaluators go through the UI twice. Ask them to see if it complies with heuristics note where it doesn’t and say why Combine the findings from 3 to 5 evaluators Have evaluators independently rate severity Discuss problems with design team Alternate with user testing May miss domain-specific problems