Heuristic Evaluation IS 485, Professor Matt Thatcher
2 Agenda l Administrivia l Heuristic evaluation
3 Heuristic Evaluation l Helps find usability problems in a UI design l Can perform on working UI or on sketches l Small set (3-5) of evaluators examine UI –each evaluator independently goes through UI several times »inspects various dialogue/design elements »compares with list of usability principles (or heuristics of good interface design) »identify any violations of these heuristics –evaluators only communicate afterwards (i.e., no interaction) and findings are aggregated l Usability principles --> Nielsen’s heuristics l Use violations to redesign / fix problems
4 Heuristics l H2-1: Visibility of system status l H2-2: Match between system and real world l H2-3: User control and freedom l H2-4: Consistency and standards l H2-5: Error prevention l H2-6: Recognition over recall l H2-7: Flexibility and efficiency of use l H2-8: Aesthetic and minimalist design l H2-9: Help users recognize, diagnose, and recover from errors l H2-10: Help and documentation
5 Phases of Heuristic Evaluation 1) Pre-evaluation training –give evaluators list of principles with which to evaluate –give evaluators needed domain knowledge –give evaluators information on the scenario 2) Evaluation –individuals evaluate and then aggregate results 3) Severity rating –determine how severe each problem is (priority) 4) Debriefing –discuss the outcome with design team
6 How to Perform Evaluation l At least two passes for each evaluator –first to get feel for flow and scope of system –second to focus on specific elements l If system is walk-up-and-use or evaluators are domain experts, then no assistance needed –otherwise might supply evaluators with scenarios l Each evaluator produces list of problems –explain why with reference to heuristic or other info. –be specific and list each problem separately
7 Examples l Can’t copy info from one window to another –violates “Recognition Over Recall” (H2-6) –fix: allow copying l Typography uses mix of upper/lower case formats and fonts –violates “Consistency and standards” (H2-4) –slows users down –probably wouldn’t be found by user testing –fix: pick a single format for entire interface
8 Aggregate the Results l Take all the lists and aggregate the results into a single list of violations l Eliminate redundancies and make clarifications l You will end up with the following Problem # [Heuristic Violated] Brief description of the problem found
9 An Example of Aggregated Results Aggregated List of Violations 1. [H2-4 Consistency and Standards] The interface used the string “Save” on the first screen for saving the user’s file, but used the string “Write file” on the second screen. Users may be confused by this different terminology for the same function 2. [H2-5 Error Prevention]...
10 Severity Ratings l Used to allocate resources to fix the most serious problems l Estimates of need for more usability efforts l Combination of –frequency, impact, persistence l Should be calculated after all evals. are in l Should be done independently by all judges
11 Severity Ratings 0 - don’t agree that this is a usability problem 1 - cosmetic problem only 2 - minor usability problem; fixing this should be given low priority 3 - major usability problem; important to fix 4 - usability catastrophe; imperative to fix
12 Example of Severity Ratings Evaluator # 1 1. [H2-4 Consistency and Standards] [Severity 3] The interface used the string “Save” on the first screen for saving the user’s file, but used the string “Write file” on the second screen. Users may be confused by this different terminology for the same function 2. [H2-5 Error Prevention] [Severity 4]... Problem # [Heuristic violated] [Severity rating] Problem description
13 Summary Report 1. [H2-4 Consistency and Standards] [Severity 2.7] The interface used the string “Save” on the first screen for saving the user’s file, but used the string “Write file” on the second screen. Users may be confused by this different terminology for the same function 2. [H2-5 Error Prevention] [Severity 3.3]... Problem # [Heuristic violated] [Average severity] Problem description
14 Debriefing l Conduct with evaluators, observers, and development team members l Discuss general characteristics of UI l Suggest potential improvements to address major usability problems l Add ratings on how hard things are to fix –e.g., technological feasibility, time issues, etc. l Make it a brainstorming session –little criticism until end of session
15 Fix Ratings l Together team should also identify a fix rating for each usability problem identified in the summary report l How much time, resources, and effort would it take to fix each usability problems –programmers and techies are crucial here l Fix the important ones (see severity ratings) l Fix the easy ones (see fix ratings) l Make a decision about the rest
16 Fix Ratings 0 - Very easy to fix; only takes a few minutes 1 - Relatively simple to fix; takes a few hours 2 - Difficult to fix; takes a few days or more 3 - Impossible to fix
17 Final Adjustment Final Report for the Heuristic Evaluation 1. [H2-4 Consistency and Standards] [Severity 2.7] [Fix 1] The interface used the string “Save” on the first screen for saving the user’s file, but used the string “Write file” on the second screen. Users may be confused by this different terminology for the same function 2. [H2-5 Error Prevention] [Severity 3.3] [Fix 0] … Problem # [Heur violated] [Avg severity rating] [Fix rating] Problem description
18 Independent Evaluations Aggregated List of Violations Independent Severity Ratings Summary Report with Avg Severity Ratings (SR) Final HE Report with SR and Fix Ratings
19 Some Summary Statistics l Number of violations for the entire interface l For each heuristic, list the number of violations l For each evaluator, list the % of violations found l For each evaluator and severity rating, give the % total violations of that rating found by that evaluator
20 Summary l Expert reviews are discount usability engineering methods l Heuristic evaluation is very popular –have evaluators go through the UI twice –ask them to see if it complies with heuristics »note where it doesn’t and say why –combine the findings from 3 to 5 evaluators –have evaluators independently rate severity –discuss problems with design team –alternate with user testing
21 TRAVELweather Example