Evaluation Howell Istance
Why Evaluate? n testing whether criteria defining success have been met n discovering user problems n testing whether a usability-related specification has been met n comparing designs
Overview of approaches to evaluation n observation and monitoring usage n structured expert reviewing n usability engineering n contextual enquiry n experimental techniques
Things to test n Correct for target audience’s ability? n Meets the design concept? n Human-computer dialogue and graphics design principles adhered to? n Material / information correct? n Install correctly, and run, on target machines? Links work, etc? n How easy is it for target audience to use? n Does it meet the client’s expectations? n User feedback positive?
Approaches to Evaluation in Human-Computer Interaction n User-centred evaluation of software in general usually focuses on how well users can complete tasks n The idea of a ‘task’ is often less well-defined in multimedia applications
Context for evaluation n Free use - users asked to use the application without specific instructions u difficult to ensure users visit all parts of the application u may be unclear to the user what they should u best suited to small information presentations n structured use - users guided in their use of the application u ensures that critical parts of application tested in limited time
Observation and monitoring usage n direct and indirect observation n verbal protocolls n user opinions n software logging
Structured Direct Observation n give subjects a series of standard tasks to complete using a prototype n observe subject completing tasks under standardised conditions n data collection aimed at ensuring that qualitative descriptions of problems during task completion are captured n what problems are likely in data recording?
Standard tasks in structured direct observation n structure tasks into incremental difficulty (easy ones first) n have a clear policy on subject becoming stuck and providing help n have a reason for including each task (avoid unnecessary duplication) n ensure (all) functional areas of interface usage are covered n ensure tasks of sufficient complexity are included
Indirect observation - video n enables post-session debriefing 'talk-through' (post-event protocolls) n enables quantitative data to be extracted - e.g. part task timings n serves as a diary and visual record of problems n usually very time consuming to analyse n usability laboratories
Verbal protocols n means of enhancing direct observations n user articulates what they are thinking during task completion (think-aloud protocols) n but… u doing this can alter normal behaviour u subject likely to stop when undertaking complex cognitive activities u user may rationalise behaviour in post-event protocols n get subjects working in pairs - co- discovery can overcome some of these problems.
Collecting users opinions n interview and questionnaire n suited to both qualitative data and quantitative data collection n interviews u structured interviews (fixed sequence of questions) u semi structured (allows disgressions, but all questions covered) u flexible (exploration of topic governed by users views) n what are the advantages and disadvantages of these?
Questionnaires n contain closed questions (attitude scales) and open questions n pre- and post questionnaires obtain ratings on an issue before and after an design change n can be used to standardise attitude measurement of single subjects following direct observation n can be used to survey large user groups
Can you use the following edit commands? yes no don't know duplicate paste A simple checklist Types of rating scales
Multipoint checklist Rate the usefulness of the duplicate command on the following scale? very of no useful use
Likert Scale n statement of opinion to which the subject expresses their level of agreement Computers can simplify complex problems very much agree slightly neutral slightly disagree strongly agree agree disagree disagree
Caution! The help facility in system A is much better than the help facility in system B very much agree slightly neutral slightly disagree strongly agree agree disagree disagree what does 'strongly disagree' mean?
Semantic differential Scale uses a series of bi-polar adjectives and obtains ratings which respect to each Rate the Beauxarts drawing package on the following dimensions extremely quite slightly neutral slightly quite extremely easy difficult clear confusing fun dreary
Rank Order Place the following commands in order of usefulness (rank the most useful as 1, the least useful as 4) paste duplicate group clear
Do and Don'ts with Questionnaire evaluation n have a clear idea of what specifically you want information about and ensure there are questions that directly address these issues n don't risk subjects being demotivated u not interested in the questionnaire u questionnaire is too long n don't be lazy u focus questions to the specific interface u avoid 'not applicable' responses n provide specific task reference for questions
Possible procedures (free usage)... n Recruit participants representative of intended user group n prepare an introduction which you read n ask person to use/view the presentation until they have finished or up to a time limit
Possible procedures... n Observe u sequence of pages visited u behaviour - browse or read u pages which appear to interest or not interest person n Note problems person appears to have n give person a brief questionnaire n discuss what person liked/didn’t like about the presentation
Structured Expert Reviews n uses ‘experts’ in HCI and task domain to review design rather than subject-based testing n methods vary according to how the review is structured n two popular methods u heuristic evaluation u cognitive walkthrough
Heuristics in Heuristic Evaluation n use simple and natural language n speak the users language n minimise users memory load n be consistent n provide feedback n provide clearly marked n exits n provide shortcuts n provide good error n messages n prevent errors
S: Enter 1 for account information, 3 for transfers between accounts.. U: 3# (interrupts) S: Enter account to transfer from U: # S: Enter account to transfer to U: # (default) S: Enter amount in cents U: # S: From account number to account number primary account, transfer of 1000 dollars is to be made. Press 1 to confirm, 0 to cancel U: 1# S: You do not have access to use this function
S: Enter 1 for account information, 3 for transfers between accounts.. U: 3# (interrupts) S: Enter account to transfer from U: # S: Enter account to transfer to U: # (default) S: Enter amount in cents U: # S: From account number to account number primary account, transfer of 1000 dollars is to be made. Press 1 to confirm, 0 to cancel U: 1# S: You do not have access to use this function read number before menu item description (minimise users memory load)
S: Enter 1 for account information, 3 for transfers between accounts.. U: 3# (interrupts) S: Enter account to transfer from U: # S: Enter account to transfer to U: # (default) S: Enter amount in cents U: # S: From account number to account number primary account, transfer of 1000 dollars is to be made. Press 1 to confirm, 0 to cancel U: 1# S: You do not have access to use this function Avoid gap in menu numbers between 1 and 3 (Simple and Natural Language)
S: Enter 1 for account information, 3 for transfers between accounts.. U: 3# (interrupts) S: Enter account to transfer from U: # S: Enter account to transfer to U: # (default) S: Enter amount in cents U: # S: From account number to account number primary account, tansfer of 1000 dollars is to be made. Press 1 to confirm, 0 to cancel U: 1# S: You do not have access to use this function Error message appears too late (simple and Natural Dialogue
S: Enter 1 for account information, 3 for transfers between accounts.. U: 3# (interrupts) S: Enter account to transfer from U: # S: Enter account to transfer to U: # (default) S: Enter amount in cents U: # S: From account number to account number primary account, tansfer of 1000 dollars is to be made. Press 1 to confirm, 0 to cancel U: 1# S: You do not have access to use this function Error message is imprecise (precise and constructive error messages)
S: Enter 1 for account information, 3 for transfers between accounts.. U: 3# (interrupts) S: Enter account to transfer from U: # S: Enter account to transfer to U: # (default) S: Enter amount in cents U: # S: From account number to account number primary account, tansfer of 1000 dollars is to be made. Press 1 to confirm, 0 to cancel U: 1# S: You do not have access to use this function Do not require dollar amounts to entered in cents (speak the users language)
S: Enter 1 for account information, 3 for transfers between accounts.. U: 3# (interrupts) S: Enter account to transfer from U: # S: Enter account to transfer to U: # (default) S: Enter amount in cents U: # S: From account number to account number primary account, tansfer of 1000 dollars is to be made. Press 1 to confirm, 0 to cancel U: 1# S: You do not have access to use this function Use alternative and proper English phrase (speak users language)
Cognitive Walkthrough n developed on the basis of Cognitive Theory of Initial Learning n intended for systems where the user ‘guesses their way’ through an interaction sequence n the task is decomposed into paths to successful task completion consisting of individual actions n method provides u set of guide lines to support development u procedures and checklist for evaluation
Cognitive Walkthrough Checklist n Problems forming correct goals u failure to add goals u failure to drop goals u addition of spurious goals u premature loss of goals n problems identifying action u correct action does not match goal u incorrect actions match goals n problems performing action u physical difficulties u timeouts
Example of correct goal structure n Program video for timed recording u Press timed recording button u Set Stream F type stream number F press ‘timed recording’ button u Set start time F type start time (24 hour clock) F press ‘timed recording’ button u …..