Presentation is loading. Please wait.

Presentation is loading. Please wait.

Evaluation and metrics: Measuring the effectiveness of virtual environments Doug Bowman Edited by C. Song.

Similar presentations


Presentation on theme: "Evaluation and metrics: Measuring the effectiveness of virtual environments Doug Bowman Edited by C. Song."— Presentation transcript:

1 Evaluation and metrics: Measuring the effectiveness of virtual environments Doug Bowman Edited by C. Song

2 (C) 2005 Doug Bowman, Virginia Tech2 11.2.2 Types of evaluation Cognitive walkthrough Heuristic evaluation Formative evaluation Observational user studies Questionnaires, interviews Summative evaluation Task-based usability evaluation Formal experimentation Sequential evaluation Testbed evaluation

3 (C) 2005 Doug Bowman, Virginia Tech3 11.5 Classifying evaluation techniques Generic Qualitative Quantitative Application- specific Qualitative Quantitative

4 (C) 2005 Doug Bowman, Virginia Tech4 11.4 How VE evaluation is different Physical issues User can’t see world in HMD Think-aloud and speech incompatible Evaluator issues Evaluator can break presence Multiple evaluators usually needed

5 (C) 2005 Doug Bowman, Virginia Tech5 11.4 How VE evaluation is different (cont.) User issues Very few expert users Evaluations must include rest breaks to avoid possible sickness Evaluation type issues Lack of heuristics/guidelines Choosing independent variables is difficult

6 (C) 2005 Doug Bowman, Virginia Tech6 11.4 How VE evaluation is different (cont.) Miscellaneous issues Evaluations must focus on lower-level entities (ITs) because of lack of standards Results difficult to generalize because of differences in VE systems

7 (C) 2005 Doug Bowman, Virginia Tech7 11.6.1 Testbed evaluation framework Main independent variables: ITs Other considerations (independent variables) task (e.g. target known vs. target unknown) environment (e.g. number of obstacles) system (e.g. use of collision detection) user (e.g. VE experience) Performance metrics (dependent variables) Speed, accuracy, user comfort, spatial awareness… Generic evaluation context

8 (C) 2005 Doug Bowman, Virginia Tech8 Testbed evaluation

9 (C) 2005 Doug Bowman, Virginia Tech9 Taxonomy Establish a taxonomy of interaction technique for the interaction task being evaluate. Example : Task: Changing the object’s color 3 sub tasks : selecting object Choosing a color Applying color 2 possible technique components (TC) for choosing a color Changing the values of R, G and B sliders Touching a point within a 3D color space

10 (C) 2005 Doug Bowman, Virginia Tech10 Outside Factors A user’s performance on an interaction task may depend on a variety of factors. 4 categories Task Distance to be traveled, size of object to be manipulated Environment The number of obstacles, the level of activity or motion User Spatial awareness, physical attributes (arm length, etc) System Lighting model, the mean frame rate etc.

11 (C) 2005 Doug Bowman, Virginia Tech11 Performance Metrics Information about human performance Speed, Accuracy : quantitative More subjective performance values Ease of use, ease of learning, and user comfort The user’s sense and body, user-centric performance measure

12 (C) 2005 Doug Bowman, Virginia Tech12 Testbed Evaluation Final stages in the evaluation of Interaction techniques for 3D Interaction tasks Generic, generalizable, and reusable evaluation through the creations of test-beds. Test-beds : Environments and tasks Involve all important aspects of a task Evaluate each component of a technique Consider outside influences on performance Have multiple performance measures

13 (C) 2005 Doug Bowman, Virginia Tech13 Application and Generalization of Results Testbed evaluation produces models that characterize the usability of an interaction technique for the specified task. Usability is given in terms of multiple performance metrics w.r.t various lelvels of outside factors. -> performance Database(DB) More information is added to the DB each time a new technique is run through the testbed. To choose interaction techniques for applications appropriately, one must understand the interaction requirements of the application The performance results from testbed evaluation can be used to recommend interaction techniques that meet those requirements.

14 (C) 2005 Doug Bowman, Virginia Tech14 11.6.2 Sequential evaluation Traditional usability engineering methods Iterative design/eval. Relies on scenarios, guidelines Application-centric

15 (C) 2005 Doug Bowman, Virginia Tech15 11.3 When is a VE effective? Users’ goals are realized User tasks done better, easier, or faster Users are not frustrated Users are not uncomfortable

16 (C) 2005 Doug Bowman, Virginia Tech16 11.3 How can we measure effectiveness? System performance Interface performance / User preference User (task) performance All are interrelated

17 (C) 2005 Doug Bowman, Virginia Tech17 Effectiveness case studies Watson experiment: how system performance affects task performance Slater experiments: how presence is affected Design education: task effectiveness

18 (C) 2005 Doug Bowman, Virginia Tech18 11.3.1 System performance metrics Avg. frame rate (fps) Avg. latency / lag (msec) Variability in frame rate / lag Network delay Distortion

19 (C) 2005 Doug Bowman, Virginia Tech19 System performance Only important for its effects on user performance / preference frame rate affects presence net delay affects collaboration Necessary, but not sufficient

20 (C) 2005 Doug Bowman, Virginia Tech20 Case studies - Watson How does system performance affect task performance? Vary avg. frame rate, variability in frame rate Measure perf. on closed-loop, open- loop task e.g.B. Watson et al, Effects of variation in system responsiveness on user performance in virtual environments. Human Factors, 40(3), 403-414.

21 (C) 2005 Doug Bowman, Virginia Tech21 11.3.3 User preference metrics Ease of use / learning Presence User comfort Usually subjective (measured in questionnaires, interviews)

22 (C) 2005 Doug Bowman, Virginia Tech22 User preference in the interface UI goals ease of use ease of learning affordances unobtrusiveness etc. Achieving these goals leads to usability Crucial for effective applications

23 (C) 2005 Doug Bowman, Virginia Tech23 Case studies - Slater questionnaires assumes that presence is required for some applications e.g.M. Slater et al, Taking Steps: The influence of a walking metaphor on presence in virtual reality. ACM TOCHI, 2(3), 201-219. study effect of: collision detection physical walking virtual body shadows movement

24 (C) 2005 Doug Bowman, Virginia Tech24 User comfort Simulator sickness Aftereffects of VE exposure Arm/hand strain Eye strain

25 (C) 2005 Doug Bowman, Virginia Tech25 Measuring user comfort Rating scales Questionnaires Kennedy - SSQ Objective measures Stanney - measuring aftereffects

26 (C) 2005 Doug Bowman, Virginia Tech26 11.3.2 Task performance metrics Speed / efficiency Accuracy Domain-specific metrics Education: learning Training: spatial awareness Design: expressiveness

27 (C) 2005 Doug Bowman, Virginia Tech27 Speed-accuracy tradeoff Subjects will make a decision Must explicitly look at particular points on the curve Manage tradeoff Speed Accuracy

28 (C) 2005 Doug Bowman, Virginia Tech28 Case studies: learning Measure effectiveness by learning vs. control group Metric: standard test Issue: time on task not the same for all groups e.g.D. Bowman et al. The educational value of an information-rich virtual environment. Presence: Teleoperators and Virtual Environments, 8(3), June 1999, 317-331.

29 (C) 2005 Doug Bowman, Virginia Tech29 Aspects of performance System Performance Interface Performance Task Performance Effectiveness

30 (C) 2005 Doug Bowman, Virginia Tech30 11.7 Guidelines for 3D UI evaluation Begin with informal evaluation Acknowledge and plan for the differences between traditional UI and 3D UI evaluation Choose an evaluation approach that meets your requirements Use a wide range of metrics – not just speed of task completion

31 (C) 2005 Doug Bowman, Virginia Tech31 Guidelines for formal experiments Design experiments with general applicability Generic tasks Generic performance metrics Easy mappings to applications Use pilot studies to determine which variables should be tested in the main experiment Look for interactions between variables – rarely will a single technique be the best in all situations

32 (C) 2005 Doug Bowman, Virginia Tech32 Acknowledgments Deborah Hix Joseph Gabbard


Download ppt "Evaluation and metrics: Measuring the effectiveness of virtual environments Doug Bowman Edited by C. Song."

Similar presentations


Ads by Google