Evaluating VR Systems
Scenario You determine that while looking around virtual worlds is natural and well supported in VR, moving about them is a difficult problem. You address this problem by developing a new locomotion technique for virtual worlds. What now? – Prove that your design is better than alternatives – What’s “better” and how do you “prove” it?
Better how? Usability – Intuitiveness, flexibility, functionality Presence/Copresence – Pi, Psi Performance – Accuracy, precision, efficiency Effectiveness – Training, therapy, distraction
Proof Compare system with alternative(s) by conducting human subjects experiments (user studies) How do we go about this “scientifically”? – Design an experiment to identify and potentially magnify differences you expect to exist between systems? – Is the experiment a good (valid one)?
Classic VR Experiment Recruit participants from the local population (population sample) Randomly divide that sample into two groups, control and intervention. Control group gets normal VR, and intervention group gets new VR Compare observations of the two groups to determine if they are significantly (non- randomly) different
Experimental Design Validity Internal Validity – Does your design properly address possible bias factors that may lead to incorrect interpretation of observations? E.g. Selection bias – Can you definitively establish a cause and effect relationship? Correlation is not causation External Validity – Generalization of results to other settings
Observations (Measures) Constructs must be operationalized as measures (metrics). Data produced by the metric proportional to hypothetical value of the construct Construct Validity – “Are you measuring what you think you’re measuring?” – Very difficult to establish (Think Pi, Psi) – Requires community to believe you – Comes after having reliability, predictive validity
How do you compare the data? Assume (Hypothesize) that all of the data comes from the same exact population. – Null Hypothesis – Alternative Hypothesis (your real belief) Find the likelihood of that seeing your particular distribution with random samples of that data If the probability of seeing your particular distribution is less than some pre-determined value, REJECT your hypothesis.
Statistical Tests Most common in VR, by far, is the Student’s T- Test (built into most spreadsheet software) – Can be used to determine the probability that a sample population has a specific mean – Can be used to determine if two samples have the same mean Compute T-Value for your case Determine probability of seeing this or greater T value in the T distribution
Example Suppose you have Metric M, which yields the following values for group 1 and group 2 – [1,7,5,6,4,1,8,5,6,3,7,5,6,4,6] – [2,1,6,1,3,2,1,7,3,1,5,3,8,4,1] Is there a significant difference between the two groups?