Evaluation: Analyzing results

Evaluation: Analyzing results
ECE 695 Alexander J. Quinn 4/20/2016

Finish study design exercise from Monday.
Today Project questions references, overall structure Finish study design exercise from Monday. Discuss questions from today’s reading. Analyzing data

Q: “Factors for usability are Learnability, Efficiency, Memorability, Errors, Satisfaction. In this chapter how to measure Memorability and Satisfaction were not discussed.” --CM A: The types of performance metrics in this article are not directly connected to Jakob Nielsen’s metrics. Jakob Nielsen Usability Metrics: Tracking Interface Improvements. IEEE Software 13, 6 (November 1996),

Learnability is a way to measure how performance changes over time.
Task success is perhaps the most widely used performance metric. It measures how effectively users are able to complete a given set of tasks. Two different types of task success will be reviewed: binary success and levels of success. Time-on-task is a common performance metric that measures how much time is required to complete a task. Errors reflect the mistakes made during a task. Errors can be useful in pointing out particularly confusing or misleading parts of an interface. Efficiency can be assessed by examining the amount of effort a user expends to complete a task, such as the number of clicks in a website or the number of button presses on a cell phone. Learnability is a way to measure how performance changes over time. Credit: Measuring the User Experience Jacob Nielson, Usability Metrics: Tracking Interface Improvements, 1996.

Q: “How to combine these five performance metrics?” --JZ A: Crisp user goals will help you form clear questions to guide your study. From there, the types of performance metrics described in this chapter are just tools for answering your questions.

Q: “Can you easily test for levels of success in an online test?” --SC A: Non-answer: It depends on your study design.

Q: “Is there a guideline for how long a certain type of task should take?” --SC A: Non-answer: Use MHP, KLM, etc. Real answer: Ideally, you are always comparing to a meaningful baseline.

Q: “This chapter said that it is not always easy to collect error data. It will be better if there were some examples to show how to collect error data in the generic task performance evaluation.” --CM A: This will be task-dependent. Measurable behaviors such as backtracking or stray clicks may be a good proxy

Analysis

Key questions for any empirical evaluation
What are independent variables (IVs)? What type? How many? What are dependent variables (DVs)? Do you need to control for other factors?

Credit: Anne Marenco, http://www. csun

The ANOVA F-test The ANOVA F-statistic compares variation due to specific sources (levels of the factor) with variation among individuals who should be similar (individuals in the same sample). Difference in means large relative to overall variability Difference in means small relative to overall variability  F tends to be small  F tends to be large Larger F-values typically yield more significant results. How large depends on the degrees of freedom (I − 1 and N − I). Credit: W. H. Freeman and Company, 2009 – verbatim

Evaluation: Analyzing results

Similar presentations

Presentation on theme: "Evaluation: Analyzing results"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Evaluation: Analyzing results

Similar presentations

Presentation on theme: "Evaluation: Analyzing results"— Presentation transcript:

Similar presentations

About project

Feedback