Download presentation
Presentation is loading. Please wait.
Published byBertram Harmon Modified over 6 years ago
1
Data Analysis of EnchantedLearning.com vs. Invent.org
Eric Lewine
2
Introduction Method Overview
This presentation analyzes the data from a usability test comparing two sites that provide information on inventors and inventions EnchantedLearning.com (Enchanted) Invent.org (Invent) Method Overview Participants were given 6 tasks (out of a possible 10) to perform on a single site. Data analyzed from 79 participants All participants were in the usability field 42 used Enchanted 37 used Invent (38 responded but 1 was eliminated) See Appendix for rationale
3
Executive Summary Users were more effective and efficient at Enchanted
Better task success rate Better task efficiency Tasks were rated as easier Information was rated as easier to find Users found Invent more visually appealing Rated more visually appealing (the most statistically significant difference of any metric) Even though users were more effective and efficient at Enchanted, the difference in perceived usability, as measured by SUS, of each site was not statistically significant.
4
Metrics Analyzed Task Success Rate Task Efficiency Average Task Rating
The average of the success rates of the participants. Task Efficiency The average efficiency of the participants calculated as percentage of tasks completed successfully per minute. Average Task Rating The average of the mean task rating given by the participants on a scale from 1 (Very Difficult) to 5 (Very Easy) Ease of Finding The average “ease of finding information” rating given by the participants on a scale from 1 (Very Difficult) to 7 (Very Easy) Visual Appeal The average visual appeal rating given by the participants on a scale from 1 (Not at all Appealing) to 7 (Very Appealing) SUS (System Usability Scale) The average SUS score given by the participants on a scale from 0 (Not at all Usable) to 100 (Perfectly Usable)
5
Analysis Method A difference between two values is deemed statistically significant if the difference has a 90% level of confidence. Confidence intervals displayed on graphs are 90%, unless otherwise noted. Where a 90% level of confidence cannot be achieved by visual inspection of the graph, a T-Test is used.
6
Task Success Rate The observed average task success rates for the participants were: Enchanted: 91.7% Invent: 84.7% T-Testing revealed that there is a 96% confidence that Enchanted users had a higher success rate than Invent users.
7
Task Efficiency The observed average task efficiency for the participants were: Enchanted: 122% Invent: 91% These values were sufficiently different that I was able to use confidence intervals to determine visually that there is a 95% confidence that Enchanted users were more efficient than Invent users.
8
Task Rating The observed average of mean task ratings given by the participants were: Enchanted: 4.1 (out of 5) Invent: 3.7 (out of 5) Visual inspection was not clear, so T-Testing revealed that there is a 98% confidence level that tasks have a higher mean rating on Enchanted than on Invent.
9
Ease of Finding The observed average “ease of finding information” ratings given by the participants were: Enchanted: 5.7 (out of 7) Invent: 5.1 (out of 7) Visual inspection was not clear, so T-Testing revealed that there is a 97% confidence level that tasks have a higher “ease of finding” rating on Enchanted than on Invent.
10
Visual Appeal The observed average visual appeal ratings given by the participants were: Enchanted: 2.2 (out of 7) Invent: 4.5 (out of 7) These values were so different that I was able to use confidence intervals to determine visually that there is a 99% confidence that Invent was deemed more visually appealing than Enchanted. Note that the difference in visual appeal was, by far, the most statistically significant difference. This will play an important role in the conclusions drawn.
11
SUS (System Usability Scale)
The observed average SUS scores given by the participants were: Enchanted: 63.2 (out of 100) Invent: (out of 100) These values were not statistically different at a 90% confidence level. T-testing showed that the difference could only be shown with a 75% confidence. So, using the 90% bar, these values are not statistically different. Even though Enchanted users were more effective and efficient, the difference in SUS scores was not statistically significant.
12
Conclusions Even though Enchanted users were clearly more effective and efficient the Invent users, the difference in SUS scores, which is the perceived usability of the site, was not statistically significant. Why is that? Invent was rated much more visually appealing than Enchanted. In fact, the difference in visual appeal was much greater than any difference in effectiveness or efficiency. Though Invent’s visual appeal score wasn’t great, either, none of the free form comments were that strong. However, Enchanted received comments like “horrible,” “pretty terrible,” “looks like it was made in the 90s” (4 times), “cluttered and boring,” “amateurish looking” (twice), “unfinished,” and “too boxy.” All participants were in the usability field and may have been extremely turned off by the poor visual appeal of Enchanted even though they were more successful at Invent. When looking through the free form comments for Invent for “How easy was it to find information” and “Did you find anything particularly challenging” it’s clear that the main problems with the site were: Search was hard to find Sometimes Search didn’t find the answer with the search keys given.
13
Conclusions Enchanted’s poor visual appeal disproportionately affected user’s perception of its usability. Invent Recommendations Obviously, Invent needs to work on its search mechanism. That should give Invent a big boost in effectiveness and efficiency. Enchanted Recommendations Even though Enchanted performed well in efficiency and effectiveness, its lack of visual appeal brought down its overall perceived usability. An effort to improve the visual appeal (making sure not to sacrifice function!) should make Enchanted much more satisfying to users.
14
Appendix A: Analysis Method
I created pivot tables with count, average and stdev for each metric I analyzed (plus task time, which I didn’t include in the report). From each pivot table, I created a comparison table with 90% (usually) confidence intervals. When statistical difference wasn’t obvious, or when 90% difference was significant, I performed T-testing either to confirm statistical difference or figure out just how much more confidence the difference could be stated with. In order to do some T-tests, I had to create lists of a particular value from each site. I just filtered the original table by each site and copy-and-pasted the relevant column to a new sheet (“filtered table” and “filtered table 2”) and then used those sets of data for the T-test arguments.
15
Appendix B: Removal of P13
P13 only answered one question correctly (the only participant with only 1 correct) and gave up on the rest. The mean task time was .29 minutes (the fastest of all Invent users) The mean task rating was This was almost certainly 5 1’s for the “give ups” and a 5 for the correct one. But most telling were the free form comments. Ease of finding: “I didn't know the answers to most of these questions, so I don't know how difficult it was to find the information. I wouldn't have know the correct answer anyway.” Anything particularly challenging: “No, but I could have used some hints.” Anything particularly effective: “the dropdown menu was easy to use.” It seems clear that the user was trying to answer the questions without using the website and was evaluating the task questions, not the site itself.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.