Data Analysis of EnchantedLearning.com vs. Invent.org

Slides:



Advertisements
Similar presentations
Key Steps to running a survey. Aims and Objectives Have clear aims and objectives for the project. Ensure you know what you want to get out of the survey.
Advertisements

Homework Planners as an Intervention for Homework Completion Audrey Bullock Fall 2009 Math 5090 Audrey Bullock Fall 2009 Math 5090.
AN EVALUATION OF THE EIGHTH GRADE ALGEBRA PROGRAM IN GRAND BLANC COMMUNITY SCHOOLS 8 th Grade Algebra 1A.
The Use of Funnel Plots & Multi- Year Cumulative Data to Track Hospital Performance Herbert MA, Hamman BH, Roper KL, Ring WS, Edgerton JR, Texas Quality.
Microsoft ® Office Access ® 2007 Training Datasheets II: Sum, sort, filter, and find your data ICT Staff Development presents:
Course on Data Analysis and Interpretation P Presented by B. Unmar Sponsored by GGSU PART 2 Date: 5 July
1 Cronbach’s Alpha It is very common in psychological research to collect multiple measures of the same construct. For example, in a questionnaire designed.
Advanced Higher Physics Investigation Report. Hello, and welcome to Advanced Higher Physics Investigation Presentation.
Principals of Research Writing. What is Research Writing? Process of communicating your research  Before the fact  Research proposal  After the fact.
Surveillance and Population-based Prevention Department for Prevention of Noncommunicable Diseases Displaying data and interpreting results.
Testing plan outline Adam Leko Hans Sherburne HCS Research Laboratory University of Florida.
Customer Satisfaction Index July 2008 RESULTS. Introduction This report presents the results for the Customer Satisfaction Index survey undertaken in.
Review: Stages in Research Process Formulate Problem Determine Research Design Determine Data Collection Method Design Data Collection Forms Design Sample.
1 Research Methods in Psychology AS Descriptive Statistics.
By: Krista Hass. You don’t have to be Einstein to pass this test. Just follow these simple steps and you’ll be on your way to great success on the ACT.
» In order to create a valid experiment, the experimenter can only change ONE variable at a time » The variable that is being changed by the experimenter.
Spider Charts: A Training Course
Plus: Exam Scoring How is it done. How many questions are there
HOW TO IMPROVE YOUR GRADE AND EXAM TECHNIQUE
Data Interpretation.
Y6 Assessment Information Evening
PeerWise Student Instructions
AP CSP: Cleaning Data & Creating Summary Tables
What is a CAT? What is a CAT?.
GCSE: Histograms Dr J Frost
Chapter 19: Planning your Data Collection
Mathematics for GCSE Science
August 25, 2015 Please turn in any forms or assignments from yesterday. Take out a sheet of paper and something to write with for notes.
Graphing.
Vocabulary Statistical Inference – provides methods for drawing conclusions about a population parameter from sample data Expected Values– row total *
Parts of a Lab Write-up.
Image-based ads outperform price-based ads on almost every measure
Evaluation of Research Methods
Print vs Digital – who wins?
Draw up tables to display data clearly Should
QM222 A1 Visualizing data using Excel graphs
How could data be used in an EPQ?
Maths Counts Insights into Lesson Study
AP EXAM: Short Answer Questions
Software Quality Engineering
Observations on assignment 3 - Reviews
Math-Curriculum Based Measurement (M-CBM)
Hypothesis Testing for Proportions
Science Fair Project Due:
Science Fair Thursday, March 10th.
Finding Answers through Data Collection
Advantages and disadvantages of types of graphs
Sign test/forensic mini mock
Chi-Square Goodness of Fit
Advantages and disadvantages of types of graphs
..
Regression.
IB BIOLOGY INTERNAL ASSESSMENT
Statistical Analysis Error Bars
Using Statistical techniques in Geography
Attributes of Information
Statistics project By~.
Geography Essay Writing Tips
Good Morning AP Stat! Day #2
Introduction Previous lessons have demonstrated that the normal distribution provides a useful model for many situations in business and industry, as.
What’s the problem? Goodson
Does size matter?.
Critically Critiquing
Analyzing test data using Excel Gerard Seinhorst
Discussion Nedas Matulionis 09/07/2018.
NEPf-Aligned Student Perception Survey Implementation
WESEF Judging.
Introduction to Web Authoring
AO4 Evaluation.
Presentation transcript:

Data Analysis of EnchantedLearning.com vs. Invent.org Eric Lewine

Introduction Method Overview This presentation analyzes the data from a usability test comparing two sites that provide information on inventors and inventions EnchantedLearning.com (Enchanted) Invent.org (Invent) Method Overview Participants were given 6 tasks (out of a possible 10) to perform on a single site. Data analyzed from 79 participants All participants were in the usability field 42 used Enchanted 37 used Invent (38 responded but 1 was eliminated) See Appendix for rationale

Executive Summary Users were more effective and efficient at Enchanted Better task success rate Better task efficiency Tasks were rated as easier Information was rated as easier to find Users found Invent more visually appealing Rated more visually appealing (the most statistically significant difference of any metric) Even though users were more effective and efficient at Enchanted, the difference in perceived usability, as measured by SUS, of each site was not statistically significant.

Metrics Analyzed Task Success Rate Task Efficiency Average Task Rating The average of the success rates of the participants. Task Efficiency The average efficiency of the participants calculated as percentage of tasks completed successfully per minute. Average Task Rating The average of the mean task rating given by the participants on a scale from 1 (Very Difficult) to 5 (Very Easy) Ease of Finding The average “ease of finding information” rating given by the participants on a scale from 1 (Very Difficult) to 7 (Very Easy) Visual Appeal The average visual appeal rating given by the participants on a scale from 1 (Not at all Appealing) to 7 (Very Appealing) SUS (System Usability Scale) The average SUS score given by the participants on a scale from 0 (Not at all Usable) to 100 (Perfectly Usable)

Analysis Method A difference between two values is deemed statistically significant if the difference has a 90% level of confidence. Confidence intervals displayed on graphs are 90%, unless otherwise noted. Where a 90% level of confidence cannot be achieved by visual inspection of the graph, a T-Test is used.

Task Success Rate The observed average task success rates for the participants were: Enchanted: 91.7% Invent: 84.7% T-Testing revealed that there is a 96% confidence that Enchanted users had a higher success rate than Invent users.

Task Efficiency The observed average task efficiency for the participants were: Enchanted: 122% Invent: 91% These values were sufficiently different that I was able to use confidence intervals to determine visually that there is a 95% confidence that Enchanted users were more efficient than Invent users.

Task Rating The observed average of mean task ratings given by the participants were: Enchanted: 4.1 (out of 5) Invent: 3.7 (out of 5) Visual inspection was not clear, so T-Testing revealed that there is a 98% confidence level that tasks have a higher mean rating on Enchanted than on Invent.

Ease of Finding The observed average “ease of finding information” ratings given by the participants were: Enchanted: 5.7 (out of 7) Invent: 5.1 (out of 7) Visual inspection was not clear, so T-Testing revealed that there is a 97% confidence level that tasks have a higher “ease of finding” rating on Enchanted than on Invent.

Visual Appeal The observed average visual appeal ratings given by the participants were: Enchanted: 2.2 (out of 7) Invent: 4.5 (out of 7) These values were so different that I was able to use confidence intervals to determine visually that there is a 99% confidence that Invent was deemed more visually appealing than Enchanted. Note that the difference in visual appeal was, by far, the most statistically significant difference. This will play an important role in the conclusions drawn.

SUS (System Usability Scale) The observed average SUS scores given by the participants were: Enchanted: 63.2 (out of 100) Invent: 58.2 (out of 100) These values were not statistically different at a 90% confidence level. T-testing showed that the difference could only be shown with a 75% confidence. So, using the 90% bar, these values are not statistically different. Even though Enchanted users were more effective and efficient, the difference in SUS scores was not statistically significant.

Conclusions Even though Enchanted users were clearly more effective and efficient the Invent users, the difference in SUS scores, which is the perceived usability of the site, was not statistically significant. Why is that? Invent was rated much more visually appealing than Enchanted. In fact, the difference in visual appeal was much greater than any difference in effectiveness or efficiency. Though Invent’s visual appeal score wasn’t great, either, none of the free form comments were that strong. However, Enchanted received comments like “horrible,” “pretty terrible,” “looks like it was made in the 90s” (4 times), “cluttered and boring,” “amateurish looking” (twice), “unfinished,” and “too boxy.” All participants were in the usability field and may have been extremely turned off by the poor visual appeal of Enchanted even though they were more successful at Invent. When looking through the free form comments for Invent for “How easy was it to find information” and “Did you find anything particularly challenging” it’s clear that the main problems with the site were: Search was hard to find Sometimes Search didn’t find the answer with the search keys given.

Conclusions Enchanted’s poor visual appeal disproportionately affected user’s perception of its usability. Invent Recommendations Obviously, Invent needs to work on its search mechanism. That should give Invent a big boost in effectiveness and efficiency. Enchanted Recommendations Even though Enchanted performed well in efficiency and effectiveness, its lack of visual appeal brought down its overall perceived usability. An effort to improve the visual appeal (making sure not to sacrifice function!) should make Enchanted much more satisfying to users.

Appendix A: Analysis Method I created pivot tables with count, average and stdev for each metric I analyzed (plus task time, which I didn’t include in the report). From each pivot table, I created a comparison table with 90% (usually) confidence intervals. When statistical difference wasn’t obvious, or when 90% difference was significant, I performed T-testing either to confirm statistical difference or figure out just how much more confidence the difference could be stated with. In order to do some T-tests, I had to create lists of a particular value from each site. I just filtered the original table by each site and copy-and-pasted the relevant column to a new sheet (“filtered table” and “filtered table 2”) and then used those sets of data for the T-test arguments.

Appendix B: Removal of P13 P13 only answered one question correctly (the only participant with only 1 correct) and gave up on the rest. The mean task time was .29 minutes (the fastest of all Invent users) The mean task rating was 1.66. This was almost certainly 5 1’s for the “give ups” and a 5 for the correct one. But most telling were the free form comments. Ease of finding: “I didn't know the answers to most of these questions, so I don't know how difficult it was to find the information. I wouldn't have know the correct answer anyway.” Anything particularly challenging: “No, but I could have used some hints.” Anything particularly effective: “the dropdown menu was easy to use.” It seems clear that the user was trying to answer the questions without using the website and was evaluating the task questions, not the site itself.