Matt York | Danny Swisher | Patrick Healy | Tim Crossley | Design, salaries, webmaster Search, auto-complete, homepages Leaderboards & metrics, schedules, reviews Classification, compare, publications December 14, 2009 CSE 454, Autumn 09
Problem: Solution: Instructor data is useful and juicy But, it’s also uber difficult to find Courses taught, salaries, reviews, course evaluations, publications, etc. Solution: Aggregate data and make it accessible Add super cool functionality
Data Sets Instructors, courses (number, title, description, etc.), UW Time Schedule Instructors, courses (number, title, description, etc.), departments, course instances.
Easiness, Helpfulness, Clarity, Rater Interest. Data Sets RateMyProfessors.com Easiness, Helpfulness, Clarity, Rater Interest.
Data Sets lbloom.net This is public data!
Data Sets Awards Only 149 matches.
Data Sets BehindTheName Lookup nicknames
Publications search engine Data Sets CiteSeer Publications search engine
This information is difficult to find. Data Sets Course Evaluations This information is difficult to find.
Course Evaluations
Course Evaluations
Course Evaluations
Course Evaluations
Course Evaluations
Course Evaluations
Find Rank Compare Interface Structure Search For Courses Compare Multiple Attributes At Once Search For Departments Compare All Between Types Search For Instructors Rank Discover The Best And The Worst Discover InterType Relationships
Find AutoComplete Throughout Interface Ranking Snippets Our data is structured which facilitates autocomplete, Also while large our data set is still small enough to keep autocomplete pretty fast
Results Separated Areas For Results Of Each Type
Rankings Interesting Rankings
Usability Testing Results High-level: Cool idea, features useful Site navigation smooth “Rankings” not clear “Leaderboards” Tested Better Low-level: No option to remove item from compare list No way to compare all teachers From search results page, people tried to click compare (no “Add to Compare” button)
Early Development
Demo
Problems we ran in to Time schedule has ugly HTML structure XPath queries difficult
Problems & Surprises Name conflicts
42.6% 87.0% Evaluation Precision Recall Classifying home pages is difficult! Precision Recall 42.6% 87.0%
Time schedule disagrees with evaluations Time schedule vs. Course evaluations Time schedule disagrees with evaluations 0.7%
Experiments & Validation Instructors with salaries Courses with evaluations Avg. instructor publications 37% 26% 4.7
RateMyProfessors evaluation Entries matching our crawled data Instructors with reviews 65.29% 12.24%
Acquired Knowledge / Skills Web crawling, recognizing relationships Naïve Bayes shortcomings Ruby on Rails, JS, etc. Consistent data between team members
?