Download presentation
Presentation is loading. Please wait.
Published byGary Rice Modified over 9 years ago
1
Vertical Search for Courses of UIUC by Jessica Bell, Alexander Loeb, Sharon Paradesi, Michael Paul, Jing Xia, Jie Zhang
2
Demo http://greedy.cs.uiuc.edu/dssi/course/search.php
3
Goals of the project - construct a database of UIUC courses across all departments ultimately creating a centralized knowledgebase about each course. - augment the database by drawing relations between courses both within and between departments and further by finding similarities among courses outside of the University of Illinois.
4
DATA SOURCE Course Catalog Book Store Webpages Other Universities PHP script JAVA script AgentIDE Heritrix WEKA DATABASE Basic Course Info Book Info Course homepage Keywords Related Courses Query by Course Name Instructor Description … PHP Architecture
5
Web Crawling Wget, AgentIDE and Heritrix Parsers Python and Java Learning Tools WEKA Website Design PHP and MySQL Tools used
6
Tasks finished Data Mining – Basic course information Similar course recommendation Prerequisite course list Recommended book information Learning – Clustering Classification
7
Keywords Pull from course descriptions Remove uninformative/common words
8
Keywords (contd.)
9
Search Search by name, instructor, or content Clean up search string “cs125” becomes “CS 125” “real-time” becomes “real time realtime” Split search string into individual words and query database for word matches Score and rank results by match frequencies and keyword informativeness scores Look at distribution of scores and display the top results
10
Classification NBTree Classifier Training set: 34 instances Test set: 38 instances Attributes: 17 Accuracy - 94.74% Precision - 0.947 Recall - 0.947 F-Measure -.947
11
Clustering Cobweb Clustering Algorithm Instances: 20 Attributes: 112 Number of clusters: 17 Incorrectly clustered instances: 7.0 (i.e. 35%)
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.