Presentation is loading. Please wait.

Presentation is loading. Please wait.

Vertical Search for Courses of UIUC by Jessica Bell, Alexander Loeb, Sharon Paradesi, Michael Paul, Jing Xia, Jie Zhang.

Similar presentations


Presentation on theme: "Vertical Search for Courses of UIUC by Jessica Bell, Alexander Loeb, Sharon Paradesi, Michael Paul, Jing Xia, Jie Zhang."— Presentation transcript:

1 Vertical Search for Courses of UIUC by Jessica Bell, Alexander Loeb, Sharon Paradesi, Michael Paul, Jing Xia, Jie Zhang

2 Demo http://greedy.cs.uiuc.edu/dssi/course/search.php

3 Goals of the project - construct a database of UIUC courses across all departments ultimately creating a centralized knowledgebase about each course. - augment the database by drawing relations between courses both within and between departments and further by finding similarities among courses outside of the University of Illinois.

4 DATA SOURCE Course Catalog Book Store Webpages Other Universities PHP script JAVA script AgentIDE Heritrix WEKA DATABASE Basic Course Info Book Info Course homepage Keywords Related Courses Query by Course Name Instructor Description … PHP Architecture

5 Web Crawling  Wget, AgentIDE and Heritrix Parsers  Python and Java Learning Tools  WEKA Website Design  PHP and MySQL Tools used

6 Tasks finished Data Mining –  Basic course information  Similar course recommendation  Prerequisite course list  Recommended book information Learning –  Clustering  Classification

7 Keywords Pull from course descriptions Remove uninformative/common words

8 Keywords (contd.)‏

9 Search Search by name, instructor, or content Clean up search string  “cs125” becomes “CS 125”  “real-time” becomes “real time realtime” Split search string into individual words and query database for word matches Score and rank results by match frequencies and keyword informativeness scores Look at distribution of scores and display the top results

10 Classification NBTree Classifier Training set: 34 instances Test set: 38 instances Attributes: 17 Accuracy - 94.74% Precision - 0.947 Recall - 0.947 F-Measure -.947

11 Clustering Cobweb Clustering Algorithm Instances: 20 Attributes: 112 Number of clusters: 17 Incorrectly clustered instances: 7.0 (i.e. 35%)‏


Download ppt "Vertical Search for Courses of UIUC by Jessica Bell, Alexander Loeb, Sharon Paradesi, Michael Paul, Jing Xia, Jie Zhang."

Similar presentations


Ads by Google