CPS 196.03: Information Management and Mining Third programming project
Third Programming Project Three options: Clustering project PageRank project Your own topic 1-page project proposal due by Friday (April 10) 5.00 PM Project and report due on Tuesday April 21 Single demo for all three projects: April 21 and 22 30 minutes per team (team from Project 3) Should be prepared to run code on your laptop or by logging in to CS department machine Time slots will be determined through email
Clustering Project Implement BFR algorithm Notes www.cs.cornell.edu/Courses/cs678/2002sp/papers/bradley98scaling.ps Evaluate on one or more datasets from UCI repository http://kdd.ics.uci.edu/
PageRank Project Implement PageRank computation algorithm for large Web graphs How is the Web graph represented? Study running time, convergence properties, and robustness of the algorithm to spam/fraud Generate different types of Web graphs Paper from Google: The PageRank Citation Ranking: Bringing Order to the Web For discussion on Thursday (see readings page)