Download presentation
Presentation is loading. Please wait.
Published byGwenda McBride Modified over 8 years ago
1
Spring 2012 1
2
Staff Lecturer: Prof. Sara Cohen Graders: Igor Lifshits, Arbel Moshe 2
3
Topics (tentative) Textual Data (~7 weeks): Index structures Constructing the Index Query Processing Ranking Crawling Distributing Data Graph Data (~3 weeks): Social Networks Social Search Labeled Tree Data (~4 weeks): XML Storage XML Query Processing 3
4
Teaching Methods I will often use slides. I will try to put the slides on the Internet before class. NOTE: My slides DO NOT necessarily contain all the information taught in the class In particular, discussions of how the course material applies to the final project, will usually be oral 4
5
Homework Assignments There will be one non-programming (easy) assignment: March 27: Collaborative social network creation There will be 3 programming assignments: April 17: Index Structure May 8: Index Construction May 29: Friend Recommendation Each of these assignments also has a written component 5
6
Administration All work is in pairs Signup by March 27 (link from homepage) Exercises must be implemented in Java (If, for some reason, you wish to implement the project alone and/or in another programming language, make your request by March 20) Read the exercise description overview on the homepage Contains many administrative details 6
7
Grading All exercise descriptions already available online I summarize the main points in the upcoming slides, but READ THE OVERVIEW AND DEFINITIONS (online)! 7 Percent of Final Grade Task 0%Exercise 0 15%Exercise 1 15%Exercise 2 10%Exercise 3 60%Exam
8
Social Network Directed Graph Nodes are people, and are associated with a name and a short description Edges indicate “like” relationship 8 0 3 1 4 2 5 IDNameDescription 0Alice I am tall and love cooking 1Bob Only my mother says I’m nice looking 2CharlieHi ya! 3DaveI collect bottle caps 4AliceCS student – why? 5BobSmart, silly
9
Overview of System 2 Main Functionalities: Get Large Raw Social Network Dataset, and create on disk Index Recommend friends Raw Data Files Indexer Index User Id Friend Recommender Result 9
10
We Provide A program for generating random social network data Example of how to use this programs Will be available online soon 10
11
Index Structure Given raw input social network data, you will create an on-disk index to allow efficient data access We study this topic in class, so you should build on your class knowledge Some restrictions Cannot use a relational database system for implementation Cannot use the default Java serialize and de-serialize implementations You should use some form of compression to get your index to be a reasonable size 11
12
Index Construction In the first exercise, you will implement an inefficient index construction method In the second exercise, you will improve upon the one in the first exercise, in order to make construction scalable to large amounts of data you cannot assume that all data fits into the internal memory buffer, and thus, the indexing technique should be efficient even when the data is very large 12
13
Friend Recommendation In the third exercise, you will implement several different techniques to recommend friends for a given node in the network 1. Based on common neighbors 2. Based on common description 3. Based on your own creativity 13
14
Honor Code Read the exercise overview for our honor code and code reuse policy Violations of the course honor code and/or code reuse policy will have severe consequences 14
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.