Download presentation
Presentation is loading. Please wait.
Published byJulianna Morris Modified over 9 years ago
1
CPS 216: Advanced Database Systems Shivnath Babu
2
Minor Change to Course Logistics Grading: –Project 40% 35% –Homework Assignments 15% –Midterm 20% 25% –Final 25%
3
Presentation & Report on “Big Data” 6 topics, 2 students per topic. –Let us try to form groups in class. Otherwise, email your ranked preferences. Shivnath will form the groups Shivnath will give some initial pointers. Get more information (use the Web, books, library, etc.) Do a 10-minute in-class presentation on Thu 9/24 Submit a detailed report that will be read by all students Presentation and report will be graded as part of the project
4
“Big Data” Topics 1.MapReduce Vs. Databases, Hive, Hybrid approaches 2.Parallel Databases: Old (Gamma) and New (Greenplum, Aster Data, HadoopDB) 3.HBase and databases over HDFS, Google File System, Google BigTable 4.Pig and other higher-level languages (Scope, Dryad) 5.Optimization of MapReduce programs: Hadoop Scheduling, Resource allocation 6.Key-Value stores (Amazon Dynamo, Cassandra)
5
The Duke CS Hadoop Cluster See the project web page for access instructions. I will try to give an introduction in class Programming component of Homework 1 will be done on the Hadoop cluster –Implement MapReduce program to compute average temperature per year over the NCDC data –Submit sources (Java files and Jar file) –Due on Tuesday 9/22
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.