Download presentation
Presentation is loading. Please wait.
1
Poly Hadoop CSC 550 April 26, 2007 Scott Griffin Daniel Jackson Alexander Sideropoulos Anton Snisarenko
2
Grid Computing High Performance Computing Cluster Network of Resources CPUs, Applications, Data and Storage Common Interface
3
Map Reduce Google Map Function Reduce Function Examples Word Frequencies Hyperlink Source/Target Tree
4
Hadoop Open Source Java Framework Map/Reduce Paradigm Cluster Commodity Hardware HDFS
5
Our Project Setup Hadoop BladeCenter 10 Physical Nodes VMware: Grid of Virtual Nodes 1, 2, 4, 8 Virtual Nodes per Physical Node
6
Experimental Goals Feasibility Performance Ease of Deployment Limits
7
Large Dataset Netflix Prize: Movie/User/Rating Database Calculate Average Rating per User Map: For every rating, emit Reduce: Average every rating for a given user, emit
8
Related Work UCSB Hadoop on XEN University of Washington CSE 490 – Class projects in Hadoop
9
Timeline Week 5-6 Install/Configure Environment Develop Code Week 7-8 Run Experiments Week 9-10 Analyze Data Write Paper Present results
10
Questions?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.