Presentation is loading. Please wait.

Presentation is loading. Please wait.

Poly Hadoop CSC 550 April 26, 2007 Scott Griffin Daniel Jackson Alexander Sideropoulos Anton Snisarenko.

Similar presentations


Presentation on theme: "Poly Hadoop CSC 550 April 26, 2007 Scott Griffin Daniel Jackson Alexander Sideropoulos Anton Snisarenko."— Presentation transcript:

1 Poly Hadoop CSC 550 April 26, 2007 Scott Griffin Daniel Jackson Alexander Sideropoulos Anton Snisarenko

2 Grid Computing High Performance Computing Cluster Network of Resources CPUs, Applications, Data and Storage Common Interface

3 Map Reduce Google Map Function Reduce Function Examples Word Frequencies Hyperlink Source/Target Tree

4 Hadoop Open Source Java Framework Map/Reduce Paradigm Cluster Commodity Hardware HDFS

5 Our Project Setup Hadoop BladeCenter 10 Physical Nodes VMware: Grid of Virtual Nodes 1, 2, 4, 8 Virtual Nodes per Physical Node

6 Experimental Goals Feasibility Performance Ease of Deployment Limits

7 Large Dataset Netflix Prize: Movie/User/Rating Database Calculate Average Rating per User Map: For every rating, emit Reduce: Average every rating for a given user, emit

8 Related Work UCSB Hadoop on XEN University of Washington CSE 490 – Class projects in Hadoop

9 Timeline Week 5-6 Install/Configure Environment Develop Code Week 7-8 Run Experiments Week 9-10 Analyze Data Write Paper Present results

10 Questions?


Download ppt "Poly Hadoop CSC 550 April 26, 2007 Scott Griffin Daniel Jackson Alexander Sideropoulos Anton Snisarenko."

Similar presentations


Ads by Google