Hadoop-Harp Applications Performance Analysis on Big Red II

Slides:



Advertisements
Similar presentations
A MapReduce Workflow System for Architecting Scientific Data Intensive Applications By Phuong Nguyen and Milton Halem phuong3 or 1.
Advertisements

Chapter 4 Partition (3) Double Partition Ding-Zhu Du.
Graphs: basic definitions and notation Definition A (undirected, unweighted) graph is a pair G = (V, E), where V = {v 1, v 2,..., v n } is a set of vertices,
Simple Graph Warmup. Cycles in Simple Graphs A cycle in a simple graph is a sequence of vertices v 0, …, v n for some n>0, where v 0, ….v n-1 are distinct,
Breadth-First Search Seminar – Networking Algorithms CS and EE Dept. Lulea University of Technology 27 Jan Mohammad Reza Akhavan.
APACHE GIRAPH ON YARN Chuan Lei and Mohammad Islam.
DISC-Finder: A distributed algorithm for identifying galaxy clusters.
Applications of Depth-First Search
CS267 L15 Graph Partitioning II.1 Demmel Sp 1999 CS 267 Applications of Parallel Computers Lecture 15: Graph Partitioning - II James Demmel
Iterative computation is a kernel function to many data mining and data analysis algorithms. Missing in current MapReduce frameworks is collective communication,
Distance Approximating Trees in Graphs
Chapter 4 Performance. Times User CPU time – Time that the CPU is executing the program System CPU time – time the CPU is executing OS routines for the.
1 Min-cut for Undirected Graphs Given an undirected graph, a global min-cut is a cut (S,V-S) minimizing the number of crossing edges, where a crossing.
Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Charu Aggarwal + * Department of Computer Science, University of Texas at Dallas + IBM T. J. Watson.
Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Michael Baron + * Department of Computer Science, University of Texas at Dallas + Department of Mathematical.
H O R I Z O N T AL VERTICAL © 2007 by S - Squared, Inc. All Rights Reserved.
Drawing Straight Line Graphs
Algorithms and Running Time Algorithm: Well defined and finite sequence of steps to solve a well defined problem. Eg.,, Sequence of steps to multiply two.
CSE 326: Data Structures NP Completeness Ben Lerner Summer 2007.
Harp: Collective Communication on Hadoop Bingjing Zhang, Yang Ruan, Judy Qiu.
Daphne Koller Message Passing Belief Propagation Algorithm Probabilistic Graphical Models Inference.
1 12/2/2015 MATH 224 – Discrete Mathematics Formally a graph is just a collection of unordered or ordered pairs, where for example, if {a,b} G if a, b.
 Based on observed functioning of human brain.  (Artificial Neural Networks (ANN)  Our view of neural networks is very simplistic.  We view a neural.
Introduction to Graph Theory
A Visual and Statistical Benchmark for Graph Sampling Methods Fangyan Zhang 1 Song Zhang 1 Pak Chung Wong 2 J. Edward Swan II 1 T.J. Jankun-Kelly 1 1 Mississippi.
Motivation: Sorting is among the fundamental problems of computer science. Sorting of different datasets is present in most applications, ranging from.
Notes Over 2.1 Graphing a Linear Equation Graph the equation.
David Stotts Computer Science Department UNC Chapel Hill.
Source: CSE 214 – Computer Science II Graphs.
Dijkstra animation. Dijksta’s Algorithm (Shortest Path Between 2 Nodes) 2 Phases:initialization;iteration Initialization: 1. Included:(Boolean) 2. Distance:(Weight)
Uses some of the slides for chapters 3 and 5 accompanying “Introduction to Parallel Computing”, Addison Wesley, 2003.
Graph Concepts Illustrated Using The Leda Library Amanuel Lemma CS252 Algorithms.
Hw. 6: Algorithm for finding strongly connected components. Original digraph as drawn in our book and in class: Preorder label : Postorder label Nodes:
Date of download: 6/27/2016 Copyright © 2016 SPIE. All rights reserved. Execution overview of MapReduce. Figure Legend: From: Parallel optimization of.
PERFORMANCE EVALUATIONS
A simple parallel algorithm for the MIS problem
Parallel and Perpendicular Lines
Lines that point down-hill have a negative slope.
Graphing Linear Functions
Minimum Spanning Tree Chapter 13.6.
Case Study 2- Parallel Breadth-First Search Using OpenMP
Notes Over 4.2 Is a Solution Verifying Solutions of an Equation
I590 Data Science Curriculum August
Linear Equations in two variables
Data Science Curriculum March
CSE 421: Introduction to Algorithms
Hyunchul Park, Kevin Fan, Manjunath Kudlur,Scott Mahlke
Graphs Representation, BFS, DFS
CIS 700: “algorithms for Big Data”
CSCE569 Parallel Computing
كلية المجتمع الخرج البرمجة - المستوى الثاني
Transformations of graphs
Scalable Parallel Interoperable Data Analytics Library
Computing an Adjacency Matrix D. J. Foreman
Concept maps.
Parallel Applications And Tools For Cloud Computing Environments
Correctness of Edmonds-Karp
Planarity K4 (complete)
Name:______________________________
Graphing Linear Equations
Slope-Point Form of the Equation for a Linear Function
CS 584 Project Write up Poster session for final Due on day of final
Chapter 1 Graphs.
Computing an Adjacency Matrix D. J. Foreman
What is a constant function?
Graph Algorithms: Shortest Path
LINEAR & QUADRATIC GRAPHS
Graphing Linear Equations
A Visual Way to the World of Parallel Computations
Motivation Contemporary big data tools such as MapReduce and graph processing tools have fixed data abstraction and support a limited set of communication.
Presentation transcript:

Hadoop-Harp Applications Performance Analysis on Big Red II

Notes For K-Means Clustering and FR Graph Drawing Algorithm, I use execution time and speedup to explain the performance because They can be drawn in one picture. Due to the intensive computation, I find these applications are easy to achieve linear speedup when the number of nodes is small. Therefore the parallel efficiency is always around 1.

K-Means Clustering Execution Time and Speedup on Big Red II (10 Iterations Benchmark, Nodes: 8, 16, 32, 64, 128, JVM settings: -Xmx44000m –Xms44000m -XX:NewRatio=1 -XX:SurvivorRatio=98)

FR Graph Drawing Algorithm Execution Time and Speedup on Big Red II (477111 Vertices and 665599 Undirected Edges, 10 Iterations Benchmark, Nodes: 1, 8, 16, 32, 64, 128, JVM settings: -Xmx44000m –Xms44000m -XX:NewRatio=1 -XX:SurvivorRatio=98)

WDA-SMACOF Execution Time on Big Red II (Full Application Benchmark, Nodes: 8, 16, 32, 64, 128, JVM settings: -Xmx54000m -Xms54000m, -XX:NewRatio=1 -XX:SurvivorRatio=98)

WDA-SMACOF Parallel Efficiency on Big Red II (Full Application Benchmark, Nodes: 8, 16, 32, 64, 128, JVM settings: -Xmx54000m -Xms54000m -XX:NewRatio=1 -XX:SurvivorRatio=98)

WDA-SMACOF Speedup on Big Red II (Full Application Benchmark, Nodes: 8, 16, 32, 64, 128, JVM settings: -Xmx54000m -Xms54000m -XX:NewRatio=1 -XX:SurvivorRatio=98)