Hadoop-Harp Applications Performance Analysis on Big Red II

Slides:

Advertisements

Similar presentations

A MapReduce Workflow System for Architecting Scientific Data Intensive Applications By Phuong Nguyen and Milton Halem phuong3 or 1.

Advertisements

Chapter 4 Partition (3) Double Partition Ding-Zhu Du.

Graphs: basic definitions and notation Definition A (undirected, unweighted) graph is a pair G = (V, E), where V = {v 1, v 2,..., v n } is a set of vertices,

Simple Graph Warmup. Cycles in Simple Graphs A cycle in a simple graph is a sequence of vertices v 0, …, v n for some n>0, where v 0, ….v n-1 are distinct,

Breadth-First Search Seminar – Networking Algorithms CS and EE Dept. Lulea University of Technology 27 Jan Mohammad Reza Akhavan.

APACHE GIRAPH ON YARN Chuan Lei and Mohammad Islam.

DISC-Finder: A distributed algorithm for identifying galaxy clusters.

Applications of Depth-First Search

CS267 L15 Graph Partitioning II.1 Demmel Sp 1999 CS 267 Applications of Parallel Computers Lecture 15: Graph Partitioning - II James Demmel

Iterative computation is a kernel function to many data mining and data analysis algorithms. Missing in current MapReduce frameworks is collective communication,

Distance Approximating Trees in Graphs

Chapter 4 Performance. Times User CPU time – Time that the CPU is executing the program System CPU time – time the CPU is executing OS routines for the.

1 Min-cut for Undirected Graphs Given an undirected graph, a global min-cut is a cut (S,V-S) minimizing the number of crossing edges, where a crossing.

Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Charu Aggarwal + * Department of Computer Science, University of Texas at Dallas + IBM T. J. Watson.

Ahsanul Haque *, Swarup Chandra *, Latifur Khan * and Michael Baron + * Department of Computer Science, University of Texas at Dallas + Department of Mathematical.

H O R I Z O N T AL VERTICAL © 2007 by S - Squared, Inc. All Rights Reserved.

Drawing Straight Line Graphs

Algorithms and Running Time Algorithm: Well defined and finite sequence of steps to solve a well defined problem. Eg.,, Sequence of steps to multiply two.

CSE 326: Data Structures NP Completeness Ben Lerner Summer 2007.

Harp: Collective Communication on Hadoop Bingjing Zhang, Yang Ruan, Judy Qiu.

Daphne Koller Message Passing Belief Propagation Algorithm Probabilistic Graphical Models Inference.

1 12/2/2015 MATH 224 – Discrete Mathematics Formally a graph is just a collection of unordered or ordered pairs, where for example, if {a,b} G if a, b.

 Based on observed functioning of human brain.  (Artificial Neural Networks (ANN)  Our view of neural networks is very simplistic.  We view a neural.

Introduction to Graph Theory

A Visual and Statistical Benchmark for Graph Sampling Methods Fangyan Zhang 1 Song Zhang 1 Pak Chung Wong 2 J. Edward Swan II 1 T.J. Jankun-Kelly 1 1 Mississippi.

Motivation: Sorting is among the fundamental problems of computer science. Sorting of different datasets is present in most applications, ranging from.

Notes Over 2.1 Graphing a Linear Equation Graph the equation.

David Stotts Computer Science Department UNC Chapel Hill.

Source: CSE 214 – Computer Science II Graphs.

Dijkstra animation. Dijksta’s Algorithm (Shortest Path Between 2 Nodes) 2 Phases:initialization;iteration Initialization: 1. Included:(Boolean) 2. Distance:(Weight)

Uses some of the slides for chapters 3 and 5 accompanying “Introduction to Parallel Computing”, Addison Wesley, 2003.

Graph Concepts Illustrated Using The Leda Library Amanuel Lemma CS252 Algorithms.

Hw. 6: Algorithm for finding strongly connected components. Original digraph as drawn in our book and in class: Preorder label : Postorder label Nodes:

Date of download: 6/27/2016 Copyright © 2016 SPIE. All rights reserved. Execution overview of MapReduce. Figure Legend: From: Parallel optimization of.

PERFORMANCE EVALUATIONS

A simple parallel algorithm for the MIS problem

Parallel and Perpendicular Lines

Lines that point down-hill have a negative slope.

Graphing Linear Functions

Minimum Spanning Tree Chapter 13.6.

Case Study 2- Parallel Breadth-First Search Using OpenMP

Notes Over 4.2 Is a Solution Verifying Solutions of an Equation

I590 Data Science Curriculum August

Linear Equations in two variables

Data Science Curriculum March

CSE 421: Introduction to Algorithms

Hyunchul Park, Kevin Fan, Manjunath Kudlur,Scott Mahlke

Graphs Representation, BFS, DFS

CIS 700: “algorithms for Big Data”

CSCE569 Parallel Computing

كلية المجتمع الخرج البرمجة - المستوى الثاني

Transformations of graphs

Scalable Parallel Interoperable Data Analytics Library

Computing an Adjacency Matrix D. J. Foreman

Parallel Applications And Tools For Cloud Computing Environments

Correctness of Edmonds-Karp

Planarity K4 (complete)

Name:______________________________

Graphing Linear Equations

Slope-Point Form of the Equation for a Linear Function

CS 584 Project Write up Poster session for final Due on day of final

Chapter 1 Graphs.

Computing an Adjacency Matrix D. J. Foreman

What is a constant function?

Graph Algorithms: Shortest Path

LINEAR & QUADRATIC GRAPHS

Graphing Linear Equations

A Visual Way to the World of Parallel Computations

Motivation Contemporary big data tools such as MapReduce and graph processing tools have fixed data abstraction and support a limited set of communication.

Presentation transcript:

Hadoop-Harp Applications Performance Analysis on Big Red II

Notes For K-Means Clustering and FR Graph Drawing Algorithm, I use execution time and speedup to explain the performance because They can be drawn in one picture. Due to the intensive computation, I find these applications are easy to achieve linear speedup when the number of nodes is small. Therefore the parallel efficiency is always around 1.

K-Means Clustering Execution Time and Speedup on Big Red II (10 Iterations Benchmark, Nodes: 8, 16, 32, 64, 128, JVM settings: -Xmx44000m –Xms44000m -XX:NewRatio=1 -XX:SurvivorRatio=98)

FR Graph Drawing Algorithm Execution Time and Speedup on Big Red II (477111 Vertices and 665599 Undirected Edges, 10 Iterations Benchmark, Nodes: 1, 8, 16, 32, 64, 128, JVM settings: -Xmx44000m –Xms44000m -XX:NewRatio=1 -XX:SurvivorRatio=98)

WDA-SMACOF Execution Time on Big Red II (Full Application Benchmark, Nodes: 8, 16, 32, 64, 128, JVM settings: -Xmx54000m -Xms54000m, -XX:NewRatio=1 -XX:SurvivorRatio=98)

WDA-SMACOF Parallel Efficiency on Big Red II (Full Application Benchmark, Nodes: 8, 16, 32, 64, 128, JVM settings: -Xmx54000m -Xms54000m -XX:NewRatio=1 -XX:SurvivorRatio=98)

WDA-SMACOF Speedup on Big Red II (Full Application Benchmark, Nodes: 8, 16, 32, 64, 128, JVM settings: -Xmx54000m -Xms54000m -XX:NewRatio=1 -XX:SurvivorRatio=98)