Scaling Spark on HPC Systems

Slides:



Advertisements
Similar presentations
Spark: Cluster Computing with Working Sets
Advertisements

Microsoft Load Balancing and Clustering. Outline Introduction Load balancing Clustering.
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.
Venkatram Ramanathan 1. Motivation Evolution of Multi-Core Machines and the challenges Background: MapReduce and FREERIDE Co-clustering on FREERIDE Experimental.
Venkatram Ramanathan 1. Motivation Evolution of Multi-Core Machines and the challenges Summary of Contributions Background: MapReduce and FREERIDE Wavelet.
Storage in Big Data Systems
Performance Issues in Parallelizing Data-Intensive applications on a Multi-core Cluster Vignesh Ravi and Gagan Agrawal
Building a Parallel File System Simulator E Molina-Estolano, C Maltzahn, etc. UCSC Lab, UC Santa Cruz. Published in Journal of Physics, 2009.
Frontiers in Massive Data Analysis Chapter 3.  Difficult to include data from multiple sources  Each organization develops a unique way of representing.
Indexing HDFS Data in PDW: Splitting the data from the index VLDB2014 WSIC、Microsoft Calvin
Grid Computing at Yahoo! Sameer Paranjpye Mahadev Konar Yahoo!
Towards Exascale File I/O Yutaka Ishikawa University of Tokyo, Japan 2009/05/21.
Ceph: A Scalable, High-Performance Distributed File System
Virtualization and Databases Ashraf Aboulnaga University of Waterloo.
HADOOP DISTRIBUTED FILE SYSTEM HDFS Reliability Based on “The Hadoop Distributed File System” K. Shvachko et al., MSST 2010 Michael Tsitrin 26/05/13.
Data Engineering How MapReduce Works
History & Motivations –RDBMS History & Motivations (cont’d) … … Concurrent Access Handling Failures Shared Data User.
Experiments in Utility Computing: Hadoop and Condor Sameer Paranjpye Y! Web Search.
LIOProf: Exposing Lustre File System Behavior for I/O Middleware
1 Student Date Time Wei Li Nov 30, 2015 Monday 9:00-9:25am Shubbhi Taneja Nov 30, 2015 Monday9:25-9:50am Rodrigo Sanandan Dec 2, 2015 Wednesday9:00-9:25am.
Introduction to Data Analysis with R on HPC Texas Advanced Computing Center Feb
Presenter: Yue Zhu, Linghan Zhang A Novel Approach to Improving the Efficiency of Storing and Accessing Small Files on Hadoop: a Case Study by PowerPoint.
Practical Hadoop: do’s and don’ts by example Kacper Surdy, Zbigniew Baranowski.
File Systems for Cloud Computing Chittaranjan Hota, PhD Faculty Incharge, Information Processing Division Birla Institute of Technology & Science-Pilani,
Compute and Storage For the Farm at Jlab
Parallel Virtual File System (PVFS) a.k.a. OrangeFS
COURSE DETAILS SPARK ONLINE TRAINING COURSE CONTENT
Presented by: Omar Alqahtani Fall 2016
Web Server Load Balancing/Scheduling
PROTECT | OPTIMIZE | TRANSFORM
About Hadoop Hadoop was one of the first popular open source big data technologies. It is a scalable fault-tolerant system for processing large datasets.
Hadoop Aakash Kag What Why How 1.
Department of Intelligent Systems Engineering
Hadoop.
Introduction to Distributed Platforms
By Chris immanuel, Heym Kumar, Sai janani, Susmitha
Web Server Load Balancing/Scheduling
Software Systems Development
An Open Source Project Commonly Used for Processing Big Data Sets
Chapter 10 Data Analytics for IoT
Large-scale file systems and Map-Reduce
Spark Presentation.
Introduction to HDFS: Hadoop Distributed File System
PA an Coordinated Memory Caching for Parallel Jobs
Gregory Kesden, CSE-291 (Storage Systems) Fall 2017
Gregory Kesden, CSE-291 (Cloud Computing) Fall 2016
Software Engineering Introduction to Apache Hadoop Map Reduce
Apache Spark Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing Aditya Waghaye October 3, 2016 CS848 – University.
A Survey on Distributed File Systems
Ministry of Higher Education
Database Applications (15-415) Hadoop Lecture 26, April 19, 2016
The Basics of Apache Hadoop
February 26th – Map/Reduce
Hadoop Basics.
Cse 344 May 4th – Map/Reduce.
CS110: Discussion about Spark
Hadoop Technopoints.
Yiannis Nikolakopoulos
Setup Sqoop.
Interpret the execution mode of SQL query in F1 Query paper
Lecture 16 (Intro to MapReduce and Hadoop)
CS 345A Data Mining MapReduce This presentation has been altered.
Introduction to Spark.
CS639: Data Management for Data Science
Apache Hadoop and Spark
Assoc. Prof. Marc FRÎNCU, PhD. Habil.
Lecture 29: Distributed Systems
CS639: Data Management for Data Science
Analysis of Structured or Semi-structured Data on a Hadoop Cluster
Presentation transcript:

Scaling Spark on HPC Systems Presented by: Jerrod Dixon

Outline HDFS vs Lustre MapReduce vs Spark Spark on HPC Experimental Setup Results Conclusions

HDFS vs Lustre

Hadoop HDFS Distributed filesystem Multi-node replication Direct communication with NameNode

Lustre Very popular filesystem for HPC systems Leverages Management Server (MGS) Metadata Server (MDS) Object Storage Servers

Lustre Full POSIX support Metadata Server informs clients where parts of file object are located Clients connect directly

MapReduce vs Spark

MapReduce Typical method of interacting on HDFS Maps data in files to key-pairs Reduces to unique key with value

Spark Similar to overall methodology of MapReduce Maintains processes in memory Distributes data across global and local scopes

Spark – Vertical Data Processes from disk only when final results requested Pulls from filesystems and works against data in batch methodolgy

Spark –Horizontal Data Distributes work done across nodes as it is processed Similar distribution to HDFS replication, but force-kept in memory

Spark Operates primarily on Resilient Distributed Databases (RDDs) Map processes can be nested but lazy Reduce operation forces processing Caching method to force map into memory Here, making note that for ‘lazy’ means that spark does not execute transformations until data is needed

Spark on HPC

Spark on HPC Spark designed for HDFS Works on data in batches Expects partial data on local disk Executes jobs as results requested Works on data in batches Vertical Data movement

Experimental Setup

Hardware Edison and Cori Cray XC supercomputers at NERSC Edison uses 5,576 compute nodes Each has two 2.4 GHz 12-core Intel “Ivy Bridge” processors Cori uses 1,630 compute nodes Each has two 2.3 GHz 16-core Intel “Haswell” processors.

Edison cluster Leverages Lustre Standard implementation Single MDS, single MDT

Cori Cluster Leverages Luster Leverages BurstBuffer Accelerates I/O performance

BurstBuffer Sits between memory and Lustre Stores frequently accessed files to improve I/O

Results

Single Node Clear bottle-neck in communicating with disk

Multi-node file I/O

BurstBuffer

GroupBy Benchmark 16 nodes (384 cores) Edison weak scaling Partitions must exchanged with partitions shm – memory mapped storage

GroupBy Benchmark Cori specific

Impact of BurstBuffer Increase in mean time till operation Lower variability in access time

Conclusions

No mention of .persist() .cache() Spark memory management to preserve processed partitions after eviction .cache() Mask of .persist() with bare basic parameters MEMORY_ONLY mode

Conclusions Clear limitations to using Lustre as filesytems Increases in access time, decreases in processing, BurstBuffer helps but only with certain amount of nodes No discussions on Spark methods to overcome issues

Issues Weak scaling covered extensively Strong scaling covered almost not at all No comparisons to equivalent work on HDFS system Spark is designed for HDFS, comparing work done on HPC to standard HDFS implementation seems intuitive

Questions?