ORNL is managed by UT-Battelle for the US Department of Energy Spark On Demand Deploying on Rhea Dale Stansberry John Harney Advanced Data and Workflows.

Slides:



Advertisements
Similar presentations
Making Fly Parviz Deyhim
Advertisements

© 2007 IBM Corporation IBM Global Engineering Solutions IBM Blue Gene/P Job Submission.
O’Reilly – Hadoop: The Definitive Guide Ch.5 Developing a MapReduce Application 2 July 2010 Taewhi Lee.
Setting up of condor scheduler on computing cluster Raman Sehgal NPD-BARC.
Spark Lightning-Fast Cluster Computing UC BERKELEY.
Introduction to Hadoop Programming Bryon Gill, Pittsburgh Supercomputing Center.
Spark: Cluster Computing with Working Sets
Introduction to Spark Shannon Quinn (with thanks to Paco Nathan and Databricks)
Hadoop(MapReduce) in the Wild —— Our current understandings & uses of Hadoop Le Zhao, Changkuk Yoo, Mark Hoy, Jamie Callan Presenter: Le Zhao
Mesos A Platform for Fine-Grained Resource Sharing in Data Centers Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony D. Joseph, Randy.
1.1 Installing Windows Server 2008 Windows Server 2008 Editions Windows Server 2008 Installation Requirements X64 Installation Considerations Preparing.
Hands-On Microsoft Windows Server 2003 Chapter 2 Installing Windows Server 2003, Standard Edition.
Patrick Wendell Databricks Deploying and Administering Spark.
Module 2: Planning to Install SQL Server. Overview Hardware Installation Considerations SQL Server 2000 Editions Software Installation Considerations.
Reproducible Environment for Scientific Applications (Lab session) Tak-Lon (Stephen) Wu.
Deployment Management The following screens demonstrate how to: 1. Access and view deployments 2. Create a new local deployment 3. Create and modify a.
Snippet Management The following screens demonstrate how to: 1. Access and view snippets 2. Create a local standard snippet, or a local class snippet 3.
JGI/NERSC New Hardware Training Kirsten Fagnan, Seung-Jin Sul January 10, 2013.
Applying Twister to Scientific Applications CloudCom 2010 Indianapolis, Indiana, USA Nov 30 – Dec 3, 2010.
6/1/2001 Supplementing Aleph Reports Using The Crystal Reports Web Component Server Presented by Bob Gerrity Head.
Welcome Thank you for taking our training. Collection 6421: Configure and Troubleshoot Windows Server® 2008 Network Course 6690 – 6709 at
KARMA with ProActive Parallel Suite 12/01/2009 Air France, Sophia Antipolis Solutions and Services for Accelerating your Applications.
Software Engineering for Business Information Systems (sebis) Department of Informatics Technische Universität München, Germany wwwmatthes.in.tum.de Data-Parallel.
Bigben Pittsburgh Supercomputing Center J. Ray Scott
Module 1: Installing and Configuring Servers. Module Overview Installing Windows Server 2008 Managing Server Roles and Features Overview of the Server.
Launch SpecE8 and React from GSS. You can use the chemical analyses in a GSS data sheet to set up and run SpecE8 and React calculations. Analysis → Launch…
Introduction to Hadoop Programming Bryon Gill, Pittsburgh Supercomputing Center.
Tool Integration with Data and Computation Grid GWE - “Grid Wizard Enterprise”
Grid Computing at Yahoo! Sameer Paranjpye Mahadev Konar Yahoo!
Data and SQL on Hadoop. Cloudera Image for hands-on Installation instruction – 2.
Software Overview Environment, libraries, debuggers, programming tools and applications Jonathan Carter NUG Training 3 Oct 2005.
Spark. Spark ideas expressive computing system, not limited to map-reduce model facilitate system memory – avoid saving intermediate results to disk –
Drush: The Drupal Shell Utility Trevor Mckeown Founder & Owner Sublime Technologies
Department of Computer Science Internet Performance Measurements using Firefox Extensions Scot L. DeDeo Professor Craig Wills.
Mark E. Fuller Senior Principal Instructor Oracle University Oracle Corporation.
SAN DIEGO SUPERCOMPUTER CENTER Inca Control Infrastructure Shava Smallen Inca Workshop September 4, 2008.
Cloud Computing Mapreduce (2) Keke Chen. Outline  Hadoop streaming example  Hadoop java API Framework important APIs  Mini-project.
Tool Integration with Data and Computation Grid “Grid Wizard 2”
Page 1 Cloud Study: Algorithm Team Mahout Introduction 박성찬 IDS Lab.
Advanced topics Cluster Training Center for Simulation and Modeling September 4, 2015.
​ TdBench 7.2 – tdb.sh Utility Script. 2 Created for TdBench 7.x release to consolidate tools Open architecture – looks for scripts in the./tools directory.
Scaling up R computation with high performance computing resources.
Centre de Calcul de l’Institut National de Physique Nucléaire et de Physique des Particules Apache Spark Osman AIDEL.
Planning Server Deployments Chapter 1. Server Deployment When planning a server deployment for a large enterprise network, the operating system edition.
Learn. Hadoop Online training course is designed to enhance your knowledge and skills to become a successful Hadoop developer and In-depth knowledge of.
Advanced Computing Facility Introduction
Outline Installing Gem5 SPEC2006 for Gem5 Configuring Gem5.
Big Data is a Big Deal!.
Sushant Ahuja, Cassio Cristovao, Sameep Mohta
Hadoop Architecture Mr. Sriram
ITCS-3190.
Speaker’s Name/Department (delete if not needed) Month 00, 2017
Working With Azure Batch AI
Hadoop Tutorials Spark
Spark Presentation.
Data Platform and Analytics Foundational Training
Pyspark 최 현 영 컴퓨터학부.
Containers in HPC By Raja.
Integration of Singularity With Makeflow
Short Read Sequencing Analysis Workshop
Introduction to Spark.
Applying Twister to Scientific Applications
CS110: Discussion about Spark
Overview of big data tools
Spark and Scala.
AlwaysOn Availability Groups
Spark and Scala.
Containerized Spark at RBC
Bryon Gill Pittsburgh Supercomputing Center
Short Read Sequencing Analysis Workshop
Presentation transcript:

ORNL is managed by UT-Battelle for the US Department of Energy Spark On Demand Deploying on Rhea Dale Stansberry John Harney Advanced Data and Workflows Group NCCS/OLCF

2 Presentation_name Spark Configuration Spark Hadoop 2.6 binary dist Will soon be available as module on Rhea Spark configured using “Standalone” mode Spark jobs run using PBS scripts generated with OLCF spark command-line utility (spark_setup) PBS script contains Spark dynamic deployment plus custom application setup/commands

3 Presentation_name Manual Spark Install Download pre-built Spark 1.6.X + Hadoop 2.6.X from Extract archive to accessible path on Atlas (scratch or project directory) Export SPARK_HOME and add SPARK_HOME/bin to path Clone spark_on_demand project from: Copy spark_setup.py to SPARK_HOME/bin Copy spark_deploy.py to SPARK_HOME/sbin

4 Presentation_name How Spark on Demand Works (Setup) spark_setup.py script generates PBS script and provisions deployment directory –Deployment directory includes template directory containing configuration files that may be edited –Includes Rhea-appropriate values (cores, memory, etc.) Must edit generated PBS script to add application commands and optional module dependencies –PBS environment is exported to allocated nodes –Spark application(s) are launched using $SPARK_SUBMIT command –Python, Scala, and Java are supported

5 Presentation_name How Spark on Demand Works (cont.) When job is run, PBS script uses mpirun to launch a deployment script on allocated nodes –Deployment script harvests node information and creates spark configuration files in deployment directory (from templates) –Deployment script then starts Spark daemon processes Spark application(s) are launched –Application output goes to stdout –Spark driver logs to stderr by default Spark daemons are killed when PBS script ends

6 Presentation_name Options for spark_setup -? : show usage -s : PBS script name -a : account to charge -n : number of nodes (must be >= 2) -w : wall time (PBS format) -d : deployment directory (full path) --driver-mem : 64GB default --exec-mem : 4GB default

7 Presentation_name Guidelines / Comments Number of nodes must be >= 2 Deployment directory should be in scratch area Be careful re-using PBS scripts! –Number of nodes in two places –Deployment directory initialized by setup script Languages: –Scala is Spark “native” language –Python is not as well supported by some Spark libraries (MLlib) –Using Java may require translation to/from Spark/Scala objects

ORNL is managed by UT-Battelle for the US Department of Energy Spark On Demand Deploying on Rhea Demo

9 Presentation_name Initial Experiences and Benchmarks Why Spark-on-demand on Rhea? –Elimination of data movement step –Utilization of OLCF resources –Do not need dedicated cluster or HDFS (instances are created and destroyed) –Client code/Driver programs “easier” to write –Embeddable (e.g. BEAM and ICE) –Includes out-of-the-box capabilities Spark-streaming GraphX MLLib

10 Presentation_name Initial Experiences and Benchmarks When should we use Spark-on-Rhea? When not to use Spark-on-Rhea? –Working with smaller datasets –Applications that require spark to be listening for requests –MPI applications Does it exhibit expected performance benefits? What are the optimal conditions for using it? –Number of nodes, size of datasets, number of threads, etc Examples: –Wordcount –Singular Value Decomposition (from MLLib)

11 Presentation_name Applications – WordCount and Singular Value Decomposition (SVD) Preliminary Findings (experiment setup) –Parameters Number of nodes – 2,4,8,16,32, (64) WordCount: Document sizes (1,2,4,8,100 GB) SVD: Matrix sizes –Small – 5000x500 (~46MB), 5000x1000 (~92MB), 5000x2500 (~231MB) –Medium (e.g. 256x256 images) – 1000x65536 (~1,2 GB), 10000x65536 (~12 GB) –Large (e.g higher res 10000x10000 images) – 10000x (~185 GB) –Assumptions Input are text files of random values Data preparation time not considered Uses out-of-box Spark task manager (YARN? Mesos?) I/O performance not considered in measurements

12 Presentation_name Application – Wordcount

13 Presentation_name Application – Singular Value Decomposition (SVD) SVD: U,V – orthogonal matricies containing singular vectors S – diagonal containing singular values Exists for all A Numerous calculation methods Many uses: –Dimensionality Reduction/Compression –PCA (Image/Face Recognition) –Recommender systems (locating important features)

14 Presentation_name Application – Singular Value Decomposition (SVD) SVD Driver Program (Scala) Declare Context Load Data into RDD From File Compute SVD (MLLib API) Output

15 Presentation_name Application – Singular Value Decomposition (SVD)

16 Presentation_name Application – Singular Value Decomposition (SVD) SVD Driver Program (Scala) - caching

17 Presentation_name Application – Singular Value Decomposition (SVD)

18 Presentation_name Application – Singular Value Decomposition (SVD)

19 Presentation_name Discussion/Observations Results –Improved computational performance for wordcount and SVD as node count increases (in some cases) –Caching makes a huge difference (what happens when required cache is greater than available memory?) –Structure of the data can be important (Large matrix 10000x ) caused timeout error –Investigate other tunable parameters including number of threads number of open sockets serialization methods number of threads mounting options

20 Presentation_name Discussion/Observations Other observations of using spark-on-demand on Rhea –Edit the pbs script at your own risk! –Cannot take advantage of the spark shell (REPL) –More support for driver programs in scala (Streaming, MLLib) –Future investigation: I/O (Darshan) TPC-DS and BigData Benchmarks (Kay Ousterhout - Comparison with traditional spark setup (CADES with HDFS) is needed