Introduction to Scalable Programming using Makeflow and Work Queue Dinesh Rajan and Mike Albrecht University of Notre Dame October 24 and November 7, 2012.

Slides:



Advertisements
Similar presentations
Three types of remote process invocation
Advertisements

1 Real-World Barriers to Scaling Up Scientific Applications Douglas Thain University of Notre Dame Trends in HPDC Workshop Vrije University, March 2012.
Lobster: Personalized Opportunistic Computing for CMS at Large Scale Douglas Thain (on behalf of the Lobster team) University of Notre Dame CVMFS Workshop,
Experience with Adopting Clouds at Notre Dame Douglas Thain University of Notre Dame IEEE CloudCom, November 2010.
Scaling Up Without Blowing Up Douglas Thain University of Notre Dame.
1 Condor Compatible Tools for Data Intensive Computing Douglas Thain University of Notre Dame Condor Week 2011.
1 High Throughput Scientific Computing with Condor: Computer Science Challenges in Large Scale Parallelism Douglas Thain University of Notre Dame UAB 27.
1 Opportunities and Dangers in Large Scale Data Intensive Computing Douglas Thain University of Notre Dame Large Scale Data Mining Workshop at SIGKDD August.
1 Scaling Up Data Intensive Science with Application Frameworks Douglas Thain University of Notre Dame Michigan State University September 2011.
1 Models and Frameworks for Data Intensive Cloud Computing Douglas Thain University of Notre Dame IDGA Cloud Computing 8 February 2011.
1 Scaling Up Data Intensive Science to Campus Grids Douglas Thain Clemson University 25 Septmber 2009.
Mesos A Platform for Fine-Grained Resource Sharing in Data Centers Benjamin Hindman, Andy Konwinski, Matei Zaharia, Ali Ghodsi, Anthony D. Joseph, Randy.
Getting Beyond the Filesystem: New Models for Data Intensive Scientific Computing Douglas Thain University of Notre Dame HEC FSIO Workshop 6 August 2009.
Introduction to Makeflow Li Yu University of Notre Dame 1.
Building Scalable Elastic Applications using Makeflow Dinesh Rajan and Douglas Thain University of Notre Dame Tutorial at CCGrid, May Delft, Netherlands.
Building Scalable Scientific Applications using Makeflow Dinesh Rajan and Peter Sempolinski University of Notre Dame.
Building Scalable Applications on the Cloud with Makeflow and Work Queue Douglas Thain and Patrick Donnelly University of Notre Dame Science Cloud Summer.
Introduction to Makeflow and Work Queue CSE – Cloud Computing – Fall 2014 Prof. Douglas Thain.
CONDOR DAGMan and Pegasus Selim Kalayci Florida International University 07/28/2009 Note: Slides are compiled from various TeraGrid Documentations.
Portable Resource Management for Data Intensive Workflows Douglas Thain University of Notre Dame.
Design of an Active Storage Cluster File System for DAG Workflows Patrick Donnelly and Douglas Thain University of Notre Dame 2013 November 18 th DISCS-2013.
Track 1: Cluster and Grid Computing NBCR Summer Institute Session 2.2: Cluster and Grid Computing: Case studies Condor introduction August 9, 2006 Nadya.
Elastic Applications in the Cloud Dinesh Rajan University of Notre Dame CCL Workshop, June 2012.
Toward a Common Model for Highly Concurrent Applications Douglas Thain University of Notre Dame MTAGS Workshop 17 November 2013.
GXP Make Kenjiro Taura. GXP make What is it? –Another parallel & distributed make What is it useful for? –Running jobs with dependencies in parallel –Running.
Integrating HPC into the ATLAS Distributed Computing environment Doug Benjamin Duke University.
Introduction to Work Queue Applications CSE – Cloud Computing – Fall 2014 Prof. Douglas Thain.
Large Scale Sky Computing Applications with Nimbus Pierre Riteau Université de Rennes 1, IRISA INRIA Rennes – Bretagne Atlantique Rennes, France
Building Scalable Scientific Applications with Makeflow Douglas Thain and Dinesh Rajan University of Notre Dame Applied Cyber Infrastructure Concepts University.
Building Scalable Scientific Applications using Makeflow Dinesh Rajan and Douglas Thain University of Notre Dame.
The Cooperative Computing Lab  We collaborate with people who have large scale computing problems in science, engineering, and other fields.  We operate.
Introduction to Scalable Programming using Work Queue Dinesh Rajan and Ben Tovar University of Notre Dame October 10, 2013.
Evaluation of Agent Teamwork High Performance Distributed Computing Middleware. Solomon Lane Agent Teamwork Research Assistant October 2006 – March 2007.
Distributed Framework for Automatic Facial Mark Detection Graduate Operating Systems-CSE60641 Nisha Srinivas and Tao Xu Department of Computer Science.
1 Computational Abstractions: Strategies for Scaling Up Applications Douglas Thain University of Notre Dame Institute for Computational Economics University.
Introduction to Work Queue Applications Applied Cyberinfrastructure Concepts Course University of Arizona 2 October 2014 Douglas Thain and Nicholas Hazekamp.
Grid Computing at Yahoo! Sameer Paranjpye Mahadev Konar Yahoo!
Building Scalable Scientific Applications with Work Queue Douglas Thain and Dinesh Rajan University of Notre Dame Applied Cyber Infrastructure Concepts.
Review of Condor,SGE,LSF,PBS
Scalable Systems Software for Terascale Computer Centers Coordinator: Al Geist Participating Organizations ORNL ANL LBNL.
Introduction to Makeflow and Work Queue Prof. Douglas Thain, University of Notre Dame
Introduction to Scalable Programming using Work Queue Dinesh Rajan and Mike Albrecht University of Notre Dame October 24 and November 7, 2012.
WebFlow High-Level Programming Environment and Visual Authoring Toolkit for HPDC (desktop access to remote resources) Tomasz Haupt Northeast Parallel Architectures.
Building Scalable Elastic Applications using Work Queue Dinesh Rajan and Douglas Thain University of Notre Dame Tutorial at CCGrid, May Delft,
Demonstration of Scalable Scientific Applications Peter Sempolinski and Dinesh Rajan University of Notre Dame.
Building Scalable Scientific Applications with Work Queue Douglas Thain and Dinesh Rajan University of Notre Dame Applied Cyber Infrastructure Concepts.
PARALLEL AND DISTRIBUTED PROGRAMMING MODELS U. Jhashuva 1 Asst. Prof Dept. of CSE om.
HTCondor’s Grid Universe Jaime Frey Center for High Throughput Computing Department of Computer Sciences University of Wisconsin-Madison.
Introduction to Makeflow and Work Queue Nicholas Hazekamp and Ben Tovar University of Notre Dame XSEDE 15.
Apache Hadoop on Windows Azure Avkash Chauhan
Instrumenting Badi Abdul-Wahid, RJ Nowling CSE Operating Systems Professor Striegel.
HPC In The Cloud Case Study: Proteomics Workflow
HPC In The Cloud Case Study: Proteomics Workflow
Introduction to Makeflow and Work Queue
Hadoop Aakash Kag What Why How 1.
Introduction to Distributed Platforms
Working With Azure Batch AI
Spark Presentation.
Scaling Up Scientific Workflows with Makeflow
Introduction to Makeflow and Work Queue
INDIGO – DataCloud PaaS
Integration of Singularity With Makeflow
Introduction to Makeflow and Work Queue
Haiyan Meng and Douglas Thain
Introduction to Makeflow and Work Queue
Introduction to Makeflow and Work Queue
Introduction to Makeflow and Work Queue with Containers
What’s New in Work Queue
Creating Custom Work Queue Applications
g++ features, better makefiles
Presentation transcript:

Introduction to Scalable Programming using Makeflow and Work Queue Dinesh Rajan and Mike Albrecht University of Notre Dame October 24 and November 7, 2012

Go to: Click “Tutorial: Introduction to Scalable Programming”

3 I have a standard, debugged, trusted application that runs on my laptop. One simulation runs in an hour. I have to run 100. Then I have to analyze the results, tweak the simulation, and run 100 more. Can I get a single result faster? Can I get more results in the same time?

Last year, I heard about this grid thing. This year, I heard about this cloud thing. What do I do next?

Should I port my program to MPI or Hadoop? Learn C / Java Learn MPI / Hadoop Re-architect Re-write Re-test Re-debug Re-certify

What if my application looks like this?

I can get as many machines on the cloud as I want! How do I organize my application to run on those machines?

Makeflow: A Portable Workflow System

An Old Idea: Makefiles 9 part1 part2 part3: input.data split.py./split.py input.data out1: part1 mysim.exe./mysim.exe part1 >out1 out2: part2 mysim.exe./mysim.exe part2 >out2 out3: part3 mysim.exe./mysim.exe part3 >out3 result: out1 out2 out3 join.py./join.py out1 out2 out3 > result

Makeflow = Make + Workflow Provides portability across batch systems. Enable parallelism (but not too much!) Fault tolerance at multiple scales. Data and resource management. 10 Makeflow LocalCondorSGE Work Queue

Makeflow Language - Rules Each rule specifies: – a set of target files to create; – a set of source files needed to create them; – a command that generates the target files from the source files. part1 part2 part3: input.data split.py./split.py input.data out1: part1 mysim.exe./mysim.exe part1 >out1 out2: part2 mysim.exe./mysim.exe part2 >out2 out3: part3 mysim.exe./mysim.exe part3 >out3 result: out1 out2 out3 join.py./join.py out1 out2 out3 > result out1 : part1 mysim.exe mysim.exe part1 > out1

You must state all the files needed by the command.

sims.mf out.10 : in.dat calib.dat sim.exe sim.exe –p 10 in.data > out.10 out.20 : in.dat calib.dat sim.exe sim.exe –p 20 in.data > out.20 out.30 : in.dat calib.dat sim.exe sim.exe –p 30 in.data > out.30

Private Cluster Campus Condor Pool Public Cloud Provider CRC SGE Cluster Makefile Makeflow Local Files and Programs Makeflow + Batch System makeflow –T sge makeflow –T condor Work Queue

Drivers Local Condor SGE Batch Hadoop WorkQueue Torque MPI-Queue XGrid Moab

How to run a Makeflow Run a workflow locally (multicore?) – makeflow -T local sims.mf Clean up the workflow outputs: – makeflow –c sims.mf Run the workflow on SGE: – makeflow –T sge sims.mf

Hands On

Practice Problems 1.Construct a makeflow to render a short movie featuring a Rubik’s cube 2.Launch the makeflow on both your laptop and SGE 3.Consider ways you might use Makeflow for your research