OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY The stagesub tool Sudharshan S. Vazhkudai Computer Science Research Group CSMD Oak Ridge National.

Slides:



Advertisements
Similar presentations
Jaime Frey Computer Sciences Department University of Wisconsin-Madison OGF 19 Condor Software Forum Routing.
Advertisements

Rhea Analysis & Post-processing Cluster Robert D. French NCCS User Assistance.
Workshop: Using the VIC3 Cluster for Statistical Analyses Support perspective G.J. Bex.
Workshop on HPC in India Grid Middleware for High Performance Computing Sathish Vadhiyar Grid Applications Research Lab (GARL) Supercomputer Education.
Office of Science U.S. Department of Energy Grids and Portals at NERSC Presented by Steve Chan.
Astrophysics, Biology, Climate, Combustion, Fusion, Nanoscience Working Group on Simulation-Driven Applications 10 CS, 10 Sim, 1 VR.
Quick Tutorial on MPICH for NIC-Cluster CS 387 Class Notes.
ORNL is managed by UT-Battelle for the US Department of Energy Tools Available for Transferring Large Data Sets Over the WAN Suzanne Parete-Koon Chris.
Simo Niskala Teemu Pasanen
ORNL is managed by UT-Battelle for the US Department of Energy Data Management User Guide Suzanne Parete-Koon Oak Ridge Leadership Computing Facility.
MultiJob PanDA Pilot Oleynik Danila 28/05/2015. Overview Initial PanDA pilot concept & HPC Motivation PanDA Pilot workflow at nutshell MultiJob Pilot.
JGI/NERSC New Hardware Training Kirsten Fagnan, Seung-Jin Sul January 10, 2013.
Critical Flags, Variables, and Other Important ALCF Minutiae Jini Ramprakash Technical Support Specialist Argonne Leadership Computing Facility.
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
TeraGrid Gateway User Concept – Supporting Users V. E. Lynch, M. L. Chen, J. W. Cobb, J. A. Kohl, S. D. Miller, S. S. Vazhkudai Oak Ridge National Laboratory.
Eos Center-wide File Systems Chris Fuson Outline 1 Available Center-wide File Systems 2 New Lustre File System 3 Data Transfer.
High Performance Louisiana State University - LONI HPC Enablement Workshop – LaTech University,
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
© 1998 GENIAS Software GmbH GENIAS Software GmbH GRD Mannheim/1 GRD Success Stories Customer Scenarios for Global Distributed Workload Management Wolfgang.
Principles of Scalable HPC System Design March 6, 2012 Sue Kelly Sandia National Laboratories Abstract: Sandia National.
Bigben Pittsburgh Supercomputing Center J. Ray Scott
Just-in-time Staging of Large Input Data for Supercomputing Jobs Henry Monti, Ali R. ButtSudharshan S. Vazhkudai.
The Red Storm High Performance Computer March 19, 2008 Sue Kelly Sandia National Laboratories Abstract: Sandia National.
Multi-Tiered Storage with Xrootd at ATLAS Western Tier 2 Andrew Hanushevsky Wei Yang SLAC National Accelerator Laboratory 1CHEP2012, New York
Presented by Reliability, Availability, and Serviceability (RAS) for High-Performance Computing Stephen L. Scott and Christian Engelmann Computer Science.
Reliable Data Movement using Globus GridFTP and RFT: New Developments in 2008 John Bresnahan Michael Link Raj Kettimuthu Argonne National Laboratory and.
Accelerating Scientific Exploration Using Workflow Automation Systems Terence Critchlow (LLNL) Ilkay Altintas (SDSC) Scott Klasky(ORNL) Mladen Vouk (NCSU)
Scientific Workflow Scheduling in Computational Grids Report: Wei-Cheng Lee 8th Grid Computing Conference IEEE 2007 – Planning, Reservation,
Using the BYU Supercomputers. Resources Basic Usage After your account is activated: – ssh You will be logged in to an interactive.
CSF4 Meta-Scheduler Name: Zhaohui Ding, Xiaohui Wei
Globus GridFTP and RFT: An Overview and New Features Raj Kettimuthu Argonne National Laboratory and The University of Chicago.
Workflow Project Status Update Luciano Piccoli - Fermilab, IIT Nov
Introduction to dCache Zhenping (Jane) Liu ATLAS Computing Facility, Physics Department Brookhaven National Lab 09/12 – 09/13, 2005 USATLAS Tier-1 & Tier-2.
Peter F. Couvares (based on material from Tevfik Kosar, Nick LeRoy, and Jeff Weber) Associate Researcher, Condor Team Computer Sciences Department University.
STORK: Making Data Placement a First Class Citizen in the Grid Tevfik Kosar and Miron Livny University of Wisconsin-Madison March 25 th, 2004 Tokyo, Japan.
Operating Systems David Goldschmidt, Ph.D. Computer Science The College of Saint Rose CIS 432.
Advisor: Resource Selection 11/15/2007 Nick Trebon University of Chicago.
The CRI compute cluster CRUK Cambridge Research Institute.
NET100 … as seen from ORNL Tom Dunigan November 8, 2001.
STORK: Making Data Placement a First Class Citizen in the Grid Tevfik Kosar University of Wisconsin-Madison May 25 th, 2004 CERN.
USATLAS dCache System and Service Challenge at BNL Zhenping (Jane) Liu RHIC/ATLAS Computing Facility, Physics Department Brookhaven National Lab 10/13/2005.
CEDPS Data Services Ann Chervenak USC Information Sciences Institute.
CPSC 171 Introduction to Computer Science System Software and Virtual Machines.
TeraGrid Gateway User Concept – Supporting Users V. E. Lynch, M. L. Chen, J. W. Cobb, J. A. Kohl, S. D. Miller, S. S. Vazhkudai Oak Ridge National Laboratory.
Running Parallel Jobs Cray XE6 Workshop February 7, 2011 David Turner NERSC User Services Group.
Presented by Robust Storage Management in the Machine Room and Beyond Sudharshan Vazhkudai Computer Science Research Group Computer Science and Mathematics.
A Fully Automated Fault- tolerant System for Distributed Video Processing and Off­site Replication George Kola, Tevfik Kosar and Miron Livny University.
Comprehensive Scientific Support Of Large Scale Parallel Computation David Skinner, NERSC.
Evolving Scientific Data Workflow CAS 2011 Pamela Gillman
PERFORMANCE AND ANALYSIS WORKFLOW ISSUES US ATLAS Distributed Facility Workshop November 2012, Santa Cruz.
ISG We build general capability Introduction to Olympus Shawn T. Brown, PhD ISG MISSION 2.0 Lead Director of Public Health Applications Pittsburgh Supercomputing.
Scheduling MPI Workflow Applications on Computing Grids Juemin Zhang, Waleed Meleis, and David Kaeli Electrical and Computer Engineering Department, Northeastern.
LSF Universus By Robert Stober Systems Engineer Platform Computing, Inc.
Portable Batch System – Definition and 3 Primary Roles Definition: PBS is a distributed workload management system. It handles the management and monitoring.
Reliable and Efficient Grid Data Placement using Stork and DiskRouter Tevfik Kosar University of Wisconsin-Madison April 15 th, 2004.
CMS: T1 Disk/Tape separation Nicolò Magini, CERN IT/SDC Oliver Gutsche, FNAL November 11 th 2013.
NREL is a national laboratory of the U.S. Department of Energy, Office of Energy Efficiency and Renewable Energy, operated by the Alliance for Sustainable.
Presented by Robust Storage Management On Desktop, in Machine Room, and Beyond Xiaosong Ma Computer Science and Mathematics Oak Ridge National Laboratory.
Presented by SciDAC-2 Petascale Data Storage Institute Philip C. Roth Computer Science and Mathematics Future Technologies Group.
INTRODUCTION TO XSEDE. INTRODUCTION  Extreme Science and Engineering Discovery Environment (XSEDE)  “most advanced, powerful, and robust collection.
HPC usage and software packages
OpenPBS – Distributed Workload Management System
Architecture & System Overview
Virtual Memory Networks and Communication Department.
Short Read Sequencing Analysis Workshop
STORK: A Scheduler for Data Placement Activities in Grid
User interaction and workflow management in Grid enabled e-VLBI experiments Dominik Stokłosa Poznań Supercomputing and Networking Center, Supercomputing.
Overview of Workflows: Why Use Them?
Quick Tutorial on MPICH for NIC-Cluster
Short Read Sequencing Analysis Workshop
Presentation transcript:

OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY The stagesub tool Sudharshan S. Vazhkudai Computer Science Research Group CSMD Oak Ridge National Laboratory

OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY Problem Space  Users’ job workflow (stage, compute, offload) is mired by numerous issues  Staging and offloading are large data jobs prone to failure  Early staging and late offloading wastes scratch space  Delayed offloading renders result data vulnerable to purging  Compute time wasted on staging/offloading as part of the job

OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY Solution  Recognition that job input and output data needs to be managed more efficiently: in the way it is scheduled  Automatic scheduling of data staging/offloading activities so they can better coincide with computing  Results:  From a center standpoint:  Our techniques optimize resource usage and increase data and service availability  From a user job standpoint:  Our techniques reduce job turnaround time and optimize the usage of allocated time

OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY Coordination between staging, offloading and computation  Motivation  Lack of global coordination between the storage hierarchy and system software  Problems with manual and scripted staging  Human operational cost, wasted compute time/storage and increased wait time due to resubmissions  Our approach  Explicit specification of I/O activities alongside computation in a job script  Zero-charge Data Transfer Queue  Planning and Orchestration

OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY Specification of I/O Activities  Instrumented PBS job script:  #!/bin/bash  #PBS -N ldrdtest  #PBS -l nodes=4  #PBS -l walltime=00:015:00  #PBS -q batch  #PBS -A project-account  #PBS -M  #STAGEIN [-d HH:MM:SS] hpssdir://hpss.ccs.ornl.gov/home/vazhkuda/inp [-k /spin/sys/.keytab/vazhkuda.kt] [-p 1217] [-l vazhkuda] file:///ccs/home/vazhkuda/SVLDRD/test/scratch file:///ccs/home/vazhkuda/SVLDRD/test/scratch  Multiple stagein directives: from more than one source  Your compute job  #STAGEOUT file:///ccs/home/vazhkuda/SVLDRD/test/scratch/out hpssdir:///home/vazhkuda/inp  Multiple stageout directives: to more than one source  Protocols supported: scp://, hpss://, gsiftp://, file://

OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY Data Transfer Queue  Motivation:  Queuing up and scheduling data transfers  Treats data transfers as “data jobs”  Enables efficient management of data transfers  Specifics:  Zero-charge data transfer queue  Queue operated with the same scheduling policy as compute queue  Queue is managed using access controls to avoid misuse  Added benefit:  Data transfer jobs can now be logged and charged, if need be!

OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY Orchestration  Parsing into individual stagin, compute and stageout jobs  Dependency setup and management using resource manager primitives: wrapper around qsub Data Queue Job Queue Head Node 1. Stage Data 2. Compute Job 3. Offload Data Job Script I/O Nodes Compute Nodes Planner 1 2 after 1 3 after 2

OAK RIDGE NATIONAL LABORATORY U. S. DEPARTMENT OF ENERGY Usage  module load stagesub  man stagesub  stagesub  References:  Z. Zhang, C. Wang, S. S. Vazhkudai, X, Ma, G. Pike, J.W. Cobb, F. Mueller, “Optimizing Center Performance through Coordinated Data Staging, Scheduling and Recovery”, Proceedings of Supercomputing 2007 (SC07): Int'l Conference on High Performance Computing, Networking, Storage and Analysis, Reno, Nevada, November  H. Monti, A.R. Butt, S.S. Vazhkudai, “Just-in-time Staging of Large Input Data for Supercomputing Jobs”, Proceedings of Petascale Data Storage Workshop, Supercomputing 2008, Austin, Texas, November 2008.