Grid Workflow Midwest Grid Workshop Module 6
Goals Enhance scientific productivity through: Discovery and application of datasets and programs at petabyte scale Enabling use of a worldwide data grid as a scientific workstation
Goals of using grids through scripting Provide an easy on-ramp to the grid Utilize massive resources with simple scripts Leverage multiple grids like a workstation Empower script-writers to empower end users Track and leverage provenance in the science process
Classes of Workflow Systems Earlier generation business workflow systems Document management, forms processing, etc Scientific laboratory management systems LIMS, “wet lab” workflow Application-oriented workflow Kepler, DAGman, P-Star, VisTrails, Karajan VDS: First-generation Virtual Data System Pegasus, Virtual Data Language Service-oriented workflow systems BPEL, BPDL, Taverna/SCUFL, Triana Pegasus/Wings Pegasus with OWL/RDF workflow specification Swift workflow system Karajan with typed and mapped VDL - SwiftScript
VDS – The Virtual Data System Introduced Virtual Data Language - VDL A location-independent parallel language Several Planners Pegasus: main production planner Euryale: experimental “just in time” planner GADU/GNARE – user application planner (D. Sulahke, Argonne) Provenance Kickstart – app launcher and tracker VDC – virtual data catalog
Virtual Data and Workflows Challenge is managing and organizing the vast computing and storage capabilities provided by Grids Workflow expresses computations in a form that can be readily mapped to Grids Virtual data keeps accurate track of data derivation methods and provenance Grid tools virtualize location and caching of data, and recovery from failures
Virtual Data Origins: The Grid Physics Network Enhance scientific productivity through… Discovery, application and management of data and processes at all scales Using a worldwide data grid as a scientific workstation The key to this approach is Virtual Data – creating and managing datasets through workflow “recipes” and provenance recording.
Virtual Data workflow abstracts Grid details
mass = 200 decay = WW stability = 1 LowPt = 20 HighPt = mass = 200 decay = WW stability = 1 event = 8 mass = 200 decay = WW stability = 1 plot = 1 mass = 200 decay = WW plot = 1 mass = 200 decay = WW event = 8 mass = 200 decay = WW stability = 1 mass = 200 decay = WW stability = 3 mass = 200 decay = WW mass = 200 decay = ZZ mass = 200 decay = bb mass = 200 plot = 1 mass = 200 event = 8 Example Application: High Energy Physics Data Analysis Work and slide by Rick Cavanaugh and Dimitri Bourilkov, University of Florida
The core essence: Basic data analysis programs CMS.ECal : 24B707CC AF A V : 24B707CD F A V : E9DCA V : Raw Data bins =60 xmin = 40.5 ymin =.003 Data Analysis Program bins xmin ymin infile
Expressing Workflow in VDL TR grep (in a1, out a2) { argument stdin = ${a1}; argument stdout = ${a2}; } TR sort (in a1, out a2) { argument stdin = ${a1}; argument stdout = ${a2}; } DV grep DV sort file1 file2 file3 grep sort Define a “function” wrapper for an application Provide “actual” argument values for the invocation Define “formal arguments” for the application Define a “call” to invoke application Connect applications via output-to-input dependencies
Executing VDL Workflows Abstract workflow DAGman DAG Pegasus Planner DAGman & Condor-G VDL Program Virtual Data catalog Virtual Data Workflow Generator Job Planner Job Cleanup Workflow spec Create Execution Plan Grid Workflow Execution Show world and results in large DAG on right, as animated overlay
...and collecting Provenance VDL DAGman script Pegasus Planner DAGman & Condor-G Abstract workflow Virtual Data catalog Virtual Data Workflow Generator Specify WorkflowCreate and run DAG Grid Workflow Execution (on worker nodes) launcher file1 file2 file3 grep sort Provenance data Provenance data Provenance collector
What must we “virtualize” to compute on the Grid? Location-independent computing: represent all workflow in abstract terms Declarations not tied to specific entities: sites file systems schedulers Failures – automated retry for data server and execution site un-availability
Mapping the Science Process to workflows Start with a single workflow Automate the generation of workflow for sets of files (datasets) Replicate workflow to explore many datasets Change Parameters Change code – add new transformations Build new workflows Use provenance info
How does Workflow Relate to Provenance? Workflow – specifies what to do Provenance – tracks what was done Executed Executing Executable Waiting Query Edit Schedule Execution environment What I Did What I Want to Do What I Am Doing …
Having interface definitions also facilitates provenance tracking. CMS.ECal : 24B707CC AF A V : 24B707CD F A V : E9DCA V : Raw Data bins =60 xmin = 40.5 ymin =.003 Data Analysis Program bins xmin ymin infile
Functional MRI Analysis Workflow courtesy James Dobson, Dartmouth Brain Imaging Center
LIGO Inspiral Search Application Describe… Inspiral workflow application is the work of Duncan Brown, Caltech, Scott Koranda, UW Milwaukee, the ISI Pegasus Team, and the LSC Inspiral group
Example Montage Workflow ~1200 node workflow, 7 levels Mosaic of M42 created on the Teragrid using Pegasus
Blasting for Protein Knowledge BLAST compare of complete nr database for sequence similarity and function characterization Knowledge Base PUMA is an interface for the researchers to be able to find information about a specific protein after having been analyzed against the complete set of sequenced genomes (nr file ~ approximately 3 million sequences) Analysis on the Grid The analysis of the protein sequences occurs in the background in the grid environment. Millions of processes are started since several tools are run to analyze each sequence, such as finding out protein similarities (BLAST), protein family domain searches (BLOCKS), and structural characteristics of the protein.
FOAM: Fast Ocean/Atmosphere Model 250-Member Ensemble Run on TeraGrid under VDS FOAM run for Ensemble Member 1 FOAM run for Ensemble Member 2 FOAM run for Ensemble Member N Atmos Postprocessing Ocean Postprocessing for Ensemble Member 2 Coupl Postprocessing for Ensemble Member 2 Atmos Postprocessing for Ensemble Member 2 Coupl Postprocessing for Ensemble Member 2 Results transferred to archival storage Work of: Rob Jacob (FOAM), Veronica Nefedova (Workflow design and execution) Remote Directory Creation for Ensemble Member 1 Remote Directory Creation for Ensemble Member 2 Remote Directory Creation for Ensemble Member N
TeraGrid and VDS speed up modelling Climate Supercomputer TeraGrid with NMI and VDS FOAM application by Rob Jacob, Argonne; VDS workflow by Veronika Nefedova, Argonne Visualization courtesy Pat Behling and Yun Liu, UW Madison..
VDS Virtual Data System Virtual Data Language (VDL) A language to express workflows Pegasus planner Decides how the workflow will run Virtual Data Catalog (VDC) Stores information about workflows Stores provenance of data
Virtual Data Process Describe data derivation or analysis steps in a high-level workflow language (VDL) VDL is cataloged in a database for sharing by the community Grid workflows are generated from VDL Provenance of derived results stored in database for assessment or verification
Planning with Pegasus High Level Application Knowledge Resource Information and Configuration Data Location Information Pegasus Planner Plan to be submitted to the grid (e.g condor submit files) DAX abstract workflow, from VDL
Abstract to Concrete, Step 1: Workflow Reduction
Step 2: Site Selection & Addition of Data Stage-in Nodes
Step 3: Addition of Data Stage-out Nodes
Step 4: Addition of Replica Registration Jobs
Step 5: Addition of Job-Directory Creation
Final Result of Abstract-to-Concrete Process
39 Swift System improves on VDS/VDL Clean separation of logical/physical concerns XDTM specification of logical data structures + Concise specification of parallel programs SwiftScript, with iteration, etc. + Efficient execution on distributed resources Lightweight threading, dynamic provisioning, Grid interfaces, pipelining, load balancing + Rigorous provenance tracking and query is in design Virtual data schema & automated recording Improved usability and productivity Demonstrated in numerous applications
40 AIRSN: an example program (Run snr) functional ( Run r, NormAnat a, Air shrink ) { Run yroRun = reorientRun( r, "y" ); Run roRun = reorientRun( yroRun, "x" ); Volume std = roRun[0]; Run rndr = random_select( roRun, 0.1 ); AirVector rndAirVec = align_linearRun( rndr, std, 12, 1000, 1000, "81 3 3" ); Run reslicedRndr = resliceRun( rndr, rndAirVec, "o", "k" ); Volume meanRand = softmean( reslicedRndr, "y", "null" ); Air mnQAAir = alignlinear( a.nHires, meanRand, 6, 1000, 4, "81 3 3" ); Warp boldNormWarp = combinewarp( shrink, a.aWarp, mnQAAir ); Run nr = reslice_warp_run( boldNormWarp, roRun ); Volume meanAll = strictmean( nr, "y", "null" ) Volume boldMask = binarize( meanAll, "y" ); snr = gsmoothRun( nr, boldMask, "6 6 6" ); } (Run or) reorientRun (Run ir, string direction) { foreach Volume iv, i in ir.v { or.v[i] = reorient(iv, direction); }
VDL/VDS Limitations Missing VDL language features Data typing & data mapping Iterators and control-flow constructs Run time complexity in VDS State explosion for data-parallel applications Computation status hard to provide Debugging information complex & distributed Performance Still many runtime bottlenecks
The Messy Data Problem Scientific data is typically logically structured E.g., hierarchical structure Common to map functions over dataset members Nested map operations can scale to millions of objects
The Messy Data Problem But physically “ messy ” Heterogeneous storage format and access protocol Logically identical dataset can be stored in textual File (e.g. CSV), spreadsheet, database, … Data available from filesystem, DBMS, HTTP, WebDAV,.. Metadata encoded in directory and file names Hinders program development, composition, execution./Group23 total 58 drwxr-xr-x 4 yongzh users 2048 Nov 12 14:15 AA drwxr-xr-x 4 yongzh users 2048 Nov 11 21:13 CH drwxr-xr-x 4 yongzh users 2048 Nov 11 16:32 EC./Group23/AA: total 4 drwxr-xr-x 5 yongzh users 2048 Nov 5 12:41 04nov06aa drwxr-xr-x 4 yongzh users 2048 Dec 6 12:24 11nov06aa. /Group23/AA/04nov06aa: total 54 drwxr-xr-x 2 yongzh users 2048 Nov 5 12:52 ANATOMY drwxr-xr-x 2 yongzh users Dec 5 11:40 FUNCTIONAL. /Group23/AA/04nov06aa/ANATOMY: total rw-r--r-- 1 yongzh users 348 Nov 5 12:29 coplanar.hdr -rw-r--r-- 1 yongzh users Nov 5 12:29 coplanar.img. /Group23/AA/04nov06aa/FUNCTIONAL: total rw-r--r-- 1 yongzh users 348 Nov 5 12:32 bold1_0001.hdr -rw-r--r-- 1 yongzh users Nov 5 12:32 bold1_0001.img -rw-r--r-- 1 yongzh users 348 Nov 5 12:32 bold1_0002.hdr -rw-r--r-- 1 yongzh users Nov 5 12:32 bold1_0002.img -rw-r--r-- 1 yongzh users 496 Nov 15 20:44 bold1_0002.mat -rw-r--r-- 1 yongzh users 348 Nov 5 12:32 bold1_0003.hdr -rw-r--r-- 1 yongzh users Nov 5 12:32 bold1_0003.img
SwiftScript Typed parallel programming notation XDTM as data model and type system Typed dataset and procedure definitions Scripting language Implicit data parallelism Program composition from procedures Control constructs (foreach, if, while, …) [SIGMOD05, Springer06] Clean application logic Type checking Dataset selection, iteration A Notation & System for Expressing and Executing Cleanly Typed Workflows on Messy Scientific Data [SIGMOD Record Sep05]
fMRI Type Definitions in SwiftScript type Study { Group g[ ]; } type Group { Subject s[ ]; } type Subject { Volume anat; Run run[ ]; } type Run { Volume v[ ]; } type Volume { Image img; Header hdr; } type Image {}; type Header {}; type Warp {}; type Air {}; type AirVec { Air a[ ]; } type NormAnat { Volume anat; Warp aWarp; Volume nHires; } Simplified declarations of fMRI AIRSN (Spatial Normalization)
AIRSN Program Definition (Run snr) functional ( Run r, NormAnat a, Air shrink ) { Run yroRun = reorientRun( r, "y" ); Run roRun = reorientRun( yroRun, "x" ); Volume std = roRun[0]; Run rndr = random_select( roRun, 0.1 ); AirVector rndAirVec = align_linearRun( rndr, std, 12, 1000, 1000, "81 3 3" ); Run reslicedRndr = resliceRun( rndr, rndAirVec, "o", "k" ); Volume meanRand = softmean( reslicedRndr, "y", "null" ); Air mnQAAir = alignlinear( a.nHires, meanRand, 6, 1000, 4, "81 3 3" ); Warp boldNormWarp = combinewarp( shrink, a.aWarp, mnQAAir ); Run nr = reslice_warp_run( boldNormWarp, roRun ); Volume meanAll = strictmean( nr, "y", "null" ) Volume boldMask = binarize( meanAll, "y" ); snr = gsmoothRun( nr, boldMask, "6 6 6" ); } (Run or) reorientRun (Run ir, string direction) { foreach Volume iv, i in ir.v { or.v[i] = reorient(iv, direction); }
SwiftScript Expressiveness Lines of code with different workflow encodings Collaboration with James Dobson, Dartmouth [SIGMOD Record Sep05] fMRI Workflow Shell Script VDLSwift ATLAS ATLAS FILM FEAT AIRSN215~40034 AIRSN workflow:AIRSN workflow expanded:
48 SwiftScript Abstract computation Virtual Data Catalog SwiftScript Compiler SpecificationExecution Virtual Node(s) Provenance data Provenance data Provenance collector launcher file1 file2 file3 App F1 App F2 Scheduling Execution Engine (Karajan w/ Swift Runtime) Swift runtime callouts C CCC Status reporting Swift Architecture Provisioning Dynamic Resource Provisioner Amazon EC2
Workflow Status and logs swift command launcher f1 f2 f3 Worker Nodes App a1 App a2 Using Swift SwiftScript App a1 App a2 Data f1 f2 f3 site list app list Provenance data
Swift uses Karajan Workflow Engine Fast, scalable threading model Suitable constructs for control flow Flexible task dependency model “ Futures ” enable pipelining Flexible provider model allows for use of different run time environments Job execution and data transfer Flow controlled to avoid resource overload Workflow client runs from a Java container Java CoG Workflow, Gregor von Laszewski et al., Workflows for Science, 2007
Application example: ACTIVAL: Neural activation validation Identifies clusters of neural activity not likely to be active by random chance: switch labels of the conditions for one or more participants; calculate the delta values in each voxel, re-calculate the reliability of delta in each voxel, and evaluate clusters found. If the clusters in data are greater than the majority of the clusters found in the permutations, then the null hypothesis is refuted indicating that clusters of activity found in our experiment are not likely to be found by chance. Work by S. Small and U. Hasson, UChicago.
SwiftScript Workflow ACTIVAL – Data types and utilities type script {} type fullBrainData {} type brainMeasurements{} type fullBrainSpecs {} type precomputedPermutations{} type brainDataset {} type brainClusterTable {} type brainDatasets{ brainDataset b[]; } type brainClusters{ brainClusterTable c[]; } // Procedure to run "R" statistical package (brainDataset t) bricRInvoke (script permutationScript, int iterationNo, brainMeasurements dataAll, precomputedPermutations dataPerm) { app { iterationNo } } // Procedure to run AFNI Clustering tool (brainClusterTable v, brainDataset t) bricCluster (script clusterScript, int iterationNo, brainDataset randBrain, fullBrainData brainFile, fullBrainSpecs specFile) { app { } } // Procedure to merge results based on statistical likelhoods (brainClusterTable t) bricCentralize ( brainClusterTable bc[]) { app { } }
ACTIVAL Workflow – Dataset iteration procedures // Procedure to iterate over the data collection (brainClusters randCluster, brainDatasets dsetReturn) brain_cluster (fullBrainData brainFile, fullBrainSpecs specFile) { int sequence[]=[1:2000]; brainMeasurements dataAll ; precomputedPermutations dataPerm ; script randScript ; script clusterScript ; brainDatasets randBrains ; foreach int i in sequence { randBrains.b[i] = bricRInvoke(randScript,i,dataAll,dataPerm); brainDataset rBrain = randBrains.b[i] ; (randCluster.c[i],dsetReturn.b[i]) = bricCluster(clusterScript,i,rBrain, brainFile,specFile); }
ACTIVAL Workflow – Main Workflow Program // Declare datasets fullBrainData brainFile ; fullBrainSpecs specFile ; brainDatasets randBrain ; brainClusters randCluster<simple_mapper; prefix="Tmean.4mm.perm", suffix="_ClstTable_r4.1_a2.0.1D">; brainDatasets dsetReturn<simple_mapper; prefix="Tmean.4mm.perm", suffix="_Clustered_r4.1_a2.0.niml.dset">; brainClusterTable clusterThresholdsTable ; brainDataset brainResult ; brainDataset origBrain ; // Main program – executes the entire workflow (randCluster, dsetReturn) = brain_cluster(brainFile, specFile); clusterThresholdsTable = bricCentralize (randCluster.c); brainResult = makebrain(origBrain,clusterThresholdsTable,brainFile,specFile);
Performance example: fMRI workflow 4-stage workflow (subset of AIRSN) 476 jobs, <10 secs CPU each, 119 jobs per stage. No pipelining: 24 minutes (idle uc-teragrid cluster, via GRAM to Torque) Jobs
Example Performance Optimizations Pipelining Jobs pipelined between stages: 19 minutes Jobs
Example Performance Optimizations Pipelining + clustering with pipelining and clustering (up to 6 jobs clustered into one GRAM job): 8 mins Jobs
Example Performance Optimizations Pipelining + provisioning With pipelining and CPU provisioning: 2.2 minutes. Jobs
Load Balancing uc-teragrid: 216 UC-TeraPort: 260 Load balancing between UC TeraPort (OSG) and UC-TeraGrid (IA32) Jobs
Development Status Initial release is available for evaluation Performance measurement and tuning efforts active Adapting to OSG Grid info and site conventions Many applications in progress and eval Astrophysics, molecular dynamics, neuroscience, psychology, radiology Provisioning mechanism progressing Virtual data catalog re-integration starting ~ April Collating language feedback – focus is on mapping Web site for docs, downloads and more info:
Conclusion Swift is in its early stages of development and its transition from the VDS virtual data language Application testing is underway in neuroscience, molecular dynamics, astrophysics, radiology, and other applications. Providing valuable feedback for language refinement and finalization SwiftScript is proving to be a productive language while feedback from usage is still shaping it Positive comments from VDL users – radiology in particular Ongoing performance evaluation and improvement is yielding exciting results Major initial focus is usability – good progress on improving time-to-get-started and on ease of debugging
Acknowledgements Swift effort is supported by DOE (Argonne LDRD), NSF (i2u2,GriPhyN, iVDGL), NIH, and the UChicago Computation Institute Team Ben Clifford, Ian Foster, Mihael Hategan, Veronika Nefedova, Tiberiu Stef-Praun, Mike Wilde, Yong Zhao Java CoG Kit Mihael Hategan, Gregor Von Laszewski, and many collaborators User contributed workflows and Swift applications ASCI Flash, I2U2, UC Human Neuroscience Lab, UCH Molecular Dynamics, UCH Radiology, caBIG Ravi Madduri, Patrick McConnell, and the caGrid team of caBIG.
Based on: The Virtual Data System – a workflow toolkit for science applications OSG Summer Grid Workshop Lecture 8 June 29, 2006
Based on: Fast, Reliable, Loosely Coupled Parallel Computation Tiberiu Stef-Praun Computation Institute University of Chicago & Argonne National Laboratory
Acknowledgements The technologies and applications described here were made possible by the following projects and support: GriPhyN, iVDGL, the Globus Alliance and QuarkNet, supported by The National Science Foundation The Globus Alliance, PPDG, and QuarkNet, supported by the US Department of Energy, Office of Science Support was also provided by NVO, NIH, and SCEC