Distributed Pipeline Programming for Mosaics Or Mario Tips’N’Tricks
NOAO Mosaic Pipeline
Major Features and Goals Data products for NOAO archive and NVO node Data products for NOAO archive and NVO node Data products for observers Data products for observers Pipeline for NOAO and mosaic community Pipeline for NOAO and mosaic community Basic CCD mosaic calibrations Basic CCD mosaic calibrations Advanced time-domain data products Advanced time-domain data products Real-time data quality assessment and monitoring Real-time data quality assessment and monitoring High performance, data parallel system High performance, data parallel system LSST testbed LSST testbed Fairly generic pipeline infrastructure (NEWFIRM, …) Fairly generic pipeline infrastructure (NEWFIRM, …) Automated operation Automated operation Thorough processing history and data documentation Thorough processing history and data documentation
MARIO Mosaic Automatic Reduction Infrastructure and Operations (i.e. a pipeline)
Key Concepts (Tips’N’Tricks) sub-pipelines - “meta pipeline programming” sub-pipelines - “meta pipeline programming” indirect files indirect files load balancing using trigger files load balancing using trigger files stay-alive module stay-alive module parallelization of algorithms over mosaic parallelization of algorithms over mosaic shared monitoring shared monitoring network filenames network filenames image processing language (CL) image processing language (CL)
What is a pipeline? collection of processing modules collection of processing modules connected by dependency rules connected by dependency rules modules may run concurrently on different data objects modules may run concurrently on different data objects Infrastructure to manage processes Infrastructure to manage processes Infrastructure to manage dependencies Infrastructure to manage dependencies Infrastructure to monitor processes and processing Infrastructure to monitor processes and processing
OPUS Operations Pipeline Unified System
Opus Triggers (dependency rules) Triggers (dependency rules) file, osf, time file, osf, time Blackboard Blackboard Polling Polling Monitors and Managers Monitors and Managers
Distributed Pipeline Issues data vs. functional parallelism data vs. functional parallelism shared file system vs. local file system shared file system vs. local file system heterogeneous vs. homogenous processors heterogeneous vs. homogenous processors parasitic processing parasitic processing push vs. pull push vs. pull load balancing load balancing master-worker vs. peer-to-peer master-worker vs. peer-to-peer
MARIO Choices data parallelism data parallelism local file system (w/ shared blackboard) local file system (w/ shared blackboard) heterogeneous processors heterogeneous processors push AND pull push AND pull load balancing by number of data objects load balancing by number of data objects peer-to-peer peer-to-peer
MARIO Architecture Concept Multiple CPUs but no dependency on N Multiple CPUs but no dependency on N Multiple types of sub-pipelines by function Multiple types of sub-pipelines by function One for operations over all mosaic elements One for operations over all mosaic elements One for operations on individual elements One for operations on individual elements One for cataloging One for cataloging One for image differencing One for image differencing All types on all CPUs: no master! All types on all CPUs: no master! Sub-pipelines triggered by files Sub-pipelines triggered by files
“Meta Pipeline Programming” Build a pipeline out of sub-pipelines Build a pipeline out of sub-pipelines Form a distributed web of sub-pipelines Form a distributed web of sub-pipelines Sub-pipelines play role of subroutines Sub-pipelines play role of subroutines Need equivalents of: Need equivalents of: objects objects call and return call and return node assignment node assignment library of standard modules library of standard modules start, call, return, done, obs, run start, call, return, done, obs, run
What is a sub-pipeline primarily operates on one type of object primarily operates on one type of object operates on one node operates on one node data is maintained locally data is maintained locally multiple stages but limited functionality multiple stages but limited functionality
Example of Sub-pipelines NGT CAL SCL MEF SIF DTS multiextensionsingle images
Sub-pipelines NGT: Nights worth of data NGT: Nights worth of data Group, Zero, Dome Flat, Objects, Done Group, Zero, Dome Flat, Objects, Done CAL: Calibration sequence (MEF) CAL: Calibration sequence (MEF) Setup, Split, Done Setup, Split, Done SCL: Calibration sequence (SIF) SCL: Calibration sequence (SIF) Setup, CCDPROC, Combine, Done Setup, CCDPROC, Combine, Done MEF: Process objects (MEF) MEF: Process objects (MEF) Setup, Split, Done Setup, Split, Done SIF: Process objects (SIF) SIF: Process objects (SIF) Setup, CCDPROC, Done Setup, CCDPROC, Done
Network of Sub-pipelines and CPUs Pipeline CPU MEF SIF MEF SIF MEF CPU SIF MEF SIF MEF SIF MEF: pipeline for operations over all mosaic extensions; eg crosstalk, global WCS correction SIF: pipeline for single CCD images; eg ccdproc, masking
Example Processing Status OBJECT NAME PIPELINE NODE STAGES anight1 ngt dhcp cccw_ ct4m T183424S cal dhcp cccd_ ct4m T183424S_01 scl archive2 ccccd ct4m T183424S_02 scl dhcp ccccd ct4m T183424S_03 scl archive2 ccccd ct4m T183424S_04 scl dhcp ccccd ct4m T191558S cal dhcp cccd_ ct4m T191558S_01 scl archive2 ccccd ct4m T191558S_02 scl vmware ccccd ct4m T191558S_03 scl archive2 ccccd ct4m T191558S_04 scl dhcp ccccd ct4m T mef dhcp ccw__ ct4m T084044_01 sif archive2 ccd__ ct4m T084044_02 sif archive2 cp___ ct4m T084044_03 sif vmware p____ ct4m T084044_04 sif archive2 _____ ct4m T mef dhcp cccd_ ct4m T084307_01 sif archive2 ccd__ ct4m T084307_02 sif vmware ccd__ ct4m T084307_03 sif archive2 ccd__ ct4m T084307_04 sif archive2 ccd__
Calling a Sub-pipeline Data is setup either locally or on target node Data is setup either locally or on target node File with path for returned result written to target pipeline File with path for returned result written to target pipeline File with paths of returned results written in calling pipeline File with paths of returned results written in calling pipeline Trigger file written to target pipeline Trigger file written to target pipeline
Returning Results Return module in target pipeline looks for return file Return module in target pipeline looks for return file Results are written to trigger file for calling pipeline specified in the return file Results are written to trigger file for calling pipeline specified in the return file Calling pipeline triggers on return file Calling pipeline triggers on return file
Call/Return A -> H N !B/data/abc N (derived from abc) A -> H N !B/return/abc N [H!A/abc N.btrig] A -> H!A/abc.b [abc 1.btrig,abc 2.btrig,…] A -> H N !B/abc N.btrig [H N !B/data/abc N ] H N !B -> H!A/abc N.btrig [results] return checks H!A/abc.b for all done
Indirect Files anight1.ngttrig:anight1.list: Distribute data files across a network Distribute data files across a network Move references and only move data as needed Move references and only move data as needed Pipeline objects: standard form, variable content Pipeline objects: standard form, variable content Act as triggers and meta-data containers Act as triggers and meta-data containers
Pipeline Data Directory Trigger Directory Module obj123.fitsobj123.trig GO File Triggers Contains reference to data Data Trigger (DRA, user, or pipeline module) Tape Disk DTS Process
Data Flow Networking: Example Host0: Crosstalk Host1: Obj456.1 Obj321.2 Host2: Obj567.2 Host3: Obj123 Obj123.2 Obj123.1 Host3!Obj123.1 Host2!Obj123.2 Host4: DOWN
Data Parallel Modules Some algorithms may need to be re-implemented specifically for a data parallel pipeline. One type is where measurements are made across the mosaic for a global calibration. Rather than requiring all pieces to be in one pipeline arrange for measurements made in parallel to be collected for the global calibration and then apply the global calibration to the pieces in parallel.