LIGO-G W Use of Condor by the LIGO Scientific Collaboration Gregory Mendell, LIGO Hanford Observatory On behalf of the LIGO Scientific Collaboration The Laser Interferometer Gravitational-Wave Observatory Supported by the United States National Science Foundation
LIGO-G W Sources Of Gravitational Waves Black HolesDense Stars Supernovae Stochastic Background Photos:
LIGO-G W LIGO GEOVirgo TAMA AIGO Worldwide Interferometers
LIGO-G W LDAS: CIT LDAS: LLO LSC: PSU LDAS: LHO LSC: UWM LDAS: MIT LSC: Golm LSC: SYR LSC: Cardiff LSC: Birmingham Worldwide Data Analysis using The LIGO Data Grid (LDG) CPUs with a typical clock speed of 2.6 GHz
LIGO-G W The LIGO Data Grid packageName( 'Server' ) version( '4.5' ) pacmanVersionGE('3.18.5') package( 'Server-Environment' ) package( 'VDT_CACHE:Globus' ) package( 'VDT_CACHE:CA-Certificates' ) package( 'VDT_CACHE:CA-Certificates-Updater' ) package( 'VDT_CACHE:Condor' ) package( 'VDT_CACHE:GSIOpenSSH' ) package( 'VDT_CACHE:KX509' ) package( 'VDT_CACHE:MyProxy' ) package( 'VDT_CACHE:UberFTP' ) package( 'VDT_CACHE:EDG-Make-Gridmap' ) package( 'VDT_CACHE:Globus-RLS') package( 'VDT_CACHE:Globus-Core') package( 'VDT_CACHE:Globus-Condor-Setup' ) package( 'VDT_CACHE:PyGlobus' ) package( 'VDT_CACHE:PyGlobusURLCopy' ) package( 'VDT_CACHE:Pegasus' ) package( 'VDT_CACHE:VOMS-Client' ) package( 'VDT_CACHE:Globus-WS' ) package( 'VDT_CACHE:Tomcat-5.5' ) package( 'VDT_CACHE:TclGlobus' ) package( 'Server-FixSSH' ) package( 'Server-RLS-Python-Client' ) package( 'Server-Cert-Util' ) package( 'Server-LSC-CA' ) LDG Client/Server Distribution
LIGO-G W The LIGO Data Grid Grid middleware LDG Client/Server Virtual Data Toolkit (VDT) –Globus Toolkit –GSI and X.509 certificates –pyGlobus, tclGlobus, Pegasus, etc… In house packages Glue: LSC Data Location & Pipeline Tools LDR: LSC Lightweight Data Replication GridFTP for moving data and files Replica Location Service (RLS) Onaysis: LSC Online Analysis System Users 500+ scientist in the LIGO Scientific Collaboration 200+ doing data analysis on the LIGO Data Grid High throughput computing Condor for most analyses BOINC for LSC Analysis Software LAL, Matapps, DMT, etc..
LIGO-G W Use of Condor by the LIGO Scientific Collaboration Condor handles 10’s of millions of jobs per year running on the LDG, and up to 500k jobs per DAG. Condor standard universe checking pointing widely used, saving us from having to manage this. At Caltech, 30 million jobs processed using 22.8 million CPU hrs. on 1324 CPUs in last 30 months. For example, to search 1 yr. of data for GWs from the inspiral of binary neutron star and black hole systems takes ~2 million jobs, and months to run on several thousand ~2.6 GHz nodes.
LIGO-G W Inspiral Analysis Pipeline Multidetector pipeline Pipeline topology is same for all inspiral searches BNS, PBH, BBH, spinning BBH Different template/filtering code used for different searches Can be used for LIGO-GEO and LIGO-VIRGO analysis Pipeline Description Inspiral search run on each IFO Look for triggers coincident in time and mass between detetors Follow up with signal-based vetoes Perform coherent analysis of surviving triggers Follow up candidate events
LIGO-G W Example of a LIGO Inspiral DAG
LIGO-G W Example DAG Within Dag DAG: finds data and generates Fourier Tranforms used by other DAGs DAG: outputs spectra of power supply data:
LIGO-G W Example Test DAG for Condor Regression Testing Makes fake data for each detector (same code is used in Monte Carlo simulations). Run the fully-coherent multi-detector continuous-wave search code, used to search for GWs from rotating neutron stars. Compares the output with reference data.
LIGO-G W The LIGO/Condor Success Story Condor handles most of our searches and is vital to the success of LIGO. Condor and LIGO have a biweekly telecon to discuss issues & enhancements. In approximately the past year, Condor successfully enhanced scaling to support non-trivial O(1M) node DAGs, implemented option to prioritize nodes, e.g., depth-first traversal of DAGs, added categories to limit on number of resource intensive nodes in a DAG, handling of priorities and staggered start of jobs. Condor is working on a list of enhancements to, e.g., speed up of starting DAGs by O(100x), automate finding of rescue DAGs, e.g., when there are DAGs within DAGs, and merging of sub-DAGs. Add standard universe support on RHEL/CentOS and Debian. Condor is compatible with BOINC and can run backfill jobs on the LDG clusters when there are idle cycles. For the future: Our offline/online high throughput computing needs will continue to grow. Online jobs moving towards low latency; need to think about computing needs for realtime detection when Advanced LIGO comes on line.