LIGO Data Analysis (G060539-00-M) Computing Breakout Session 1 1.LIGO Data Analysis Systems Data Management and Analysis 2.LIGO/LSC Analysis Software:

Slides:



Advertisements
Similar presentations
11/12/2003LIGO Document G Z1 Data reduction for S3 I Leonor (UOregon), P Charlton (CIT), S Anderson (CIT), K Bayer (MIT), M Foster (PSU), S Grunewald.
Advertisements

LIGO-G Z 23 October 2002NSF Review of LIGO Laboratory1 The Penn State LIGO Data Analysis Center Lee Samuel Finn Penn State.
Intermediate Condor: DAGMan Monday, 1:15pm Alain Roy OSG Software Coordinator University of Wisconsin-Madison.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
S. Gadomski, "ATLAS computing in Geneva", journee de reflexion, 14 Sept ATLAS computing in Geneva Szymon Gadomski description of the hardware the.
23/04/2008VLVnT08, Toulon, FR, April 2008, M. Stavrianakou, NESTOR-NOA 1 First thoughts for KM3Net on-shore data storage and distribution Facilities VLV.
LIGO- GXXXXXX-XX-X Advanced LIGO Data & Computing Material for the breakout session NSF review of the Advanced LIGO proposal Albert Lazzarini Caltech,
Intermediate HTCondor: Workflows Monday pm Greg Thain Center For High Throughput Computing University of Wisconsin-Madison.
Design considerations for the Indigo Data Analysis Centre. Anand Sengupta, University of Delhi Thanks to -Maria Alessandra Papa (AEI) -Stuart Anderson.
Ian M. Fisk Fermilab February 23, Global Schedule External Items ➨ gLite 3.0 is released for pre-production in mid-April ➨ gLite 3.0 is rolled onto.
QCDgrid Technology James Perry, George Beckett, Lorna Smith EPCC, The University Of Edinburgh.
LIGO-G E ITR 2003 DMT Sub-Project John G. Zweizig LIGO/Caltech Argonne, May 10, 2004.
Open Science Grid Software Stack, Virtual Data Toolkit and Interoperability Activities D. Olson, LBNL for the OSG International.
LSC Segment Database Duncan Brown Caltech LIGO-G Z.
10/20/05 LIGO Scientific Collaboration 1 LIGO Data Grid: Making it Go Scott Koranda University of Wisconsin-Milwaukee.
LIGO- G E LIGO Grid Applications Development Within GriPhyN Albert Lazzarini LIGO Laboratory, Caltech GriPhyN All Hands.
Patrick R Brady University of Wisconsin-Milwaukee
ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader.
G Z LIGO Scientific Collaboration Grid Patrick Brady University of Wisconsin-Milwaukee LIGO Scientific Collaboration.
INFSO-RI Enabling Grids for E-sciencE The US Federation Miron Livny Computer Sciences Department University of Wisconsin – Madison.
Update on the LIGO Data Analysis System LIGO Scientific Collaboration Meeting LIGO Hanford Observatory August 19 th, 2002 Kent Blackburn Albert Lazzarini.
OSG Area Coordinators’ Meeting LIGO Applications (NEW) Kent Blackburn Caltech / LIGO October 29 th,
3rd June 2004 CDF Grid SAM:Metadata and Middleware Components Mòrag Burgon-Lyon University of Glasgow.
QCDGrid Progress James Perry, Andrew Jackson, Stephen Booth, Lorna Smith EPCC, The University Of Edinburgh.
 The workflow description modified to output a VDS DAX.  The workflow description toolkit developed allows any concrete workflow description to be migrated.
Nick Brook Current status Future Collaboration Plans Future UK plans.
LIGO Applications Kent Blackburn (Robert Engel, Britta Daudert)
Wenjing Wu Andrej Filipčič David Cameron Eric Lancon Claire Adam Bourdarios & others.
SAM and D0 Grid Computing Igor Terekhov, FNAL/CD.
LIGO-G9900XX-00-M ITR 2003 DMT Sub-Project John G. Zweizig LIGO/Caltech.
14 Aug 08DOE Review John Huth ATLAS Computing at Harvard John Huth.
LIGO Z LIGO Scientific Collaboration -- UWM 1 LSC Data Analysis Alan G. Wiseman (LSC Software Coordinator) LIGO Scientific Collaboration.
Status of Standalone Inspiral Code Duncan Brown University of Wisconsin-Milwaukee LIGO Scientific Collaboration Inspiral Working Group LIGO-G Z.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Intermediate Condor: Workflows Rob Quick Open Science Grid Indiana University.
The GriPhyN Planning Process All-Hands Meeting ISI 15 October 2001.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
LIGO-G E LIGO Scientific Collaboration Data Grid Status Albert Lazzarini Caltech LIGO Laboratory Trillium Steering Committee Meeting 20 May 2004.
15 Dec 2005GWDAW 10 LIGO-G Z1 Overview of LIGO Scientific Collaboration Inspiral Searches Alexander Dietz Louisiana State University for the LIGO.
Intermediate Condor: Workflows Monday, 1:15pm Alain Roy OSG Software Coordinator University of Wisconsin-Madison.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
NSF Review, 18 Nov 2003 Peter Shawhan (LIGO/Caltech)1 How to Develop a LIGO Search Peter Shawhan (LIGO / Caltech) NSF Review November 18, 2003 LIGO-G E.
Searching for Gravitational Waves from Binary Inspirals with LIGO Duncan Brown University of Wisconsin-Milwaukee for the LIGO Scientific Collaboration.
S5 BNS Inspiral Update Duncan Brown Caltech LIGO-G Z.
Status Organization Overview of Program of Work Education, Training It’s the People who make it happen & make it Work.
1 Status of Search for Compact Binary Coalescences During LIGO’s Fifth Science Run Drew Keppel 1 for the LIGO Scientific Collaboration 1 California Institute.
Jan 12, 2009LIGO-G Z1 DMT and NDS2 John Zweizig LIGO/Caltech Ligo PAC, Caltech, Jan 12, 2009.
LIGO Plans for OSG J. Kent Blackburn LIGO Laboratory California Institute of Technology Open Science Grid Technical Meeting UCSD December 15-17, 2004.
State of LSC Data Analysis and Software LSC Meeting LIGO Hanford Observatory November 11 th, 2003 Kent Blackburn, Stuart Anderson, Albert Lazzarini LIGO.
Cyberinfrastructure Overview Russ Hobby, Internet2 ECSU CI Days 4 January 2008.
Scott Koranda, UWM & NCSA 14 January 2016www.griphyn.org Lightweight Data Replicator Scott Koranda University of Wisconsin-Milwaukee & National Center.
11/12/2003LIGO-G Z1 Data reduction for S3 P Charlton (CIT), I Leonor (UOregon), S Anderson (CIT), K Bayer (MIT), M Foster (PSU), S Grunewald (AEI),
Gregory Mendell, LIGO Hanford Observatory LIGO-G WLIGO-G WLIGO-G W LIGO S5 Reduced Data Set Generation March 2007.
LSC Meeting LIGO Scientific Collaboration - University of Wisconsin - Milwaukee 1 LSC Software and Other Things Alan Wiseman University of Wisconsin.
Final Implementation of a High Performance Computing Cluster at Florida Tech P. FORD, X. FAVE, K. GNANVO, R. HOCH, M. HOHLMANN, D. MITRA Physics and Space.
LIGO-G W Use of Condor by the LIGO Scientific Collaboration Gregory Mendell, LIGO Hanford Observatory On behalf of the LIGO Scientific Collaboration.
LIGO-G Z1 Using Condor for Large Scale Data Analysis within the LIGO Scientific Collaboration Duncan Brown California Institute of Technology.
LIGO: The Laser Interferometer Gravitational-wave Observatory Sarah Caudill Louisiana State University Physics Graduate Student LIGO Hanford LIGO Livingston.
Summary of OSG Activities by LIGO and LSC LIGO NSF Review November 9-11, 2005 Kent Blackburn LIGO Laboratory California Institute of Technology LIGO DCC:
Managing LIGO Workflows on OSG with Pegasus Karan Vahi USC Information Sciences Institute
Building on virtualization capabilities for ExTENCI Carol Song and Preston Smith Rosen Center for Advanced Computing Purdue University ExTENCI Kickoff.
Scientific Computing at Fermilab Lothar Bauerdick, Deputy Head Scientific Computing Division 1 of 7 10k slot tape robots.
VO Experiences with Open Science Grid Storage OSG Storage Forum | Wednesday September 22, 2010 (10:30am)
Running LIGO workflows on the OSG
Scott Koranda, UWM & NCSA 20 November 2016www.griphyn.org Lightweight Replication of Heavyweight Data Scott Koranda University of Wisconsin-Milwaukee &
U.S. ATLAS Grid Production Experience
Gregory Mendell LIGO Hanford Observatory
Leigh Grundhoefer Indiana University
University of Wisconsin-Milwaukee
S4 Online Inspiral Search
Presentation transcript:

LIGO Data Analysis (G M) Computing Breakout Session 1 1.LIGO Data Analysis Systems Data Management and Analysis 2.LIGO/LSC Analysis Software: Case Study of Inspiral Analysis 3.Open Science Grid (OSG) Activities by LIGO and LSC LIGO NSF Review October 23-25, 2006 Stuart Anderson, Kent Blackburn, Duncan Brown LIGO Laboratory California Institute of Technology

LIGO Data Analysis (G M) Computing Breakout Session 2 Outline LIGO Data Management and Distribution (S. Anderson) »Cluster access to data »Storage technology and dataflow »Size of problem (rates and volume) »Data and metadata replication »Hardware and Software updates during S5 LIGO Data Analysis Software (D. Brown) »Software working groups »Example analysis pipeline LIGO and the OSG (K.Blackburn) »Overview »Participation »Applications »Challenges

LIGO Data Analysis (G M) Computing Breakout Session 3 Data Management and Distribution MITCIT LLO  Stuart Anderson (Group Leader)  Greg Mendell (LHO Scientist)  Ben Johnson (LHO Systems Admin.)  Igor Yakushin (LLO Scientist)  Dwayne Giardina (LLO Systems Admin.)  Dan Kozak (CIT Systems Admin.)  Erik Espinoza (CIT Systems Admin.)  Phil Ehrens (CIT Systems Admin.)  Fred Donovan (MIT Systems Admin.) LHO

LIGO Data Analysis (G M) Computing Breakout Session 4 LIGO Data Analysis Systems LIGO Laboratory operated systems: 4 Production systems »2 on-site: observatory archive and real-time processing (LHO/LLO) –1100 GHz P4-Xeon/100TB disk/6 STK9940B tape drives/700 slot robot each »2 off-site: central archive and multi-IFO processing (CIT/MIT) –2900 GHz Opteron-core/230TB disk/10 STK9940B tape drives/6000 slot robot (CIT) –450 GHz P4-Xeon/40TB disk (MIT) 1 Development system »Run daily builds of LDAS from CVS 1 Test system »Pre-release testing »Integration testing of LSC/LIGO software

LIGO Data Analysis (G M) Computing Breakout Session 5 Mass Storage Software SAM-QFS is used to manage the IFO data and intermediate data products from cluster analysis Shared filesystem with metadata on shared disk for initial frame filesystem to minimize subsystem interface issues (security and reliability). Scalable to size of LIGO with multiple read/write servers However, single tape drive server per filesystem Free for research use Presents fully POSIX compliant filesystem for ease and portability of application development Portable data format (GNU Tar) Disaster recovery Future migration of data if software product is terminated Seamless migration of physical tapes between Observatories and central archive at Caltech

LIGO Data Analysis (G M) Computing Breakout Session 6 LIGO Data Management  IFO data is stored in an International GW self-describing binary format (Frame).  Duplicate servers write independent copies to 2 different QFS filesystems in different buildings for redundancy and quality checks.  These filesystems are the interface between the data acquisition and archiving/analysis systems (no TCP/IP).  Reduced data sets are created in near real-time and distributed over the WAN using gridftp.  Two copies of the full IFO data are written to SAM-QFS tape and one read-only copy is sent to the central archive at Caltech.  Local cluster analysis jobs have access to IFO data with a latency of ~1 minute, reduced data is available off-site with a latency of ~1 hour, and off-site raw data in ~1 week.

LIGO Data Analysis (G M) Computing Breakout Session 7 Data Rate 1 Yr 1 Yr Comp. Ratio Look-back LHO (MByte/s) (TByte) (tapes) Level 0 raw on tape ~1 Month* Level 1 RDS in files ~most S5* Level 3 RDS in files all S5 Level 4 RDS in files all S5 h(t) all S5 SFTs all S5 LLO Level 0 raw on tape ~2 Month* Level 1 RDS in files ~most S5* Level 3 RDS in files all S5 Level 4 RDS in files all S5 h(t) all S5 SFTs all S5 Totals: as above All Science Runs at CIT * The remainder of these data are off-line in tape cabinets on-site and available on demand as well as being on-line at CIT. LIGO Data Products Dual-copy tape utilization (STK9940B)

LIGO Data Analysis (G M) Computing Breakout Session 8 LIGO Data Archive CITLHOLLO Frame files 29 M14 M25 M Frame size 950 TB610 TB270 TB Cluster home files 39 M25 M11 M Cluster home size 9.4 TB6.2 TB2.8 TB Tape volumes (online/offline) 4990/370610/ /550

LIGO Data Analysis (G M) Computing Breakout Session 9 LIGO Data Distribution Reduced data products are distributed over the WAN using the LIGO Data Replicator (LDR), a grid tool built on top of the Globus Toolkit to manage both data and metadata transfers: »Parallel data transfers are accomplished using GridFTP. »Metadata uses the Replication Location Server to hold mappings of filenames to paths/URL’s at a given cluster location. GEO calibrated strain data are replicated via LDR from Hannover, Germany to Caltech. Data are re-distributed from CIT to Tier-2 and Tier-3 centers via LDR. Discussions have started with VIRGO regarding joint data exchange and possible interoperability problems between US and EU grid middleware.

LIGO Data Analysis (G M) Computing Breakout Session 10 Data Transfer Performance  Observatories to Caltech  LHO upgraded Esnet connection to GigE  Esnet upgraded backbone (SEA/SNV) to OC-192  Transferring frame data at MByte/s  LLO upgraded to GigE connection to LSU  LSU connection to I2 is OC-3  5-10 MByte/s  CIT to Tier-2 centers  MIT 7 MByte/s  PSU 8 MByte/s  UWM 20 MByte/s

LIGO Data Analysis (G M) Computing Breakout Session 11 Hardware Updates During S5  Approximately doubled power and HVAC available for clusters.  Upgraded LLO/LHO clusters from 70/140 to 210/210 dual-Xeon nodes by relocating CIT cluster.  Installed a new CIT cluster (329 dual-socket dual-core Opteron).  Adding additional GRID related servers (Linux) as grid model develops, e.g., OSG.  Increasing SAM-QFS cluster home directory space as needed.  Replacing T3 RAID system (5.5 yr old).  Using shelf storage for tapes at Observatories and Caltech.  Planned upgrade of SPARC/Solaris servers to Opteron/Solaris.

LIGO Data Analysis (G M) Computing Breakout Session 12 Software Updates During S5 Continuing with Solaris 10/Fedora Core 4 for S5. »Still struggling to find the right compromise between the latest support and stability in Linux for post-S5. New CIT and UWM clusters are Opteron based running 64-bit FC x86_64: »VDT 64-bit release on 2 Aug 2006 (1.3.11) »Condor 64-bit was released on 16 Oct 2006 (6.8.2) »Current LDAS development branch is running on 64-bit Porting LDAS software to Solaris 10 x86 to allow upgrading of floating point intensive servers from SPARC to Opteron.

LIGO Data Analysis (G M) LIGO/LSC Analysis Software: Case Study of Inspiral Analysis

LIGO Data Analysis (G M) Computing Breakout Session 14 Software Working Groups The LIGO software is managed by the following teams: »Control and Data Systems (CDS) –Real-time computer systems (VxWorks) and data distribution (Reflective Memory) –Writing IFO data to Frames (QFS) »LIGO Data Analysis Software (LDAS) –Frame I/O library (FrameCPP) –Generating and distributing reduced data sets (RDS) –Managing cluster jobs (Condor) –Managing Metadata catalogues (Segment database and Q-replication) »Data Analysis Software Working Group (DASWG) –LSC wide algorithm libraries (LAL, LALapps, Matapps, Root) –LIGO Data Replicator (LDR) –Online Analysis System (Onasys) –Detector Monitoring Tool (DMT) –LIGOtools/LDG software bundles »OSG migration team –Working to port example LIGO pipelines to OSG using VDS –Use BOINC to harness home computers (currently ~50k machines)

LIGO Data Analysis (G M) Computing Breakout Session 15 The Goal There are 4 main search groups analyzing LIGO data using these tools for astrophysical publications: Inspiral (coalescing stellar binary systems, e.g., binary neutron stars) Stochastic (GW equivalent of microwave background) Periodic (Pulsars and other nearly periodic oscillators, e.g. LMXB’s) Burst (unmodeled events, e.g., Supernovae explosion) To highlight the various software components necessary to extract Astrophysical statements from LIGO IFO data consider the example of an inspiral analysis.

LIGO Data Analysis (G M) Computing Breakout Session 16 Overview of Inspiral Analysis LSC search for compact binary coalescence »Binary Neutron Stars »Binary Black Holes »Spinning Binary Black Holes »Primordial Black Holes Two components: online and offline analysis »Online single detector analysis for detector characterization »Offline analysis for multidetector analysis, pipeline tuning, Monte Carlo simulations, etc. »More of pipeline is migrating to online running as we gain a better understanding of analysis tuning

LIGO Data Analysis (G M) Computing Breakout Session 17 Elements of Inspiral Analysis Programs which perform the individual aspects of the analysis »e.g. lalapps_tmpltbank to generate match filter template bank, »lalapps_inspiral to filter detector data through bank, find triggers, etc. »Stand alone C code written by scientists which can be run either on desktops or on clusters as part of large analysis pipeline Infrastructure which allows the construction of pipelines »Grid/LSC User Environment (GLUE): describes elements of workflows and allows construction of complex analysis pipelines »Can generate pipelines for LSC Data Grid or Open Science Grid Pipeline construction scripts »Written by scientists to implement desired analysis using various components »Use GLUE to create files needed to execute analysis

LIGO Data Analysis (G M) Computing Breakout Session 18 Inspiral Analysis Pipeline Multidetector pipeline »Pipeline topology is same for all inspiral searches »BNS, PBH, BBH, spinning BBH »Different template/filtering code used for different searches »Can be used for LIGO-GEO and LIGO- VIRGO analysis Pipeline Description »Inspiral search run on each IFO »Look for triggers coincident in time and mass between detetors »Follow up with signal-based vetoes »Perform coherent analysis of surviving triggers »Follow up candidate events

LIGO Data Analysis (G M) Computing Breakout Session 19 Building Pipelines with GLUE Scientist writes pipeline generation script to implement desired topology Input to pipeline script: »Metadata describing when detectors were in science mode »Parameters of analysis (e.g. mass range, thresholds, coincidence windows etc.) Output is workflow that can be run on LSC clusters or OSG

LIGO Data Analysis (G M) Computing Breakout Session 20 Executing Analysis Pipeline Workflow written contains instructions for Condor to execute analysis on cluster Figure shows small subset of inspiral Directed Acyclic Graph (DAG)

LIGO Data Analysis (G M) Computing Breakout Session 21 Detail of Analysis Pipeline Close up of section of pipeline »“thinca” is coincidence code »Timeslide analysis to extimate background is built in Condor handles all job scheduling, management, etc. »Reports success of failure of DAG to user »Creates “rescue DAG” if any jobs failed

LIGO Data Analysis (G M) Computing Breakout Session 22 Online Inspiral Analysis Inspiral analysis runs online at observatories for detector characterization »Provide rapid feedback to operators and scientists in control room »Inspiral “figure of merit,” analysis of hardware injections, “loudest glitch” pages Executes analysis in near-real time »ONline Analysis SYStem (Onasys) launches inspiral DAGs when data is available »Subset of inspiral analysis pipeline runs on observatory cluster »Inspiral code analyzes data and produces triggers in XML format »~ 256 second latency for production of triggers »Triggers summarized, displayed in control room and made available to LSC for additional analysis

LIGO Data Analysis (G M) Computing Breakout Session 23 Analysis View of Online Inspiral Search

LIGO Data Analysis (G M) Computing Breakout Session 24 Infrastructure View of Online Inspiral Search Framebuilder (Solaris) running CDS realtime inferface to FrameCPP I/O library writes detector data to disk Frame file locations discovered using LDAS diskCache API Detector state metadata extracted from frames, stored in relational database and replicated to other observatory and CIT Data locations published into LIGO Data Replicator (LDR) Onasys periodically launches DAGs to analyze data Condor DAGman manages execution of workflows on cluster Results made available to users

LIGO Data Analysis (G M) Computing Breakout Session 25 Offline Inspiral Analysis Scientists use Glue and pipeline scripts described previously to create Condor DAGs Glue writes necessary DAGman files to launch jobs (language of Condor) Scientist runs DAG under Condor on LSC Data Grid cluster »comes back a few (sometimes many) days later to look at progress and/or results This is how our current scientific results for publications are obtained Glue can also write “Abstract DAG” (DAX) files which describe workflow (language of OSG Virtual Data System) Use Grid middleware “Pegasus” (Planning for Execution in Grids) developed by USC ISI to turn DAG into Condor-G DAGs for OSG Pegasus writes Condor-G DAG with necessary data staging and retrieval steps for running on non-LSC computing clusters

LIGO Data Analysis (G M) Computing Breakout Session 26 Running Inspiral Analysis on the OSG Using Pegasus has some problems for LSC scientists »Knowledge of VDS and/or OSG issues is required Solution: LIGO has developed an interface around Pegasus that will take the users pipeline DAX, discover the necessary OSG information, use Pegasus to plan the DAG and execute it using Condor-G LIGO/OSG team currently working with Pegasus team and OSG sites to run inspiral analysis on OSG »Currently dealing with issues such as available space for data, etc. Current application used for OSG development is inspiral software injection Monte Carlo »Same analysis used by LSC scientists on LIGO data grid »Has been run successfully on several OSG sites

LIGO Data Analysis (G M) Open Science Grid (OSG) Activities by LIGO and LSC

LIGO Data Analysis (G M) Computing Breakout Session 28 OSG Organization Kent Blackburn (Caltech) recently appointed to OSG Executive Board as the co-Resources Manager w/ Paul Avery Kent Blackburn is LIGO liaison to OSG and participates in many OSG activity groups Warren Anderson (UWM) recently appointed LIGO Virtual Organization (VO) Council Member to the OSG Murali Ramsunder (PSU) has provided the LIGO VO Support Center Role David Meyers recently asked to take a lead role in the Integration Testbed Activities and to provide leadership to the new Validation Testbed Activity

LIGO Data Analysis (G M) Computing Breakout Session 29 OSG Production Grid Evolved out of GRID3 Roughly 70 sites on OSG Production Grid »UWM & PSU from LIGO Roughly CPUs »UWM & PSU: CPUs Used to support VO specific application runs for simulations and data analysis Current OSG release version »Preceded by and »Improving patch technology »0.6.0 expected by March 2007

LIGO Data Analysis (G M) Computing Breakout Session 30 OSG Integration Testbed Roughly 20 sites currently in the OSG »LIGO-CIT & PSU from LIGO Scientific Collaboration Almost 5000 CPUs (some shared with production sites) Used to as testbed to validate new OSG software stacks and develop new VO applications. Adding a Validation Testbed to validate software stack at the level of the grid itself »Initially will be composed of three sites »David Meyers (LIGO/Caltech) asked to lead this effort

LIGO Data Analysis (G M) Computing Breakout Session 31 Virtual Organization PSU hosts LIGO’s VOMS server As of October 16 th, 2006 there were 147 OSG:LIGO VO certificates registered Two productions sites supported: »UW Milwaukee »PSU »LIGO Lab considering setting up another site at MIT Two LIGO integration testbed (ITB) sites: »Caltech »PSU

LIGO Data Analysis (G M) Computing Breakout Session 32 LIGO’s ITB Cluster Recently moved into main computer room at Caltech Eight dual CPU nodes »All are 32 bit hardware »Plan to add two more nodes 1.6 TB disk space Used to test new releases of the OSG software stack (& propose future changes supporting LIGO’s OSG application development) Being used to develop LIGO applications before the OSG Production Grid Will also be part of the new validation testbed (VTB)

LIGO Data Analysis (G M) Computing Breakout Session 33 LIGO Applications on OSG Focusing on binary inspiral searches »Some support from two caltech software developers, caltech post-doc, Pegasus developers First successful test run on October 1st, 2005 on LIGO’s integration testbed cluster using PIPE, a test DAG of 940 dag-nodes using 5.1 GBs of data. »Two weeks later, succeeded in running on LIGO’s OSG production sites »Demonstrated able to run this workflow at 10 different OSG Production sites This summer we began running a much larger binary inspiral workflow (HIPE)on a few larger OSG sites (PSU and UCSD) »Close to 700 gigabytes of data involved »Over 79,000 DAG nodes (two orders of magnitude larger; 40m-lab -> LIGO) »Run time is typically a week on these sites »Improvements to Pegasus workflow planner needed –Implemented a scaled down test version of workflow called Nano-HIPE with 356 DAG nodes »Approaching scale necessary for conducting science –Need to expand template space by several to reach science Began prototype activities to understand Burst analysis workflows on the OSG »Primary difference is that this group is not working in the with abstract DAG framework »Based on compiled Matlab and need to have Matlab runtime libraries managed as part of the workflows. »Have only run this workflow on the Caltech Integration Testbed site

LIGO Data Analysis (G M) Computing Breakout Session 34 LIGO Jobs on the OSG

LIGO Data Analysis (G M) Computing Breakout Session 35 OSG Challenges Ease of porting LIGO data analysis applications from LDG to OSG Efficiency of utilization of other OSG VO sites More frequent releases (or updates) of software stack to better support VO and application needs Down the road issues: »Now that it is funded … –Organization challenges –Staffing challenges (LIGO staff in place!) »Storage Management »Interoperability with TeraGrid and EGEE as well as LDG

LIGO Data Analysis (G M) Computing Breakout Session 36 OSG Summary LIGO and the LSC have contributed significantly to the OSG process, providing personnel at all levels from oversight to administrators, developers and users. OSG has only released two (plus patched) production software stack, 0.2.0, 0.4.0, »LIGO has worked on individual sites (mostly within the CMS VO) to get the LIGO VO supported during the release interim LIGO has now developed an application based on the binary inspiral search code for running on the OSG »Requires patches beyond officials OSG software stack »Data transmission times significantly mask CPU benefits for LIGO application »Storage requirement greatly constrain sites which can support LIGO binary inspiral application until Storage Elements (SE) are integral part of OSG sites