Download presentation
Presentation is loading. Please wait.
Published byIris Peters Modified over 9 years ago
1
April 19, 2005Scott Koranda1 The LIGO Scientific Collaboration Data Grid Scott Koranda UW-Milwaukee skoranda@uwm.edu On behalf of the LIGO Scientific Collaboration GridChem Workshop April 19, 2005 Urbana
2
April 19, 2005 Scott Koranda 2 Laser Interferometer Gravitational-wave Observatory LIGO is opening a new frontier in observational astrophysics Detect & use gravitational waves (GW) to observe the Universe, provide a more complete picture of the Cosmos. Complementary to radio/infrared/optical/X-ray/ -ray astronomy EM emitters not likely to be strong GW emitters & vice versa Detect & observe cataclysmic events leading to death of stars, birth of neutron stars & black holes Study Einstein’s theory of general relativity in the strong- field regime near massive compact objects, where GW are produced LIGO is now observing, acquiring science data and in full analysis production
3
April 19, 2005 Scott Koranda 3 Computation & Data Intensive Revealing the full science content of LIGO data is a computationally and data intensive challenge LIGO interferometers generate ~ 10 MB/s or almost 1 TB/day Several classes of data analysis challenges require large-scale computational resources In general for analysis 1. FFT data segment 2. Choose template (based on physical parameters) and filter 3. Repeat again and again and again… Sun E450 LIGO data file “frame” builder
4
April 19, 2005 Scott Koranda 4 Realizing the full science potential of LIGO Search for gravitational wave (GW) analogs of electromagnetic (EM) pulsars GW sources not likely to have EM counterparts Fast (ms) EM pulsars are stable, old neutron stars GW emission likely to come shortly after birth of a rapidly rotating (deformed, hot) NS GW sky is unknown Searches will need to survey a large parameter space All-sky blind search for periodic sources requires > 10 15 FLOPS
5
April 19, 2005 Scott Koranda 5 LIGO Laboratory Hardware for Analysis CPU (GHz) Disk (TB)Tape (TB) Network() LHO 763 20 140 OC3 LLO 391 10 140 OC3 CIT 1150 30 1200 GigE MIT 244 20 - FastE TOTAL 2548 80 1480
6
April 19, 2005 Scott Koranda 6 LIGO Scientific Collaboration (LSC) Tier 2 Sites 936 GHz CPU 34 TB RAID 5 storage OC-12 (622 Mbps) to Abilene
7
April 19, 2005 Scott Koranda 7 LIGO Scientific Collaboration (LSC) Tier 2 Sites 296 GHz CPU 64 TB storage (commodity IDE) for data OC-12 (622 Mbps) to Abilene NEWSFLASH: awarded $1.4M (NSF) + $0.3M (UW) for 3 TFlop cluster with 365 TB storage
8
April 19, 2005 Scott Koranda 8 GEO 600 Analysis Sites 272 GHz CPU 18 TB storage (RAID 5 and commodity IDE) GigE to SuperJANET (to GEANT to Abilene) 670 GHz CPU 40 TB storage (commodity IDE) Fast Ethernet to G-WiN (to GEANT to Abilene) 508 GHz CPU 18 TB storage (RAID5 and commodity IDE) 10 Gbps to SuperJANET (to GEANT to Abilene)
9
April 19, 2005 Scott Koranda 9 LSC DataGrid LSC Data Grid: 6 US sites + 3 EU sites (Birmingham, Cardiff, AEI) Cardiff AEI/Golm Birmingham ~5400 GHz CPU
10
April 19, 2005 Scott Koranda 10 LSC: Partner in GriPhyN GriPhyN: Grid Physics Network “GriPhyN research will enable the development of Petascale Virtual Data Grids (PVDGs)” US CMS US ATLAS SDSS LIGO Scientific Collaboration Computer Scientists (Globus, Condor) Virtual Data Toolkit (VDT) “one stop shopping for DataGrid software” Virtual Data System (VDS)
11
April 19, 2005 Scott Koranda 11 LSC: Parnter in iVDGL/Grid3+ Persistent Grid! 4184 CPUs (and growing…) Virtual Organizations (VOs) and applications US CMS US ATLAS SDSS LSC BTeV Other applications SnB: Bio-molecular analysis GADU/Gnare: Genome analysis
12
April 19, 2005 Scott Koranda 12 LSC: Partner in Open Science Grid
13
April 19, 2005 Scott Koranda 13 LSC: Partner in Open Science Grid
14
April 19, 2005 Scott Koranda 14 LSC Data Grid Focus in 3 areas 1. Development of Data Grid itself 2. LIGO & GEO interferomter data replication 3. Data analysis infrastructure
15
April 19, 2005 Scott Koranda 15 LSC Data Grid Focus in 3 areas 1. Development of Data Grid itself 2. LIGO & GEO interferomter data replication 3. Data analysis infrastructure
16
April 19, 2005 Scott Koranda 16 Building the LSC DataGrid Deploy server and client LSC DataGrid packages
17
April 19, 2005 Scott Koranda 17 Building the LSC DataGrid LSC DataGrid Client/Server packages Built on top of the VDT Packaged using Pacman from BU/ATLAS/iVDGL cross platform packaging solution Fedora Core 3,4 Debian 3.1 (Sarge) Solaris 9,10 well suited for “Grid” middleware more flexible post- and pre-installation configuration VO level “control” over software stack
18
April 19, 2005 Scott Koranda 18 Building the LSC DataGrid description = 'LSC DataGrid Client 3.0.0' url = 'http://www.lsc-group.phys.uwm.edu/lscdatagrid/' depends = [ 'LSC-DataGrid-Client-Environment', 'http://www.cs.wisc.edu/vdt/vdt_121_cache:Globus', 'http://www.cs.wisc.edu/vdt/vdt_121_cache:DOE-EDG-Certificates', 'http://www.cs.wisc.edu/vdt/vdt_121_cache:Condor', 'http://www.cs.wisc.edu/vdt/vdt_121_cache:GSIOpenSSH', 'http://www.cs.wisc.edu/vdt/vdt_121_cache:MyProxy', 'http://www.cs.wisc.edu/vdt/vdt_121_cache:PyGlobus', 'http://www.cs.wisc.edu/vdt/vdt_121_cache:UberFTP', 'http://www.cs.wisc.edu/vdt/vdt_121_cache:Globus-RLS', 'LSCdataFind', 'LSCcertUtil', 'LSC-VDS' ]
19
April 19, 2005 Scott Koranda 19 DOEGrids Certificate Authority LSC, through its partnership in large Grid projects, uses DOEGrids Certificate Authority for details of handing out digital certificates
20
April 19, 2005 Scott Koranda 20 LSC Data Grid Focus in 3 areas 1. Development of Data Grid itself 2. LIGO & GEO interferomter data replication 3. Data analysis infrastructure
21
April 19, 2005 Scott Koranda 21 LIGO/GEO Data Replication Replicate or copy data sets to analysis sites Quickly Securely Robustly But also provide an infrastructure for answering What data exists? Where does it exists? How do I get it? Answers needed for people and for analysis codes
22
April 19, 2005 Scott Koranda 22 Lightweight Data Replicator (LDR) Tie together 3 basic Grid services 1. Metadata Service info about files such as size, md5, GPS time,... sets of interesting metadata propagate answers question “What files or data are available?” 2. Globus Replica Location Service (RLS) catalog service maps filenames to URLs also maps filenames to sites answers question “Where are the files?” 3. GridFTP Service server and customized, tightly integrated client use to actually replicate files from site to site answers question “How do we move the files?”
23
TO30269-00-Z LDR LHOCITMITLLOUWMPSUAEI“Publish” data Metadata Catalog H-RDS_R_L3-752653616-16.gwf 1950187 bytes Frame type RDS_R_L3 Run tag S3 Locked … “Publish” data Metadata Catalog L-RDS_R_L3-752653616-16.gwf 983971 bytes Frame type RDS_R_L3 Run tag S3 Locked … What data do we want? Ask metadata catalog Collection: Instrument = ‘H’ AND frameType = ‘RDS_R_L3’ AND runTag = ‘S3’ Where can we get it? Ask URL catalog H-RDS_R_L3-752653616-16.gwf is available at LHO Local Replica Catalog H-RDS_R_L3-752653616-16.gwf → gsiftp://ldas.ligo-wa.caltech. edu:15000/samrds/S3/L3/LHO/H- RDS_R_L3-7526/H-RDS_R_L3- 752653616-16.gwf Local Replica Catalog L-RDS_R_L3-752653616-16.gwf → gsiftp://ldas.ligo-la.caltech. edu:15000/samrds/S3/L3/LLO/L- RDS_R_L3-7526/L-RDS_R_L3- 752653616-16.gwf “I have URLs for files…” URL Catalog What is URL for H-RDS_R_L3-752653616-16.gwf? gsiftp://ldas.ligo-wa.caltech.edu:15000/samrds/S3/L3/LHO/H-RDS_R_L3-7526/H-RDS_R_L3- 752653616-16.gwf
24
April 19, 2005 Scott Koranda 24 Lightweight Data Replicator (LDR) Python – glue to hold it all together and make it robust PyGlobus Python API for Globus toolkit (3.2.x) Keith Jackson’s group at LBL LIGO/GEO S4 science run completed March replicated over 30 TB in 30 days mean time between failure now one month (outside of CIT) over 30 million files in LDR network pushed into Python performance limits now
25
April 19, 2005 Scott Koranda 25 Lightweight Data Replicator (LDR) Also provide infrastructure for data discovery Clients query for URLs based on metadata LSCdataFind --observatory H --type R -- gps-start-time 71402420 --gps-end- time 714024340 --url-type gsiftp gsiftp://dataserver.phys.uwm.edu:15000/data/gsiftp_root/c luster_storage/data/s001/S1/R/H/714023808-714029599/H- R-714024224-16.gwf gsiftp://dataserver.phys.uwm.edu:15000/data/gsiftp_root/c luster_storage/data/s001/S1/R/H/714023808-714029599/H- R-714024240-16.gwf http://www.lsc-group.phys.uwm.edu/LDR
26
April 19, 2005 Scott Koranda 26 LSC Data Grid Focus in 3 areas 1. Development of Data Grid itself 2. LIGO & GEO interferomter data replication 3. Data analysis infrastructure
27
April 19, 2005 Scott Koranda 27 Growing LSC Grid Users We have a “recipe” for growing LSC DataGrid users and applications Start with developing applications on Linux destops all current algorithms “pleasantly parallel” most applications written in ANSI C some Matlab (compiled)
28
April 19, 2005 Scott Koranda 28 Growing LSC Grid Users String together smaller, simple tools to create complex pipelines or workflows ala the UNIX small tools model output of one application is input of the next Eventually manage these complex workflows using Condor DAGman DAG = Directed Acyclic Graph describe parent-child relationships DAGman runs child jobs after succesful parents jobs
29
April 19, 2005 Scott Koranda 29 Growing LSC Grid Users Workflow for inspiral search (neutron star, black hole, other compact objects) for a single 2048 sec stretch of data
30
April 19, 2005 Scott Koranda 30 Growing LSC Grid Users Develop tools to allow LSC scientists to easily create these workflows pipeline.py python module with useful classes and object orientied data structures no GUI tools yet... applications evolve too quickly the “application” becomes a DAG generator that ties together many smaller applications
31
April 19, 2005 Scott Koranda 31 Growing LSC Grid Users Users graduate to running on N of 8 Linux clusters start on a single Linux cluster debug any scaling issues (usually I/O) Condor DAGman still used to manage workflows Still...too much effort lost managing jobs at N different sites
32
April 19, 2005 Scott Koranda 32 Growing LSC Grid Users Users complain to Scott and Duncan, “Why can’t we run across multiple clusters/sites all at once?” Scott and Duncan reply with a wry smile... “Add the flag --dax to your DAG generator script”
33
April 19, 2005 Scott Koranda 33 Growing LSC Grid Users DAX is a more generic description of workflow only applications and necessary files no hard-coded paths no job manager or service addresses XML (ready for machine processing) No change to applications or DAG/DAX generation script! Just add --dax flag
34
April 19, 2005 Scott Koranda 34 Growing LSC Grid Users Pegasus planner converts DAX to workflow for Grid search catalogs for locations of data search catalogs for locations of executables search catalogs for locations of resources plan the workflow for the LSC DataGrid stage data if necessary stage executables if necessary output is a “concrete DAG” run under DAGman as usual
35
April 19, 2005 Scott Koranda 35 Growing LSC Grid Users Linux desktop to the TeraFLOP...
36
April 19, 2005 Scott Koranda 36 Not that we don’t have growing pains... DataGrid infrastructure issues Globus GRAM 2.x, 3.x limitations Are they gone in GT 4.0? Data replication problems Scaling issues for metadata service Stability issues for Globus RLS recently solved a major stability problem Pegasus still a research product only just realized application staging “smart” scheduling of data transfers
37
April 19, 2005 Scott Koranda 37 Einstein@Home
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.