Software Implementation

Slides:



Advertisements
Similar presentations
Building Portals to access Grid Middleware National Technical University of Athens Konstantinos Dolkas, On behalf of Andreas Menychtas.
Advertisements

11/12/2003LIGO Document G Z1 Data reduction for S3 I Leonor (UOregon), P Charlton (CIT), S Anderson (CIT), K Bayer (MIT), M Foster (PSU), S Grunewald.
Setting up of condor scheduler on computing cluster Raman Sehgal NPD-BARC.
GRID workload management system and CMS fall production Massimo Sgaravatto INFN Padova.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
Office of Science U.S. Department of Energy Grids and Portals at NERSC Presented by Steve Chan.
DataGrid Kimmo Soikkeli Ilkka Sormunen. What is DataGrid? DataGrid is a project that aims to enable access to geographically distributed computing power.
GRID Workload Management System Massimo Sgaravatto INFN Padova.
Grid Services at NERSC Shreyas Cholia Open Software and Programming Group, NERSC NERSC User Group Meeting September 17, 2007.
Workload Management Massimo Sgaravatto INFN Padova.
Open Science Grid: More compute power Alan De Smet
The SAM-Grid Fabric Services Gabriele Garzoglio (for the SAM-Grid team) Computing Division Fermilab.
LIGO-G E ITR 2003 DMT Sub-Project John G. Zweizig LIGO/Caltech Argonne, May 10, 2004.
OSG End User Tools Overview OSG Grid school – March 19, 2009 Marco Mambelli - University of Chicago A brief summary about the system.
National Alliance for Medical Image Computing Grid Computing with BatchMake Julien Jomier Kitware Inc.
April 19, 2005Scott Koranda1 The LIGO Scientific Collaboration Data Grid Scott Koranda UW-Milwaukee On behalf of the LIGO Scientific Collaboration.
LSC Segment Database Duncan Brown Caltech LIGO-G Z.
10/20/05 LIGO Scientific Collaboration 1 LIGO Data Grid: Making it Go Scott Koranda University of Wisconsin-Milwaukee.
Patrick R Brady University of Wisconsin-Milwaukee
GT Components. Globus Toolkit A “toolkit” of services and packages for creating the basic grid computing infrastructure Higher level tools added to this.
ESP workshop, Sept 2003 the Earth System Grid data portal presented by Luca Cinquini (NCAR/SCD/VETS) Acknowledgments: ESG.
Job Submission Condor, Globus, Java CoG Kit Young Suk Moon.
INFSO-RI Module 01 ETICS Overview Etics Online Tutorial Marian ŻUREK Baltic Grid II Summer School Vilnius, 2-3 July 2009.
LIGO-G9900XX-00-M ITR 2003 DMT Sub-Project John G. Zweizig LIGO/Caltech.
CSF4 Meta-Scheduler Name: Zhaohui Ding, Xiaohui Wei
Evaluation of Agent Teamwork High Performance Distributed Computing Middleware. Solomon Lane Agent Teamwork Research Assistant October 2006 – March 2007.
Resource Brokering in the PROGRESS Project Juliusz Pukacki Grid Resource Management Workshop, October 2003.
Part Four: The LSC DataGrid Part Four: LSC DataGrid A: Data Replication B: What is the LSC DataGrid? C: The LSCDataFind tool.
The Roadmap to New Releases Derek Wright Computer Sciences Department University of Wisconsin-Madison
July 11-15, 2005Lecture3: Grid Job Management1 Grid Compute Resources and Job Management.
The LIGO Scientific Collaboration Data Grid Client/Server Environment Junwei Cao MIT LIGO Laboratory For the LIGO Scientific Collaboration.
Campus grids: e-Infrastructure within a University Mike Mineter National e-Science Centre 14 February 2006.
LIGO-G E LIGO Scientific Collaboration Data Grid Status Albert Lazzarini Caltech LIGO Laboratory Trillium Steering Committee Meeting 20 May 2004.
Junwei Cao March LIGO Document ID: LIGO-G Computing Infrastructure for Gravitational Wave Data Analysis Junwei Cao.
Alain Roy Computer Sciences Department University of Wisconsin-Madison Condor & Middleware: NMI & VDT.
Jan 12, 2009LIGO-G Z1 DMT and NDS2 John Zweizig LIGO/Caltech Ligo PAC, Caltech, Jan 12, 2009.
Contact: Junwei Cao SC2005, Seattle, WA, November 12-18, 2005 The authors gratefully acknowledge the support of the United States National.
State of LSC Data Analysis and Software LSC Meeting LIGO Hanford Observatory November 11 th, 2003 Kent Blackburn, Stuart Anderson, Albert Lazzarini LIGO.
Scott Koranda, UWM & NCSA 14 January 2016www.griphyn.org Lightweight Data Replicator Scott Koranda University of Wisconsin-Milwaukee & National Center.
11/12/2003LIGO-G Z1 Data reduction for S3 P Charlton (CIT), I Leonor (UOregon), S Anderson (CIT), K Bayer (MIT), M Foster (PSU), S Grunewald (AEI),
Grid Compute Resources and Job Management. 2 Grid middleware - “glues” all pieces together Offers services that couple users with remote resources through.
LIGO-G W Use of Condor by the LIGO Scientific Collaboration Gregory Mendell, LIGO Hanford Observatory On behalf of the LIGO Scientific Collaboration.
Grid Activities in CMS Asad Samar (Caltech) PPDG meeting, Argonne July 13-14, 2000.
Simulation Production System Science Advisory Committee Meeting UW-Madison March 1 st -2 nd 2007 Juan Carlos Díaz Vélez.
OSG Status and Rob Gardner University of Chicago US ATLAS Tier2 Meeting Harvard University, August 17-18, 2006.
Open Science Grid Configuring RSV OSG Resource & Service Validation Thomas Wang Grid Operations Center (OSG-GOC) Indiana University.
Parallel Computing Globus Toolkit – Grid Ayaka Ohira.
Gravitational Wave Data Analysis  GW detectors  Signal processing: preparation  Noise spectral density  Matched filtering  Probability and statistics.
Scott Koranda, UWM & NCSA 20 November 2016www.griphyn.org Lightweight Replication of Heavyweight Data Scott Koranda University of Wisconsin-Milwaukee &
Workload Management Workpackage
Simulation Production System
U.S. ATLAS Grid Production Experience
Gregory Mendell LIGO Hanford Observatory
Peter Kacsuk – Sipos Gergely MTA SZTAKI
Creating and running applications on the NGS
Mardi Gras Distributed Applications Conference Baton Rouge, LA
Study course: “Computing clusters, grids and clouds” Andrey Y. Shevel
Condor: Job Management
LCG middleware and LHC experiments ARDA project
Patrick Dreher Research Scientist & Associate Director
Pegasus and Condor Gaurang Mehta, Ewa Deelman, Carl Kesselman, Karan Vahi Center For Grid Technologies USC/ISI.
Signal Processing: Propaedeutics
Genre1: Condor Grid: CSECCR
Wide Area Workload Management Work Package DATAGRID project
GRID Workload Management System for CMS fall production
Status of Grids for HEP and HENP
Condor-G Making Condor Grid Enabled
gLite The EGEE Middleware Distribution
Grid Computing Software Interface
Condor-G: An Update.
Presentation transcript:

Software Implementation Junwei Cao (曹军威) and Junwei Li (李俊伟) Tsinghua University Gravitational Wave Summer School Kunming, China, July 2009

Computing Challenges Large amount of data Parallel and distributed processing Clusters and grids Clusters: collection of workstations Grids: collection of clusters/supercomputers Software becomes the key LIGO data analysis software working group (DASWG) Future computing infrastructures

The LSC Data Grid (LDG) Cardiff Birmingham• AEI/Golm

The LDG Software Stack LDAS DMT LALApps Matlab End users & applications Application enabling LDM LDGreport Glue Onasys LSC Job management LSC Data management LDR LSCdataFind LSCsegFind The LSC Data Grid Client/Server Environment Version 4.0 (based on VDT 1.3.9) LSCcertUtils LSC CA LSC Security management Applications Infrastructures Condor-G Worklfow management / Condor DAGman VDS VOMS Catalog service / Globus Resource location service / Globus Metadata service Resource management / Globus GRAM Data transfer / GridFTP Job scheduling / Condor Grid security / Globus GSI Middleware / Services Operating Systems and … FC4 GCC Python Autotools MySQL

The LDG 4.0 Client/Server Written in Pacman 3 Based on VDT 1.3.9 …… platformGE( 'Linux' ); package( 'Client-Environment' ); cd( 'vdt' ); package( 'VDT_CACHE:Globus-Client' ); package( 'VDT_CACHE:CA-Certificates' ); package( 'VDT_CACHE:Condor' ); package( 'VDT_CACHE:Fault-Tolerant-Shell' ); package( 'VDT_CACHE:GSIOpenSSH' ); package( 'VDT_CACHE:KX509' ); package( 'VDT_CACHE:MyProxy' ); package( 'VDT_CACHE:PyGlobus' ); package( 'VDT_CACHE:PyGlobusURLCopy' ); package( 'VDT_CACHE:UberFTP' ); package( 'VDT_CACHE:EDG-Make-Gridmap' ); package( 'VDT_CACHE:Globus-RLS-Client' ); package( 'VDT_CACHE:VDS' ); package( 'VDT_CACHE:VOMS-Client' ); cd(); package( 'Client-FixSSH' ); package( 'Client-RLS-Python-Client' ); package( 'Client-Cert-Util' ); package( 'Client-LSC-CA' ); OR platformGE( 'Sun' ); package( 'SolarisPro' ); platformGE( 'MacOS' ); package( 'Mac' ); Written in Pacman 3 Based on VDT 1.3.9 Support LDG:Client and LDG:ClientPro Support multiple platforms: FC4, Solaris and Darwin at client and only FC4 at server Support both 32bit and 64bit machines Server includes client Online documentation for step-by-step installation

Data Discovery (LSCdataFind) Metadata observatory = L dataType = RDS_R_L3 startGPS = 751658016 endGPS = 751658095 Metadata Service LFNs L-RDS_R_L3-751658016-16.gwf L-RDS_R_L3-751658032-16.gwf L-RDS_R_L3-751658048-16.gwf L-RDS_R_L3-751658064-16.gwf L-RDS_R_L3-751658080-16.gwf Resource Location Service @ldas.mit.edu PFNs (for both local and remote) file://localhost/data/node10/frame/S3/L3/LLO/L-RDS_R_L3-751658016-16.gwf gsiftp://ldas.mit.edu/data/node10/frame/S3/L3/LLO/L-RDS_R_L3-751658016-16.gwf file://localhost/data/node11/frame/S3/L3/LLO/L-RDS_R_L3-751658032-16.gwf gsiftp://ldas.mit.edu/data/node11/frame/S3/L3/LLO/L-RDS_R_L3-751658032-16.gwf file://localhost/data/node12/frame/S3/L3/LLO/L-RDS_R_L3-751658048-16.gwf gsiftp://ldas.mit.edu/data/node12/frame/S3/L3/LLO/L-RDS_R_L3-751658048-16.gwf file://localhost/data/node13/frame/S3/L3/LLO/L-RDS_R_L3-751658064-16.gwf gsiftp://ldas.mit.edu/data/node13/frame/S3/L3/LLO/L-RDS_R_L3-751658064-16.gwf file://localhost/data/node14/frame/S3/L3/LLO/L-RDS_R_L3-751658080-16.gwf gsiftp://ldas.mit.edu/data/node14/frame/S3/L3/LLO/L-RDS_R_L3-751658080-16.gwf

Data Replication (LDR) 24/7 Operation LHO (Tier 0) LDRMaster LDRMetadataService LDRSchedule LDRTransfer (gridftp) Caltech (Tier 1) RLS Local Storage Module mysql LDRMaster LDRMetadataService LDRSchedule LDRTransfer (gridftp) MIT (Tier 2) RLS Local Storage Module mysql 1. update metadata lfn 2. get lfn pfn@caltech(remote) 3. generate lfn pfn@mit 4. gridftp pfn@caltech(remote) pfn@mit(local) 5. rlsadd lfn pfn@mit LLO (Tier 0)

Data Monitoring Toolkit (DMT) SMP client gui Name Server Server lsmp DMT Monitors Single data stream lmsg DMT Online Use Scenario – control-room type DMT Offline Use Scenario – standalone or grid enabled MonServer Multiple data streams Stdout Trigger files Alarm files Trend files …… base container xml /data/node10/frame/S3/L3/LHO/H-RDS_R_L3-751658016-16.gwf /data/node11/frame/S3/L3/LHO/H-RDS_R_L3-751658032-16.gwf /data/node12/frame/S3/L3/LHO/H-RDS_R_L3-751658048-16.gwf /data/node13/frame/S3/L3/LHO/H-RDS_R_L3-751658064-16.gwf /data/node14/frame/S3/L3/LHO/H-RDS_R_L3-751658080-16.gwf /data/node15/frame/S3/L3/LHO/H-RDS_R_L3-751658096-16.gwf /data/node16/frame/S3/L3/LHO/H-RDS_R_L3-751658112-16.gwf sigp html ezcalib xsil /data/node10/frame/S3/L3/LLO/L-RDS_R_L3-751658016-16.gwf /data/node11/frame/S3/L3/LLO/L-RDS_R_L3-751658032-16.gwf /data/node12/frame/S3/L3/LLO/L-RDS_R_L3-751658048-16.gwf /data/node13/frame/S3/L3/LLO/L-RDS_R_L3-751658064-16.gwf /data/node14/frame/S3/L3/LLO/L-RDS_R_L3-751658080-16.gwf /data/node15/frame/S3/L3/LLO/L-RDS_R_L3-751658096-16.gwf /data/node16/frame/S3/L3/LLO/L-RDS_R_L3-751658112-16.gwf dmtenv event …… trig …… frameio DMT Libraries

standalone run of rmon DMT offline monitor An Example DMT Monitor filelist1.txt multilist.txt /data/node10/frame/S3/L3/LLO/L-RDS_R_L3-751658016-16.gwf /data/node11/frame/S3/L3/LLO/L-RDS_R_L3-751658032-16.gwf /data/node12/frame/S3/L3/LLO/L-RDS_R_L3-751658048-16.gwf /data/node13/frame/S3/L3/LLO/L-RDS_R_L3-751658064-16.gwf /data/node14/frame/S3/L3/LLO/L-RDS_R_L3-751658080-16.gwf /data/node15/frame/S3/L3/LLO/L-RDS_R_L3-751658096-16.gwf /data/node16/frame/S3/L3/LLO/L-RDS_R_L3-751658112-16.gwf filelist1.txt filelist2.txt filelist2.txt rmon /data/node10/frame/S3/L3/LHO/H-RDS_R_L3-751658016-16.gwf /data/node11/frame/S3/L3/LHO/H-RDS_R_L3-751658032-16.gwf /data/node12/frame/S3/L3/LHO/H-RDS_R_L3-751658048-16.gwf /data/node13/frame/S3/L3/LHO/H-RDS_R_L3-751658064-16.gwf /data/node14/frame/S3/L3/LHO/H-RDS_R_L3-751658080-16.gwf /data/node15/frame/S3/L3/LHO/H-RDS_R_L3-751658096-16.gwf /data/node16/frame/S3/L3/LHO/H-RDS_R_L3-751658112-16.gwf standalone run of rmon DMT offline monitor [jcao@ldaspc1 rmon]$ export LD_LIBRARY_PATH=/opt/lscsoft/dol/lib [jcao@ldaspc1 rmon]$ ./rmon -opt opt -inlists multilist.txt Processing multi list file: multilist.txt Number of lists added: 2 Total data streams: 2 Processing frame list file: /home/jcao/rmon/filelist1.txt Number of files added: 1188 Total frame files: 1188 Processing frame list file: /home/jcao/rmon/filelist2.txt channel[1]=H1:LSC-AS_Q channel[2]=L1:LSC-AS_Q startgps=751658000 stride=16 r-statistic=-0.00251782 startgps=751658016 stride=16 r-statistic=-0.0122699 startgps=751658032 stride=16 r-statistic=0.0168868 …… opt stride 16.0 channel_1 H1:LSC-AS_Q channel_2 L1:LSC-AS_Q

The LDM Modules and Flowchart Other tools client server QUEUED REJECTED SCHEDULED LSCdataFind Server ldm_submit ldm_locate_script LSCdataFind LOCATING ldm_agent RELEASED condor_master LOCATED Globus Job Manager ldm_q ldm_exec_script condor_submit RUNNING ldm_rm FINISHED LDM_SITES ldm_agent [MIT] lscdatafindserver = ldas-gridmon.mit.edu globusscheduler = ldas-grid.mit.edu/jobmanager-condor environment = LD_LIBRARY_PATH=/dso-test/home/jcao/dol/lib [CIT] lscdatafindserver = ldas-gridmon.ligo.caltech.edu globusscheduler = ldas-grid.ligo.caltech.edu/jobmanager-condor environment = LD_LIBRARY_PATH=/dso-test/jcao/dol/lib [LHO] lscdatafindserver = ldas-gridmon.ligo-wa.caltech.edu globusscheduler = ldas-grid.ligo-wa.caltech.edu/jobmanager-condor [LLO] lscdatafindserver = ldas-gridmon.ligo-la.caltech.edu globusscheduler = ldas-grid.ligo-la.caltech.edu/jobmanager-condor environment = LD_LIBRARY_PATH=/data2/jcao/dol/lib LDM_CONFIG Condor [AGENT] RESOURCES = @MIT@CIT@LHO@LLO SITES = /home/jcao/ldm/etc/LDM_SITES EXEC = /home/jcao/ldm/bin/ldm_exec_script LOCATE = /home/jcao/ldm/bin/ldm_locate_script PID = /home/jcao/ldm/var/ldm.pid LOG = /home/jcao/ldm/var/ldm.log LDG = /home/jcao/ldg-3.0/ Modules developed or deployed Modules designed and underdeveloped

Data Monitoring Environment (LDM) grid-enabled run of rmon DMT offline monitor using LDM ldm.sub [jcao@ldaspc1 ~]$ cd ldm [jcao@ldaspc1 ldm]$ source setup.sh [jcao@ldaspc1 ldm]$ cd ../rmon [jcao@ldaspc1 rmon]$ ldm_agent [jcao@ldaspc1 rmon]$ ldm_submit ldm.sub Job test has been submitted. [jcao@ldaspc1 rmon]$ more ldm_test_condor.out Processing multi list file: ldm_test_CIT_multilist.txt Number of lists added: 2 Total data streams: 2 …… startgps=751658000 stride=16 r-statistic=-0.00251782 [job] id = test monitor = rmon args = -opt opt input = opt [data] observatory = @H@L type = @RDS_R_L3@RDS_R_L3 start = 751658000 end = 751676993 automatically generated Condor-G submission file Users are interfaced with a LIGO friendly language. universe = globus globusscheduler = ldas-grid.ligo.caltech.edu/jobmanager-condor log = ldm_test_condor.log output = ldm_test_condor.out error = ldm_test_condor.err should_transfer_files = YES when_to_transfer_output = ON_EXIT transfer_input_files = ldm_test_CIT_multilist.txt, ldm_test_CIT_filelist1.txt, ldm_test_CIT_filelist2.txt, /home/jcao/rmon/opt arguments = -inlists ldm_test_CIT_multilist.txt -opt opt environment = LD_LIBRARY_PATH=/dso-test/jcao/dol/lib executable = /home/jcao/rmon/rmon Queue Users do not bother with technical details of LSC data grid services. Data are located and file lists are generated automatically

Open Science Grid (OSG) High energy physics Bioinformatics Nanotechnology …… Grid of Grids 20 thousand CPUs Petabytes of data storage

Thank You ! Junwei Cao jcao@tsinghua.edu.cn http://ligo.org.cn