Experiment Applications: applying the power of the grid to real science Rick Cavanaugh University of Florida GriPhyN/iVDGL External Advisory Committee 13 January, 2002
GriPhyN/iVDGL and ATLAS Argonne, Boston, Brookhaven, Chicago, Indiana, Berkeley, Texas
EAC Review3 ATLAS at SC2002 l Grappa Manages the overall grid experience Magda Distributed data management and replication Pacman Defines and produces software environments Dc1 production with grat Data challenge simulations for Atlas Instrumented athena Grid monitoring of Atlas analysis applications vo-gridmap Virtual organization management Gridview Monitoring U.S. Atlas resources Worldgrid World-wide US/EU grid infrastructure
EAC Review4 Pacman at SC2002 l How did we install our software for this demo? % pacman –get iVDGL:WorldGrid ScienceGrid l Pacman lets you define how a mixed tarball/rpm/gpt/native software environment is Fetched Installed Setup Updated l This can be figured out once and exported to the rest of the world via caches % pacman –get atlas_testbed
EAC Review5 The caches you have decided to trust Installed software, pointer to local documentation Dependencies are automatically resolved Pacman at SC2002 l How did we install our software for this demo? % pacman –get iVDGL:WorldGrid ScienceGrid l Pacman lets you define how a mixed tarball/rpm/gpt/native software environment is Fetched Installed Setup Updated l This can be figured out once and exported to the rest of the world via caches % pacman –get atlas_testbed
EAC Review6 Grappa at SC2002 l Web-based interface for Athena job submission to Grid resources l Based on XCAT Science Portal technology developed at Indiana l EDG JDL backend to Grappa l Common submission to US gatekeepers and EDG resource broker (through EDG “user interface” machine)
EAC Review7 Grappa Portal Machine: XCAT tomcat server Web Browsing Machine (JavaScript) Netscape/Mozilla/Int.Expl/PalmScape https - JavaScript http: JavaScriptCactus framework Script-Based Submisson interactive or cron-job Resource A Resource Z... MAGDA: registers file/location registers file metadata Compute Resources browse catalogue CoG : Submission, Monitoring CoG : Data Copy Data Storage: - Data Disk - HPSS Magda (spider) Input files Grappa Communications Flow
EAC Review8 Instrumented Athena at SC2002 l Part of SuperComputing 2002 ATLAS demo l Prophesy ( An Infrastructure for Analyzing & Modeling the Performance of Parallel & Distributed Applications Normally a Parse & auto- instrument approach (C & FORTRAN). l NetLogger ( didc.lbl.gov/NetLogger/) End-to-End Monitoring & Analysis of Distributed Systems C, C++, Java, Python, Perl, Tcl APIs Web Service Activation
GriPhyN/iVDGL and CMS Caltech, Fermilab, Florida, San Diego, Wisconsin
EAC Review10 Bandwidth Gluttony at SC2002 l "Grid-Enabled" particle physics analysis application l issued remote database selection queries; prepared data object collections, l moved collections across the WAN using specially enhanced TCP/IP stacks l rendered the results in real time on the analysis client workstation in Baltimore.
EAC Review11 MonaLisa at SC2002 l MonaLisa (Caltech) –Deployed on the US-CMS Test-bed –Dynamic information/resource discovery mechanism using agents –Implemented in >Java / Jini with interfaces to SNMP, MDS, and Ganglia >WDSL / SOAP with UDDI –Proved critical during live CMS production runs Pictures taken from Iosif Legrand
EAC Review12 MOP and Clarens at SC2002 l Simple, robust grid planner integrated with CMS production software l 1.5 million simulated CMS events produced over 2 months (~30 CPU years) VDT Client VDT Server 1 MCRunJob DAGMan/ Condor-G Condor GridFTP VDT Server N Condor GridFTP mop-submitter LinkerScriptGen Config Req. Self Des Master Clarens Client Clarens Server Clarens Server
EAC Review13 Chimera Production at SC2002 l Used VDL to describe virtual data products and their dependencies l Used the Chimera Planners to map abstract workflows onto concrete grid resources l Implemented a WorkRunner to continously schedule jobs across all grid sites Generator Simulator Formator Reconstructor Ntuple Production Analysis params exec. data Stage File In Execute Job Stage File Out Register File Example CMS concrete DAG
EAC Review14 mass = 200 decay = WW stability = 1 event = 8 mass = 200 decay = WW stability = 1 plot = 1 mass = 200 decay = WW plot = 1 mass = 200 decay = WW event = 8 mass = 200 decay = WW stability = 1 mass = 200 decay = WW stability = 3 mass = 200 decay = WW mass = 200 decay = ZZ mass = 200 plot = 1 mass = 200 event = 8 A virtual space of simulated data is created for future use by scientists... Data Provenance at SC2002
EAC Review15 mass = 200 decay = WW stability = 1 event = 5 mass = 200 decay = WW stability = 1 plot = 1 mass = 200 decay = WW plot = 1 mass = 200 decay = WW event = 8 mass = 200 decay = WW stability = 1 mass = 200 decay = WW stability = 3 mass = 200 decay = WW mass = 200 decay = ZZ mass = 200 plot = 1 mass = 200 event = 8 Search for WW decays of the Higgs Boson and where only stable, final state particles are recorded: mass = 200; decay = WW; stability = 1 Data Provenance at SC2002
EAC Review16 mass = 200 decay = WW stability = 1 LowPt = 20 HighPt = mass = 200 decay = WW stability = 1 event = 8 mass = 200 decay = WW stability = 1 plot = 1 mass = 200 decay = WW plot = 1 mass = 200 decay = WW event = 8 mass = 200 decay = WW stability = 1 mass = 200 decay = WW stability = 3 mass = 200 decay = WW mass = 200 decay = ZZ mass = 200 plot = 1 mass = 200 event = 8...The scientist adds a new derived data branch... and continues to investigate ! Data Provenance at SC2002
ISI, Caltech, Milwaukee GriPhyN and LIGO (Laser Interferometer Gravitational-wave Observatory)
EAC Review18 LIGO’s Pulsar Search Long time frames Store raw channels Short time frames Hz Time Single Frame Extract channel transpose Time-frequency Image Find Candidate event DB archive Interferom eter Short Fourier Transform Extract frequency range Construct image 30 minutes
EAC Review19 l Developed at ISI as part of the GriPhyN project l Configurable system that can map and execute complex workflows on the Grid l Integrated with the GriPhyN Chimera system It Receives an abstract workflow (AW) description from Chimera, produces a concrete workflow (CW) Submits the CW to DAGMan for execution. Optimizations of CW are done from the point of view of Virtual Data. l Can perform AW planning based on application-level metadata attributes. l Given attributes such as time interval, frequency of interest, location in the sky, etc., Pegasus is currently able to produce any virtual data products present in the LIGO pulsar search Pegasus: Planning for Execution in Grids
EAC Review20 Metadata Driven Configuration
EAC Review21 LIGO’s pulsar search at SC2002 l The pulsar search conducted at SC 2002 Used LIGO’s data collected during the first scientific run of the instrument Targeted a set of 1000 locations of known pulsar as well as random locations in the sky Results of the analysis were published via LDAS (LIGO Data Analysis System) to the LIGO Scientific Collaboration performed using LDAS and compute and storage resources at Caltech, University of Southern California, University of Wisconsin Milwaukee.
EAC Review22 Results SC 2002 demo l Over 58 pulsar searches l Total of 330 tasks 469 data transfers 330 output files l The total runtime was 11:24:35 To date l 185 pulsar searches l Total of 975 tasks 1365 data transfers 975 output files l Total runtime 96:49:47
Virtual Galaxy Cluster System: An Application of the GriPhyN Virtual Data Toolkit to Sloan Digital Sky Survey Data Chicago, Argonne, Fermilab
EAC Review24 The Brightest Cluster Galaxy Pipeline Interesting intermediate data reuse made possible by Chimera: maxBcg is a series of transformations Cluster finding works well with 1 Mpc radius apertures. If one instead was looking for the sites of gravitational lensing, one would rather use a 1/4 Mpc radius. This would start at transformation 3. l 1: extracts galaxies from the full tsObj data set. l 2: filter the field for Bright Red Galaxies. l 3: calculate the weighted BCG likelihood for each galaxy, most expensive. l 4: is this galaxy the most likely galaxy in the neighborhood? l 5: remove extraneous data, and store in a compact format.
EAC Review25 BRG Core Cluster Catalog The DAG
EAC Review26 A DAG for 50 Fields l 744 files, 387 nodes, 40 minutes
EAC Review27 With Jim Annis & Steve Kent, FNAL Galaxy cluster size distribution DAG Example: Sloan Galaxy Cluster Analysis Sloan Data
EAC Review28 Conclusion l Built a virtual cluster system based on Chimera and SDSS cluster finding. l Described the five stages and data dependencies in VDL. l Tested the system on a virtual data grid. l Conducting performance analysis. l Helped improve Chimera.
EAC Review29 Some CMS Issues/Challenges l How to generate more buy-in from the experiments? Sociological trust problem, not technical. l More exploition of (virtual) collections of objects and further use of web services (work already well underway). l What is required to store the complete provenance of data generated in a grid environment? l Creation of collaborative peer-to-peer environments. l Data Challenge : generate and analyze 5% of the expected data at startup (~1/2 year of continuous production). l What is the relationship between WorldGRID and the LCG? l Robust, portable applications! Virtual Organization Management and Policy Enforcement.
EAC Review30 Some ATLAS Issues/Challenges l How to generate more buy-in from the experiments? Sociological trust problem, not technical. Fleshing out the notion of Pacman "Projects" and prototyping them l What is the best integration path for chimera infrastructure with international atlas catalog systems? Need standardized Virtual Data API? l Packaging and distribution of ATLAS SW releases for each step in the production/analysis chain: gen, sim, reco, analysis. l LCG SW application development env. is now SCRAM: ATLAS evaluating possible migration from CMT to SCRAM
EAC Review31 SDSS Challenges l Cluster Finding Distribution of clusters in the universe Evolution of the mass function Balanced I/O and compute l Power Spectrum Distribution of galaxies in the universe Direct constraints on cosmological parameters Compute intensive, prefer MPI systems Premium on discovering similar results l Analyses based on pixel data Weak lensing analysis of the SDSS coadded southern survey data Near Earth asteroid searches Galaxy morphological properties: NVO Galaxy Morphology Demo All involve moving around terabytes of data Or choosing not to
EAC Review32 LIGO Challenges