Operated by the Southeastern Universities Research Association for the U.S. Depart. Of Energy Thomas Jefferson National Accelerator Facility Andy Kowalski SURA Cyberinfrastructure Workshop Georgia State University January 5–7, 2005 Jefferson Lab: Experimental and Theoretical Physics Grids Andy Kowalski
Operated by the Southeastern Universities Research Association for the U.S. Depart. Of Energy Thomas Jefferson National Accelerator Facility Andy Kowalski Jefferson Lab Who are we? Thomas Jefferson National Accelerator Facility Department of Energy Research Laboratory Southeastern Universities Research Association What do we do? High Energy Nuclear Physics quarks and gluons Operate a 6.07 GeV continuous electron beam accelerator Free-Electron Laser (10 kW)
Operated by the Southeastern Universities Research Association for the U.S. Depart. Of Energy Thomas Jefferson National Accelerator Facility Andy Kowalski Jefferson Lab
Operated by the Southeastern Universities Research Association for the U.S. Depart. Of Energy Thomas Jefferson National Accelerator Facility Andy Kowalski Data and Storage Three experimental halls HallA and HallC 100’s GB/day each HallB – CLAS TB/day (currently up to 30MB/sec) Currently store and manage 1 PB of data on tape. Users around the world want access to the data
Operated by the Southeastern Universities Research Association for the U.S. Depart. Of Energy Thomas Jefferson National Accelerator Facility Andy Kowalski Computing Batch Farm 200 dual CPU nodes ~358,060 SPECint2000 Moves 4-7 TB/day Reconstruction Analysis Simulations (CLAS – large) Lattice QCD Machine 3 clusters 128, 256, 384 nodes
Operated by the Southeastern Universities Research Association for the U.S. Depart. Of Energy Thomas Jefferson National Accelerator Facility Andy Kowalski Need for Grids 12 GeV Upgrade HallB – CLAS data rates increase to MB/sec Will export 50% or more of the data Import data from simulations done at Universities This can be a rather large amount HallD – GlueX Same scale as the LHC experiments 100 MB/sec - 3 PB of data per year 1 PB of raw data at JLab 1 PB for analysis (JLAb and offsite) 1 PB for simulations (offsite) Lattice QCD 10 TF machine A significant amount of data Users around the world want access to the data
Operated by the Southeastern Universities Research Association for the U.S. Depart. Of Energy Thomas Jefferson National Accelerator Facility Andy Kowalski JLab: Theory and Experimental Grid Efforts Similarities Focus on Data Grids Desire interfaces definitions for interoperability Chose web services for implementation WSDL defines the interface Theory ILDG and PPDG SRM Replica Catalog Experimenal PPDG and pursuing OSG SRM Job submission interface
Operated by the Southeastern Universities Research Association for the U.S. Depart. Of Energy Thomas Jefferson National Accelerator Facility Andy Kowalski ILDG: Data Grid Services Web Services Architecture File Client Meta Data Catalog Replica Catalog SRM Service Replication Service Storage (disk, silo) File Server(s) Web Services Single Site Meta Data Catalog Replica Catalog SRM Service Replication Service Storage (disk, silo) File Server(s) Replica Catalog Meta Data Catalog Storage (disk, silo) File Server(s) SRM Service Storage (disk, silo) File Server(s) Replication Service Replica Catalog SRM Service Storage (disk, silo) File Server(s) Replication Service Replica Catalog SRM Service (Consistency Agent) Storage (disk, silo) File Server(s) SRM Service Replication Service Replica Catalog Storage (disk, silo) File Server(s) SRM Service * Slide from Chip Watson, ILDG Middleware Project Status
Operated by the Southeastern Universities Research Association for the U.S. Depart. Of Energy Thomas Jefferson National Accelerator Facility Andy Kowalski ILDG: A Three Tier Web Services Architecture Web Browser XML to HTML servlet Web Service Application Web Service Local Backend Services (batch, file, etc.) Web Server (Portal) Authenticated connections Remote Web Server Web Service Storage system Catalogs Web services provide a standard API for clients, and intermediary servlets allow use from a browser (as in a portal) * Slide from Chip Watson, ILDG Middleware Project Status
Operated by the Southeastern Universities Research Association for the U.S. Depart. Of Energy Thomas Jefferson National Accelerator Facility Andy Kowalski Components: Meta Data Catalog Hold metadata for files Hold metadata for a set of files (data set) Process query lookup Queries return (sets of) GFN, (Global File Name = key), and optionally full metadata for each match * Slide from Chip Watson, ILDG Middleware Project Status File Client Meta Data Catalog Replica Catalog SRM Service Replication Service Storage Resource File Server(s) SRM Listener Web Services Single Site
Operated by the Southeastern Universities Research Association for the U.S. Depart. Of Energy Thomas Jefferson National Accelerator Facility Andy Kowalski Components: Replica Catalog Track all copies of a file / data set Get replicas Create replica Remove replica Prototypes exist at Jefferson Lab Fermilab * Slide from Chip Watson, ILDG Middleware Project Status File Client Meta Data Catalog Replica Catalog SRM Service Replication Service Storage Resource File Server(s) SRM Listener Web Services Single Site
Operated by the Southeastern Universities Research Association for the U.S. Depart. Of Energy Thomas Jefferson National Accelerator Facility Andy Kowalski Components: Storage Resource Manager Manage storage system Disk only Disk plus tape 3 party file transfers Negotiate protocols for file retrieval (select a file server) Auto stage a file on get (asynchronous operation) Version 2.1 defined (collaboration) * Slide from Chip Watson, ILDG Middleware Project Status File Client Meta Data Catalog Replica Catalog SRM Service Replication Service Storage Resource File Server(s) SRM Listener Web Services Single Site
Operated by the Southeastern Universities Research Association for the U.S. Depart. Of Energy Thomas Jefferson National Accelerator Facility Andy Kowalski ILDG Components MetaData Catalog (MDC) Each collaboration deploys one A mechanism (not defined yet, under discussion) exists for searching all (a virtual MDC) Replica Catalog (RC) (same comments) Storage Resource Manager (SRM) Each collaboration deploys one or more At each SRM site, there are one or more file servers: http, ftp, gridftp, jparss, bbftp, … * Slide from Chip Watson, ILDG Middleware Project Status
Operated by the Southeastern Universities Research Association for the U.S. Depart. Of Energy Thomas Jefferson National Accelerator Facility Andy Kowalski JLab: Experimental Effort PPDG (Particle Physics Data Grid) Collaboration of computer scientists and physicists Developing and deploying production Grid systems for experiment-specific applications Now supporting OSG (Open Science Grid) SRM (Storage Resource Manager) A common/standard interface to mass storage systems In 2003 FSU used SRM v1 to process monte-carlo for 30 million events In 2004 deployed a v2 implementation for testing Required for production in February 2005 Already working with LBL, Fermi, CERN to define v3 Job Submission PKI Based authentication to Auger (JLab job submission system) Investigated uJDL (a user level job description language) BNL leading this effort
Operated by the Southeastern Universities Research Association for the U.S. Depart. Of Energy Thomas Jefferson National Accelerator Facility Andy Kowalski Envisioned Architecture
Operated by the Southeastern Universities Research Association for the U.S. Depart. Of Energy Thomas Jefferson National Accelerator Facility Andy Kowalski SRM v2 Implemented SRM version Interface to Jasmine via the HPC Disk/Cache Manager. JLab SRM is a Java Web Service. Uses Apache Axis as SOAP Engine Uses Apache Tomcat as Servlet Engine. Uses GridFTP for file movement Testing with CMU Production service required by February Had a hard time using GT3 Cannot just take components that one wants (it is all or nothing)
Operated by the Southeastern Universities Research Association for the U.S. Depart. Of Energy Thomas Jefferson National Accelerator Facility Andy Kowalski SRM v2 Server Deployment Requires Tomcat, MySQL, SRM worker daemon Firewall configuration: SRM port 8443 GRIDFTP ports 2811, Currently only installed at JLab Testing client access with CMU Next step: install an SRM server at CMU
Operated by the Southeastern Universities Research Association for the U.S. Depart. Of Energy Thomas Jefferson National Accelerator Facility Andy Kowalski SRM v2 Client Deployment Installed at JLab and CMU Implements only srmGet and srmPut (permission problem to fix) Requires specific ant and java versions Proper grid certificate request and installation a challenge (?) Use OpenSSL for cert request instead Globus requires a full installation simply to request a cert and run the client Just need grid-proxy-init Note: Curtis' notes are at Currently the only SRM v2 server and client
Operated by the Southeastern Universities Research Association for the U.S. Depart. Of Energy Thomas Jefferson National Accelerator Facility Andy Kowalski Long-Term SRM Work We are considering how the next SRM version could become the primary interface to Jasmine and the primary farm file mover. Use for Local and Remote Access Goal: 25TB/day from tape through SRM. Balancing classes of requests/prioritizing types of data transfers becomes essential. Farm interaction use cases must be modeled: farm input, farm output, scheduling. We are already looking at what SRM v3 will look like. SRM Core Features and Feature Sets (ideas from the last SRM meeting)
Operated by the Southeastern Universities Research Association for the U.S. Depart. Of Energy Thomas Jefferson National Accelerator Facility Andy Kowalski Job Submission uJDL Is this really needed? Is a standard job submission interface what is really needed? Is that Condor-G? Auger interface Uses java web services Uses PKI authentication for authentication Not GSI
Operated by the Southeastern Universities Research Association for the U.S. Depart. Of Energy Thomas Jefferson National Accelerator Facility Andy Kowalski Grid3dev - OSG JLab development effort is limited Grid3 proved successful Atlas and CMS were the major users JLab plans to join Grid3dev as a step toward OSG-INT/OSG We cannot develop everything we need VO management tools, monitoring, etc. Testing and evaluation Integration with facility infrastructure Determine what we need and can use for others
Operated by the Southeastern Universities Research Association for the U.S. Depart. Of Energy Thomas Jefferson National Accelerator Facility Andy Kowalski References