US Grid Efforts Lee Lueking D0 Remote Analysis Workshop February 12, 2002
Lee Lueking - D0 RACE2 All of these projects are working towards the common goal of providing transparent access to the massively distributed computing infrastructure that is needed to meet the challenges of modern experiments … (From the EU DataTAG proposal)
February 12, 2002Lee Lueking - D0 RACE3 Grid Projects Timeline Q3 00 Q4 00 Q4 01 Q3 01 Q2 01 Q1 01 Q1 02 GriPhyN: $11.9M+$1.6M PPDG:$9.5M iVDGL:$13.65M EU DataGrid: $9.3M EU DataTAG:4M Euros GridPP:
February 12, 2002Lee Lueking - D0 RACE4 PPDG Develop, acquire and deliver vitally needed Grid- enabled tools for data-intensive requirements of particle and nuclear physics. Collaboration of computer scientists with a strong record in distributed computing and Grid technology, and physicists with leading roles in the software and network infrastructures for major high-energy and nuclear experiments. Goals and plans are ultimately guided by the immediate, medium-term and longer-term needs and perspectives of the physics experiments.
February 12, 2002Lee Lueking - D0 RACE5 GriPhyN: Grid Physics Network Virtual data technologies. Advances are required in information models and in new methods of cataloging, characterizing, validating, and archiving software components to implement virtual data manipulations Policy-driven request planning and scheduling of networked data and computational resources. We require mechanisms for representing and enforcing both local and global policy constraints and new policy-aware resource discovery techniques. Management of transactions and task-execution across national-scale and worldwide virtual organizations. New mechanisms are needed to meet user requirements for performance, reliability, and cost.
February 12, 2002Lee Lueking - D0 RACE6 iVDGL:International Virtual Data Grid Laboratory The iVDGL will provide a global computing resource for several leading international experiments in physics and astronomy, Global services and centralized monitoring, management, and support functions functions will be coordinated by the Grid Operations Center (GOC) located at Indiana University, with technical effort provided by GOC staff, iVDGL site staff, and the CS support teams. GriPhyN and Particle Physics Data Grid will provide the basic R&D and software toolkits needed for the laboratory. The European Union DataGrid is also a major participant and will contribute basic technologies and tools. The iVDGL will be based on the open Grid infrastructure provided by the Globus Toolkit and will also build on other technologies such as Condor resource management tools.
February 12, 2002Lee Lueking - D0 RACE7 Comparison of PPDG and iVDGL PPDGiVDGL FundingUS DOE approved 1/1/3/3/3 $M, 99 – 03 US NSF proposed 3/3/3/3/3 $M, 02 – 06 Computer Science Globus (Foster), Condor (Livny), SDM (Shoshani), SRB (Moore) Globus (Foster, Kesselman), Condor (Livny) PhysicsBaBar, Dzero, STAR, JLAB, ATLAS, CMS ATLAS, CMS, LIGO, SDSS, NVO National Laboratories BNL, Fermilab, JLAB, SLAC, ANL, LBNL ANL,BNL, Fermilab (all unfunded collaborators) UniversitiesCaltech, SDSS, UCSD, Wisconsin Florida, Chicago, Caltech, UCSD, Indiana, Boston, Wisconsin at Milwaukee, Pennsylvania State, Johns Hopkins, Wisconsin at Madison, Northwestern, USC, UT Brownsville, Hampton, Salish Kootenai College HardwareNone~20% of funding (Tier-2 Centers) NetworkNo funding requestedNo funding requested DataTAG complementary
February 12, 2002Lee Lueking - D0 RACE8 PPDG Collaborators
February 12, 2002Lee Lueking - D0 RACE9 PPDG Computer Science Groups Condor – develop, implement, deploy, and evaluate mechanisms and policies that support High Throughput Computing on large collections of computing resources with distributed ownership. Globus - developing fundamental technologies needed to build persistent environments that enable software applications to integrate instruments, displays, computational and information resources that are managed by diverse organizations in widespread locations SDM - Scientific Data Management Research Group – optimized and standardized access to storage systems Storage Resource Broker - client-server middleware that provides a uniform interface for connecting to heterogeneous data resources over a network and cataloging/accessing replicated data sets.
February 12, 2002Lee Lueking - D0 RACE10 Delivery of End-to-End Applications & Integrated Production Systems to allow thousands of physicists to share data & computing resources for scientific processing and analyses Operators & Users Resources: Computers, Storage, Networks PPDG Focus: - - Robust Data Replication - Intelligent Job Placement and Scheduling - Management of Storage Resources - Monitoring and Information of Global Services Relies on Grid infrastructure: - Security & Policy - - High Speed Data Transfer - Network management
February 12, 2002Lee Lueking - D0 RACE11 Common Services Job Description Language Scheduling and Management of Processing and Data Placement Activities Monitoring and Status Reporting Storage Resource Management Reliable Replica Management Services File Transfer Services Collect and Document Current Experimental Practices R & D, Evaluation Authentication, Authorization, and Security End-to-End Applications and Testbeds
February 12, 2002Lee Lueking - D0 RACE12 Project Activities, End-to-End Applications and Cross-Cut Pilots Project Activities are focused Experiment – Computer Science Collaborative developments. Replicated data sets for science analysis – BaBar, CMS, STAR Distributed Monte Carlo production services – ATLAS, D0, CMS Common storage management and interfaces – STAR, JLAB End-to-End Applications used in Experiment data handling systems to give real-world requirements, testing and feedback. Error reporting and response Fault tolerant integration of complex components Cross-Cut Pilots for common services and policies Certificate Authority policy and authentication File transfer standards and protocols Resource Monitoring – networks, computers, storage.
February 12, 2002Lee Lueking - D0 RACE13 Security, Privacy, Legal Super Computing 2001 in Denver
February 12, 2002Lee Lueking - D0 RACE14
February 12, 2002Lee Lueking - D0 RACE15 PPDG activities as part of the Global Grid Community Coordination with other Grid Projects in our field: GriPhyN – Grid for Physics Network European DataGrid Storage Resource Management collaboratory HENP Data Grid Coordination Committee Participation in Experiment and Grid deployments in our field: ATLAS, BaBar, CMS, D0, Star, JLAB experiment data handling systems iVDGL/DataTAG – International Virtual Data Grid Laboratory Use DTF computational facilities? Active in Standards Committees: Internet2 HENP Working Group Global Grid Forum
February 12, 2002Lee Lueking - D0 RACE16 PPDG and GridPP Projects Use of Standard Middleware to Promote Interoperability Move to Globus infrastructure: GSI, GridFTP Use of Condor as a supported system for job submission Publish availability of resources and file catalog Additional Grid Functionality for Job Specification, Submission, and Tracking Use Condor for migration and check pointing Enhanced job specification language and services Enhanced Monitoring and Diagnostic Capabilities Fabric Management
February 12, 2002Lee Lueking - D0 RACE17 PPDG Management and Coordination PIs Livny, Newman, Mount Steering Committee Ruth Pordes, Chair Doug Olson, Physics Deputy Chair Miron Livny, Computer Science Deputy Chair Computer Science Group Representatives Physics Experiment Representatives PIs (ex officio) STARSDMBaBarSRBJLABATLASGlobusCMSCondorDZero Executive Team (>1.0 FTE on PPDG) Steering Committee Chair Steering Committee Physics and CS Deputy Chairs
February 12, 2002Lee Lueking - D0 RACE18 iVDGL International Virtual-Data Grid Laboratory A global Grid laboratory with participation from US, EU, Asia, etc. A place to conduct Data Grid tests “at scale” A mechanism to create common Grid infrastructure A facility to perform production exercises for LHC experiments A laboratory for other disciplines to perform Data Grid tests “We propose to create, operate and evaluate, over a sustained period of time, an international research laboratory for data-intensive science.” From NSF proposal, 2001
February 12, 2002Lee Lueking - D0 RACE19 iVDGL Summary Information Principal components (as seen by USA) Tier1 sites (laboratories) Tier2 sites (universities and other institutes) Selected Tier3 sites (universities) Fast networks: US, Europe, transatlantic International Grid Operations Center (iGOC) Computer Science support teams Coordination, management Proposed international partners Initially US, EU, Japan, Australia Other world regions later Discussions w/ Russia, China, Pakistan, India, South America Complementary EU project: DataTAG Transatlantic network from CERN to STAR-TAP (+ people) Initially 2.5 Gb/s
February 12, 2002Lee Lueking - D0 RACE20 US Proposal to NSF US proposal approved by NSF Sept. 25, 2001 “Part 2” of GriPhyN project Much more application oriented than first GriPhyN proposal $15M, 5 $3M per year (huge constraint) CMS + ATLAS + LIGO + SDSS/NVO + Computer Science Scope of US proposal Deploy Grid laboratory with international partners Acquire Tier2 hardware, Tier2 support personnel Integrate of Grid software into applications CS support teams (+ 6 UK Fellows) to harden tools Establish International Grid Operations Center (iGOC) Deploy hardware at 3 minority institutions (Tier3)
February 12, 2002Lee Lueking - D0 RACE21 US iVDGL Proposal Participants T2/Software CS support T3/Outreach T1/Labs U FloridaCMS CaltechCMS, LIGO UC San DiegoCMS, CS Indiana UATLAS, iGOC Boston UATLAS U Wisconsin, MilwaukeeLIGO Penn StateLIGO Johns HopkinsSDSS, NVO U ChicagoCS U Southern CaliforniaCS U Wisconsin, MadisonCS Salish KootenaiOutreach, LIGO Hampton UOutreach, ATLAS U Texas, BrownsvilleOutreach, LIGO FermilabCMS, SDSS, NVO BrookhavenATLAS Argonne LabATLAS, CS
February 12, 2002Lee Lueking - D0 RACE22 iVDGL Partners National partners PPDG (Particle Physics Data Grid ) DTF: Distributed Terascale Facility CAL-IT2 (new California Grid initiative) Current international partners EU-DataGrid UK PPARC funding agency UK Core e-Science Program 6 UK Fellowships INFN (Italy) 2 Japanese institutes 1 Australian institute (APAC)
February 12, 2002Lee Lueking - D0 RACE23 iVDGL Map Circa Tier0/1 facility Tier2 facility 10 Gbps link 2.5 Gbps link 622 Mbps link Other link Tier3 facility
February 12, 2002Lee Lueking - D0 RACE24 iVDGL Requirements Realistic scale In number, diversity, distribution, network connectivity Delegated management and local autonomy Management needed to operate as large, single facility Autonomy needed for sites and experiments Support large-scale experimentation To provide useful information for building real Data Grids Robust operation For long running applications in complex environment Instrumentation and monitoring Required for an experimental facility Integration with international “cyberinfrastructure” Extensibility
February 12, 2002Lee Lueking - D0 RACE25 Approach Define a laboratory architecture Define expected laboratory functions Build in scalability, extensibility, reproducibility Define instrumentation, monitoring Establish CS support teams (develop/harden tools, support users) Define working relationship, coordination with partners Create and operate global-scale laboratory Deploy hardware, software, personnel at Tier2, Tier3 sites Establish iGOC, single point of contact for monitoring, support, … Help international partners establish sites Evaluate and improve iVDGL through experimentation CS support teams will work with experiments Extend results to partners Engage underrepresented groups Integrate minority institutions as Tier3 sites
February 12, 2002Lee Lueking - D0 RACE26 iVDGL as a Laboratory Grid Exercises “Easy”, intra-experiment tests first (10-30%, national, transatlantic) “Harder” wide-scale tests later (30-100% of all resources) CMS is already conducting transcontinental simulation productions Operation as a facility Common software, central installation to ensure compatibility CS teams to “harden” tools, support applications iGOC to monitor performance, handle problems
February 12, 2002Lee Lueking - D0 RACE27 Emphasize Simple Operation “Local” control of resources vitally important (Site level or national level) Experiments, politics demand it Operate mostly as a “partitioned” testbed (Experiment, nation, etc.) Avoids excessive coordination Allows software tests in different partitions Hierarchy of operation must be defined E.g., (1) National + experiment, (2) inter-expt., (3) global tests
February 12, 2002Lee Lueking - D0 RACE28 Other Disciplines Use by other disciplines Expected to be at the 10% level Other HENP experiments Virtual Observatory (VO) community in Europe/US Gravity wave community in Europe/US/Australia/Japan Earthquake engineering Bioinformatics Our CS colleagues (wide scale tests)
February 12, 2002Lee Lueking - D0 RACE29 US iVDGL Management and Coordination Project Directors Avery, Foster Project Coordination Group Project Coordinator Project Directors Coordinators of Systems Integration, Education/Outreach Physics Experiment Representatives University Research Center or Group Representatives PACI Representatives iVDGL Design and Deployment Integration with Applications University Research Centers / Groups International Grid Operations Center Collaboration Board (Advisory) External Advisory Board
February 12, 2002Lee Lueking - D0 RACE30 Conclusion PPDG, and iVDGL are complementary in their approach and deliverables. These efforts, along with our European partners will provide exciting new ways to share data and computing resources. Dzero Grid involvement offers many challenges, but even more opportunities. Acknowledgements: Richard Mount (SLAC), Paul Avery (University of Florida), Ruth Pordes (FNAL).