Download presentation
Presentation is loading. Please wait.
Published byGary Sutton Modified over 9 years ago
1
CASC Meeting (July 14, 2004)Paul Avery1 University of Florida avery@phys.ufl.edu Physics Grids and Open Science Grid CASC Meeting Washington, DC July 14, 2004
2
CASC Meeting (July 14, 2004)Paul Avery2 U.S. “Trillium” Grid Consortium Trillium = PPDG + GriPhyN + iVDGL Particle Physics Data Grid:$12M (DOE)(1999 – 2004+) GriPhyN:$12M (NSF)(2000 – 2005) iVDGL:$14M (NSF)(2001 – 2006) Basic composition (~150 people) PPDG:4 universities, 6 labs GriPhyN:12 universities, SDSC, 3 labs iVDGL:18 universities, SDSC, 4 labs, foreign partners Expts:BaBar, D0, STAR, Jlab, CMS, ATLAS, LIGO, SDSS/NVO Complementarity of projects GriPhyN:CS research, Virtual Data Toolkit (VDT) development PPDG:“End to end” Grid services, monitoring, analysis iVDGL:Grid laboratory deployment using VDT Experiments provide frontier challenges Unified entity when collaborating internationally
3
CASC Meeting (July 14, 2004)Paul Avery3 Trillium Science Drivers Experiments at Large Hadron Collider 100s of Petabytes2007 - ? High Energy & Nuclear Physics expts ~1 Petabyte (1000 TB)1997 – present LIGO (gravity wave search) 100s of Terabytes2002 – present Sloan Digital Sky Survey 10s of Terabytes2001 – present Data growth Community growth 2007 2005 2003 2001 2009 Future Grid resources Massive CPU (PetaOps) Large distributed datasets (>100PB) Global communities (1000s)
4
CASC Meeting (July 14, 2004)Paul Avery4 Search for Origin of Mass & Supersymmetry (2007 – ?) TOTEM LHCb ALICE 27 km Tunnel in Switzerland & France CMS ATLAS Large Hadron Collider (LHC) @ CERN
5
CASC Meeting (July 14, 2004)Paul Avery5 Trillium: Collaborative Program of Work Common experiments, leadership, participants CS research Workflow, scheduling, virtual data Common Grid toolkits and packaging Virtual Data Toolkit (VDT) + Pacman packaging Common Grid infrastructure: Grid3 National Grid for testing, development and production Advanced networking Ultranet, UltraLight, etc. Integrated education and outreach effort + collaboration with outside projects Unified entity in working with international projects LCG, EGEE, South America
6
CASC Meeting (July 14, 2004)Paul Avery6 Goal: Peta-scale Virtual-Data Grids for Global Science Virtual Data Tools Request Planning & Scheduling Tools Request Execution & Management Tools Transforms Distributed resources (code, storage, CPUs, networks) Resource Management Services Security and Policy Services Other Grid Services Interactive User Tools Production Team Single Researcher Workgroups Raw data source PetaOps Petabytes Performance GriPhyN
7
CASC Meeting (July 14, 2004)Paul Avery7 LHC: Petascale Global Science Complexity:Millions of individual detector channels Scale:PetaOps (CPU), 100s of Petabytes (Data) Distribution:Global distribution of people & resources CMS Example- 2007 5000+ Physicists 250+ Institutes 60+ Countries BaBar/D0 Example - 2004 700+ Physicists 100+ Institutes 35+ Countries
8
CASC Meeting (July 14, 2004)Paul Avery8 CMS Experiment Global LHC Data Grid Hierarchy Online System CERN Computer Center USA Korea Russia UK Maryland 0.1 - 1.5 GBytes/s >10 Gb/s 10-40 Gb/s 2.5-10 Gb/s Tier 0 Tier 1 Tier 3 Tier 4 Tier 2 Physics caches PCs Iowa UCSDCaltech U Florida ~10s of Petabytes/yr by 2007-8 ~1000 Petabytes in < 10 yrs? FIU
9
CASC Meeting (July 14, 2004)Paul Avery9 HEP Bandwidth Roadmap (Gb/s) HEP: Leading role in network development/deployment
10
CASC Meeting (July 14, 2004)Paul Avery10 Tier2 Centers Tier2 facility 20 – 40% of Tier1? “1 FTE support”: commodity CPU & disk, no hierarchical storage Essential university role in extended computing infrastructure Validated by 3 years of experience with proto-Tier2 sites Functions Physics analysis, simulations Support experiment software Support smaller institutions Official role in Grid hierarchy (U.S.) Sanctioned by MOU with parent organization (ATLAS, CMS, LIGO) Local P.I. with reporting responsibilities Selection by collaboration via careful process
11
CASC Meeting (July 14, 2004)Paul Avery11 Analysis by Globally Distributed Teams Non-hierarchical: Chaotic analyses + productions Superimpose significant random data flows
12
CASC Meeting (July 14, 2004)Paul Avery12 Trillium Grid Tools: Virtual Data Toolkit Sources (CVS) Patching GPT src bundles NMI Build & Test Condor pool (37 computers) … Build Test Package VDT Build Contributors (VDS, etc.) Build Pacman cache RPMs Binaries Test Use NMI processes later
13
CASC Meeting (July 14, 2004)Paul Avery13 Virtual Data Toolkit: Tools in VDT 1.1.12 Globus Alliance Grid Security Infrastructure (GSI) Job submission (GRAM) Information service (MDS) Data transfer (GridFTP) Replica Location (RLS) Condor Group Condor/Condor-G DAGMan Fault Tolerant Shell ClassAds EDG & LCG Make Gridmap Cert. Revocation List Updater Glue Schema/Info provider ISI & UC Chimera & related tools Pegasus NCSA MyProxy GSI OpenSSH LBL PyGlobus Netlogger Caltech MonaLisa VDT VDT System Profiler Configuration software Others KX509 (U. Mich.)
14
CASC Meeting (July 14, 2004)Paul Avery14 VDT Growth (1.1.14 Currently) VDT 1.1.3, 1.1.4 & 1.1.5 pre-SC 2002 VDT 1.0 Globus 2.0b Condor 6.3.1 VDT 1.1.7 Switch to Globus 2.2 VDT 1.1.11 Grid3 VDT 1.1.8 First real use by LCG VDT 1.1.14 May 10
15
CASC Meeting (July 14, 2004)Paul Avery15 Language: define software environments Interpreter: create, install, configure, update, verify environments LCG/Scram ATLAS/CMT CMS DPE/tar/make LIGO/tar/make OpenSource/tar/make Globus/GPT NPACI/TeraGrid/tar/make D0/UPS-UPD Commercial/tar/make Combine and manage software from arbitrary sources. % pacman –get iVDGL:Grid3 “1 button install”: Reduce burden on administrators Remote experts define installation/ config/updating for everyone at once VDT ATLA S NPAC I D- Zero iVDGL UCHEP % pacman VDT CMS/DPE LIGO Packaging of Grid Software: Pacman
16
CASC Meeting (July 14, 2004)Paul Avery16 Trillium Collaborative Relationships Internal and External Computer Science Research Virtual Data Toolkit Partner Physics projects Partner Outreach projects Larger Science Community Globus, Condor, NMI, iVDGL, PPDG, EU DataGrid, LHC Experiments, QuarkNet, CHEPREO Production Deployment Tech Transfer Techniques & software Requirements Prototyping & experiments Other linkages Work force CS researchers Industry U.S.Grids Int’l Outreach
17
CASC Meeting (July 14, 2004)Paul Avery17 Grid3: An Operational National Grid 28 sites: Universities + national labs 2800 CPUs, 400–1300 jobs Running since October 2003 Applications in HEP, LIGO, SDSS, Genomics http://www.ivdgl.org/grid3
18
CASC Meeting (July 14, 2004)Paul Avery18 Grid3: Three Months Usage
19
CASC Meeting (July 14, 2004)Paul Avery19 Production Simulations on Grid3 Used = 1.5 US-CMS resources US-CMS Monte Carlo Simulation USCMS Non-USCMS
20
CASC Meeting (July 14, 2004)Paul Avery20 Grid3 Open Science Grid Build on Grid3 experience Persistent, production-quality Grid, national + international scope Ensure U.S. leading role in international science Grid infrastructure for large-scale collaborative scientific research Create large computing infrastructure Combine resources at DOE labs and universities to effectively become a single national computing infrastructure for science Provide opportunities for educators and students Participate in building and exploiting this grid infrastructure Develop and train scientific and technical workforce Transform the integration of education and research at all levels http://www.opensciencegrid.org
21
CASC Meeting (July 14, 2004)Paul Avery21 Roadmap towards Open Science Grid Iteratively build & extend Grid3, to national infrastructure Shared resources, benefiting broad set of disciplines Strong focus on operations Grid3 Open Science Grid Contribute US-LHC resources to provide initial federation US-LHC Tier-1 and Tier-2 centers provide significant resources Build OSG from laboratories, universities, campus grids, etc. Argonne, Fermilab, SLAC, Brookhaven, Berkeley Lab, Jeff. Lab UW Madison, U Florida, Purdue, Chicago, Caltech, Harvard, etc. Corporations? Further develop OSG Partnerships and contributions from other sciences, universities Incorporation of advanced networking Focus on general services, operations, end-to-end performance
22
CASC Meeting (July 14, 2004)Paul Avery22 Enterprise Possible OSG Collaborative Framework Technical Groups 0…n (small) Consortium Board (1) Research Grid Projects VO Org Researchers Sites Service Providers Campus, Labs activity 1 activity 1 activity 1 activity 0…N (large) Joint committees (0… N small) Participants provide: resources, management, project steering groups
23
CASC Meeting (July 14, 2004)Paul Avery23 Open Science Grid Consortium Join U.S. forces into the Open Science Grid consortium Application scientists, technology providers, resource owners,... Provide a lightweight framework for joint activities... Coordinating activities that cannot be done within one project Achieve technical coherence and strategic vision (roadmap) Enable communication and provide liaising Reach common decisions when necessary Create Technical Groups Propose, organize, oversee activities, peer, liaise Now:Security and storage services Later:Support centers, policies Define activities Contributing tasks provided by participants joining the activity
24
CASC Meeting (July 14, 2004)Paul Avery24 Applications, Infrastructure, Facilities
25
CASC Meeting (July 14, 2004)Paul Avery25 Open Science Grid Partnerships
26
CASC Meeting (July 14, 2004)Paul Avery26 Partnerships for Open Science Grid Complex matrix of partnerships: challenges & opportunities NSF and DOE, Universities and Labs, physics and computing depts. U.S. LHC and Trillium Grid Projects Application sciences, computer sciences, information technologies Science and education U.S., Europe, Asia, North and South America Grid middleware and infrastructure projects CERN and U.S. HEP labs, LCG and Experiments, EGEE and OSG ~85 participants on OSG stakeholder mail list ~50 authors on OSG abstract to CHEP conference (Sep. 2004) Meetings Jan. 12:Initial stakeholders meeting May 20-21:Joint Trillium Steering meeting to define program Sep. 9-10:OSG workshop, Boston
27
CASC Meeting (July 14, 2004)Paul Avery27 Education and Outreach
28
CASC Meeting (July 14, 2004)Paul Avery28 NEWS: Bulletin: ONE TWO WELCOME BULLETIN General Information Registration Travel Information Hotel Registration Participant List How to Get UERJ/Hotel Computer Accounts Useful Phone Numbers Program Contact us: Secretariat Chairmen Grids and the Digital Divide Rio de Janeiro, Feb. 16-20, 2004 Background World Summit on Information Society HEP Standing Committee on Inter-regional Connectivity (SCIC) Themes Global collaborations, Grids and addressing the Digital Divide Next meeting: 2005 (Korea) http://www.uerj.br/lishep2004
29
CASC Meeting (July 14, 2004)Paul Avery29 iVDGL, GriPhyN Education / Outreach Basics $200K/yr Led by UT Brownsville Workshops, portals Partnerships with CHEPREO, QuarkNet, …
30
CASC Meeting (July 14, 2004)Paul Avery30 June 21-25 Grid Summer School First of its kind in the U.S. (South Padre Island, Texas) 36 students, diverse origins and types (M, F, MSIs, etc) Marks new direction for Trillium First attempt to systematically train people in Grid technologies First attempt to gather relevant materials in one place Today:Students in CS and Physics Later:Students, postdocs, junior & senior scientists Reaching a wider audience Put lectures, exercises, video, on the web More tutorials, perhaps 3-4/year Dedicated resources for remote tutorials Create “Grid book”, e.g. Georgia Tech New funding opportunities NSF: new training & education programs
31
CASC Meeting (July 14, 2004)Paul Avery31
32
CHEPREO: Center for High Energy Physics Research and Educational Outreach Florida International University Physics Learning Center CMS Research iVDGL Grid Activities AMPATH network (S. America) Funded September 2003 $4M initially (3 years) 4 NSF Directorates!
33
CASC Meeting (July 14, 2004)Paul Avery33 UUEO: A New Initiative Meeting April 8 in Washington DC Brought together ~40 outreach leaders (including NSF) Proposed Grid-based framework for common E/O effort
34
CASC Meeting (July 14, 2004)Paul Avery34 Grid Project References Grid3 www.ivdgl.org/grid3 PPDG www.ppdg.net GriPhyN www.griphyn.org iVDGL www.ivdgl.org Globus www.globus.org LCG www.cern.ch/lcg EU DataGrid www.eu-datagrid.org EGEE egee-ei.web.cern.ch 2nd Edition www.mkp.com/grid2
35
CASC Meeting (July 14, 2004)Paul Avery35 Extra Slides
36
CASC Meeting (July 14, 2004)Paul Avery36 Outreach QuarkNet-Trillium Virtual Data Portal More than a web site Organize datasets Perform simple computations Create new computations & analyses View & share results Annotate & enquire (metadata) Communicate and collaborate Easy to use, ubiquitous, No tools to install Open to the community Grow & extend Initial prototype implemented by graduate student Yong Zhao and M. Wilde (U. of Chicago)
37
CASC Meeting (July 14, 2004)Paul Avery37 GriPhyN Achievements Virtual Data paradigm to express science processes Unified language (VDL) to express general data transformation Advanced planners, executors, monitors, predictors, fault recovery to make the Grid “like a workstation” Virtual Data Toolkit (VDT) Tremendously simplified installation & configuration of Grids Close partnership with and adoption by multiple sciences: ATLAS, CMS, LIGO, SDSS, Bioinformatics, EU Projects Broad education & outreach program (UT Brownsville) 25 graduate, 2 undergraduate; 3 CS PhDs by end of 2004 Virtual Data for QuarkNet Cosmic Ray project Grid Summer School 2004, 3 MSIs participating
38
CASC Meeting (July 14, 2004)Paul Avery38 Grid3 Broad Lessons Careful planning and coordination essential to build Grids Community investment of time/resources Operations team needed to operate Grid as a facility Tools, services, procedures, documentation, organization Security, account management, multiple organizations Strategies needed to cope with increasingly large scale “Interesting” failure modes as scale increases Delegation of responsibilities to conserve human resources Project, Virtual Org., Grid service, site, application Better services, documentation, packaging Grid3 experience critical for building “useful” Grids Frank discussion in “Grid3 Project Lessons” doc
39
CASC Meeting (July 14, 2004)Paul Avery39 Grid2003 Lessons (1) Need for financial investment Grid projects:PPDG + GriPhyN + iVDGL: $35M National labs:Fermilab, Brookhaven, Argonne, LBL Other expts:LIGO, SDSS Critical role of collaboration CS + 5 HEP + LIGO + SDSS + CS + biology Makes possible large common effort Collaboration sociology ~50% of issues Building “stuff”: Prototypes testbeds production Grids Build expertise, debug software, develop tools & services Need for “operations team” Point of contact for problems, questions, outside inquiries Resolution of problems by direct action, consultation w/ experts
40
CASC Meeting (July 14, 2004)Paul Avery40 Grid2003 Lessons (2): Packaging VDT + Pacman: Installation and configuration Simple installation, configuration of Grid tools + applications Major advances over 14 VDT releases Why this is critical for us Uniformity and validation across sites Lowered barriers to participation more sites! Reduced FTE overhead & communication traffic The next frontier: remote installation
41
CASC Meeting (July 14, 2004)Paul Avery41 Grid2003 and Beyond Further evolution of Grid3: (Grid3+, etc.) Contribute resources to persistent Grid Maintain development Grid, test new software releases Integrate software into the persistent Grid Participate in LHC data challenges Involvement of new sites New institutions and experiments New international partners (Brazil, Taiwan, Pakistan?, …) Improvements in Grid middleware and services Storage services Integrating multiple VOs Monitoring Troubleshooting Accounting
42
CASC Meeting (July 14, 2004)Paul Avery42 International Virtual Data Grid Laboratory UF UW Madison BNL Indiana Boston U SKC Brownsville Hampton PSU J. Hopkins Caltech Tier1 Tier2 Other FIU Austin Michigan LBL Argonne Vanderbilt UCSD Fermilab Partners EU Brazil Korea Iowa Chicago UW Milwaukee ISI Buffalo
43
CASC Meeting (July 14, 2004)Paul Avery43 U FloridaCMS (Tier2), Management CaltechCMS (Tier2), LIGO (Management) UC San DiegoCMS (Tier2), CS Indiana UATLAS (Tier2), iGOC (operations) Boston UATLAS (Tier2) HarvardATLAS (Management) Wisconsin, MilwaukeeLIGO (Tier2) Penn StateLIGO (Tier2) Johns HopkinsSDSS (Tier2), NVO Chicago CS, Coord./Management, ATLAS (Tier2) Vanderbilt*BTEV (Tier2, unfunded) Southern CaliforniaCS Wisconsin, MadisonCS Texas, AustinCS Salish KootenaiLIGO (Outreach, Tier3) Hampton UATLAS (Outreach, Tier3) Texas, BrownsvilleLIGO (Outreach, Tier3) FermilabCMS (Tier1), SDSS, NVO BrookhavenATLAS (Tier1) ArgonneATLAS (Management), Coordination Roles of iVDGL Institutions
44
CASC Meeting (July 14, 2004)Paul Avery44 “Virtual Data”: Derivation & Provenance Most scientific data are not simple “measurements” They are computationally corrected/reconstructed They can be produced by numerical simulation Science & eng. projects are more CPU and data intensive Programs are significant community resources (transformations) So are the executions of those programs (derivations) Management of dataset dependencies critical! Derivation: Instantiation of a potential data product Provenance: Complete history of any existing data product Previously:Manual methods GriPhyN:Automated, robust tools
45
CASC Meeting (July 14, 2004)Paul Avery45 Transformation Derivation Data product-of execution-of consumed-by/ generated-by “I’ve detected a muon calibration error and want to know which derived data products need to be recomputed.” “I’ve found some interesting data, but I need to know exactly what corrections were applied before I can trust it.” “I want to search a database for 3 muon SUSY events. If a program that does this analysis exists, I won’t have to write one from scratch.” “I want to apply a forward jet analysis to 100M events. If the results already exist, I’ll save weeks of computation.” Virtual Data Motivations
46
CASC Meeting (July 14, 2004)Paul Avery46 decay = WW WW e Pt > 20 decay = WW WW e decay = WW WW leptons mass = 160 decay = WW decay = ZZ decay = bb Other cuts Scientist adds a new derived data branch & continues analysis Virtual Data Example: LHC Analysis Other cuts
47
CASC Meeting (July 14, 2004)Paul Avery47 Sloan Digital Sky Survey (SDSS) Using Virtual Data in GriPhyN Galaxy cluster size distribution Sloan Data
48
CASC Meeting (July 14, 2004)Paul Avery48 The LIGO International Grid LIGO Grid: 6 US sites + 3 EU sites (Cardiff/UK, AEI/Germany) LHO, LLO: observatory sites LSC - LIGO Scientific Collaboration Cardiff AEI/Golm Birmingham
49
CASC Meeting (July 14, 2004)Paul Avery49 Grids: Enhancing Research & Learning Fundamentally alters conduct of scientific research “Lab-centric”:Activities center around large facility “Team-centric”:Resources shared by distributed teams “Knowledge-centric”:Knowledge generated/used by a community Strengthens role of universities in research Couples universities to data intensive science Couples universities to national & international labs Brings front-line research and resources to students Exploits intellectual resources of formerly isolated schools Opens new opportunities for minority and women researchers Builds partnerships to drive advances in IT/science/eng HEP Physics, astronomy, biology, CS, etc. “Application” sciences Computer Science Universities Laboratories Scientists Students Research Community IT industry
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.