CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)1 The Promise of Computational Grids in the LHC Era Paul Avery University of Florida Gainesville,

Slides:



Advertisements
Similar presentations
International Grid Communities Dr. Carl Kesselman Information Sciences Institute University of Southern California.
Advertisements

SWITCH Visit to NeSC Malcolm Atkinson Director 5 th October 2004.
31/03/00 CMS(UK)Glenn Patrick What is the CMS(UK) Data Model? Assume that CMS software is available at every UK institute connected by some infrastructure.
Resources for the ATLAS Offline Computing Basis for the Estimates ATLAS Distributed Computing Model Cost Estimates Present Status Sharing of Resources.
Highest Energy e + e – Collider LEP at CERN GeV ~4km radius First e + e – Collider ADA in Frascati GeV ~1m radius e + e – Colliders.
Amber Boehnlein, FNAL D0 Computing Model and Plans Amber Boehnlein D0 Financial Committee November 18, 2002.
Distributed IT Infrastructure for U.S. ATLAS Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
1 Software & Grid Middleware for Tier 2 Centers Rob Gardner Indiana University DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National.
Workload Management Workpackage Massimo Sgaravatto INFN Padova.
US-CMS Meeting (May 19, 2001)Paul Avery1 US-CMS Meeting (UC Riverside) May 19, 2001 Grids for US-CMS and CMS Paul Avery University of Florida
Parallel Programming on the SGI Origin2000 With thanks to Moshe Goldberg, TCC and Igor Zacharov SGI Taub Computer Center Technion Mar 2005 Anne Weill-Zrahia.
POLITEHNICA University of Bucharest California Institute of Technology National Center for Information Technology Ciprian Mihai Dobre Corina Stratan MONARC.
Workload Management Massimo Sgaravatto INFN Padova.
HEP Prospects, J. Yu LEARN Strategy Meeting Prospects on Texas High Energy Physics Network Needs LEARN Strategy Meeting University of Texas at El Paso.
Hall D Online Data Acquisition CEBAF provides us with a tremendous scientific opportunity for understanding one of the fundamental forces of nature. 75.
October 24, 2000Milestones, Funding of USCMS S&C Matthias Kasemann1 US CMS Software and Computing Milestones and Funding Profiles Matthias Kasemann Fermilab.
Welcome e-Science in the UK Building Collaborative eResearch Environments Prof. Malcolm Atkinson Director 23 rd February 2004.
Ian Fisk and Maria Girone Improvements in the CMS Computing System from Run2 CHEP 2015 Ian Fisk and Maria Girone For CMS Collaboration.
CMS Software and Computing FNAL Internal Review of USCMS Software and Computing David Stickland Princeton University CMS Software and Computing Deputy.
Peer to Peer & Grid Computing Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer Science The University.
High Energy Physics At OSCER A User Perspective OU Supercomputing Symposium 2003 Joel Snow, Langston U.
November 15, 2000US CMS Tier2 Plans Matthias Kasemann1 US CMS Software and Computing Tier 2 Center Plans Matthias Kasemann Fermilab DOE/NSF Baseline Review.
CERN TERENA Lisbon The Grid Project Fabrizio Gagliardi CERN Information Technology Division May, 2000
ARGONNE  CHICAGO Ian Foster Discussion Points l Maintaining the right balance between research and development l Maintaining focus vs. accepting broader.
Fermilab User Facility US-CMS User Facility and Regional Center at Fermilab Matthias Kasemann FNAL.
LHC Computing Review - Resources ATLAS Resource Issues John Huth Harvard University.
Finnish DataGrid meeting, CSC, Otaniemi, V. Karimäki (HIP) DataGrid meeting, CSC V. Karimäki (HIP) V. Karimäki (HIP) Otaniemi, 28 August, 2000.
LHC Computing Plans Scale of the challenge Computing model Resource estimates Financial implications Plans in Canada.
DOSAR Workshop, Sao Paulo, Brazil, September 16-17, 2005 LCG Tier 2 and DOSAR Pat Skubic OU.
Instrumentation of the SAM-Grid Gabriele Garzoglio CSC 426 Research Proposal.
Data Grid projects in HENP R. Pordes, Fermilab Many HENP projects are working on the infrastructure for global distributed simulated data production, data.
10/24/2015OSG at CANS1 Open Science Grid Ruth Pordes Fermilab
Virtual Data Grid Architecture Ewa Deelman, Ian Foster, Carl Kesselman, Miron Livny.
Data Logistics in Particle Physics Ready or Not, Here it Comes… Prof. Paul Sheldon Vanderbilt University Prof. Paul Sheldon Vanderbilt University.
GriPhyN EAC Meeting (Jan. 7, 2002)Carl Kesselman1 University of Southern California GriPhyN External Advisory Committee Meeting Gainesville,
GriPhyN EAC Meeting (Apr. 12, 2001)Paul Avery1 University of Florida Opening and Overview GriPhyN External.
Brussels Grid Meeting (Mar. 23, 2001)Paul Avery1 University of Florida Extending the Grid Reach in Europe.
Developing & Managing A Large Linux Farm – The Brookhaven Experience CHEP2004 – Interlaken September 27, 2004 Tomasz Wlodek - BNL.
December 10,1999: MONARC Plenary Meeting Harvey Newman (CIT) Phase 3 Letter of Intent (1/2)  Short: N Pages è May Refer to MONARC Internal Notes to Document.
Perspectives on Grid Technology Ian Foster Argonne National Laboratory The University of Chicago.
HEP-CCC Meeting, November 1999Grid Computing for HEP L. E. Price, ANL Grid Computing for HEP L. E. Price Argonne National Laboratory HEP-CCC Meeting CERN,
DOE/NSF Review (Nov. 15, 2000)Paul Avery (LHC Data Grid)1 LHC Data Grid The GriPhyN Perspective DOE/NSF Baseline Review of US-CMS Software and Computing.
NA-MIC National Alliance for Medical Image Computing UCSD: Engineering Core 2 Portal and Grid Infrastructure.
Internet 2 Workshop (Nov. 1, 2000)Paul Avery (The GriPhyN Project)1 The GriPhyN Project (Grid Physics Network) Paul Avery University of Florida
US ATLAS Tier 1 Facility Rich Baker Brookhaven National Laboratory DOE/NSF Review of U.S. ATLAS and CMS Computing Projects Brookhaven National Laboratory.
GRID Overview Internet2 Member Meeting Spring 2003 Sandra Redman Information Technology and Systems Center and Information Technology Research Center National.
Ruth Pordes November 2004TeraGrid GIG Site Review1 TeraGrid and Open Science Grid Ruth Pordes, Fermilab representing the Open Science.
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
Eine Einführung ins Grid Andreas Gellrich IT Training DESY Hamburg
High Energy Physics and Grids at UF (Dec. 13, 2002)Paul Avery1 University of Florida High Energy Physics.
ATLAS WAN Requirements at BNL Slides Extracted From Presentation Given By Bruce G. Gibbard 13 December 2004.
State of LSC Data Analysis and Software LSC Meeting LIGO Hanford Observatory November 11 th, 2003 Kent Blackburn, Stuart Anderson, Albert Lazzarini LIGO.
July 26, 1999MONARC Meeting CERN MONARC Meeting CERN July 26, 1999.
Securing the Grid & other Middleware Challenges Ian Foster Mathematics and Computer Science Division Argonne National Laboratory and Department of Computer.
U.S. Grid Projects and Involvement in EGEE Ian Foster Argonne National Laboratory University of Chicago EGEE-LHC Town Meeting,
6 march Building the INFN Grid Proposal outline a.ghiselli,l.luminari,m.sgaravatto,c.vistoli INFN Grid meeting, milano.
U.S. ATLAS Computing Facilities Overview Bruce G. Gibbard Brookhaven National Laboratory U.S. LHC Software and Computing Review Brookhaven National Laboratory.
Tier 1 at Brookhaven (US / ATLAS) Bruce G. Gibbard LCG Workshop CERN March 2004.
Computing Division FY03 Budget and budget outlook for FY04 + CDF International Finance Committee April 4, 2003 Vicky White Head, Computing Division.
10-Jan-00 CERN Building a Regional Centre A few ideas & a personal view CHEP 2000 – Padova 10 January 2000 Les Robertson CERN/IT.
U.S. ATLAS Computing Facilities DOE/NFS Review of US LHC Software & Computing Projects Bruce G. Gibbard, BNL January 2000.
January 20, 2000K. Sliwa/ Tufts University DOE/NSF ATLAS Review 1 SIMULATION OF DAILY ACTIVITITIES AT REGIONAL CENTERS MONARC Collaboration Alexander Nazarenko.
U.S. ATLAS Computing Facilities U.S. ATLAS Physics & Computing Review Bruce G. Gibbard, BNL January 2000.
LHC Computing – the 3 rd Decade Jamie Shiers LHC OPN meeting October 2010.
Meeting with University of Malta| CERN, May 18, 2015 | Predrag Buncic ALICE Computing in Run 2+ P. Buncic 1.
1 Particle Physics Data Grid (PPDG) project Les Cottrell – SLAC Presented at the NGI workshop, Berkeley, 7/21/99.
10-Feb-00 CERN HepCCC Grid Initiative ATLAS meeting – 16 February 2000 Les Robertson CERN/IT.
Hall D Computing Facilities Ian Bird 16 March 2001.
] Open Science Grid Ben Clifford University of Chicago
CS258 Spring 2002 Mark Whitney and Yitao Duan
Presentation transcript:

CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)1 The Promise of Computational Grids in the LHC Era Paul Avery University of Florida Gainesville, Florida, USA CHEP 2000 Padova, Italy Feb. 7-11, 2000

CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)2 Example: CMS 1800 Physicists 150 Institutes 32 Countries LHC Computing Challenges è Complexity of LHC environment and resulting data è Scale: Petabytes of data per year è Geographical distribution of people and resources

CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)3 Dimensioning / Deploying IT Resources è LHC computing scale is “something new” è Solution requires directed effort, new initiatives è Solution must build on existing foundations  Robust computing at national centers essential  Universities must have resources to maintain intellectual strength, foster training, engage fresh minds è Scarce resources are/will be a fact of life  plan for it è Goal: get new resources, optimize deployment of all resources to maximize effectiveness  CPU:CERN / national lab / region / institution / desktop  Data:CERN / national lab / region / institution / desktop  Networks:International / national / regional / local

CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)4 Deployment Considerations è Proximity of datasets to appropriate IT resources  Massive  CERN & national labs  Data caches  Regional centers  Mini-summary  Institutional  Micro-summary  Desktop è Efficient use of network bandwidth  Local > regional > national > international è Utilizing all intellectual resources  CERN, national labs, universities, remote sites  Scientists, students è Leverage training, education at universities è Follow lead of commercial world  Distributed data, web servers

CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)5 è Hierarchical grid best deployment option  Hierarchy  Optimal resource layout (MONARC studies)  Grid  Unified system è Arrangement of resources  Tier 0  Central laboratory computing resources (CERN)  Tier 1  National center (Fermilab / BNL)  Tier 2  Regional computing center (university)  Tier 3  University group computing resources  Tier 4  Individual workstation/CPU è We call this arrangement a “Data Grid” to reflect the overwhelming role that data plays in deployment Solution: A Data Grid

CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)6 Layout of Resources è Want good “impedance match” between Tiers  Tier N-1 serves Tier N  Tier N big enough to exert influence on Tier N-1  Tier N-1 small enough to not duplicate Tier N è Resources roughly balanced across Tiers Reasonable balance?

CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)7 Data Grid Hierarchy (Schematic) Tier 1 T Tier 0 (CERN)

CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)8 622 Mbits/s CERN (CMS/ATLAS) 350k Si Tbytes Disk; Robot Tier 2 Center 20k Si95 25 Tbytes Disk, Robot Tier 1: FNAL/BNL 70k Si95 70 Tbytes Disk; Robot 2.4 Gbps N  622 Mbits/s 622Mbits/s 2.4 Gbps Tier 3 Univ WG 1 Tier 3 Univ WG M US Model Circa 2005 Tier 3 Univ WG 2

CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)9 Institute Data Grid Hierarchy (CMS) Online System Offline Farm ~20 TIPS CERN Computer Center Fermilab ~4 TIPS France Regional Center Italy Regional Center Germany Regional Center Workstations ~100 MBytes/sec ~2.4 Gbits/sec 1-10 Gbits/sec Bunch crossing per 25 nsecs. 100 triggers per second Event is ~1 MByte in size Physicists work on analysis “channels”. Each institute has ~10 physicists working on one or more channels Data for these channels is cached by the institute server Physics data cache ~PBytes/sec ~622 Mbits/sec Tier 0 Tier 1 Tier 3 Tier 4 1 TIPS = 25,000 SpecInt95 PC (today) = SpecInt95 Tier 2 Tier2 Center ~1 TIPS Institute Institute ~0.25TIPS

CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)10 Why a Data Grid: Physical è Unified system: all computing resources part of grid  Efficient resource use (manage scarcity)  Averages out spikes in usage  Resource discovery / scheduling / coordination truly possible  “The whole is greater than the sum of its parts” è Optimal data distribution and proximity  Labs are close to the data they need  Users are close to the data they need  No data or network bottlenecks è Scalable growth

CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)11 Why a Data Grid: Political è Central lab cannot manage / help 1000s of users  Easier to leverage resources, maintain control, assert priorities regionally è Cleanly separates functionality  Different resource types in different Tiers  Funding complementarity (NSF vs DOE)  Targeted initiatives è New IT resources can be added “naturally”  Additional matching resources at Tier 2 universities  Larger institutes can join, bringing their own resources  Tap into new resources opened by IT “revolution” è Broaden community of scientists and students  Training and education  Vitality of field depends on University / Lab partnership

CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)12 Tier 2 Regional Centers è Possible Model : CERN:National:Tier 2  1/3 : 1/3 : 1/3 è Complementary role to Tier 1 lab-based centers  Less need for 24  7 operation  lower component costs  Less production-oriented  respond to analysis priorities  Flexible organization, i.e. by physics goals, subdetectors  Variable fraction of resources available to outside users è Range of activities includes  Reconstruction, simulation, physics analyses  Data caches / mirrors to support analyses  Production in support of parent Tier 1  Grid R&D ... Tier 0 Tier 1 Tier 2 Tier 3 Tier 4 More Organization More Flexibility

CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)13 Distribution of Tier 2 Centers è Tier 2 centers arranged regionally in US model  Good networking connections to move data (caches)  Location independence of users always maintained Increases collaborative possibilities  Emphasis on training, involvement of students High quality desktop environment for remote collaboration, e.g., next generation VRVS system

CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)14 Strawman Tier 2 Architecture  Linux Farm of 128 Nodes$ 0.30 M  Sun Data Server with RAID Array$ 0.10 M  Tape Library$ 0.04 M  LAN Switch$ 0.06 M  Collaborative Infrastructure$ 0.05 M  Installation and Infrastructure$ 0.05 M  Net Connect to Abilene network$ 0.14 M  Tape Media and Consumables$ 0.04 M  Staff (Ops and System Support)$ 0.20 M*  Total Estimated Cost (First Year)$ 0.98 M  Cost in Succeeding Years, for evolution,$ 0.68 M upgrade and ops: * 1.5 – 2 FTE support required per Tier 2. Physicists from institute also aid in support.

CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)15 Strawman Tier 2 Evolution  Linux Farm:1,500 SI95 20,000 SI95*  Disks on CPUs4 TB20 TB  RAID Array  1 TB20 TB  Tape Library1 TB TB  LAN Speed Gbps Gbps  WAN Speed Mbps Gbps  CollaborativeMPEG2 VGARealtime HDTV Infrastructure( Mbps)( Mbps)  RAID disk used for “higher availability” data * Reflects lower Tier 2 component costs due to less demanding usage, e.g. simulation.

CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)16 The GriPhyN Project è Joint project involving  US-CMS, US-ATLAS  LIGOGravity wave experiment  SDSSSloan Digital Sky Survey  è Requesting funds from NSF to build world’s first production-scale grid(s)  Sub-implementations for each experiment  NSF pays for Tier 2 centers, some R&D, some networking è Realization of unified Grid system requires research  Many common problems for different implementations  Requires partnership with CS professionals

CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)17 R & D Foundations I è Globus (Grid middleware)  Grid-wide services  Security è Condor (see M. Livny paper)  General language for service seekers / service providers  Resource discovery  Resource scheduling, coordination, (co)allocation è GIOD (Networked object databases) è Nile (Fault-tolerant distributed computing)  Java-based toolkit, running on CLEO

CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)18 R & D Foundations II è MONARC  Construct and validate architectures  Identify important design parameters  Simulate extremely complex, dynamic system è PPDG (Particle Physics Data Grid)  DOE / NGI funded for 1 year  Testbed systems  Later program of work incorporated into GriPhyN

CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)19 The NSF ITR Initiative è Information Technology Research Program  Aimed at funding innovative research in IT  $90M in funds authorized  Max of $12.5M for a single proposal (5 years)  Requires extensive student support è GriPhyN submitted preproposal Dec. 30, 1999  Intend that ITR fund most of our Grid research program  Major costs for people, esp. students / postdocs  Minimal equipment  Some networking è Full proposal due April 17, 2000

CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)20 Summary of Data Grids and the LHC è Develop integrated distributed system, while meeting LHC goals  ATLAS/CMS: production, data handling oriented  (LIGO/SDSS: computation, “commodity component” oriented) è Build, test the regional center hierarchy  Tier 2 / Tier 1 partnership  Commission and test software, data handling systems, and data analysis strategies è Build, test the enabling collaborative infrastructure  Focal points for student-faculty interaction in each region  Realtime high-res video as part of collaborative environment è Involve students at universities in building the data analysis, and in the physics discoveries at the LHC