Download presentation
Presentation is loading. Please wait.
Published byFelicia Lawson Modified over 9 years ago
1
CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)1 The Promise of Computational Grids in the LHC Era Paul Avery University of Florida Gainesville, Florida, USA avery@phys.ufl.edu http://www.phys.ufl.edu/~avery/ CHEP 2000 Padova, Italy Feb. 7-11, 2000
2
CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)2 Example: CMS 1800 Physicists 150 Institutes 32 Countries LHC Computing Challenges è Complexity of LHC environment and resulting data è Scale: Petabytes of data per year è Geographical distribution of people and resources
3
CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)3 Dimensioning / Deploying IT Resources è LHC computing scale is “something new” è Solution requires directed effort, new initiatives è Solution must build on existing foundations Robust computing at national centers essential Universities must have resources to maintain intellectual strength, foster training, engage fresh minds è Scarce resources are/will be a fact of life plan for it è Goal: get new resources, optimize deployment of all resources to maximize effectiveness CPU:CERN / national lab / region / institution / desktop Data:CERN / national lab / region / institution / desktop Networks:International / national / regional / local
4
CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)4 Deployment Considerations è Proximity of datasets to appropriate IT resources Massive CERN & national labs Data caches Regional centers Mini-summary Institutional Micro-summary Desktop è Efficient use of network bandwidth Local > regional > national > international è Utilizing all intellectual resources CERN, national labs, universities, remote sites Scientists, students è Leverage training, education at universities è Follow lead of commercial world Distributed data, web servers
5
CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)5 è Hierarchical grid best deployment option Hierarchy Optimal resource layout (MONARC studies) Grid Unified system è Arrangement of resources Tier 0 Central laboratory computing resources (CERN) Tier 1 National center (Fermilab / BNL) Tier 2 Regional computing center (university) Tier 3 University group computing resources Tier 4 Individual workstation/CPU è We call this arrangement a “Data Grid” to reflect the overwhelming role that data plays in deployment Solution: A Data Grid
6
CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)6 Layout of Resources è Want good “impedance match” between Tiers Tier N-1 serves Tier N Tier N big enough to exert influence on Tier N-1 Tier N-1 small enough to not duplicate Tier N è Resources roughly balanced across Tiers Reasonable balance?
7
CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)7 Data Grid Hierarchy (Schematic) Tier 1 T2 3 3 3 3 3 3 3 3 3 3 3 3 Tier 0 (CERN) 4444 3 3
8
CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)8 622 Mbits/s CERN (CMS/ATLAS) 350k Si95 350 Tbytes Disk; Robot Tier 2 Center 20k Si95 25 Tbytes Disk, Robot Tier 1: FNAL/BNL 70k Si95 70 Tbytes Disk; Robot 2.4 Gbps N 622 Mbits/s 622Mbits/s 2.4 Gbps Tier 3 Univ WG 1 Tier 3 Univ WG M US Model Circa 2005 Tier 3 Univ WG 2
9
CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)9 Institute Data Grid Hierarchy (CMS) Online System Offline Farm ~20 TIPS CERN Computer Center Fermilab ~4 TIPS France Regional Center Italy Regional Center Germany Regional Center Workstations ~100 MBytes/sec ~2.4 Gbits/sec 1-10 Gbits/sec Bunch crossing per 25 nsecs. 100 triggers per second Event is ~1 MByte in size Physicists work on analysis “channels”. Each institute has ~10 physicists working on one or more channels Data for these channels is cached by the institute server Physics data cache ~PBytes/sec ~622 Mbits/sec Tier 0 Tier 1 Tier 3 Tier 4 1 TIPS = 25,000 SpecInt95 PC (today) = 10-20 SpecInt95 Tier 2 Tier2 Center ~1 TIPS Institute Institute ~0.25TIPS
10
CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)10 Why a Data Grid: Physical è Unified system: all computing resources part of grid Efficient resource use (manage scarcity) Averages out spikes in usage Resource discovery / scheduling / coordination truly possible “The whole is greater than the sum of its parts” è Optimal data distribution and proximity Labs are close to the data they need Users are close to the data they need No data or network bottlenecks è Scalable growth
11
CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)11 Why a Data Grid: Political è Central lab cannot manage / help 1000s of users Easier to leverage resources, maintain control, assert priorities regionally è Cleanly separates functionality Different resource types in different Tiers Funding complementarity (NSF vs DOE) Targeted initiatives è New IT resources can be added “naturally” Additional matching resources at Tier 2 universities Larger institutes can join, bringing their own resources Tap into new resources opened by IT “revolution” è Broaden community of scientists and students Training and education Vitality of field depends on University / Lab partnership
12
CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)12 Tier 2 Regional Centers è Possible Model : CERN:National:Tier 2 1/3 : 1/3 : 1/3 è Complementary role to Tier 1 lab-based centers Less need for 24 7 operation lower component costs Less production-oriented respond to analysis priorities Flexible organization, i.e. by physics goals, subdetectors Variable fraction of resources available to outside users è Range of activities includes Reconstruction, simulation, physics analyses Data caches / mirrors to support analyses Production in support of parent Tier 1 Grid R&D ... Tier 0 Tier 1 Tier 2 Tier 3 Tier 4 More Organization More Flexibility
13
CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)13 Distribution of Tier 2 Centers è Tier 2 centers arranged regionally in US model Good networking connections to move data (caches) Location independence of users always maintained Increases collaborative possibilities Emphasis on training, involvement of students High quality desktop environment for remote collaboration, e.g., next generation VRVS system
14
CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)14 Strawman Tier 2 Architecture Linux Farm of 128 Nodes$ 0.30 M Sun Data Server with RAID Array$ 0.10 M Tape Library$ 0.04 M LAN Switch$ 0.06 M Collaborative Infrastructure$ 0.05 M Installation and Infrastructure$ 0.05 M Net Connect to Abilene network$ 0.14 M Tape Media and Consumables$ 0.04 M Staff (Ops and System Support)$ 0.20 M* Total Estimated Cost (First Year)$ 0.98 M Cost in Succeeding Years, for evolution,$ 0.68 M upgrade and ops: * 1.5 – 2 FTE support required per Tier 2. Physicists from institute also aid in support.
15
CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)15 Strawman Tier 2 Evolution 20002005 Linux Farm:1,500 SI95 20,000 SI95* Disks on CPUs4 TB20 TB RAID Array 1 TB20 TB Tape Library1 TB 50 - 100 TB LAN Speed0.1 - 1 Gbps10 - 100 Gbps WAN Speed155 - 622 Mbps2.5 - 10 Gbps CollaborativeMPEG2 VGARealtime HDTV Infrastructure(1.5 - 3 Mbps)(10 - 20 Mbps) RAID disk used for “higher availability” data * Reflects lower Tier 2 component costs due to less demanding usage, e.g. simulation.
16
CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)16 The GriPhyN Project è Joint project involving US-CMS, US-ATLAS LIGOGravity wave experiment SDSSSloan Digital Sky Survey http://www.phys.ufl.edu/~avery/mre/ è Requesting funds from NSF to build world’s first production-scale grid(s) Sub-implementations for each experiment NSF pays for Tier 2 centers, some R&D, some networking è Realization of unified Grid system requires research Many common problems for different implementations Requires partnership with CS professionals
17
CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)17 R & D Foundations I è Globus (Grid middleware) Grid-wide services Security è Condor (see M. Livny paper) General language for service seekers / service providers Resource discovery Resource scheduling, coordination, (co)allocation è GIOD (Networked object databases) è Nile (Fault-tolerant distributed computing) Java-based toolkit, running on CLEO
18
CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)18 R & D Foundations II è MONARC Construct and validate architectures Identify important design parameters Simulate extremely complex, dynamic system è PPDG (Particle Physics Data Grid) DOE / NGI funded for 1 year Testbed systems Later program of work incorporated into GriPhyN
19
CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)19 The NSF ITR Initiative è Information Technology Research Program Aimed at funding innovative research in IT $90M in funds authorized Max of $12.5M for a single proposal (5 years) Requires extensive student support è GriPhyN submitted preproposal Dec. 30, 1999 Intend that ITR fund most of our Grid research program Major costs for people, esp. students / postdocs Minimal equipment Some networking è Full proposal due April 17, 2000
20
CHEP 2000 (Feb. 7-11)Paul Avery (Data Grids in the LHC Era)20 Summary of Data Grids and the LHC è Develop integrated distributed system, while meeting LHC goals ATLAS/CMS: production, data handling oriented (LIGO/SDSS: computation, “commodity component” oriented) è Build, test the regional center hierarchy Tier 2 / Tier 1 partnership Commission and test software, data handling systems, and data analysis strategies è Build, test the enabling collaborative infrastructure Focal points for student-faculty interaction in each region Realtime high-res video as part of collaborative environment è Involve students at universities in building the data analysis, and in the physics discoveries at the LHC
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.