Presentation is loading. Please wait.

Presentation is loading. Please wait.

10-Feb-00 CERN Building a Regional Centre A few ideas & a personal view CHEP 2000 – Padova 10 February 2000 Les Robertson CERN/IT.

Similar presentations


Presentation on theme: "10-Feb-00 CERN Building a Regional Centre A few ideas & a personal view CHEP 2000 – Padova 10 February 2000 Les Robertson CERN/IT."— Presentation transcript:

1 les.robertson@cern.ch 10-Feb-00 CERN Building a Regional Centre A few ideas & a personal view CHEP 2000 – Padova 10 February 2000 Les Robertson CERN/IT

2 CERN 10-feb-00 - #2les robertson - cern/it Summary  LHC regional computing centre topology  Some capacity and performance parameters  From components to computing fabrics  Remarks about regional centres  Policies & sociology  Conclusions

3 CERN 10-feb-00 - #3les robertson - cern/it Why Regional Centres?  Bring computing facilities closer to home  final analysis on a compact cluster in the physics department  Exploit established computing expertise & infrastructure  Reduce dependence on links to CERN  full ESD available nearby - through a fat, fast, reliable network link  Tap funding sources not otherwise available to HEP  Devolve control over resource allocation  national interests?  regional interests?  at the expense of physics interests?

4 CERN 10-feb-00 - #4les robertson - cern/it Department    Desktop The MONARC RC Topology CERN – Tier 0 MONARC report: http://home.cern.ch/~barone/monarc/RCArchitecture.html Tier 1 FNAL RAL IN2P3 622 Mbps 2.5 Gbps 622 Mbps 155 mbps Tier2 Lab a Uni b Lab c Uni n Tier 0 – CERN  Data recording, reconstruction, 20% analysis  Full data sets on permanent mass storage – raw, ESD, simulated data  Hefty WAN capability  Range of export-import media  24 X 7 availability Tier 1 – established data centre or new facility hosted by a lab  Major subset of data – all/most of the ESD, selected raw data  Mass storage, managed data operation  ESD analysis, AOD generation, major analysis capacity  Fat pipe to CERN  High availability  User consultancy – Library & Collaboration Software support Tier 2 – smaller labs, smaller countries, probably hosted by existing data centre  Mainly AOD analysis  Data cached from Tier 1, Tier 0 centres  No mass storage management  Minimal staffing costs University physics department  Final analysis  Dedicated to local users  Limited data capacity – cached only via the network  Zero administration costs (fully automated)

5 CERN 10-feb-00 - #5les robertson - cern/it The MONARC RC Topology CERN – Tier 0 MONARC report: http://home.cern.ch/~barone/monarc/RCArchitecture.html Tier 1 FNAL RAL IN2P3 622 Mbps 2.5 Gbps 622 Mbps 155 mbps Tier2 Lab a Uni b Lab c Uni n Department    Desktop

6 CERN 10-feb-00 - #6les robertson - cern/it More realistically - a Grid Topology CERN – Tier 0 Tier 1 FNAL RAL IN2P3 622 Mbps 2.5 Gbps 622 Mbps 155 mbps Tier2 Lab a Uni b Lab c Uni n Department    Desktop DHL

7 CERN 10-feb-00 - #7les robertson - cern/it Capacity / Performance Based on CMS/Monarc estimates (early 1999) Rounded, extended and adapted by LMR CERN CMS or ATLAS Tier 1 1 expt. Tier 1 2 expts. Capacity in 2006 Annual increase Capacity in 2006 CPU (K SPECint95) ** 600200120240 Disk (TB)550200110220 Tape (PB) (including copies at CERN) 3.420.4<1 I/O rates disk (GB/sec) tape (MB/sec) 50 400 10 50 20 100 WAN bandwidth Gbps2.5 20% CERN all CERN today ~15K SI95 ~25 TB ~100 MB/sec ** 1 SPECint95 = 10 CERNunits = 40 MIPS

8 CERN 10-feb-00 - #8les robertson - cern/it Capacity / Performance Based on CMS/Monarc estimates (early 1999) Rounded, extended and adapted by LMR Tier 1 2 expts. Capacity in 2006 CPU (K SPECint95)240~1200 cpus ~600 boxes Disk (TB)220 At least 2400 disks  ~100 GB/disk (only!) Tape (PB) (including copies at CERN) <1 I/O rates disk (GB/sec) tape (MB/sec) 20 100 40 MB/sec/cpu 20 MB/sec/disk WAN bandwidth Gbps2.5300 MB/sec Approx. Number of farm PCs at CERN today May not find disks as small as that! But we need a high disk count for access, performance, RAID/mirroring, etc. We probably have to buy more disks, larger disks, & use the disks that come with the PCs  much more disk space Effective throughput of LAN backbone 1.5% of LAN

9 CERN 10-feb-00 - #9les robertson - cern/it Building a Regional Centre Commodity components are just fine for HEP  Masses of experience with inexpensive farms  LAN technology is going the right way  Inexpensive high performance PC attachments  Compatible with hefty backbone switches  Good ideas for improving automated operation and management

10 CERN 10-feb-00 - #10les robertson - cern/it Evolution of today’s analysis farms Computing & Storage Fabric built up from commodity components  Simple PCs  Inexpensive network-attached disk  Standard network interface (whatever Ethernet happens to be in 2006) with a minimum of high(er)-end components  LAN backbone  WAN connection

11 CERN 10-feb-00 - #11les robertson - cern/it Standard components Computing & Storage Fabric built up from commodity components  Simple PCs  Inexpensive network-attached disk  Standard network interface (whatever Ethernet happens to be in 2006) with a minimum of high(er)-end components  LAN backbone  WAN connection

12 CERN 10-feb-00 - #12les robertson - cern/it HEP’s not special, just more cost conscious Computing & Storage Fabric built up from commodity components  Simple PCs  Inexpensive network-attached disk  Standard network interface (whatever Ethernet happens to be in 2006) with a minimum of high(er)-end components  LAN backbone  WAN connection

13 CERN 10-feb-00 - #13les robertson - cern/it Limit the role of high end equipment Computing & Storage Fabric built up from commodity components  Simple PCs  Inexpensive network-attached disk  Standard network interface (whatever Ethernet happens to be in 2006) with a minimum of high(er)-end components LAN backbone WAN connection

14 CERN 10-feb-00 - #14les robertson - cern/it Components  building blocks 2000 – standard office equipment 36 dual cpus ~900 SI95 120 72GB disks ~9 TB 2005 – standard, cost-optimised, Internet warehouse equipment 36 dual 200 SI95 cpus = 14K SI95s ~ $100K 224 3.5” disks 25-100 TB $50K - $200K For capacity & cost estimates see the 1999 Pasta Report: http://nicewww.cern.ch/~les/pasta/welcome.html

15 CERN 10-feb-00 - #15les robertson - cern/it The Physics Department System  Two 19” racks & $200K  CPU – 14K SI95 (10% of a Tier1 centre)  Disk – 50TB (50% of a Tier1 centre)  Rather comfortable analysis machine   Small Regional Centres are not going to be competitive  Need to rethink the storage capacity at the Tier1 centres

16 CERN 10-feb-00 - #16les robertson - cern/it Tier 1, Tier 2 RCs, CERN A few general remarks:  A major motivation for the RCs is that we are hard pressed to finance the scale of computing needed for LHC  We need to start now to work together towards minimising costs  Standardisation among experiments, regional centres, CERN so that we can use the same tools and practices to …  Automate everything  Operation & monitoring  Disk & data management  Work scheduling  Data export/import (prefer the network to mail) in order to …  Minimise operation, staffing –  Trade off mass storage for disk + network bandwidth  Acquire contingency capacity rather than fighting bottlenecks  Outsource what you can (at a sensible price)  ……. Keep it simple Work together

17 CERN 10-feb-00 - #17les robertson - cern/it The middleware The issues are:  integration of this amorphous collection of Regional Centres  Data  Workload  Network performance  application monitoring  quality of data analysis service Leverage the “Grid” developments  Extending Meta-computing to Mass-computing  Emphasis on data management & caching  … and production reliability & quality – Keep it simple Work together

18 CERN 10-feb-00 - #18les robertson - cern/it Processors 20 “standard” racks = 1,440 cpus  280K SI95 Disks 12 “standard” racks = 2,688 disks  300TB (with low capacity disks) A 2-experiment Tier 1 Centre tape/DVD net cpu/disk 200 m 2 Basic equipment ~ $3m cpus/disks Requirement: 240K SI95 220 TB

19 CERN 10-feb-00 - #19les robertson - cern/it The full costs?  Space  Power, cooling  Software  LAN  Replacement/Expansion 30% per year  Mass storage  People

20 CERN 10-feb-00 - #20les robertson - cern/it mass storage ? Do all Tier 1 centres really need a full mass storage operation?  Tapes, robots, storage management software? Need support for export/import media  But think hard before getting into mass storage  Rather  more disks, bigger disks, mirrored disks  cache data across the network from another centre (that is willing to tolerate the stresses of mass storage management) Mass storage is person-power intensive  long term costs

21 CERN 10-feb-00 - #21les robertson - cern/it Consider outsourcing  Massive growth in co-location centres, ISP warehouses, ASPs, storage renters, etc.  Level 3, Intel, Hot Office, Network Storage Inc, PSI, ….  There will probably be one near you  Check it out – compare costs & prices  Maybe personnel savings can be made

22 CERN 10-feb-00 - #22les robertson - cern/it Policies & sociology Access policy?  Collaboration-wide? or restricted access (regional, national, ….)  A rich source of unnecessary complexity Data distribution policies Analysis models  Monarc work will help to plan the centres  But the real analysis models will evolve when the data arrives Keep everything flexible – simple architecture - simple policies - minimal politics

23 CERN 10-feb-00 - #23les robertson - cern/it Concluding remarks I  Lots of experience with farms of inexpensive components  We need to scale them up – lots of work but we think we understand it  But we have to learn how to integrate distributed farms into a coherent analysis facility  Leverage other developments  But we need to learn through practice and experience  Retain a healthy scepticism for scalability theories  Check it all out on a realistically sized testbed

24 CERN 10-feb-00 - #24les robertson - cern/it Concluding remarks II  Don’t get hung up on optimising component costs Do be very careful with head-count  Personnel costs will probably dominate  Define clear objectives for the centre –  Efficiency, capacity, quality  Think hard if you really need mass storage  Discourage empires & egos  Encourage collaboration & out-sourcing  In fact – maybe we can just buy all this as an Internet service


Download ppt "10-Feb-00 CERN Building a Regional Centre A few ideas & a personal view CHEP 2000 – Padova 10 February 2000 Les Robertson CERN/IT."

Similar presentations


Ads by Google