Download presentation
Presentation is loading. Please wait.
1
Computer Hardware and Procurement at CERN Helge Meinhard (at) cern ch HEPiX fall 2005 @ SLAC
2
2 Helge Meinhard (at) cern.chHEPiX@SLAC: Hardware procurement at CERN Outline Procedures Hardware (being) procured Power measurements Observations
3
Procedures
4
4 Helge Meinhard (at) cern.chHEPiX@SLAC: Hardware procurement at CERN Constraints (1) CERN is an international organisation with strict administrative rules Competitive tendering required covering (at least) member states No way to avoid for commodity equipment Lowest compliant bid wins No negotiations about added value of higher offers
5
5 Helge Meinhard (at) cern.chHEPiX@SLAC: Hardware procurement at CERN Constraints (2) Different procedures depending on expected volume < 10’000 CHF: IT seeks 3 offers < 200’000 CHF: Formal price enquiry by purchasing service. Four weeks response time < 750’000 CHF: Formal call for tender preceded by market survey. Six weeks response time > 750’000 CHF: As < 750’000 CHF, plus approval by CERN’s Finance Committee (5 sessions/year, papers ready two months in advance) (1 CHF = 0.78 USD = 0.65 EUR)
6
6 Helge Meinhard (at) cern.chHEPiX@SLAC: Hardware procurement at CERN Our problems Procedures badly adapted to quickly evolving computing market Difficult to give preference to “good”, reliable equipment
7
7 Helge Meinhard (at) cern.chHEPiX@SLAC: Hardware procurement at CERN Our choices (1) For significant purchases (> 100 kCHF) we require (a) sample system(s) with the tender for big tenders on CERN’s request for small tenders Tenders include 3 years on-site warranty for hardware Typical requirements: 4 working hours response / 12 working hours repair for critical machines 3 working days response / 5 working days repair for farm nodes Supplier can subcontract on-site warranty
8
8 Helge Meinhard (at) cern.chHEPiX@SLAC: Hardware procurement at CERN Our choices (2) Payment within 30 days after provisional acceptance on receipt of bank guarantee of 5% of purchase sum valid until end of warranty period Delivery within 6 weeks, penalty for late delivery: 2% of purchase sum per complete week, max. 10%
9
9 Helge Meinhard (at) cern.chHEPiX@SLAC: Hardware procurement at CERN Our choices (3) If more than 10% systems fail during acceptance or during first month after: right to return the whole batch If a system fails 3 or more times during any 6 months’ period, right to request complete replacement of system If more than 20% of any component fail during any 6 months’ period, right to request complete replacement of this component across batch If CERN adds third-party devices, no impact on warranty obligations for system as delivered
10
10 Helge Meinhard (at) cern.chHEPiX@SLAC: Hardware procurement at CERN Our choices (4) If justified by volume, procure from two suppliers (lowest and second-lowest compliant) Better protection if one delivers crap or nothing at all Better chance for companies to win an order Increased workload on our part
11
11 Helge Meinhard (at) cern.chHEPiX@SLAC: Hardware procurement at CERN Example of a procurement Procurement of equipment worth < 750 kCHF Approval by Finance Committee not needed Market survey already done Market survey can cover different types of equipment Valid for 1 year If not done yet, add ~ 16 weeks
12
12 Helge Meinhard (at) cern.chHEPiX@SLAC: Hardware procurement at CERN Steps (1) Fix scope2 w Write technical, commercial docs3 w IT-internal review Revise technical, commercial docs2 w Specification meeting Revise technical, commercial docs1 w Tender out Deadline for replies6 w Opening of replies1 w (Total so far: 15 weeks, at best compressible to 12 weeks) Typical case
13
13 Helge Meinhard (at) cern.chHEPiX@SLAC: Hardware procurement at CERN Steps (2) (Total from previous slide: 15 w, min. 12 w) Technical analysis of replies1 w Visual inspection, mounting1 w Benchmarks, reports3 w Technical clarifications1 w Purchase request, order2 w Delivery7 w Preliminary acceptance6 w Total: 36 weeks, compressible to 30 weeks Typical case
14
Hardware (being) procured
15
15 Helge Meinhard (at) cern.chHEPiX@SLAC: Hardware procurement at CERN Objectives Cover existing needs with as few different models and as few procurement procedures as possible Closely follow technology and market evolution and satisfy requirements with modern hardware at low cost contradiction
16
16 Helge Meinhard (at) cern.chHEPiX@SLAC: Hardware procurement at CERN Fabric Infrastructure and Operations (1) RedHat 7.3 phased out on public services Campaign on storage nodes far advanced New in machine room since Karlsruhe: 200 farm PCs (dual Nocona): in production 116 disk servers (> 5 TB usable each, total of 900 TB gross capacity): part in production, part under acceptance test 112 “midrange servers”: under acceptance test 32-node Infiniband-based cluster for Theory Refurbishment of machine room proceeding LHS being populated, but power remains limited Talk From CERN site report 2005/10/11
17
17 Helge Meinhard (at) cern.chHEPiX@SLAC: Hardware procurement at CERN Hardware being procured (1) Large volumes – several times < 750 kCHF per year “Farm PCs” – non-redundant, cheap dual- processor work horses “Disk servers” – storage-in-a-box systems with many SATA disks for streaming applications
18
18 Helge Meinhard (at) cern.chHEPiX@SLAC: Hardware procurement at CERN Hardware being procured (2) Medium-size volumes – once < 750 kCHF per year or once or several times < 200 kCHF per year “Midrange servers” – redundant building blocks for specific applications “Tape servers” – midrange servers with an FC interface “Disk arrays” – autonomous RAID units with FC uplinks SAN infrastructure (most notably FC switches) Head nodes for serial console infrastructure “Small disk servers”, somewhere between disk servers and midrange servers Miscellaneous
19
19 Helge Meinhard (at) cern.chHEPiX@SLAC: Hardware procurement at CERN Specifications: Farm PCs (1) 2 boxed Intel Noconas of 2.8 GHz Mainboard: BMC (IPMI 1.5 or higher) PXE, USB boot BBS menu Console redirection Configurable to stay off on AC power loss 2 GB ECC memory From mainboard manuf. approved list Upgradable to 4 GB without removing modules
20
20 Helge Meinhard (at) cern.chHEPiX@SLAC: Hardware procurement at CERN Specifications: Farm PCs (2) 1 disk > 140 GB, IDE not permitted Certified for 24/7, 3 y warranty by disk manuf. 1 GigE providing PXE and IPMI access 19” chassis max. 4 U, with rails Power, reset button Power, disk activity LED Power supply supporting machine + 50 W Active PFC C13 to C14 LSZH power cord Guaranteed to run under RHEL 3 (i386 and x86_64) Delivery within 6 weeks from dispatch of order
21
21 Helge Meinhard (at) cern.chHEPiX@SLAC: Hardware procurement at CERN Specifications: Disk server (1) 1 or 2 boxed Intel Xeon with EM64T Mainboard as for Farm PCs Now adding support for memory mirroring Memory as for Farm PCs General requirements for disks etc. ≥ 7200 rpm, no EIDE, 3 y warranty, certified for 24/7 by manufacturer Metallic hot-swap trays certified by chassis manuf. Indicators for power and activity for each tray PCB backplanes for disks, multilane cabling “Intelligent” RAID controllers
22
22 Helge Meinhard (at) cern.chHEPiX@SLAC: Hardware procurement at CERN Specifications: Disk server (2) System disks: 2 x ≥ 140 GB mirrored Data disks: all identical Redundant RAIDs with hot spares (min. 1/15) Total usable capacity per system above 5 TB Battery buffer if controller with active cache 1 GigE providing required performance, PXE, IPMI access 19” chassis rack-mountable with rails Min. 40 TB usable in 42 U high rack Power supply: N+1 redundant, active PFC Guaranteed to run under RHEL 3 (i386 and x86_64) Delivery within 6 weeks from dispatch of order
23
23 Helge Meinhard (at) cern.chHEPiX@SLAC: Hardware procurement at CERN Specifications: Disk server (3) Performance: memory to disk: iozone with 16 GB files and 256 kb record size Single stream: 40 MB/s write, 40 MB/s read Multi-stream (at least 10): 115 MB/s write, 170 MB/s read (*) Memory to network: iperf Single stream: 100 MB/s write, 100 MB/s read Two streams: 110 MB/s write, 110 MB/s read Two streams in, two streams out: 145 MB/s
24
24 Helge Meinhard (at) cern.chHEPiX@SLAC: Hardware procurement at CERN Specifications: Disk server (4) Global (disk to network) performance: At least 10 clients transferring 2 GB files via rfio Reading from system: 95 MB/s (*) Writing to system: 90 MB/s (*) (*): Requirements scale linearly with usable capacity, numbers for 5000 GB usable
25
Power measurements Done by Andras Horvath, CERN
26
26 Helge Meinhard (at) cern.chHEPiX@SLAC: Hardware procurement at CERN Power measurements http://ahorvath.home.cern.ch/ahorvath/power
27
Observations
28
28 Helge Meinhard (at) cern.chHEPiX@SLAC: Hardware procurement at CERN Observations (1) Profile of winning companies Tier-1 suppliers competing with large integrators Small ‘round the corner companies eliminated at Market Survey stage Almost always the integrators win Specially tailored solutions responding to our specifications Prices of Tier-1s rather high in Europe
29
29 Helge Meinhard (at) cern.chHEPiX@SLAC: Hardware procurement at CERN Observations (2) Stress test as (important) part of the acceptance test Introduced ~ 2 years ago (triggered by presentations from SLAC and FNAL at HEPiX) Very useful Based on va-ctcs No longer sufficiently actively maintained Large number of false positives Looking for a replacement
30
30 Helge Meinhard (at) cern.chHEPiX@SLAC: Hardware procurement at CERN Observations (3) Pushing these procedures through requires dedicated (and knowledgeable) person power Not obvious to run multiple procedures in parallel In particular, if things go wrong, e.g. stress test fails
31
31 Helge Meinhard (at) cern.chHEPiX@SLAC: Hardware procurement at CERN Summary Computer hardware procurement is an excellent experimental confirmation of two fundamental laws of human nature Murphy: “Everything that can go wrong will go wrong.” Hoffstaedter: “Things always take longer than you think, even if you take into account Hoffstaedter’s law.”
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.