SJ – Nov CERN’s openlab Project Sverre Jarp, Wolfgang von Rüden IT Division CERN 29 November 2002
SJ – Nov Our ties to IA-64 (IPF) A long history already…. Nov. 1992: Visit to HP Labs (Bill Worley): “We shall soon launch PA-Wide Word!” : CERN becomes one of the few external definition partners for IA-64 Now a joint effort between Intel and HP : Creation of a vector math library for IA-64 Full prototype to demonstrate the precision, versatility, and unbeatable speed of execution (with HP Labs) : Port of Linux onto IA-64 “Trillian” project: glibc Real applications Demonstrated already at Intel’s “Exchange” exhibition on Oct. 2000
SJ – Nov openlab Status Industrial Collaboration Enterasys, HP, and Intel are our partners Technology aimed at the LHC era Network switch at 10 Gigabits Connect via both 1 Gbit and 10 Gbits Rack-mounted HP servers Itanium processors Storage subsystem may be coming from a 4th partner Cluster evolution: 2002: Cluster of 32 systems (64 processors) 2003: 64 systems (“Madison” processors) 2004: 64 systems (“Montecito” processors)
SJ – Nov The compute nodes HP rx2600 Rack-mounted (2U) systems Two Itanium-2 processors 900 or 1000 MHz Field upgradable to next generation 4 GB memory (max 12 GB) 3 hot pluggable SCSI discs (36 or 73 GB) On-board 100 and 1000 Mbit Ethernet 4 full-size 133 MHz/64-bit PCI-X slots Built-in management processor Accessible via serial port or Ethernet interface
SJ – Nov openlab SW strategy Exploit existing CERN infrastructure Which is based on RedHat Linux, GNU compilers OpenAFS SUE (Standard Unix Env.) systems maintenance tools Native 64-bit port Key LHC applications: CLHEP, GEANT4, ROOT, etc. Important subsystems: Castor, Oracle, MySQL, LSF, etc. Intel compiler where it is sensible Performance 32-bit emulation mode Wherever it makes sense Low usage, no special performance need Non-strategic areas
SJ – Nov openlab - phase 1 Also: Prepare porting strategy for phase 2 Estimated time scale: 6 months Awaiting recruitment of: 1 system programmer Integrate the openCluster 32 nodes + development nodes Rack-mounted DP Itanium-2 systems RedHat 7.3 (AW2.1 beta) – kernel at OpenAFS 1.2.7, LSF 4 GNU, Intel Compilers (+ ORC?) Database software (MySQL, Oracle?) CERN middleware: Castor data mgmt GRID middleware: Globus, Condor, etc. CERN Applications Porting, Benchmarking, Performance improvements CLHEP, GEANT4, ROOT, CERNLIB Cluster benchmarks 1 10 Gigabit interfaces
SJ – Nov openlab - phase 2 European Data Grid Integrate OpenCluster alongside EDG testbed Porting, Verification Relevant software packages Large number of RPMs Document prerequisites Understand dependency chain Decide when to use 32-bit emulation mode Interoperability with WP6 Integration into existing authentication scheme Interoperability with other partners GRID benchmarks (As available) Estimated time scale: 9 months (May be subject to change!) Awaiting recruitment of: 1 GRID programmer Also: Prepare porting strategy for phase 3
SJ – Nov openlab - phase 3 LHC Computing Grid Need to understand Software architectural choices To be made between now and mid-2003 Need new integration process of selected software Time scales Disadvantage: Possible porting of new packages Advantage: Aligned with key choices for LHC deployment Impossible at this stage to give firm estimates for timescale and required manpower
SJ – Nov openlab time line End-02End-03End-04End-05 Order/Install 32 nodes Systems experts in place – Start phase 1 Complete phase 1 openCluster Start phase 2 Order/Install Madison upgrades + 32 more nodes EDG Complete phase 2 Order/Install Montecito upgrades LCG Start phase 3
SJ – Nov IA-64 wish list For IA-64 (IPF) to establish itself solidly in the market-place: Better compiler technology Offering better system performance Wider range of systems and processors For instance: Really low-cost entry models, low power systems State-of-the-art process technology Similar “commoditization” as for IA-32
SJ – Nov openlab starts with CPU Servers Multi-gigabit LAN
SJ – Nov … and will be extended … CPU Servers Multi-gigabit LAN Gigabit long- haul link WAN Remote Fabric
SJ – Nov … step by step Gigabit long- haul link CPU Servers WAN Multi-gigabit LAN Storage system Remote Fabric
SJ – Nov Annexes The potential of openlab The openlab “advantage” The LHC Expected LHV needs The LHC Computing Grid Project – LCG
SJ – Nov The openlab “advantage” openlab will be able to build on the following strong points: 1)CERN/IT’s technical talent 2)CERN existing computing environment 3)The size and complexity of the LHC computing needs 4)CERN strong role in the development of GRID “middleware” 5)CERN’s ability to embrace emerging technologies
SJ – Nov The potential of openlab Leverage CERN’s strengths Integrates perfectly into our environment OS, Compilers, Middleware, Applications Integration alongside EDG testbed Integration into LCG deployment strategy Show with success that the new technologies can be solid building blocks for the LHC computing environment
SJ – Nov The openlab “advantage” openlab will be able to build on the following strong points: 1)CERN/IT’s technical talent 2)CERN existing computing environment 3)The size and complexity of the LHC computing needs 4)CERN strong role in the development of GRID “middleware” 5)CERN’s ability to embrace emerging technologies
SJ – Nov The Large Hadron Collider - 4 detectors CMS ATLAS LHCb Huge requirements for data analysis Storage – Raw recording rate 0.1 – 1 GByte/sec Accumulating data at 5-8 PetaBytes/year (plus copies) 10 PetaBytes of disk Processing – 100,000 of today’s fastest PCs
SJ – Nov Expected LHC needs Moore’s law (based on 2000)
SJ – Nov The LHC Computing Grid Project – LCG 1) Applications support: develop and support the common tools, frameworks, and environment needed by the physics applications 2) Computing system: build and operate a global data analysis environment integrating large local computing fabrics and high bandwidth networks to provide a service for ~6K researchers in over ~40 countries Goal – Prepare and deploy the LHC computing environment This is not “yet another grid technology project” – it is a grid deployment project LCG