Presentation is loading. Please wait.

Presentation is loading. Please wait.

SJ – Mar 2003 1 The “opencluster” in “openlab” A technical overview Sverre Jarp IT Division CERN.

Similar presentations


Presentation on theme: "SJ – Mar 2003 1 The “opencluster” in “openlab” A technical overview Sverre Jarp IT Division CERN."— Presentation transcript:

1 SJ – Mar 2003 1 The “opencluster” in “openlab” A technical overview Sverre Jarp IT Division CERN

2 SJ – Mar 2003 2 Definitions The “CERN openlab for DataGrid applications” is a framework for evaluating and integrating cutting-edge technologies or services in partnership with industry, focusing on potential solutions for the LCG. The openlab invites members of the industry to join and contribute systems, resources or services, and carry out with CERN large-scale high-performance evaluation of their solutions in an advanced integrated environment. “opencluster” project The openlab is constructing a pilot ‘compute and storage farm’ called the opencluster, based on HP's dual processor servers, Intel's Itanium Processor Family (IPF) processors, Enterasys's 10-Gbps switches and, at a later stage, a high-capacity storage system.

3 SJ – Mar 2003 3 LHC Computing Grid 02030405060708 LCG Project Selection, Integration (technology) and deployment (full scale) Full Production service Worldwide scope 09

4 SJ – Mar 2003 4 CERN openlab 02030405060708 LCG Framework for collaboration Evaluation, integration, validation (of cutting-edge technologies) 3-year lifetime CERN openlab

5 SJ – Mar 2003 5 Technology onslaught Large amounts of new technology will become available between now and LHC start-up. A few HW examples: Processors SMT (Symmetric Multi-Threading) CMP (Chip Multiprocessor) Ubiquitous 64-bit computing (even in laptops) Memory DDR II-400 (fast) Servers with 1 TB (large) Interconnect PCI-X  PCI-X2  PCI-Express (serial) Infiniband Computer architecture Chipsets on steroids Modular computers ISC2003 Keynote Presentation Building Efficient HPC Systems from Catalog Components Justin Rattner, Intel Corp., Santa Clara, USABuilding Efficient HPC Systems from Catalog Components Justin Rattner Disks Serial-ATA Ethernet 10 GbE (NICs and switches) 1 Terabit backplanes Not all, but some of this will definitely be used by LHC

6 SJ – Mar 2003 6 Vision: A fully functional GRID cluster node Gigabit long-haul link CPU Servers WAN Multi-gigabit LAN Storage system Remote Fabric

7 SJ – Mar 2003 7 opencluster strategy Demonstrate promising technologies LCG and LHC on-line Deploy the technologies well beyond the opencluster itself 10 GbE interconnect in the LHC Test-bed Act as a 64-bit Porting Centre CMS and Alice already active; ATLAS is interested CASTOR 64-bit reference platform Storage subsystem as CERN-wide pilot Focal point for vendor collaborations For instance, in the “10 GbE Challenge” everybody must collaborate in order to be successful Channel for providing information to vendors Thematic workshops

8 SJ – Mar 2003 8 The opencluster today Three industrial partners: Enterasys, HP, and Intel A fourth partner to join Data storage subsystem Which will “fulfill the vision” Technology aimed at the LHC era Network switches at 10 Gigabits Rack-mounted HP servers 64-bit Itanium processors Cluster evolution: 2002: Cluster of 32 systems (64 processors) 2003: 64 systems (“Madison” processors) 2004/05: Possibly 128 systems (“Montecito” processors)

9 SJ – Mar 2003 9 Activity overview Over the last few months Cluster installation, middleware Application porting, compiler installations, benchmarking Initialization of “Challenges” Planned first thematic workshop Future Porting of grid middleware Grid integration and benchmarking Storage partnership Cluster upgrades/expansion New generation network switches

10 SJ – Mar 2003 10 opencluster in detail Integration of the cluster: Fully automated network installations 32 nodes + development nodes RedHat Advanced Workstation 2.1 OpenAFS, LSF GNU, Intel, ORC Compilers (64-bit) ORC (Open Research Compiler, used to belong to SGI) CERN middleware: Castor data mgmt CERN Applications Porting, Benchmarking, Performance improvements CLHEP, GEANT4, ROOT, Sixtrack, CERNLIB, etc. Database software (MySQL, Oracle?)

11 SJ – Mar 2003 11 The compute nodes HP rx2600 Rack-mounted (2U) systems Two Itanium-2 processors 900 or 1000 MHz Field upgradeable to next generation 2 or 4 GB memory (max 12 GB) 3 hot pluggable SCSI discs (36 or 73 GB) On-board 100 Mbit and 1 Gbit Ethernet 4 PCI-X slots: full-size 133 MHz/64-bit slot(s) Built-in management processor Accessible via serial port or Ethernet interface

12 SJ – Mar 2003 12 Opencluster - phase 1 Perform cluster benchmarks: Parallel ROOT queries (via PROOF) Observed excellent scaling: 2  4  8  16  32  64 CPUs To be reported at CHEP2003 “1 GB/s to tape” challenge Network interconnect via 10 GbE switches opencluster may act as CPU servers 50 StorageTek tape drives in parallel “10 Gbit/s network Challenge” Groups together all openlab partners Enterasys switch HP servers Intel processors and n/w cards CERN Linux and n/w expertise

13 SJ – Mar 2003 13 Enterasys extension 1Q2003 1-12 13-24 25-36 37-48 49- 6061-72 73-8485-96 1-12 13-2425-36 4 4 4 4444 2 2 2 2 2 1-96 FastEthernet Disk Servers Gig copper Gig fiber 10 Gig 32 32 node Itanium cluster200+ node Pentium cluster E1 OAS

14 SJ – Mar 2003 14 Why a 10 GbE Challenge? Demonstrate LHC-era technology All necessary components available inside the opencluster Identify bottlenecks And see if we can improve We know that Ethernet is here to stay 4 years from now 10 Gbit/s should be commonly available Backbone technology Cluster interconnect Possibly also for iSCSI and RDMA traffic We want to advance the state-of-the-art !

15 SJ – Mar 2003 15 Demonstration of openlab partnership Everybody contributes: Enterasys 10 Gbit switches Hewlett-Packard Server with its PCI-X slots and memory bus Intel 10 Gbit NICs plus driver Processors (i.e. code optimization) CERN Linux kernel expertise Network expertise Project management IA32 expertise CPU clusters, disk servers on multi-Gbit infrastructure Stop Press: We are up and running with back-to-back connections

16 SJ – Mar 2003 16 Opencluster time line Jan 03Jan 04Jan 05Jan 06 Install 32 nodes Start phase 1 - Systems expertise in place Complete phase 1 Order/Install G-2 upgrades and 32 more nodes Order/Install G-3 upgrades; Add nodes openCluster integration EDG and LCG interoperability Start phase 2 Complete phase 2Start phase 3

17 SJ – Mar 2003 17 Opencluster - future Port and validation of EDG 2.0 software Joint project with CMS Integrate opencluster alongside EDG testbed Porting, Verification Relevant software packages (hundreds of RPMs) Understand chain of prerequisites Exploit possibility to leave control node as IA-32 Interoperability with EDG Testbeds and later with LCG-1 Integration into existing authentication scheme GRID benchmarks To be defined later

18 SJ – Mar 2003 18 Recap: opencluster strategy Demonstrate promising IT technologies File system technology to come Deploy the technologies well beyond the opencluster itself Focal point for vendor collaborations Channel for providing information to vendors

19 SJ – Mar 2003 19 The Workshop

20 SJ – Mar 2003 20 IT Division 250 people About 200 are at engineering level 10 groups: Advanced Projects’ Group (DI) (Farm) Architecture and Data Challenges (ADC) Communications Services (CS) Fabric Infrastructure and Operations (FIO) Grid Deployment (GD) Databases (DB) Internet (and Windows) Services (IS) User Services (US) Product Support (PS) Controls (CO) Groups have both a development and a service responsibility Most of today’s speakers are from ADC and DB

21 SJ – Mar 2003 21 High Energy Physics Computing Characteristics Independent events (collisions) Trivial parallel processing Bulk of the data is read-only versions rather than updates Meta-data in databases linking to files Chaotic workload – research environment - physics extracted by iterative analysis, collaborating groups of physicists  Unpredictable  unlimited demand Very large aggregate requirements: computation, data, i/o

22 SJ – Mar 2003 22 SHIFT architecture Three tiers Interconnected via Ethernet StorageTek Powderhorn 6,000 1/2” tape cartridges CPU servers (no permanent data) DISK servers (cached data) Tape robots

23 SJ – Mar 2003 reconstruction simulation analysis interactive physics analysis batch physics analysis batch physics analysis detector event summary data raw data event reprocessing event reprocessing event simulation event simulation analysis objects (extracted by physics topic) Data Handling and Computation for Physics Analysis event filter (selection & reconstruction) event filter (selection & reconstruction) processed data les.robertson@cern.ch CERN

24 SJ – Mar 2003 24 Backup


Download ppt "SJ – Mar 2003 1 The “opencluster” in “openlab” A technical overview Sverre Jarp IT Division CERN."

Similar presentations


Ads by Google