1 The Gelato Federation What is it exactly ? Sverre Jarp March, 2003.

Slides:



Advertisements
Similar presentations
Tony Doyle - University of Glasgow GridPP EDG - UK Contributions Architecture Testbed-1 Network Monitoring Certificates & Security Storage Element R-GMA.
Advertisements

PowerEdge T20 Customer Presentation. Product overview Customer benefits Use cases Summary PowerEdge T20 Overview 2 PowerEdge T20 mini tower server.
1 Agenda … HPC Technology & Trends HPC Platforms & Roadmaps HP Supercomputing Vision HP Today.
Linux Clustering A way to supercomputing. What is Cluster? A group of individual computers bundled together using hardware and software in order to make.
6/2/2015Bernd Panzer-Steindel, CERN, IT1 Computing Fabric (CERN), Status and Plans.
12. March 2003Bernd Panzer-Steindel, CERN/IT1 LCG Fabric status
IBM RS6000/SP Overview Advanced IBM Unix computers series Multiple different configurations Available from entry level to high-end machines. POWER (1,2,3,4)
CPP Staff - 30 CPP Staff - 30 FCIPT Staff - 35 IPR Staff IPR Staff ITER-India Staff ITER-India Staff Research Areas: 1.Studies.
Cluster computing facility for CMS simulation work at NPD-BARC Raman Sehgal.
1 A Basic R&D for an Analysis Framework Distributed on Wide Area Network Hiroshi Sakamoto International Center for Elementary Particle Physics (ICEPP),
LCG Milestones for Deployment, Fabric, & Grid Technology Ian Bird LCG Deployment Area Manager PEB 3-Dec-2002.
CERN - European Laboratory for Particle Physics HEP Computer Farms Frédéric Hemmer CERN Information Technology Division Physics Data processing Group.
InfiniSwitch Company Confidential. 2 InfiniSwitch Agenda InfiniBand Overview Company Overview Product Strategy Q&A.
High Performance Computing Processors Felix Noble Mirayma V. Rodriguez Agnes Velez Electric and Computer Engineer Department August 25, 2004.
March 2003 CERN 1 EDG and AliEn in Prague Dagmar Adamova INP Rez near Prague.
LCG-2 Plan in Taiwan Simon C. Lin and Eric Yen Academia Sinica Taipei, Taiwan 13 January 2004.
Remote Direct Memory Access (RDMA) over IP PFLDNet 2003, Geneva Stephen Bailey, Sandburst Corp., Allyn Romanow, Cisco Systems,
6/26/01High Throughput Linux Clustering at Fermilab--S. Timm 1 High Throughput Linux Clustering at Fermilab Steven C. Timm--Fermilab.
Laboratório de Instrumentação e Física Experimental de Partículas GRID Activities at LIP Jorge Gomes - (LIP Computer Centre)
National Computational Science National Center for Supercomputing Applications National Computational Science NCSA Terascale Clusters Dan Reed Director,
RAL Site Report Andrew Sansum e-Science Centre, CCLRC-RAL HEPiX May 2004.
10/22/2002Bernd Panzer-Steindel, CERN/IT1 Data Challenges and Fabric Architecture.
October 2002 INFN Catania 1 The (LHCC) Grid Project Initiative in Prague Dagmar Adamova INP Rez near Prague.
JLAB Computing Facilities Development Ian Bird Jefferson Lab 2 November 2001.
1 Gelato Federation M. Benard – Hewlett Packard University Relations.
Storage and Storage Access 1 Rainer Többicke CERN/IT.
S.Jarp CERN openlab CERN openlab Total Cost of Ownership 11 November 2003 Sverre Jarp.
Grid User Interface for ATLAS & LHCb A more recent UK mini production used input data stored on RAL’s tape server, the requirements in JDL and the IC Resource.
Itanium 2 Impact Software / Systems MSC.Software Jay Clark Director, Business Development High Performance Computing
Tier1 Andrew Sansum GRIDPP 10 June GRIDPP10 June 2004Tier1A2 Production Service for HEP (PPARC) GRIDPP ( ). –“ GridPP will enable testing.
Ultimate Integration Joseph Lappa Pittsburgh Supercomputing Center ESCC/Internet2 Joint Techs Workshop.
IDE disk servers at CERN Helge Meinhard / CERN-IT CERN OpenLab workshop 17 March 2003.
The CERN openlab for DataGrid Applications A Practical Channel for Collaboration François Fluckiger.
RAL Site Report John Gordon HEPiX/HEPNT Catania 17th April 2002.
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions:  What capabilities are we aiming for.
Sep. 17, 2002BESIII Review Meeting BESIII DAQ System BESIII Review Meeting IHEP · Beijing · China Sep , 2002.
SJ – Mar The “opencluster” in “openlab” A technical overview Sverre Jarp IT Division CERN.
CERN openlab II (2006 – 2008) Grid-related activities Sverre Jarp CERN openlab CTO sverre.jarp at cern.ch.
Status of the Bologna Computing Farm and GRID related activities Vincenzo M. Vagnoni Thursday, 7 March 2002.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
SJ – Nov CERN’s openlab Project Sverre Jarp, Wolfgang von Rüden IT Division CERN 29 November 2002.
Office of Science U.S. Department of Energy NERSC Site Report HEPiX October 20, 2003 TRIUMF.
SJ – Sept Future processors: What is on the horizon for HEP computing Sverre Jarp CERN openlab CTO IT Department, CERN.
R.Divià, CERN/ALICE Challenging the challenge Handling data in the Gigabit/s range.
SA1 operational policy training, Athens 20-21/01/05 Presentation of the HG Node “Isabella” and operational experience Antonis Zissimos Member of ICCS administration.
F. HemmerUltraNet® Experiences SHIFT Model CPU Server CPU Server CPU Server CPU Server CPU Server CPU Server Disk Server Disk Server Tape Server Tape Server.
 System Requirements are the prerequisites needed in order for a software or any other resources to execute efficiently.  Most software defines two.
Pathway to Petaflops A vendor contribution Philippe Trautmann Business Development Manager HPC & Grid Global Education, Government & Healthcare.
CERN News on Grid and openlab François Fluckiger, Manager, CERN openlab for DataGrid Applications.
Virtual Server Server Self Service Center (S3C) JI July.
Voltaire and the CERN openlab collaborate on Grid technology project using InfiniBand May 27, 2004 Patrick Chevaux EMEA Business Development
IHEP Computing Center Site Report Gang Chen Computing Center Institute of High Energy Physics 2011 Spring Meeting.
26. Juni 2003Bernd Panzer-Steindel, CERN/IT1 LHC Computing re-costing for for the CERN T0/T1 center.
Recent experience with PCI-X 2.0 and PCI-E network interfaces and emerging server systems Yang Xia Caltech US LHC Network Working Group October 23, 2006.
Introduction to Data Analysis with R on HPC Texas Advanced Computing Center Feb
CNAF - 24 September 2004 EGEE SA-1 SPACI Activity Italo Epicoco.
Bernd Panzer-Steindel CERN/IT/ADC1 Medium Term Issues for the Data Challenges.
SJ – June CERN openlab for DataGrid applications Sverre Jarp CERN openlab CTO IT Department, CERN.
ALICE Computing Data Challenge VI
CERN’s openlab Project
OpenCluster Planning Sverre Jarp IT Division CERN October 2002.
Computer Hardware.
OpenLab Enterasys Meeting
CERN openlab for DataGrid applications Programme of Work Overview F
Grid related projects CERN openlab LCG EDG F.Fluckiger
openLab Technical Manager
The CERN openlab and the European DataGrid Project
The CERN openlab for DataGrid Applications François Fluckiger
CERN openlab for DataGrid applications Setting the Scene F
CERN openlab for DataGrid applications Overview F.Fluckiger
Presentation transcript:

1 The Gelato Federation What is it exactly ? Sverre Jarp March, 2003

SJ – Mar Gelato is a collaboration Goal: Promote Linux on Itanium-based systems Sponsor Hewlett-Packard Others coming Members 13 (right now) Mainly from the High Performance/High Throughput Community Expected to grow rapidly

SJ – Mar Current members North America NCAR (National Center for Atmospheric Research) NCSA (National Center for Supercomputing Applications) PNNL (Pacific Northwest National Lab) PSC (Pittsburgh Supercomputer Center) University of Illinois-Urbana/Champaign University of Waterloo Europe CERN DIKU (Datalogic Institute, University of Copenhagen) ESIEE (École Supérieure d’ingénieurs near Paris) INRIA (Institut National de Recherche en Informatique et Automatique) Far-East/Australia Bio-informatics Institute (Singapore) University of Tsinghua (Beijing) University of New South Wales (Sydney)

SJ – Mar Center of gravity Web portal ( Rich content (Pointers to) Open source IA-64 applications Examples: ROOT (from CERN) OSCAR (Cluster mgmt software from NSCA) OpenImpact compiler (UIUC) News Information, advice, hints Related to IPF, Linux kernel, etc. Member overview Who is who, etc.

SJ – Mar Current development focus Six “performance” areas: Single system scalability From 2-way to 16-way (HP, Fort Collins) Cluster Scalability and Performance Mgmt Up to 128-nodes: NSCA Parallel File System BII Compilers UIUC Performance tools, management HP Labs

SJ – Mar CERN Requirement # 1 Better C++ performance through Better compilers Faster systems Both! Gelato focus 1.5 GHz

SJ – Mar Further Gelato Research and Development Linux memory management Superpages TLB sharing between processes IA-64 pre-emption support Compilers/Debuggers OpenImpact C compiler (UIUC) Open Research Compiler enhancements (Tsinghua) Fortran, C, C++ Parallel debugger (Tsinghua)

SJ – Mar The “opencluster” and the “openlab” Sverre Jarp IT Division CERN

SJ – Mar Definitions The “CERN openlab for DataGrid applications” is a framework for evaluating and integrating cutting-edge technologies or services in partnership with industry, focusing on potential solutions for the LCG. The openlab invites members of the industry to join and contribute systems, resources or services, and carry out with CERN large-scale high-performance evaluation of their solutions in an advanced integrated environment. “opencluster” project The openlab is constructing a pilot ‘compute and storage farm’ called the opencluster, based on HP's dual processor servers, Intel's Itanium Processor Family (IPF) processors, Enterasys's 10-Gbps switches and, at a later stage, a high-capacity storage system.

SJ – Mar Technology onslaught Large amounts of new technology will become available between now and LHC start-up. A few HW examples: Processors SMT (Symmetric Multi-Threading) CMP (Chip Multiprocessor) Ubiquitous 64-bit computing (even in laptops) Memory DDR II-400 (fast) Servers with 1 TB (large) Interconnect PCI-X  PCI-X2  PCI-Express (serial) Infiniband Computer architecture Chipsets on steroids Modular computers ISC2003 Keynote Presentation Building Efficient HPC Systems from Catalog Components Justin Rattner, Intel Corp., Santa Clara, USABuilding Efficient HPC Systems from Catalog Components Justin Rattner Disks Serial-ATA Ethernet 10 GbE (NICs and switches) 1 Terabit backplanes Not all, but some of this will definitely be used by LHC

SJ – Mar Vision: A fully functional GRID cluster node Gigabit long-haul link CPU Servers WAN Multi-gigabit LAN Storage system Remote Fabric

SJ – Mar opencluster strategy Demonstrate promising technologies LCG and LHC on-line Deploy the technologies well beyond the opencluster itself 10 GbE interconnect in the LHC Testbed Act as a 64-bit Porting Centre CMS and Alice already active; ATLAS is interested CASTOR 64-bit reference platform Storage subsystem as CERN-wide pilot Focal point for vendor collaborations For instance, in the “10 GbE Challenge” everybody must collaborate in order to be successful Channel for providing information to vendors Thematic workshops

SJ – Mar The opencluster today Three industrial partners: Enterasys, HP, and Intel A fourth partner joining soon Data storage subsystem Which would “fulfill the vision” Technology aimed at the LHC era Network switches at 10 Gigabits Rack-mounted HP servers 64-bit Itanium processors Cluster evolution: 2002: Cluster of 32 systems (64 processors) 2003: 64 systems (“Madison” processors) 2004/05: Possibly 128 systems (“Montecito” processors)

SJ – Mar Activity overview Over the last few months Cluster installation, middleware Application porting, compiler installations, benchmarking Initialization of “Challenges” Planned first thematic workshop Future Porting of grid middleware Grid integration and benchmarking Storage partnership Cluster upgrades/expansion New generation network switches

SJ – Mar opencluster in detail Integration of the cluster: Fully automated network installations 32 nodes + development nodes RedHat Advanced Workstation 2.1 OpenAFS, LSF GNU, Intel, ORC Compilers (64-bit) ORC (Open Research Compiler, used to belong to SGI) CERN middleware: Castor data mgmt CERN Applications Porting, Benchmarking, Performance improvements CLHEP, GEANT4, ROOT, Sixtrack, CERNLIB, etc. Database software (MySQL, Oracle?) Many thanks to my colleagues in ADC, FIO and CS

SJ – Mar The compute nodes HP rx2600 Rack-mounted (2U) systems Two Itanium-2 processors 900 or 1000 MHz Field upgradeable to next generation 2 or 4 GB memory (max 12 GB) 3 hot pluggable SCSI discs (36 or 73 GB) On-board 100 Mbit and 1 Gbit Ethernet 4 PCI-X slots: full-size 133 MHz/64-bit slot(s) Built-in management processor Accessible via serial port or Ethernet interface

SJ – Mar rx2600 block diagram PCIX 133/64 LAN 10/100 zx1 IOA HD D USB 2.0 IDE CD - DVD SCSI Ultra 160 Gbit LAN PCIX 133/64 zx1 IOA cell 1 HD D CD/ DV D 12 DIMMs Intel Itanium 2 3 internal drives zx1 Memory & I/O Controller LAN 10/100 3 serial ports cell 0 zx1 IOA Service processor Management Processor card VGA monitor channel a channel b 6.4GB/s PCIX 133/ GB/s 1 GB/s

SJ – Mar Benchmarks Comment: Note that 64-bit benchmarks will pay a performance penalty for LP64, i.e. 64-bit pointers. Need to wait for AMD systems that can run natively either a 32-bit OS or a 64-bit OS to understand the exact cost for our benchmarks.

SJ – Mar Benchmark-1: Sixtrack (SPEC) What we would have liked to see for all CERN benchmarks: Projections: 1.5 GHz: ~ 81 s  Small is best! From MHz (efc7) Pentium 3.06 GHz (ifl7) IBM 1300 MHz Sixtrack122 s195 s202 s

SJ – Mar Benchmark-2: CRN jobs/FTN 800 MHz (efc O3, prof_use) Itanium 1000 MHz (efc O3, ipo, prof_use) Pentium 4 2 GHz, 512KB (ifc O3, ipo, prof_use) Geom. Mean CU/MHz  Big is best! Projections: 1.5 GHz: ~ 585 CU P4 3.0 GHz: ~ 620 CU

SJ – Mar Benchmark-3: Rootmarks/C++ All jobs run in “batch” mode ROOT Itanium 1000MHz (gcc 3.2, O3) Itanium 1000MHz (ecc7 prod, O2) Stress –b -q Bench –b -q Root -b benchmarks.C -q Geometric Mean Projections: 1.5 GHz: ~ 660 RM Pentium 3.0 GHz/512KB: ~ 750 RM René’s own 2.4 GHz P4 is normalized to 600 RM. Stop press: We have just agreed on a compiler improvement project with Intel

SJ – Mar opencluster - phase 1 Perform cluster benchmarks: Parallel ROOT queries (via PROOF) Observed excellent scaling: 2  4  8  16  32  64 CPUs To be reported at CHEP2003 “1 GB/s to tape” challenge Network interconnect via 10 GbE switches Opencluster may act as CPU servers 50 StorageTek tape drives in parallel “10 Gbit/s network Challenge” Groups together all Openlab partners Enterasys switch HP servers Intel processors and n/w cards CERN Linux and n/w expertise

SJ – Mar GbE Challenge

SJ – Mar Network topology in FastEthernet Disk Servers E1 OAS Gig copper Gig fiber 10 Gig

SJ – Mar Enterasys extension 1Q FastEthernet Disk Servers Gig copper Gig fiber 10 Gig node Itanium cluster200+ node Pentium cluster E1 OAS

SJ – Mar Why a 10 GbE Challenge? Demonstrate LHC-era technology All necessary components available inside the opencluster Identify bottlenecks And see if we can improve We know that Ethernet is here to stay 4 years from now 10 Gbit/s should be commonly available Backbone technology Cluster interconnect Possibly also for iSCSI and RDMA traffic We want to advance the state-of-the-art !

SJ – Mar Demonstration of openlab partnership Everybody contributes: Enterasys 10 Gbit switches Hewlett-Packard Server with its PCI-X slots and memory bus Intel 10 Gbit NICs plus driver Processors (i.e. code optimization) CERN Linux kernel expertise Network expertise Project management IA32 expertise CPU clusters, disk servers on multi-Gbit infrastructure

SJ – Mar “Can we reach 400 – 600 MB/s throughput?” Bottlenecks could be: Linux CPU consumption Kernel and driver optimization Number of interrupts; tcp checksum; ip packet handling, etc. Definitely need TCP offload capabilities Server hardware Memory banks and speeds PCI-X slot and overall speed Switch Single transfer throughput Aim: identify bottleneck(s) Measure peak throughput Corresponding cost: processor, memory, switch, etc.

SJ – Mar Gridification

SJ – Mar Opencluster - future Port and validation of EDG 2.0 software Joint project with CMS Integrate opencluster alongside EDG testbed Porting, Verification Relevant software packages (hundreds of RPMs) Understand chain of prerequisites Exploit possibility to leave control node as IA-32 Interoperability with EDG testbeds and later with LCG-1 Integration into existing authentication scheme GRID benchmarks To be defined later Fact sheet: HP joined openlab mainly because of their interest in Grids

SJ – Mar Opencluster time line Jan 03Jan 04Jan 05Jan 06 Install 32 nodes Start phase 1 - Systems expertise in place Complete phase 1 Order/Install G-2 upgrades and 32 more nodes Order/Install G-3 upgrades; Add nodes openCluster integration EDG and LCG interoperability Start phase 2 Complete phase 2Start phase 3

SJ – Mar Recap: opencluster strategy Demonstrate promising IT technologies File system technology to come Deploy the technologies well beyond the opencluster itself Focal point for vendor collaborations Channel for providing information to vendors

SJ – Mar Storage Workshop Data and Storage Mgmt Workshop (Draft Agenda) March 17th – 18th 2003 (Sverre Jarp) Organized by the CERN openlab for Datagrid applications and the LCG Aim: Understand how to create synergy between our industrial partners and LHC Computing in the area of storage management and data access. Day 1 (IT Amphitheatre) Introductory talks: 09:00 – 09:15 Welcome. (von Rueden) 09:15 – 09:35 Openlab technical overview (Jarp) 09:35 – 10:15 Gridifying the LHC Data: Challenges and current shortcomings (Kunszt) 10:15 – 10:45 Coffee break The current situation: 10:45 – 11:15 Physics Data Structures and Access Patterns (Wildish) 11:15 – 11:35 The Andrew File System Usage in CERN and HEP (Többicke) 11:35 – 12:05 CASTOR: CERN’s data management system (Durand) 12:05 – 12:25 IDE Disk Servers: A cost-effective cache for physics data (NN) 12:25 – 14:00 Lunch Preparing for the future 14:00 – 14:30 ALICE Data Challenges: On the way to 1 GB/s (Divià) 14:30 – 15:00 Lessons learnt from managing data in the European Data Grid (Kunszt) 15:00 – 15:30 Could Oracle become a player in the physics data management? (Shiers) 15:30 – 16:00 CASTOR: possible evolution into the LHC era (Barring) 16:00 – 16:30 POOL: LHC data Persistency (Duellmann) 16:30 – 17:00 Coffee break 17:00 – Discussions and conclusion of day 1 (All) Day 2 (IT Amphitheatre) Vendor interventions; One-on-one discussions with CERN

SJ – Mar THANK YOU