Presentation is loading. Please wait.

Presentation is loading. Please wait.

SURA Presentation for IBM HW for the SURA GRID for 2007

Similar presentations


Presentation on theme: "SURA Presentation for IBM HW for the SURA GRID for 2007"— Presentation transcript:

1 SURA Presentation for IBM HW for the SURA GRID for 2007
Janis Landry-Lane IBM World Wide Deep Computing

2 AGENDA for the presentation
IBM pSERIES offering for SURA for 2007 IBM e1350 HS21 Blade offering for SURA for 2007 INTEL Clovertown performance benchmarks SURA services/collaboration INTEL/IBM partnership

3 Power5+ p575 High Performance Computing Solutions for SURA
Dileep Bhattacharya Product Manager High end System p Servers

4 Power5+, p575 Server and Rack General Description

5 P5 575 System Solution Characteristics
Robust Hardware with High Reliability Components 16 CPU scalability within a node Low Latency High Performance Switch Technology Industrial Strength OS and HPC Software Subsystems High Compute Density Packaging Ability to scale to very large configurations

6 0.97 TFlop Solution For SURA
16W Fed SW BPA 2U 4U 8, 16W Nodes at 1.9 Ghz P5+ 128 Processors Federation Switch to Connect the 8 Nodes Peak Mem BW: 1.64 TB/s Peak Node Interconnect BW: 32 GB/s Peak Network BW (Ethernet): 32 Gbit/s Total System Memory: 256 GB or 128 GB Total Storage Capacity: 2.35 TBytes Cumulative SpecFP_rate: 4576

7 1.7 TFlop Solution for SURA
16W Fed SW BPA 2U 4U 14, 16W Nodes at 1.9 Ghz P5+ 224 Processors Federation Switch to Connect the 14 Nodes Peak Mem BW: 2.86 TB/s Peak Node Interconnect BW: 56 GB/s Peak Network BW (Ethernet): 56 Gbit/s Total System Memory: 224 GB or 448 GB Total Storage Capacity: 4.11 TBytes Cumulative SpecFP_rate: 8008

8 p5 575 Software AIX 5.3 General Parallel File System (GPFS) with WAN Support LoadLeveler Cluster Systems Management (CSM) Compilers (XL/FORTRAN, XLC) Engineering and Scientific Subroutine Library (ESSL) IBM’s Parallel Environment (PE) Simultaneous Multi-Threading (SMT) Support Virtualization, Micro-Partitioning, DLPAR

9 Growing the SURAgrid IBM e1350 Cluster Offerings
John Pappas IBM Southeast Linux Cluster Sales March 12, 2007

10 Outline IBM HPC Offerings e1350 Overview SURA Offerings Approach
SURA e1350 Configurations

11 IBM HPC Foundation Offerings
Scale-out (high-value) Tightly coupled clusters RISC Optional high performance interconnect (industry-standard, OEM or custom) Industry standard or custom packaging Vendor integrated IBM Offerings: p5-575+, p5-590, p5-595, p655, p690 Purpose-built Specifically designed for HPC capability workloads Usually custom microprocessors, usually employ vectors and streaming Custom interconnect Custom packaging IBM Offerings: Blue Gene Scale-out (commodity) Clusters of 1, 2, and 4-way blade or rack-optimized servers Based on “merchant” or low-cost technology Standard or OEM high performance interconnects & graphics adapters Standard packaging Broad ISV support with concomitant availability (Linux and Windows) Often vendor integrated IBM Offerings: BladeCenter HS21/LS21/JS21/Cell; Deep Computing Visualization, OpenPOWER 720, e1350 clusters (Standalone) SMP 2-way to 64-CPU (or bigger) SMP servers Single system simplicity, uniform memory programming model, and high SMP scalability for a broad range of problem sizes and throughput objectives Broad ISV support (Unix and Linux) IBM offerings: p5-570, p5-595, OpenPower 710/720, 3755 (Ridgeback), x366(Intel) SURA Offering in 2006 and 2007 SURA Offering in 2007 IBM IBM H C R U6

12 What is an IBM Cluster 1350

13 IBM ~ Cluster 1350 Hardware Core Technologies IBM Servers
Management Nodes Compute Nodes Storage Nodes 8-way Servers 3755 p5 520 IBM TotalStorage® 4-way Servers 3550, 3650, 3950 Storage Networking Fiber iSCSI Networks 3455, 3655, 3755 An IBM portfolio of components that have been cluster configured, tested, and work with a defined supporting software stack. Factory assembled Onsite Installation One phone number for support. Selection of options to customize your configuration including Linux operating system (RHEL or SUSE), CSM, xCAT, GPFS and Deep Computing Visualization (DCV) p5 505 Ethernet OpenPower 710 10 GbE Storage Software ServeRaid Blade Servers 1000 MbE HS21, LS21 10/100 MbE JS21 QS20 Infiniband Core Technologies Processors -Intel® -AMD® -CELL BE PowerPC® IBM POWER5™ 1X Disk Storage 4X SCSI Fiber Specialty Myrinet® SATA

14 BladeCenter Efficiency
BladeCenter helps clients simplify their IT infrastructure and gain operational efficiency. Fewer Outages Fewer Fabric Connections Less Power One reason we have both the BCH and the BCHT is that the telco environment and the traditional datacenter are merging Fewer Cables Smarter Management Less Cooling

15 Introducing the IBM BladeCenter HS21 XM
Deliver leadership performance and efficiency for most applications Double the memory of the HS21 30mm blade Integer, Floating Point, many financial specific applications will benefit greatly A new core architecture, dual core processors, and fully buffered DIMMs all lead to the gains Supports new solid state Modular Flash Devices Starting with the IBM 4GB Modular Flash Drive, solid state drives are optimized for durability, reliability and power efficiency RAID arrays can potentially be simplified Flash drives virtually eliminate yet another potential point of failure inside the BladeCenter HS21 XM is designed to extract the most from multi-core processors More memory, NICs, I/O, designed for diskless operation and low power 64-bit matched with up to 32GB of memory support even the most demanding solutions 10Gb enabled for BladeCenter H Can support both PCI-X and PCI-Express I/O Cards Supports all the traditional I/O cards in the BladeCenter family

16 IBM BladeCenter HS21 XM A Closer Look
2.5” SAS HDD 8 Standard FB DIMMs 8 FB DIMMs Up to 32GB of memory per blade SAS HDD (36, 73, 146GB) 2 NICs - Broadcom 5708S (TOE enabled) Diskless ready: iSCSI and SAN boot for all OS Support for IBM Modular Flash Device 4GB Dual and Quad-Core processors 65W and 80W Woodcrest GHz 80W Clovertown GHz Supports Concurrent KVM Mezzanine Card (cKVM) Supports PEU2 and SIO Expansion Units Support for the new MSIM Combo Form Factor (CFF) card to double port count per blade CFF-V Card CFF-H Card IBM Modular Flash Drive MCH Nested Between CPUs HS21 XM availability: early March 2007

17 Complete Systems Management for BladeCenter
Integrated management for resource efficiency Automation for productivity Application Management LoadLeveler Compilers Libraries AMM with BladeCenter Address Manager MAC Address and WWN address (I/O) virtualization Manage, control, install single pt Concurrent KVM RAS PowerExecutive xCAT or CSM Manage across chassis, other platforms Virtualize and optimize, stateless Computing Maintain and update Manage/report policy

18 CoolBlue™ from IBM: Energy management innovation
Power Configurator PowerExecutive™ Rear Door Heat Exchanger Plan & Manage Budget Save Power has become a difficult limitation for many clients to manage It takes a holistic approach to power to make a different IBM Mainframe inspired thinking put to work inside BladeCenter Steps to extract the most from your data center Save power with smartly designed servers Plan better with accurate planning tools Monitor and manage power with PowerExecutive Use room level solutions as needed Plan for the future Really interested in power? Download Power and Cooling Customer Presentation: x&docID=bcleadershipPowCool

19 SURA e1350 Offerings Approach
Surveyed the SURA members for useful cluster configurations Keep it simple Leverage the latest hardware and cluster technology Minimize cost of ownership Provide a complete, integrated ready-to-use cluster Leverage the existing MOU with SURA for the pSeries

20 SURA e1350 Offerings Architecture
Configuration Basis: New IBM BladeCenter-H, new HS21XM Blades and Intel Quad-Core Processors Create a 3 TFLOP and a 6 TFLOP cluster configuration 3 TFLOP – Cost Conscience Solution for HPC One Rack solution utilizing GigE interconnect 1GB/core Combination Management/User node with storage 6 TFLOP – Performance Focused Solution for HPC Two Rack solution utilizing DDR Infiniband 2GB/core Optional SAN supporting 4Gbs storage at 4.6Tbytes

21 Common e1350 Features BladeCenter-H Based Chassis
Redundant power supplies and Fan Units Advanced Management Module Dual 10 Gbps Backplanes Fully integrated, tested and installed e1350 Cluster Onsite configuration, setup and skills transfer from our Cluster Enablement Team. QUAD-CORE Intel Processors (8 Cores/Node) Single Point of Support for the Cluster Terminal Server connection to every Node IBM 42U Enterprise Racks Pull-out Console monitor, keyboard, mouse Redundant power and fans on all nodes 3 Year Onsite Warranty 9x5xNext Day on-site on Compute Nodes 24x7x4 Hour on-site on Management Node, switches, racks (optional Storage)

22 3 TFLOP e1350 Cluster - 34 HS21XM Blade Servers in 3 BladeCenter H Chassis Dual Quad-Core 2.67 GHz Clovertown Processors 1 GB Memory per core 73 GB SAS Disk per blade GigE Ethernet to Blade with 10Gbit Uplink Serial Terminal Server connection to every blade Redundant power/fans x3650 2U Management/User Node Dual Quad-Core 2.67 GHz Clovertown Processors 1 GB Memory per core Myricom 10Gb NIC Card RAID Controller with (6) 300GB 10K Hot-swap SAS Drives Redundant power/fans Force10 48-port GigE Switch with 2 10Gb Uplinks SMC 8-port 10Gb Ethernet Switch (2) 32-port Cyclades Terminal Servers RedHat ES 4 License and Media Kit (3 years update support) Console Manger, Pull-out console, keyboard, mouse One 42U Enterpise Rack, all cables, PDU’s Shipping and Installation 5 Days onsite Consulting for configuration, skills transfer 3 Year Onsite Warranty

23 6 TFLOP e1350 Cluster - 70 HS21XM Blade Servers in 5 BladeCenter H Chassis Dual Quad-Core 2.67 GHz Clovertown Processors 2 GB Memory per core 73 GB SAS Disk per blade GigE Ethernet to Blade DDR Non-Blocking Voltaire Infiniband Low Latency Network Serial Terminal Server connection to every blade Redundant power/fans x3650 2U Management/User Node Dual Quad-Core 2.67 GHz Clovertown Processors 1 GB Memory per core Myricom 10Gb NIC Card RAID Controller with (6) 300GB 10K Hot-swap SAS Drives Redundant power/fans DDR Non-Blocking Infiniband Network Force10 48-port GigE Switch (3) 32-port Cyclades Terminal Servers RedHat ES 4 License and Media Kit (3 years update support) Console Manger, Pull-out console, keyboard, mouse One 42U Enterpise Rack, all cables, PDU’s Shipping and Installation 10 Days onsite Consulting for configuration, skills transfer 3 Year Onsite Warranty

24 6 TFLOP e1350 Cluster Storage Option -
x3650 Storage Node Dual Quad-Core 2.67 GHz Clovertown Processors 1 GB Memory per core Myricom 10Gb NIC Card (2) 3.5" 73GB 10k Hot Swap SAS Drive (2) IBM 4-Gbps FC Dual-Port PCI-E HBA Redundant power/fans 3 Year Onsite 24x7x4Hour On-site Warranty DS4700 Storage Subsystem 4 Gbps Performance (Fiber Channel) EXP810 Expansion System (32) 4 Gbps FC, GB/15K Enhanced Disk Drive Module (E-DDM) Total 4.6 TB Storage Capacity

25 Why IBM SURA Cluster Offerings
Outstanding Performance at 2.67 GHz; Industry Leading Quad-Core 3 TFLOP – 2.89 Peak TFLOPS, 1.46 TFLOP Estimated Actual, 50% Efficiency 6 TFLOP – 5.96 Peak TFLOPS, 4.29 TFLOPS Estimated Actual, 72% Efficiency Switches are Chassis Based so modular growth is simpler Redundant power and fans 38% less power and cooling required for BladeCenter solution over a 1U rack-mount cluster Smaller footprint Stateless Computing with xCAT or CSM Academic Initiative – CSM, LoadLeveler, Compilers Complete, integrated, installed cluster. Onsite skills transfer Application Migration Support from IBM Single Point of Support for 3 Years

26 HPC performance slides for SURA comparing Clovertown and other offerings
Michael Greenfield Principal Software Engineer, Enterprise System Software Division, Intel, office:

27 Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel® products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, reference or call (U.S.) or * Other brands and names may be claimed as the property of others.

28 Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel® products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, reference or call (U.S.) or * Other brands and names may be claimed as the property of others.

29 Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel® products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, reference or call (U.S.) or * Other brands and names may be claimed as the property of others.

30 Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel® products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, reference or call (U.S.) or * Other brands and names may be claimed as the property of others.

31 Life Sciences Applications Engineer
Performance comparison of Intel® Xeon® Processor 5160 and Intel® Xeon® Processor X5355 for Life Sciences Applications Omar G. Stradella, PhD Life Sciences Applications Engineer Intel Corporation

32 HPC Life Sciences Applications
Computational Chemistry Sequence Analysis and Biological Databases Docking De novo Design Secondary Structure Prediction QSAR, QSPR Pharmacophore Modeling, Shape Matching Homology Modeling Pathway Analysis Focus for Today’s Presentation X-Ray and NMR 7 applications in Computational Chemistry and Bioinformatics

33 Summary of Comparison Platforms
Codename Woodcrest Clovertown Processor Intel® Xeon® Processor 5160 Intel® Xeon® Processor X5355 Processor frequency 3.0 GHz 2.67 GHz L2 Cache 4 MB 2x4 MB FSB Frequency 1333 MHz Cores/Sockets/Cores per Socket 4/2/2 8/2/4 Disk 4 disk RAID0, 270GB RAM 16GB

34 Intel relative performance for Life Sciences apps
Clovertown relative performance compared to Woodcrest (one thread per core) Clovertown 34-70% better than Woodcrest Computational Chemistry Bioinformatics Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel® products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, reference or call (U.S.) or Higher is better Source: Intel Internal Measurement * Other brands and names may be claimed as the property of others.

35 Gaussian* and GAMESS* relative performance
Clovertown relative performance compared to Woodcrest (one thread per core) Gaussian* (Gaussian, Inc) Version: 03-D.01 GAMESS* (Iowa State University) Version: 12 Apr 2006 Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel® products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, reference or call (U.S.) or Higher is better Numbers on bars are elapsed times in seconds Source: Intel Internal Measurement * Other brands and names may be claimed as the property of others.

36 Amber* and GROMACS* relative performance
Clovertown relative performance compared to Woodcrest (one thread per core) Amber* (UCSF) Version: 9 GROMACS* (Groningen University ) Version: 3.3 Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel® products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, reference or call (U.S.) or Higher is better Numbers on bars are elapsed times in seconds Source: Intel Internal Measurement * Other brands and names may be claimed as the property of others.

37 Intel scalability assessment for Life Sciences apps
Parallel speedups vs 1 thread Computational Chemistry Bioinformatics Performance tests and ratings are measured using specific computer systems and/or components and reflect the approximate performance of Intel® products as measured by those tests. Any difference in system hardware or software design or configuration may affect actual performance. Buyers should consult other sources of information to evaluate the performance of systems or components they are considering purchasing. For more information on performance tests and on the performance of Intel products, reference or call (U.S.) or Higher is better Source: Intel Internal Measurement * Other brands and names may be claimed as the property of others.

38 Summary On fully subscribed systems, Clovertown shows 34-70% better performance than Woodcrest Clovertown and Woodcrest scalabilities on 2 and 4 cores are the same Clovertown parallel speedups on 8 cores ranges from 5x to 7.8x (relative to 1 core)

39 Frank N Li frankli@us.ibm.com IBM Deep Computing, Americas
SURA Services– Our track record Presentation for the SURA GRID for 2007 Frank N Li IBM Deep Computing, Americas

40 IBM SURA Install Team IBM Architect and Solution Team: cross-brand architecture, solutioning and support – Brent Kranendonk, Frank Li IBM Cluster Enablement Team (CET): implementing complex HPC systems, both pSeries and Linux x86 cluster, hardware (server, cluster, storage, tape) and software (OS, cluster management tool, scheduler, GPFS) – Glen Corneau, Steve Cary, Larry Cluck IBM Advanced Technology Services (ATS) Team: system testing (HPL) and performance tuning -- Joanna Wong IBM Deep Computing Technical Team: assist migrating mission-critical applications and benchmarking – Carlos Sosa, etc IBM Grid Enablement for SURAgrid– Martin Maldonado and Chris McMahon Successful SURA installation at Georgia State University (GSU) and Texas A & M (TAMU)

41 Cluster Enablement Team
The x-Series Linux Cluster Enablement Team (CET) is a full service enablement team providing customers with direct access to IBM experts skilled in the implementation of Linux clustering hardware and software technologies. The following type of clustering engagements are provided by CET: Pre-configuration and cluster burn-in at our manufacturing site or customer’s location Integration with existing clusters and cluster software upgrade Software installation, including OS, cluster management, file system, compliers, schedulers or customer applications Executing customer acceptance testing Installing storage and GPFS front-ends On site Project Management Customer Training/Education Each CET project is professionally managed by a dedicated project manager who ensures an efficient delivery and deployment of your cluster Each offering includes rigorous burn-in testing The staging facility is conveniently located near our IBM’s manufacturing site CET offerings can be added to your next e1350 order using part number 26K7785 – 1Day of CET Consult

42 IBM SURA Collaborations
IBM CET installations LSU/IBM performance work with ADCIRC application U. Miami/IBM commitment to work on parallel version of ADCIRC U. Miami/IBM commitment to optimize HYCOM and WAM TAMU request for help on MM5 and WRF Phil Bogden presents in the IBM BOOTH at SC06

43 Mark Spargo mark.e.spargo@intel.com INTEL/IBM RELATIONSHIP EXECUTIVE
SURA Partnership with INTEL and IBM Presentation for the SURA GRID for 2007 Mark Spargo INTEL/IBM RELATIONSHIP EXECUTIVE

44 Intel & IBM: Delivering Together
Proven track record of delivering innovative solutions IBM fastest-growing Intel server vendor IBM/Intel BladeCenter collaboration Enterprise X-Architecture platform validation and development collaboration Jointly delivering superior server solutions with exceptional price/performance. Collaboration spans: Design Development Manufacturing Marketing & sales

45 IBM & Intel: Industry Collaboration
IDF San Francisco & Taiwan Commitment to 4th gen. technology – supporting quad–core Xeon® MP Geneseo Co-Inventors Founders SMASH BladeCenter: Products & Openness Virtualization: vConsolidate, others


Download ppt "SURA Presentation for IBM HW for the SURA GRID for 2007"

Similar presentations


Ads by Google