Download presentation
Presentation is loading. Please wait.
Published byDonald McLaughlin Modified over 9 years ago
1
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd. Overview of Hitachi’s Super Technical Server SR8000 Overview of Hitachi’s Super Technical Server SR8000 Hitachi, Ltd. Enterprise Server Division March, 2001 Yoshiro Aihara The Third International Workshop on Next Generation Climate Models
2
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd. HITACHI Supercomputers '77'78'79'80'81'82'83'84'85'86'87'88'89'90'91 M-200H IAP '92'93'94'95'96'97'98'99'00 0.01G 0.1G 1G 10G S-810 Series M-280H IAP M-680 IAP 100G S-3000 Series Peak Performance(FLOPS) 1T SR2201 Series Advanced RISC Parallel Announcement Year New concept machine for advanced HPC users (Combination of Parallel and Vector) New concept machine for advanced HPC users (Combination of Parallel and Vector) Vector Type RISC Parallel 10T IAP:Integrated Array Processor Integrated Array Processor system First Japanese Vector Supercomputer Single CPU peak performance 3GFlops First commercially available distributed memory parallel processor Single CPU peak performance 8GFlops (Fastest in the world) S-820 Series SR8000 ‘01
3
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd. Design Concept of SR8000 High Single Node Performance High Scalability Short Development Cycle Easy Enhancement: Multi-dimensional Crossbar Network (High-speed inter-node network) - PVP feature - COMPAS feature - High Memory Throughput - Vector processing - Element parallel processing RISC based processor (HITACHI developed) Target of Design Hitachi’s Solution PVP: Pseudo Vector Processing COMPAS: Co-operative Micro-Processors in single Address Space SR8000: New Concept combining advantages of Vector processor and RISC Parallel Processor SR8000 New Feature Vector processor
4
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd. Ether ATM HiPPi Basic Configuration of SR8000 High performance RISC System control PCI I/O adapter SVP I/O Device SVP : SerVice Processor MCD : Maintenance Console Device SP : System Processor Node MCD High performance RISC Microprocessor (Hitachi develop.) Pseudo-Vector Processing COMPAS: CO-operative Micro-Processors in single Address Space Network control High speed inter-node network Multi-dimensional Crossbar Network RAID Disk Main memory High performance RISC SP
5
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd. Ex) 8x8x2 (128 nodes) configuration Multi-dimensional Crossbar network
6
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd. SR8000 Hardware Specification
7
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd. Pseudo-Vector Processing(PVP) Cache memory Extended floating point registers ( 160 ) Prefetch Preload load FPU Main memory Prefetch - Read data from main memory to cache before calculation - Accelerate sequential data access Preload - Read data from main memory to Extended Floating Registers before calculation - Accelerate stride memory access and indirectly addressed memory access Problems of conventional RISC processor - Reduction of performance for large scale simulations because of cache-overflow - Sustained : Under 10% of peak
8
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd. COMPAS Feature of SR8000 Program Behavior IP Scalar Part Start Parallel Inst. Loop Part End Parallel Inst. Scalar Part Hardware Feature(COMPAS) IP (waiting for startup) Loop Part Loop Part Loop Part IP (waiting for startup) IP SC MS IP:Instruction Processor SC:Storage Controller MS:Main Storage ・・・ COMPAS: CO-operative Micro-Processors in single Address Space High-speed Communication Mechanism Realization of high speed processing of multiple processors by hardware high-speed communication mechanism Realization of elementwise parallel processing of DO Loops, employed in vector supercomputer, by multiple processors in a node (Automatic elementwise parallelization in a node by compiler)
9
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd. Programming Models
10
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd. Physical Data of SR8000 Example; 128 Node Configuration (G1 model) Power Consumption; approx. 370 kVA Heat Dissipation; approx. 330 kW Cooling Air Inlet Temperature; 16--22 deg C Weight; approx. 15,000 kg Floor Space; approx. 50 sq. meters (incl. service area) Foot Print (128 node) approx. 3.3 m approx. 8.0 m Height: approx. 1.8 m
11
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd. OS Language Processor Program Development Parallel Library Numerical Calculation Development Support Graphics GUI Graphic Library Network Optimizing FORTRAN77/90, HPF, Optimizing C, C++, OpenMP (Ver1) MPI-2, PVM, PARALLELWARE MATRIX/MPP,MATRIX/MPP/SSS,MSL2 Symbolic Debugger Optimizing C /FORTRAN90 Performance Monitor (for HP-UX) X11R6, Motif1.2 GKS, PEX, PHIGS Ethernet / Fast Ethernet, GbE, HiPPi, ATM TCP/IP, NFS V3, telnet, rlogin OSF/1 Microkernel based OS NQS, BGT, DIFF, SFF, PFF HI-UX/MPP Overview of Software Products
12
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd. Single UNIX System l Single UNIX System : Single System Operation (File system, Process control, Network) l Open System (Standardized OS, Compiler, Network) l Flexible System Operation (Partitioning Operation, Automatic Operation) l Scalable System (4 to 512 nodes) IP :Instruction Processor3D-XB : 3-dimensional Cross-bar Network Node IP SP Main Storage COMPAS Feature SR8000 Network SR2000 Series SR2000 Series 3D-XB 3500 Series 3500 Series H-9000V Series H-9000V Series HIPPI Ethe rnet Disk 3D-XB WSPC X Terminal Console Other Vendor (SGI, etc........ ) Other Vendor (SGI, etc........ ) Graphic RAID COMPAS (CO-operative Micro-Processors in single Address Space)... Micro-kernel (Control of all IPs) UNIX(OSF/1) Server (Functional co-operation with other nodes) Single UNI X System
13
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd. Receive Buffer Program OS memory copy data Send Buffer Crossbar Network Normal Transfer Protocol Processing Context Switch Interrupt Handling Node memory copy Remote DMA Transfer No Buffering in Kernel No OS System Call Remote DMA Transfer ● Direct Memory Copy between User Program on Different Nodes that minimizes OS Overhead
14
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd. Structural Analysis Computational Fluid Dynamics Chemical Analysis Libraries I MSL MSC.Nastran MSC.Marc LS-DYNA PAM_CRASH STAR-CD PHOENICS SCRYU STREAM GAUSSIAN98 Tools AVS/EXPRES S ABAQUS/Standard ABAQUS/Explicit FLUENT AMBER NAG TotalView Vampir Examples of ISV Package
15
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd. SR8000 Installation Sites (Example) Leibniz Rechenzentrum (Germany) High Energy Accelerator Research Organization University of Tokyo Japan Meteorological Agency University of Tokyo / Institute for Solid State Physics Tsukuba advanced Computing Center - TACC / AIST Meteorological Research Institute Hokkaido University Institute of Statistical Mathematics HWW / Universitat Stuttgart & DLR (Germany)..
16
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd. TOP500 Supercomputing Sites - November 3rd, 2000 Rmax/Rpeak > 75 % Hitachi SR8000 works efficiently.
17
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd. TOP500 Supercomputing Sites - November 3rd, 2000 Rmax/Rpeak = 85.3 % on SR8000/128 Rmax/Rpeak = 90.0 % on SR8000-E1/80 Hitachi SR8000 works efficiently. Rmax/Rpeak = 85.3 % on SR8000/128 Rmax/Rpeak = 90.0 % on SR8000-E1/80 Hitachi SR8000 works efficiently.
18
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd. SR8000 F1 & G1 LINPACK Performance 313.30 Gflop/s on SR8000F1/32 with Nmax=65000 ↓ 6% Speed Up 331.50 Gflop/s on SR8000F1/32 with Nmax=84800 ↓ 20% Speed Up 398.50 Gflop/s on SR8000G1/32 with Nmax=84800 313.30 Gflop/s on SR8000F1/32 with Nmax=65000 ↓ 6% Speed Up 331.50 Gflop/s on SR8000F1/32 with Nmax=84800 ↓ 20% Speed Up 398.50 Gflop/s on SR8000G1/32 with Nmax=84800 SR8000F1 SR8000G1
19
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd. NAS Parallel Benchmark (FT) FT: A 3-D fast-Fourier transform partial differential equation benchmark Model G1 is 1.28 ~ 1.30 times faster than Model F1. Model G1 is 1.28 ~ 1.30 times faster than Model F1.
20
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd. NAS Parallel Benchmark (MG) MG: a simple 3D multigrid benchmark Model G1 is 1.22 ~ 1.24 times faster than Model F1. Model G1 is 1.22 ~ 1.24 times faster than Model F1.
21
HITACHI All Rights Reserved, Copyright C 2001, Hitachi, Ltd. MPI Ping-Pong Performance Remote DMA (Direct Memory Access) is sender driven and makes memory to memory copy of data. Remote DMA provides a high-speed inter-processor communication function without redundant copying. Remote DMA (Direct Memory Access) is sender driven and makes memory to memory copy of data. Remote DMA provides a high-speed inter-processor communication function without redundant copying.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.