Presentation is loading. Please wait.

Presentation is loading. Please wait.

All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. SR8000 Concept Tim Lanfear Hitachi Europe GmbH.

Similar presentations


Presentation on theme: "All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. SR8000 Concept Tim Lanfear Hitachi Europe GmbH."— Presentation transcript:

1 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. SR8000 Concept Tim Lanfear Hitachi Europe GmbH. t-lanfear@hpcc.hitachi-eu.co.uk

2 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. 2 SR8000 Model Range

3 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. 3 SR8000 Appearance

4 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. 4 Compact Model

5 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. 5 Vector vs SMP vs MPP FeatureVectorSMPMPP Single Node Performance Scalability Programming Effort Development Cycle Energy Requirements

6 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. 6 System Architecture Cross-bar Inter-node Network Node (PRN) Node (ION) CPU System Control Main Memory Network Control Ether, ATM, HIPPI RAID Disk PCI Service Processor Console

7 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. 7 Programming Models

8 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. 8 CPU Architecture 16 bytes/cycle memory BW 128 Kbyte L1 cache Pre-fetch and pre-load instructions 160 f.p. registers 2 f.p. pipelines 4 flops/cycle Main Memory Pre-fetch Pre-load Cache Floating Point Registers Load Arithmetic Unit Memory Switch

9 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. 9 Slide Window Registers Registers for all instructions Registers for extended instructions only Fixed registers: 4, 8, 16, 32 (16 illustrated) Fixed + sliding = 128 PhysicalSliding part: 0 to 127 Global part: 128 to 159 Logical 0 to 1532 to 125 16 to 31 126-7 0 to 1532 to 123 16 to 31 124-7 Base=2 Base=4

10 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. 10 Instruction Set Extensions Load and store with extended registers Floating point arithmetic with extended registers Slide window control Pre-fetch and pre-load Thread start-up and finish Predicate instructions

11 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. SR8000 Programming Instruction Level Parallelism (Pseudo-vector Processing: PVP)

12 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. 12 Pre-fetch and Pre-load Pre-fetch: load cache line from memory to cache Pre-load: load one word from memory to register 16 streams Main Memory Pre-fetch Pre-load Cache Floating Point Registers Load Arithmetic Unit Memory Switch

13 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. 13 Pre-fetch Iteration PFLatencyUse dataLD 1 Use dataLD Use dataLD Use dataLD 2 3 4 PFLatencyUse dataLD Use dataLD 5 6 Pre-fetch 128 bytes to cache Follow by LD to register

14 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. 14 Pre-load PLLatencyUse data 1 LatencyUse data LatencyUse data LatencyUse data LatencyUse data LatencyUse data 2 3 4 5 6 PL Pre-load 8 bytes to register LD not required Iteration

15 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. 15 Software Pipelining I=1I=2I=3 No SWPL I=1 I=2 I=3 Infinite resource Recurrence a= =aa= =aa= =aI=1 I=2 I=3 Resources: registers, f.p. units, instruction issue, memory bandwidth etc I=1 I=2 I=3 Finite resource Initiation interval

16 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. 16 VST Pseudo-vector Processing A(:) = A(:) + N Vector Pseudo-Vector PFLatLD+ST LD+ST LD+ST LD+ST PFLatLD+ST LD+ST LD+ST VADD VLD

17 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. 17 Effect of PVP Dot product: S = A(1:N)*B(1:N)

18 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. SR8000 Programming Multi-thread Parallelism (Cooperative Microprocessors in a Single Address Space: COMPAS)

19 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. 19 COMPAS Multi-dimensional Crossbar Network Node thread process.. IP Pre-fetch Load Arithmetic Store Branch Pre-fetch Load Arithmetic Store Branch IP Automatic Parallel Processing Node COMPAS (Start Inst.) IP: Instruction Processor Node Main memory (shared) IP COMPAS ( End Inst.) COMPAS: Co-operative Micro-Processors in single Address Space

20 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. 20 Loop Part IP (waiting for startup) Hardware Support Software IP Scalar Part Start Parallel Inst. Loop Part End Parallel Inst. Hardware Support SC MS IP:Instruction Processor SC:Storage Controller MS:Main Storage Barrier Synchronization Mechanism Scalar Part Loop Part IP (waiting for startup) Loop Part IP (waiting for startup) IP

21 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. 21 Loop Parallelisation DO i =1,N A(i)=B(i)+C(i) ENDDO [fork] DO i =start,end A(i)=B(i)+C(i) ENDDO [join] i loop parallelisation DO j=1,M W(j)=C(j)+D(j) DO i=1,N A(i,j)=B(i,j)+W(j) ENDDO [fork] DO j=start,end W(j)=C(j)+D(j) DO i=1,N A(i,j)=B(i,j)+W(j) ENDDO [join] j loop parallelisation

22 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. 22 Loop Parallelisation DO j=2,M DO i=1,N A(i,j) = A(i,j-1)+A(i,j) ENDDO [fork] DO j=2,M DO i=start,end A(i,j) = A(i,j-1)+A(i,j) ENDDO [join] i loop parallelisation DO i=1,N A(i) = B(i)+C(i) ENDDO DO j=1,M D(j) = E(j)*F(j) ENDDO [fork] DO i=start,end A(i) = B(i)+C(i) ENDDO DO j=start,end D(j) = E(j)*F(j) ENDDO [join] i loop parallelisation j loop parallelisation

23 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. 23 Loop Parallelisation DO i = 1,N CALL sub(a,b,i) ENDDO *poption parallelforce parallelisation *poption tlocal(a,b,i)thread local variables [fork] DO i = 1,N CALL sub(a,b,i) ENDDO [join]

24 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. 24 Section Parallelisation *poption parallel_sections *poption section CALL SUB1 *poption section CALL SUB2 *poption end_parallel_sections Execution of independent blocks of code in different threads (sections are always single threaded)

25 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. 25 Effect of COMPAS Dot product: S = A(1:N)*B(1:N)

26 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. SR8000 Programming Message Passing (MPI)

27 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. 27 Remote DMA Receive Buffer Program OS memory copy Send Buffer Crossbar Network Normal Transfer Protocol Processing Context Switch Interrupt Handling Node memory copy Remote DMA Transfer No Buffering in Kernel No OS System Call data

28 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. 28 Inter-node MPI Cross-bar Inter-node Network One MPI process per node; RDMA transfer possible MPI

29 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. 29 Intra-node MPI Cross-bar Inter-node Network One MPI process per IP; RDMA transfer not possible MPI Shared memory MPI Shared memory MPI Shared memory

30 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. 30 MPI Ping-pong

31 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. 31 SR8000 Parallelism Instruction level (PVP) Multi-thread (COMPAS) Node 1 Node 2 Message passing (MPI)

32 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. SR8000 Programming Memory Architecture

33 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. 33 Memory Hierarchy fp registers (128+32) L1 cache (128 Kb 4-way) Store buffer (16 entries) Switch Memory (2 to 16 Gb, 512 banks) 16 b/cyc 32 b/cyc Other IPs

34 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. 34 Address Translation Virtual page number Page offset Page table Main memory Virtual address Cache recently used entries of page table in TLB

35 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. 35 Large TLB Virtual page number Page offset Large page table Main memory Virtual address Large TLB covers whole address space with 256 entries. Page size 16Mb to 128 Mb

36 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. 36 Memory Address Hashing 32 33 34 3536 37 38 3940 41 42 43 44 45 46 4748 49 50 5152 53 54 5556 57 58 5960 61 62 63 xor memory controller data path xor storage controller data path

37 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. 37 Key Features of SR8000 High performance RISC CPU with PVP High performance node with COMPAS High sustained memory bandwidth High scalability with fast network Low energy and space requirements

38 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. SR8000 Programming Performance

39 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. 39 Top 500 – June 2000 ManufacturerComputerR max Installation Site 1IntelASCI Red2379Sandia National Lab 2IBMASCI Blue Pacific2144Lawrence Livermore National Lab 3SGIASCI Blue Mountain1608Los Alamos National Lab 4IBMSP Power3 375 MHz1417NAVOCEANO 5HitachiSR8000-F1/1121035LRZ Munich 6HitachiSR8000-F1/100917KEK Tsukuba 7Cray IncT3E/1200891US Government 8Cray IncT3E/1200891US Army HPC Research Center 9HitachiSR8000/128873University of Tokyo 10Cray IncT3E/1200815US Government

40 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. 40 Linpack Performance 80.25 (8 nodes) 159.51 (16 nodes) 313.32 (32nodes) 917.15 (100 nodes) 605.30 (64 nodes) 577.49 (60 nodes) 0 100 200 300 400 500 600 700 800 900 1000 020406080100120 Number of nodes GFlops 10.88 Gflops on 1 node 20.50 Gflops on 2 nodes 40.76 Gflops on 4 nodes 10.88 Gflops on 1 node 20.50 Gflops on 2 nodes 40.76 Gflops on 4 nodes

41 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. 41 NAS Parallel FT 5.14 7.92 26.16 14.01 8.31 27.95 14.84 5.39 28.78 15.10 8.37 0 5 10 15 20 25 30 35 1248 Number of Nodes GFlops ClassA ClassB ClassC

42 All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. 42 NAS Parallel CG


Download ppt "All Rights Reserved. Copyright © 2000 Hitachi Europe Ltd. SR8000 Concept Tim Lanfear Hitachi Europe GmbH."

Similar presentations


Ads by Google