Presentation is loading. Please wait.

Presentation is loading. Please wait.

Scalable Systems Lab / The University of New Mexico© Summer 2000 by Adrian Riedo- Slide 1 - by Adrian Riedo - Summer 2000 High Performance Computing using.

Similar presentations


Presentation on theme: "Scalable Systems Lab / The University of New Mexico© Summer 2000 by Adrian Riedo- Slide 1 - by Adrian Riedo - Summer 2000 High Performance Computing using."— Presentation transcript:

1 Scalable Systems Lab / The University of New Mexico© Summer 2000 by Adrian Riedo- Slide 1 - by Adrian Riedo - Summer 2000 High Performance Computing using Portals over TNet

2 Scalable Systems Lab / The University of New Mexico© Summer 2000 by Adrian Riedo- Slide 2 - PoT Project The Portals over TNet Project Introduction Analysis Portals 3 TNet Design Case study Concepts Implementation Development System TNAL Conclusion

3 Scalable Systems Lab / The University of New Mexico© Summer 2000 by Adrian Riedo- Slide 3 - PoT Introduction About High Performance Computing (HPC) Supercomputers  Superclusters Message Passing (e.g. MPI) Datamovement layer OS-Bypass (avoid kernel calls) zero-copy (network bandwidth  memory bandwidth) Application Bypass (large transfers w/o intervention by Appl.) High Performance Network Design rules on all levels: Scalability low latency, high bandwidth Portability, platform independence (host & network) Goal of the PoT project Evaluation of a first implementation of Portals on TNet

4 Scalable Systems Lab / The University of New Mexico© Summer 2000 by Adrian Riedo- Slide 4 - PoT Analysis Portals 3 CPlan environment at Sandia National Laboratories, Albuquerque IO (temp) IO (temp) IP myrip.mod IP myrip.mod Application (MPI) on Portals Application (MPI) on Portals Portals 2 portals.mod Portals 3 p3.mod Portals 3 p3.mod RTS / CTS rtscts.mod RTS / CTS rtscts.mod Firmware (Myrinet) rtsmcp Firmware (Myrinet) rtsmcp

5 Scalable Systems Lab / The University of New Mexico© Summer 2000 by Adrian Riedo- Slide 5 - PoT Analysis Portals 3 Portals 3 Architecture, Network Abstraction Layer Application (USER) Application (USER) Driver (OS) Driver (OS) Firmware (NIC) Firmware (NIC) API Library api-p30/* lib-p30/* nal.c lib_nal.c... to NIC / wire

6 Scalable Systems Lab / The University of New Mexico© Summer 2000 by Adrian Riedo- Slide 6 - PoT Analysis Portals 3 Portals 3 Structures, Addressing me md Portal Table Match List Memory Descriptor List Event Queue Memory Region ApplicationLibrary Access Control Lists Network interfaces

7 Scalable Systems Lab / The University of New Mexico© Summer 2000 by Adrian Riedo- Slide 7 - PoT Analysis TNet TNet environment, Swiss-Tx Application (MPI) on FCI Application (MPI) on FCI FCI tnet.mod FCI tnet.mod Firmware (TNet) cc_b35_lc_c35 Firmware (TNet) cc_b35_lc_c35 irq handler kernel thread tnet.c...

8 Scalable Systems Lab / The University of New Mexico© Summer 2000 by Adrian Riedo- Slide 8 - PoT Analysis TNet TNet OSI Specification * corresponds to MPI Broadcast in 1 MPI Group ** specially for SMP Nodes (2, 4 Processors) BC is a particular case of MC Process 1 Process 2 Process 3 Process 4 TM Receiving Node Sender Process TM Example (4 Processes) DM Layer specific Communication Types

9 Scalable Systems Lab / The University of New Mexico© Summer 2000 by Adrian Riedo- Slide 9 - PoT Analysis TNet TNet Address Translation CMB VCA CMB: Contiguous Memory Block VCA: Virtual Communication Address pg: Page pg VCA Network virtual communication address space Host memory address space offsetpagetable

10 Scalable Systems Lab / The University of New Mexico© Summer 2000 by Adrian Riedo- Slide 10 - PoT Analysis TNet TNet PCI Adapter Specification CC PLX LC GBE SDRAM SRAM TX-FIFO RX-FIFO PCI CC Communication - Controller Lucent Orca 3T80-5 @ 31.25 MHz Tx, Rx Unit / CRC / DMW / Flags S(D)RAM Controllers Rx, Tx FIFO Interfaces PLX Controller LC Link - Controller Lucent Orca 3T30-6 @ 62.5 MHz Handshake Process TNet retransmission protocol Buffer: Out 3 Packets, IN 1 Packet CRC Check GBE GigBit - Eth Vitesse VSC 7211 62.5 MHz SDRAM Page Table Index to Address Translation Table (for TM mode) Std 16 MB or more SRAM ID Validation Table 128 x 18 Bit

11 Scalable Systems Lab / The University of New Mexico© Summer 2000 by Adrian Riedo- Slide 11 - PoT Design case study Portals over TNet case study Hardware Solution Library in hardware on NIC big FPGA, fast RAM required for optimal solution special design tools long implementation time high knowledge on Portals 3 and TNet Software Solution Library still in OS usage of TNet firmware & driver Portals NAL and TNet driver knowledge Performance workaround: pagetable as “memory descriptor”  first learn on software level, then approach step by step

12 IO (temp) IO (temp) Scalable Systems Lab / The University of New Mexico© Summer 2000 by Adrian Riedo- Slide 12 - PoT Design Concepts Driver architecture & modules (Portals  Myrinet / FCI  TNet) IP myrip.mod IP myrip.mod Application (MPI) on Portals Application (MPI) on Portals Application (MPI) on FCI Application (MPI) on FCI FCI Portals 2 portals.mod FCI tnet.mod FCI tnet.mod Firmware (TNet) cc_b35_lc_c35 Firmware (TNet) cc_b35_lc_c35 Portals 3 p3.mod Portals 3 p3.mod RTS / CTS rtscts.mod RTS / CTS rtscts.mod Firmware (Myrinet) rtsmcp Firmware (Myrinet) rtsmcp myrnal  forward PTL_IFACE_MYR lib-p30 lib_myrnal irq handler kernel thread tnet.c...

13 IO (temp) IO (temp) Scalable Systems Lab / The University of New Mexico© Summer 2000 by Adrian Riedo- Slide 13 - PoT Design Concepts Driver architecture & modules (Portals,FCI  TNet) P3oT p3ot.mod P3oT p3ot.mod IP myrip.mod IP myrip.mod Application (MPI) on Portals Application (MPI) on Portals Application (MPI) on FCI Application (MPI) on FCI FCI tnal  forward PTL_IFACE_T lib-p30 FCI Firmware (TNet) cc_b35_lc_c35 Firmware (TNet) cc_b35_lc_c35 Portals 2 portals.mod RTS / CTS rtscts.mod RTS / CTS rtscts.mod Firmware (Myrinet) rtsmcp Firmware (Myrinet) rtsmcp tnet.c... lib_tnal

14 Scalable Systems Lab / The University of New Mexico© Summer 2000 by Adrian Riedo- Slide 14 - PoT Design Concepts Dataflow in the P3oT module (CMB & IRQ - Large Msgs using DMA ) CMB P3oT p3ot.mod P3oT p3ot.mod Application (MPI) on Portals Application (MPI) on Portals tnal  forward PTL_IFACE_T lib-p30 Firmware (TNet) - Pagetable set up for virtual CMB cc_b35_lc_c35 Firmware (TNet) - Pagetable set up for virtual CMB cc_b35_lc_c35 tnet.c... lib_tnal no OS Bypass no zero-copy DMA

15 Scalable Systems Lab / The University of New Mexico© Summer 2000 by Adrian Riedo- Slide 15 - PoT Implementation Development System System 2 Alpha workstation 164LX 21164 alpha processor, 320 MB RAM 100 Base T Ethernet Mini CPlant 64 Bit / 33 MHz Myrinet Myrinet 8 port switch TNet 32 Bit TNet NIC, 16 MB RAM no switch OS TRU64, RedHat Linux (dualboot)

16 Scalable Systems Lab / The University of New Mexico© Summer 2000 by Adrian Riedo- Slide 16 - PoT Implementation TNAL (network abstraction layer for Portals over TNet) lib-p30/* lib_tnal.c... tnet.c FCI ioctl CMB, gcw TNET_ioctl in tnet.c case TNET_PTL_DISPATCH: copy_from_user(..); lib_dispatch(..); copy_to_user(..);.. break; from lib_dispatch Do_PtlPut in wrap.c.. for the PtlPut tnal_send in lib_tnal.c.. memcpy(..); //for header copy_from_user(..); //for data.. //send data using CMB, DMA.. // remote IRQ on last packet lib_finalize(..); Incoming message TNET_Interrupt in tnet.c.. memcpy(..); //for header lib_parse(..”header”..) from TNET_Interrupt lib_parse in ~.c parse_put in ~.c from parse_put tnal_rcv in tnal.c copy_to_user(..); lib_finalize(..);

17 Scalable Systems Lab / The University of New Mexico© Summer 2000 by Adrian Riedo- Slide 17 - PoT Implementation Milestones Setting up PC / Development System TNet Documentation / vhdl & C Sources  Presentation Getting familiar with Portals Experimenting with mpich Install Myrinet, TNet, FCI, Portals on Tru64 / Linux Experimenting with modules & test programs for TNet Presentations / Website PoT Design Writing hybrid module (P3oT) Debugging Benchmarking Report

18 CMB Scalable Systems Lab / The University of New Mexico© Summer 2000 by Adrian Riedo- Slide 18 - PoT Prospects Dataflow in the P3oT module (Pagetable - Small Msgs using PIO) P3oT p3ot.mod P3oT p3ot.mod Application (MPI) on Portals Application (MPI) on Portals tnal  forward PTL_IFACE_T lib-p30 Firmware (TNet) - Pagetable points to Appl. Space cc_b35_lc_c35 Firmware (TNet) - Pagetable points to Appl. Space cc_b35_lc_c35 tnet.c... lib_tnal OS Bypass zero-copy LIB PIOdynamic Pagetable

19 Scalable Systems Lab / The University of New Mexico© Summer 2000 by Adrian Riedo- Slide 19 - PoT Conclusion Conclusions CMB Software Solution: approx. 80  s latency (first version) Not the best solution, but learned a lot Software solution profits from CRC and retransmit on card TNAL lays basis for further research  Implementation using Pagetable & PIO for OS Bypass Experience Analysis and design take a lot of time (important) Wide knowledge needed Kernel programming is not trivial Long debugging time compared to applications

20 Scalable Systems Lab / The University of New Mexico© Summer 2000 by Adrian Riedo- Slide 20 - PoT Project The PoT website at http://hpc.fribyte.ch


Download ppt "Scalable Systems Lab / The University of New Mexico© Summer 2000 by Adrian Riedo- Slide 1 - by Adrian Riedo - Summer 2000 High Performance Computing using."

Similar presentations


Ads by Google