Presentation is loading. Please wait.

Presentation is loading. Please wait.

Reduced Communication Protocol for Clusters Clunix Inc. Donghyun Kim 2000.9.

Similar presentations


Presentation on theme: "Reduced Communication Protocol for Clusters Clunix Inc. Donghyun Kim 2000.9."— Presentation transcript:

1 Reduced Communication Protocol for Clusters Clunix Inc. Donghyun Kim 2000.9

2 Clunix Inc. Introduction  Communication Sub-system Performance is decided by followings Transmission speed of physical network I/O handling capability Overheads of the communication protocol  Communication using traditional protocols is the bottle-neck of parallel systems Myrinet with TCP/IP is not FAST. Small-granularity or communication-dense apps show poor performance

3 Clunix Inc. Introduction – cont’d  A high proportion of apps don’t need very complicated communication functions By practice and theoretic analysis

4 Clunix Inc. Overheads analysis of traditional protocols  Traditional protocols overheads Time of context switching Time of data copying  User space – system space, adjacent protocol layers Time of data partitioning, re-constructing, data analyzing Time of transmitting packet headers Time of routing, connection maintaining, traffic controlling, error detecting, recovering, buffer management

5 Clunix Inc. Overheads analysis of traditional protocols - cont’d  End-to-end latency L, bandwidth W modeling Assumptions : homogeneous, low network traffic T(n) : n-bytes transmission time n max : comm. subsystem max packet length m : # of protocol layers T i (n) : i-th protocol layer processing time (T 0 (n) : physical network transmission time)

6 Clunix Inc. Overheads analysis of traditional protocols - cont’d  : context switching time  : memory bandwidth  0 : physical network transmission bandwidth  i : max packet length of i-th layer  I : packet header length of i-th layer n i : data length of i-th layer  i : calling expense (routing,traffic control, error detecting, buffer management, connection maintaining)

7 Clunix Inc. Overheads analysis of traditional protocols - cont’d  Analytical & testing results  Testing conclusions Very large overhead using above IP protocol layer Memory-to-memory copying is not neglected  If transmission bandwidth is the same as memory bandwidth, data copying(n i+1 /  ) problem is bigger ProtocolAnalyticalTesting Layer L(  s) W(Mbps) L(  s) W(Mbps) TCP13508.514508.6 UDP11109.511509.5 DLPI45010.065010.0

8 Clunix Inc. Design Strategies for RPC Support reliable, synchronous, asynchronous communications Implement reliale broadcast and multicast basing directly on the physical layer Lay the protocol below the IP layer  Above physical or datalink layer Avoid data copying AFAP If possible, avoid buffer management using hardware buffering Run the protocol entirely in the user space  In the form of libraries

9 Clunix Inc. Implementation of RCP  OSI-DLPI version Standard physical-device independent data link layer interface  Can write uniform program on different machines and network devices  Myrinet version  Providing user interface like the TCP-socket

10 Clunix Inc. Implementation of RCP – cont’d  RCP supports unicast, broadcast, multicast  RCP addressing Unique source/destination using hostname+port# Static address configuration  Supports heterogeneous machines  No connection maintaining, error detecting Assuming that underlying network is reliable

11 Clunix Inc. Implementation of RCP – cont’d  Sequencing control, traffic control Sliding-window algorithm+selective retransmission Windows size is adjusted accoring to retransmission frequency  Fast-Adapt and Slow-Recover algorithm Very efficient traffic control  Data partitioning and packaging algorithm Almost no data-copy, work in user-space

12 Clunix Inc. RCP Tesing results Bandwidth(W)Lantency(L)

13 Clunix Inc. Conclusions and future issues  RCP design considerations How to reduce the overheads  Over-complicated protocol processing  Context switching  Overhead of data copying How to use the transmission control functions supported by hardware  To reduce the protocol processing  Future Work To gurantee the quality of the communication.


Download ppt "Reduced Communication Protocol for Clusters Clunix Inc. Donghyun Kim 2000.9."

Similar presentations


Ads by Google