Presentation is loading. Please wait.

Presentation is loading. Please wait.

100 Gb/s InfiniBand Transport over up to 100 km Klaus Grobe and Uli Schlegel, ADVA Optical Networking, and David Southwell, Obsidian Strategics, TNC2009,

Similar presentations


Presentation on theme: "100 Gb/s InfiniBand Transport over up to 100 km Klaus Grobe and Uli Schlegel, ADVA Optical Networking, and David Southwell, Obsidian Strategics, TNC2009,"— Presentation transcript:

1 100 Gb/s InfiniBand Transport over up to 100 km Klaus Grobe and Uli Schlegel, ADVA Optical Networking, and David Southwell, Obsidian Strategics, TNC2009, Málaga, June 2009

2 © 2009 ADVA Optical Networking. All rights reserved. ADVA confidential. 2 Agenda  InfiniBand in Data Centers  InfiniBand Distance Transport

3 © 2009 ADVA Optical Networking. All rights reserved. ADVA confidential. 3 InfiniBand in Data Centers

4 © 2009 ADVA Optical Networking. All rights reserved. ADVA confidential. 4 Connectivity performance  Bandwidth requirements follow Moore’s Law (# transistors on a chip)  So far both, Ethernet and InfiniBand outperform Moore’s growth rate Adapted from: Ishida, O., “Toward Terabit LAN/WAN” Panel, iGRID2005 Moore’s Law Doubles every 18m WDM FC Ethernet InfiniBand Fiber Link Capacity [b/s] 100M 1G 10G 100G 1T 10T 199019952000 2005 2010 2008200920102011 10 20 40 80 160 320 640 QDRx1 QDRx4 QDRx12 EDRx1 EDRx4 EDRx12 HDRx1 HDRx4 HDRx12 Time Bandwidth per Direction [Gb/s]

5 © 2009 ADVA Optical Networking. All rights reserved. ADVA confidential. 5 InfiniBand Data Rates InfiniBandIBx1IBx4IBx12 Single Data Rate, SDR2.5 Gb/s10 Gb/s30 Gb/s Double Data Rate, DDR5 Gb/s20 Gb/s60 Gb/s Quad Data Rate, QDR10 Gb/s40 Gb/s120 Gb/s IB uses 8B/10B coding, e.g., IBx1 DDR has 4 Gb/s throughput Copper  Serial (x1, not much seen on the market)  Parallel copper cables (x4, x12) Fiber Optic  Serial for x1 and SDR x4 LX (serialized I/F)  Parallel for x4, x12

6 © 2009 ADVA Optical Networking. All rights reserved. ADVA confidential. 6 Converged Architectures SRP – SCSI RDMA Protocol Latency TCP iSCSI lossy FCIP lossy FCP iFCP lossy FCoE DCB lossless Operating System / Application Small Computer System Interface (SCSI) InfiniBand lossless Performance IB Ethernet DCB IP TCP iFCP FCP iSCSISRP

7 © 2009 ADVA Optical Networking. All rights reserved. ADVA confidential. 7 HPC Networks today Server Cluster Typical HPC Data Center today  Dedicated networks / technologies for LAN, SAN, CPU (server) interconnect  Consolidation required (management complexity, cables, cost, power) FC and GbE HBAs and IB HCAs FC Ethernet LAN FC SAN FC IB EthFC IB Eth FC IB EthFC IB Eth Relevant Parameters  LAN HBA based on GbE/10GbE  SAN HBAs based on 4G/8G-FC  HCAs based on IBx4 DDR/QDR

8 © 2009 ADVA Optical Networking. All rights reserved. ADVA confidential. 8 InfiniBand Distance Transport

9 © 2009 ADVA Optical Networking. All rights reserved. ADVA confidential. 9 Generic NREN DC Large, dispersed Metro Campus, or Cluster of Campuses DC Core (Backbone) Router Large Data Center Layer-2 Switch OXC / ROADM Connection to Backbone (NREN) Dedicated (P2P) Connection to large Data Centers

10 © 2009 ADVA Optical Networking. All rights reserved. ADVA confidential. 10 InfiniBand-over-Distance Difficulties and solution considerations Technical difficulties:  IB-over-copper – limited distance (<15 m)  IB-to-XYZ conversion – high latency  No IB buffer credits in today’s switches for distance transport  High-speed serialization and E-O conversion needed Requirements:  Lowest latency, hence highest throughput is a must  Interworking must be demonstrated

11 © 2009 ADVA Optical Networking. All rights reserved. ADVA confidential. 11 InfiniBand Flow Control  InfiniBand is credit-based per virtual lane (16)  On initialization, each fabric end-point declares its capacity to receive data  This capacity is described as its buffer credit  As buffers are freed up, end points post messages updating their credit status  InfiniBand flow control happens before transmission, not after it – lossless transport  Optimized for short signal flight time; small buffers are used inside the ICs: Limits effective range to ~300 m From System Memory Across IB Link Into System Memory Update Credit 1 HCA AHCA B 2 3 4

12 © 2009 ADVA Optical Networking. All rights reserved. ADVA confidential. 12 InfiniBand Throughput vs. Distance  Only sufficient Buffer-to-Buffer credits (B2B credits) in conjunction with error-free optical transport can ensure maximum InfiniBand performance over distance  Throughput drops significantly after several 10 m w/o additional B2B credits, this is caused by an inability to keep the pipe full by restoring receive credits fast enough  Buffer credit size depends directly on desired distance Throughput Distance w/o B2B Credits With B2B Credits

13 © 2009 ADVA Optical Networking. All rights reserved. ADVA confidential. 13 DC InfiniBand-over-Distance Transport  Point-to-point  Typically, <100 km, but can be extended to any arbitrary distance  Low latency (distance!)  Transparent infrastructure (should support other protocols) LAN IB HCAs IB SF CPU/ Server Cluster IB FC SAN LAN FC SAN WDM … … 80 x 10G DWDM (redundant) Gate way NREN10GbE…100GbE IB SF – InfiniBand Switch Fabric

14 © 2009 ADVA Optical Networking. All rights reserved. ADVA confidential. 14 IB Transport Demonstrator Results SendRecV Throughput vs. Distance 0.2 0.4 0.6 0.8 1 0 020406080100 Distance [km] Throughput [GB/s] 32 kB 128 kB 512 kB 4096 kB N x 10G InfiniBand Transport over >50 km Distance demonstrated ADVA FSP 3000 DWDM  Up to 80 x 10Gb/s transponders  <100 ns latency per transponder  Max. reach 200/2000 km Obsidian Campus C100  4x SDR copper to serial 10G optical 4x SDR copper to serial 10G optical  840 ns port-to-port latency 840 ns port-to-port latency  Buffer Credits for up to 100 km (test equipment ready for 50 km) Buffer Credits for up to 100 km (test equipment ready for 50 km) B2B Credits SerDes … 80 x 10G DWDM … DWDM … B2B Credits SerDes …

15 © 2009 ADVA Optical Networking. All rights reserved. ADVA confidential. 15 WCA-PC-10G WDM Transponder  Bit rates: 4.25 / 5.0 / 8.5 / 10.0 / 10.3 / 9.95 / 10.5 Gb/s  Applications: IBx1 DDR/QDR, IBx4 SDR, 10GbE WAN/LAN PHY, 4G-/8G-/10G-FC  Dispersion tolerance: up to 100 km w/o compensation  Wavelengths: DWDM (80 channels) and CWDM (4 channels)  Client port: 1 x XFP (850 nm MM, or 1310/1550 nm SM)  Latency <100 ns Solution Components Campus C100 InfiniBand Reach Extender  Optical bit rate 10.3 Gb/s (850 nm MM, 1310/1550 nm SM)  InfiniBand bit rate 8 Gb/s (4x SDR v1.2 compliant port)  Buffer credit range up to 100 km (depending on model)  InfiniBand node type: 2-port switch  Small-packet port-to-port latency: 840 ns  Packet forwarding rate: 20 Mp/s

16 © 2009 ADVA Optical Networking. All rights reserved. ADVA confidential. 16 FSP 3000 DWDM System (~100 km, dual-ended) Chassis, PSUs, Controllers 10G DWDM Modules Optics (Filters, Amplifiers) Sum (budgetary) Solution 8x10G InfiniBand Transport ~€10.000,- ~€100.000,- ~€10.000,- ~€120.000,- 16 x Campus C100 (100 km) System total (budgetary) ~€300.000,- ~€420.000,-

17 © 2009 ADVA Optical Networking. All rights reserved. ADVA confidential. 17 An Example… NASA's largest supercomputer uses 16 Longbow C102 devices to span two buildings, 1.5 km apart, at a link speed of 80 Gb/s and a memory-to-memory latency of just 10 µs.

18 Thank you IMPORTANT NOTICE The content of this presentation is strictly confidential. ADVA Optical Networking is the exclusive owner or licensee of the content, material, and information in this presentation. Any reproduction, publication or reprint, in whole or in part, is strictly prohibited. The information in this presentation may not be accurate, complete or up to date, and is provided without warranties or representations of any kind, either express or implied. ADVA Optical Networking shall not be responsible for and disclaims any liability for any loss or damages, including without limitation, direct, indirect, incidental, consequential and special damages, alleged to have been caused by or in connection with using and/or relying on the information contained in this presentation. Copyright © for the entire content of this presentation: ADVA Optical Networking. KGrobe@advaoptical.com


Download ppt "100 Gb/s InfiniBand Transport over up to 100 km Klaus Grobe and Uli Schlegel, ADVA Optical Networking, and David Southwell, Obsidian Strategics, TNC2009,"

Similar presentations


Ads by Google