Download presentation
Presentation is loading. Please wait.
Published byShaun Smock Modified over 9 years ago
1
1NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. High-end Routers & Modern Supercomputers Bob Newhall & Dan Lenoski Cisco Systems, Routing Technology Group NORDUnet 2003, Reykjavik – August 2003
2
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 222 Agenda Traditional Routers and Supercomputers Modern Routers and Supercomputers Comparison of Subsystems Conclusions
3
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 333 What’s a Router? Traditionally… PCI Bus1 PCI Bus2 PA-6 PA-4 PA-2 PA-5 PA-3 PA-1 I/O Bus PCI Bus0 ROM Flas h NVRAM Con/Aux PB FE PCMCIA-2 CPU Bus PB System Controller System Controller SDRAM (256 MB) SDRAM (256 MB) CPU MIPS CPU MIPS Secondary Cache SRAM Secondary Cache SRAM PCMCIA-1 Architecturally, routers have been like normal computers except: - Mechanical form factors, especially for IO - Embedded forwarding and routing SW
4
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 444 What’s a Supercomputer? Traditionally… Cray Y-MP 250 Gbyte/sec of interconnect bandwidth Cray Y-MP C90
5
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 555 Evolution of High-End Routers Increasing bandwidth of external connections: T1 -> DS3 -> OC3 -> OC12 -> OC48 -> OC192 -> OC768 1mbit/sec -> 40 gbit/sec Line speed increases require changes in router architecture to remove the central memory bottleneck and replace with distributed memories and central interconnect fabric
6
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 666 Evolution of High-End Routers Increased computational power for routing, forwarding and feature processing Larger systems (more line cards) desired by end customers to exploit DWDM capabilities and simplify operation of POPs
7
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 777 What’s a High-End Router today? Switch Fabric Route Processor(s) Linecards (8-16) T1 to OC-192 Interfaces Distributed Architecture with Crossbar Switch Fabric Multi-Gigabit Switching Capacity
8
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 888 The next-generation of High-End Routers Switch Fabric Route Processor(s) Linecards (100’s to 1000’s) T1 to OC-768 Interfaces Multi-Terabit Switching Capacity Multi-Chassis, Distributed Architecture with Multi-Stage Switch Fabric
9
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 999 Evolution of Supercomputers Move from globally clocked, ECL vector processors to distributed-memory uP based multiprocessors 250MHz C90 to 1-2GHz Pentium 4, Alpha, Power3 This architecture change driven by: Complexity and economics of building highest performance processors Commoditization of smaller-scale computers Not driven by programming desires of end-users Note that state-of-the-art processors can generate less than 10Gbit/sec of communication data
10
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 10 What’s a Supercomputer today? ASCII White at LLNL 8K processors in 512 nodes, 12TFLOPS Interconnect has connection BW of 1TByte/Sec Diagram and photo from LLNL ASCII webpage
11
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 11 Major components of a Router Distributed Control Plane Used to run routing protocols (= dist. computer) Distributed Data Plane Packet Processing: Examine L2-L7 protocol information (Determine QoS, VPN ID, policy, etc.) Packet Forwarding: Make appropriate routing, switching, and queuing decisions System Interconnect Control Plane – can be combined with data plane or dedicated Data Interconnect – at least sum of external BW required
12
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 12 Major components of a Supercomputer Distributed Control / Computational nodes Small number of processor nodes (4-16) with local memory Distributed IO Subsystem Typically tied to subset of nodes, but if fully distributed these can be viewed as sync/source of external bandwidth similar to router external connections System interconnect BW driven primarily by data sharing requirements and often limited by CPU’s ability to generate data
13
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 13 Router – Supercomputer Analogy High-End RouterSupercomputer Route ProcessorsCPU Nodes Line CardsI/O Nodes Switch FabricInterconnection Network
14
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 14 Route Processors ~ CPU Nodes Route Processors execute routing protocols and maintain routing and forwarding information bases Large networks dictate gigabytes of memory to hold routing and interface database Also require high-peak computation rates to reconverge network topology and download table updates to line cards 1000 MIPs per eight 40Gbit/sec interfaces for control plane CPU nodes in supercomputer run applications and source and sync processor communication traffic 1-2 Gflops and 1000 MIPs per processor 1-2 Gbytes of memory per processor
15
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 15 Router Line Card ~ SC I/O Node Packet forwarding, classification and feature processing require complex look-ups and queuing decisions be made on a per packet basis Even with HW assist (TCAMs, etc.) approximately 500 instructions per packet At 40Gbps and minimum size packet => 100MPPS Total of 50,000 MIPS / 40Gbps line rate Queuing and TCP/IP congestion semantics imply 200millisec of buffering on ingress and egress .2sec x 40Gbps x 2 = 16Gbits = 2Gbyte / 40Gbps line rate Fragmentation usually typically requires 4x BW queuing 40Gbps => 160Gpbs per queue x 2 (I & E) => 320Gbps
16
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 16 Table SRAM Fwd/Class TCAMs RTT Buffer Mem (1GB)+ pointer SRAM Distributed Memory Router Line Card Input Queuing Receive Fwd Engine Control CPU Mem Control Linecard Control CPU Fabric Re-Assem. Transmit Fwd Engine Output Queuing L2 Buffering Optics To Fabric From Fabric Framer RTT Buffer Mem (1GB)+ pointer SRAM Table SRAM Fwd/Class TCAMs 512+MB DRAM
17
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 17 Supercomputer I/O Nodes Disk and network attachment dominate requirements Computational requirements on data typically limits effective throughput 52 nodes of 512 on ASCII-White each with appox. 1-2Gbyte/sec per node of IO BW Data must be moved from IO to local node memory and then IPC’d to other computational nodes Limited by node to interconnect BW limits
18
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 18 Router Switch Fabric ~ SC Interconnect Network Critical design parameters are: Throughput Traffic Isolation Fault-Tolerance Router switch fabric must have over-speed of fabric BW to line BW to provide traffic isolation and deal with packet fragmentation Minimum 1.5x with at least 2x line rate desirable 60-100Gbps per 40Gbps line rate Depending size of system – topology varies from Crossbar Multistage Network (e.g., Benes, Clos) Must be symmetric – all-to-all (like old-style Supercomputer)
19
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 19 Supercomputer Interconnect Network Critical parameters are: Throughput Latency (end-to-end) Actual supercomputers interconnects vary substantially, but usually <1Gbyte/sec per processor Topology Varies, but generally exploits locality Hypercube Torus or Mesh Multi-stage networks
20
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 20 Overall Comparison Feature512 Linecard 40Gpbs/LC Router 512 node, 8K ASCII-White SuperComputer Control MIPS64 GIPS8000 GIPS Data MIPS25600 GIPSN/A Total Memory Storage 1024 Gbytes4096 Gbytes Total Memory Bandwidth 20 Tbyte/sec8 Tbyte/sec Interconnect Bandwidth 4 Tbyte/sec2 Tbyte/sec
21
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 21 Overall Technology Required Traditionally, networking equipment exploited off- the-shelf silicon, FPGA, standard ASIC technology High-end routers with OC-192 support approaching supercomputers 0.25u and 0.18u ASICs shipped in early 2001 High-end routers with OC-768 support require the leading edge of technology ASICs using 0.13u technology and >1500pin packages Latest memory technology Rambus, FCRAM and RLDRAM, QDR SRAM Power per rack comparable to the 9.5KW for IBM’s SP2
22
NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. 22 Conclusions Explosive data rates and optics capabilities have pushed router technology tremendously in the last decade From embedded single-board computers in the 80’s To distributed-memory computers with specialized forwarding, queuing and feature processing capabilities In nearly every metric of system technology, today’s high-end routers match or exceed the capability of an equivalent supercomputer In addition, high-end routers have a critical requirement of system fault-tolerance Going forward, advances in high-end routers and supercomputers are technology-limited
23
23NORDUnet 2003 © 2003, Cisco Systems, Inc. All rights reserved. Thank you! Bob Newhall, newhall@cisco.com
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.