B96611024 謝宗廷 B96b02016 張煥基 1. Outline Introduction Bcube structure Bcube source routing OTHER DESIGN ISSUES GRACEFUL DEGRADATION Implementation Architecture.

Slides:



Advertisements
Similar presentations
IP Router Architectures. Outline Basic IP Router Functionalities IP Router Architectures.
Advertisements

COS 461 Fall 1997 Routing COS 461 Fall 1997 Typical Structure.
BCube: A High Performance, Server-centric Network Architecture for Modular Data Centers Chuanxiong Guo1, Guohan Lu1, Dan Li1, Haitao Wu1, Xuan Zhang2,
Packet Switching COM1337/3501 Textbook: Computer Networks: A Systems Approach, L. Peterson, B. Davie, Morgan Kaufmann Chapter 3.
Multicast in Wireless Mesh Network Xuan (William) Zhang Xun Shi.
Fundamentals of Computer Networks ECE 478/578 Lecture #13: Packet Switching (2) Instructor: Loukas Lazos Dept of Electrical and Computer Engineering University.
Datacenter Network Topologies
William Stallings Data and Computer Communications 7 th Edition (Selected slides used for lectures at Bina Nusantara University) Internetworking.
EE 122: Router Design Kevin Lai September 25, 2002.
1 Chapter 8 Local Area Networks - Internetworking.
BCube: A High Performance, Server-centric Network Architecture for Modular Data Centers Chuanxiong Guo1, Guohan Lu1, Dan Li1, Haitao Wu1, Xuan Zhang2,
Chuanxiong Guo, Haitao Wu, Kun Tan,
CS335 Networking & Network Administration Tuesday, April 20, 2010.
Department of Computer Science, Jinan University, Guangzhou, P.R. China Lijun Lyu, Junjie Xie, Yuhui Deng, Yongtao Zhou ICA3PP 2014: The 14th International.
1 LAN switching and Bridges Relates to Lab 6. Covers interconnection devices (at different layers) and the difference between LAN switching (bridging)
Ji-Yong Shin * Bernard Wong +, and Emin Gün Sirer * * Cornell University + University of Waterloo 2 nd ACM Symposium on Cloud ComputingOct 27, 2011 Small-World.
1 25\10\2010 Unit-V Connecting LANs Unit – 5 Connecting DevicesConnecting Devices Backbone NetworksBackbone Networks Virtual LANsVirtual LANs.
WAN Technologies.
1 LAN switching and Bridges Relates to Lab 6. Covers interconnection devices (at different layers) and the difference between LAN switching (bridging)
Chapter 13: WAN Technologies and Routing 1. LAN vs. WAN 2. Packet switch 3. Forming a WAN 4. Addressing in WAN 5. Routing in WAN 6. Modeling WAN using.
1 CS 4396 Computer Networks Lab LAN Switching and Bridges.
Presenter: Po-Chun Wu. Outline Introduction BCube Structure BCube Source Routing (BSR) Other Design Issues Graceful degradation Implementation.
Common Devices Used In Computer Networks
Routing & Architecture
DARD: Distributed Adaptive Routing for Datacenter Networks Xin Wu, Xiaowei Yang.
LAN Switching and Wireless – Chapter 1 Vilina Hutter, Instructor
Review: –Ethernet What is the MAC protocol in Ethernet? –CSMA/CD –Binary exponential backoff Is there any relationship between the minimum frame size and.
Load-Balancing Routing in Multichannel Hybrid Wireless Networks With Single Network Interface So, J.; Vaidya, N. H.; Vehicular Technology, IEEE Transactions.
15.1 Chapter 15 Connecting LANs, Backbone Networks, and Virtual LANs Copyright © The McGraw-Hill Companies, Inc. Permission required for reproduction or.
Networking Fundamentals. Basics Network – collection of nodes and links that cooperate for communication Nodes – computer systems –Internal (routers,
Chapter 11 Extending LANs 1. Distance limitations of LANs 2. Connecting multiple LANs together 3. Repeaters 4. Bridges 5. Filtering frame 6. Bridged network.
Dual Centric Data Center Network Architectures DAWEI LI, JIE WU (TEMPLE UNIVERSITY) ZHIYONG LIU, AND FA ZHANG (CHINESE ACADEMY OF SCIENCES) ICPP 2015.
Rate-Based Channel Assignment Algorithm for Multi-Channel Multi- Rate Wireless Mesh Networks Sok-Hyong Kim and Young-Joo Suh Department of Computer Science.
Chuanxiong Guo, Haitao Wu, Kun Tan, Lei Shi, Yongguang Zhang, Songwu Lu SIGCOMM 2008 Presented by Ye Tian for Course CS05112.
Symbiotic Routing in Future Data Centers Hussam Abu-Libdeh Paolo Costa Antony Rowstron Greg O’Shea Austin Donnelly MICROSOFT RESEARCH Presented By Deng.
Authors: Xiaoqiao Meng, Vasileio Pappas and Li Zhang
Tufts Wireless Laboratory School Of Engineering Tufts University Paper Review “An Energy Efficient Multipath Routing Protocol for Wireless Sensor Networks”,
1 An Arc-Path Model for OSPF Weight Setting Problem Dr.Jeffery Kennington Anusha Madhavan.
ECE 544 Project3 Group 9 Brien Range Sidhika Varshney Sanhitha Rao Puskuru.
Super computers Parallel Processing
A Framework for Reliable Routing in Mobile Ad Hoc Networks Zhenqiang Ye Srikanth V. Krishnamurthy Satish K. Tripathi.
CS440 Computer Networks 1 Packet Switching Neil Tang 10/6/2008.
5: DataLink Layer 5a-1 Bridges and spanning tree protocol Reference: Mainly Peterson-Davie.
2/14/2016  A. Orda, A. Segall, 1 Queueing Networks M nodes external arrival rate (Poisson) service rate in each node (exponential) upon service completion.
Jiaxin Cao, Rui Xia, Pengkun Yang, Chuanxiong Guo,
1 Chapter 3: Packet Switching (Switched LANs) Dr. Rocky K. C. Chang 23 February 2004.
1 Networking and Internetworking Devices we need networking and internetworking devices to extend physical distance and to improve efficiency and manageability.
Peter Pham and Sylvie Perreau, IEEE 2002 Mobile and Wireless Communications Network Multi-Path Routing Protocol with Load Balancing Policy in Mobile Ad.
CS 6401 Intra-domain Routing Outline Introduction to Routing Distance Vector Algorithm.
1 LAN switching and Bridges Relates to Lab Outline Interconnection devices Bridges/LAN switches vs. Routers Bridges Learning Bridges Transparent.
CHAPTER -II NETWORKING COMPONENTS CPIS 371 Computer Network 1 (Updated on 3/11/2013)
CubicRing ENABLING ONE-HOP FAILURE DETECTION AND RECOVERY FOR DISTRIBUTED IN- MEMORY STORAGE SYSTEMS Yiming Zhang, Chuanxiong Guo, Dongsheng Li, Rui Chu,
WAN Technologies. 2 Large Spans and Wide Area Networks MAN networks: Have not been commercially successful.
Network layer (addendum) Slides adapted from material by Nick McKeown and Kevin Lai.
Data Center Network Architectures
A Survey of Data Center Network Architectures By Obasuyi Edokpolor
Packets & Routing Lower OSI layers (1-3) concerned with packets and the network Packets carry data independently through the network, and into other networks…
The Underlying Technologies
Chuanxiong Guo, et al, Microsoft Research Asia, SIGCOMM 2008
Lab 2 – Hub/Switch Data Link Layer
3. Internetworking (part 2: switched LANs)
Chapter 4 Data Link Layer Switching
Lab 2 – Hub/Switch Data Link Layer
BCube: A High Performance, Server-centric Network Architecture for Modular Data Centers Chuanxiong Guo1, Guohan Lu1, Dan Li1, Haitao Wu1, Xuan Zhang2,
What’s “Inside” a Router?
Chuanxiong Guo, Haitao Wu, Kun Tan,
ECE 544 Protocol Design Project 2016
Intradomain Routing Outline Introduction to Routing
Dr. Rocky K. C. Chang 23 February 2004
EE 122: Lecture 7 Ion Stoica September 18, 2001.
Presentation transcript:

B 謝宗廷 B96b02016 張煥基 1

Outline Introduction Bcube structure Bcube source routing OTHER DESIGN ISSUES GRACEFUL DEGRADATION Implementation Architecture Conclusion 2

Introduction Organizations now use the MDC. (shorter deployment time, higher system and power density, lower cooling and manufacturing cost.) The Bcube is a high-performance and robust network architecture for an MDC network architecture. BCube is designed to well support all these traffic patterns. (one-to-one, one-to-several, one-to-all, or all-to-all.)

Bandwidth-intensive application support One-to-one: one server moves data to another server. (disk backup) One-to-several: one server transfers the same copy of data to several receivers. (distributed file systems) One-to-all: a server transfers the same copy of data to all the other servers in the cluster (boardcast) All-to-all: very server transmits data to all the other servers (mapreduce) 4

BCUBE STRUCTURE 5

Bcube construction (Bcube k,n)

Bcube 1

Bcube 2 (n=4) level2

Each server in a BCube k has k + 1 ports, which are numbered from level-0 to level-k. a BCube k has N = n k+1 servers and k+1 level of switches, with each level having n k n-port switches. a BCube k using an address array a k a k-1 …… a0. 9

Single-path Routing in BCube use h(A;B) to denote the Hamming distance of two servers A and B. Two servers are neighbors if they connect to the same switch. The Hamming distance of two neighboring servers is one. More specifcally, two neighboring servers that connect to the same level-i switch only differ at the i-th digit in their address arrays. 10

The diameter, which is the longest shortest path among all the server pairs, of a BCube k, is k + 1. k is a small integer, typically at most 3. There- fore, BCube is a low-diameter network. 11

Multi-paths for One-to-one Traffic Two parallel paths between a source server and a destination server exist if they are node-disjoint,, i.e., the intermediate servers and switches on one path do not appear on the other. It is also easy to observe that the number of parallel paths between two servers be upper bounded by k + 1, since each server has only k + 1 links. 12

There are k + 1 parallel paths between any two servers in a BCube k. 13

There are h(A;B) and k + 1-h(A;B) paths in the first and second categories, respectively. observe that the maximum path length of the paths constructed by BuildPathSet be k + 2. It is easy to see that BCube should also well support several-to-one and all-to-one traffic patterns. 14

Speedup for One-to-several Traffic These complete graphs can speed up data replications in distributed file systems src has n-1 choices for each d i. Therefore, src can build (n - 1) k+1 such complete graphs. When a client writes a chunk to r chunk servers, it sends 1/r of the chunk to each of the chunk server. This will be r times faster than the pipeline model. 15

Source:00000 Want to build a complete graph: 00001,00010,00100,01000,10000 Complete graph: (00000,00001,00010,00100) >01001-> >01010-> >01100->

Speedup for One-to-all Traffic In one-to-all, a source server delivers a file to all the other servers. It is easy to see that under tree and fat-tree, the time for all the receivers to receive the file is at least L. A source can deliver a file of size L to all the other servers in L /k+1 time in a BCube k. constructing k+1 edge-disjoint server spanning trees from the k + 1 neighbors of the source. 17

When a source distributes a file to all the other servers, it can split the file into k +1 parts and simultaneously deliver all the parts via different spanning trees. 18

Aggregate Bottleneck Throughput for All-to-all Traffic the flows that receive the smallest throughput are called the bottleneck flows. The aggregate bottleneck throughput (ABT) is defined as the number of flows times the throughput of the bottleneck flow. n/n-1 (N -1), where n is the switch port number and N is the number of servers. 19

BCUBE SOURCE ROUTING 20

21 sourcedestination intermediate K+1 path Probe packet

Source: obtain k+1 parallel paths and then probes these paths. if one path is found not available, the source uses the Breadth First Search (BFS) algorithm to find another parallel path. removes the existing parallel paths and the failed links from the BCube k graph, and then uses BFS to search for a path the number of parallel paths must be smaller than k

Intermediate: Case1: if its next hop is not available, it returns a path failure message (which includes the failed link) to the source. Case2: it updates the available bandwidth field of the probe packet if its available bandwidth is smaller than the existing value. 23

Destination: a destination server receives a probe packet, it first updates the available bandwidth field of the probe packet if the available bandwidth of the incoming link is smaller than the value carried in the probe packet. It then sends the value back to the source in a probe response messages 24

5.1 Partial BCube 25

Why Partial BCube??? In some cases, it may be difficult or unnecessary to build a complete BCube structure. For example, when n = 8 and k = 3, we have 4096 servers in a BCube3. 8 ** 4 = 4096 However, due to space constraint, we may only be able to pack 2048 servers. 26

如何建立 partial BCube k (1) build the BCube k−1 s (2) use partial layer-k switches to interconnect the BCube k−1 s. 27

Example 28

挑戰 29

Solution When building a partial BCube k, we first build the needed BCube k−1 s, we then connect the BCube k−1 s using a full layer-k switches. 30

Pro and con of full layer-k switches 好處壞處 BCubeRouting performs just as in a complete BCube, and BSR just works as before. switches in layer-k are not fully utilized 31

5.2 Packaging and Wiring 32

Condition We show how packaging and wiring can be addressed for a container with 2048 servers and port switches (a partial BCube with n = 8 and k = 3). 33

40-feet container 34

One rack layer-1 8 layer-2 16 layer-3

One rack = One BCube servers 16 (8-port switches)

One super-rack = One BCube 2 The level-2 wires are within a super-rack and level-3 wires are between super-racks.

5.3 Routing to External Networks

We assume that both internal and external computers use TCP/IP. We propose aggregator and gateway for external communication.

We can use a 48X1G+1X10G aggregator to replace several mini-switches and use the 10G link to connect to the external network. The servers that connect to the aggregator become gateways.

When an internal server sends a packet to an external IP address (1) choose one of the gateways. (2) The packet is then routed to the gateway using BSR (BCube Source Routing) (3) After the gateway receives the packet, it strips the BCube protocol header and forwards the packet to the external network via the 10G uplink

說文解字 aggregate bottleneck throughput (ABT) ABT reflects the all-to-all network capacity. ABT = ( the bottleneck flow) * ( the number of total flows in the all- to-all tra ffi c model ) Graceful degradation states that when server or switch failure increases, ABT reduces slowly and there are no dramatic performance falls.

實驗目的 In this section, we use simulations to compare the aggregate bottleneck throughput (ABT) of BCube, fat-tree [1], and DCell [9], under random server and switch failures.

THE FAT TREE

DCell

Assumption: all the links are 1Gb/s and there are 2048 servers. switch: we use 8-port switches to construct the network structures.

材料與方法 BCube networkwe use is a partial BCube3 with n = 8 that uses 4 full BCube 2. fat-tree structurefive layers of switches, with layers 0 to 3 having 512 switches per-layer and layer-4 having 256 switches. DCellpartial DCell2 which contains 28 full DCell1 and one partial DCell1 with 32 servers. 48

結果 BCube(1) only BCube provides high ABT and graceful degradation fat-treewhen there is no failure, both BCube and fat-tree provide high ABT values, 2006Gb/s for BCube and 1895 Gb/s for fat-tree. DCell(1) ABT: 298Gb/s 原因: First, the traffic is imbalanced at different levels of links in DCell. Second, partial DCell makes the traffic imbalanced even for links at the same level. 沒有 load-balancing 49

ABT under server failure

ABT under switch failure

BCube 的過人之處 BCube performs well under both server and switch failures. the degradation is graceful. when the switch failure ratio reaches 20%: fat-tree 的 ABT 267Gb/s BCube 的 ABT 765Gb/s

BCube stack We have prototyped the BCube architecture by designing and implementing a BCube protocol stack.

BCube stack BCube stack 的核心組成 BSR protocolrouting neighbor maintenance protocol maintains a neighbor status table the packet sending/receiving part interacts with the TCP/IP stack packet forwarding enginerelays packets for other servers. 55

BCube packet

BCube header

BCube header 的組成 source and destination BCube addresses packet id protocol type payload length header checksum 58 BCube stores the complete path and a next hop index (NHI) in the header of every BCube packet

NHA

relays packets for other servers

BCube stack BCube stack 的核心組成 BSR protocolrouting neighbor maintenance protocol maintains a neighbor status table the packet sending/receiving part interacts with the TCP/IP stack packet forwarding enginerelays packets for other servers. 61

packet forwarding engine We have designed an efficient packet forwarding engine which decides the next hop of a packet by only one table lookup. 62 neighbor status table ( 1)NeighborMAC : MAC address (2)OutPort, and : the port that connects to the neighbor (3)StatusFlag: if the neighbor is available packet forwarding procedure

Sending packets to the next hop It then extracts the status and the MAC address of the next hop, using the NHA value as the index.

CONCLUSION BCube as a novel network architecture for shipping- container-based modular data centers (MDC) 64 功能: accelerates one- to-x traffic patterns provides high network capacity for all-to-all traffic

未來目標 how to scale our server-centric design from the single container to multiple containers