David G. Andersen CMU Guohui Wang, T. S. Eugene Ng Rice Michael Kaminsky, Dina Papagiannaki, Michael A. Kozuch, Michael Ryan Intel Labs Pittsburgh 1 c-Through:

Slides:



Advertisements
Similar presentations
Data Center Networking with Multipath TCP
Advertisements

2nd Symposium on Networked Systems Design & Implementation (NSDI) Boston, MA May 2-4, 2005 Guohui Wang 3, David G. Andersen 2, Michael Kaminsky 1, Michael.
Data Center Fabrics Lecture 12 Aditya Akella.
Junchen Jiang (CMU) Vyas Sekar (Stony Brook U)
Big Data + SDN SDN Abstractions. The Story Thus Far Different types of traffic in clusters Background Traffic – Bulk transfers – Control messages Active.
The Effects of Wide-Area Conditions on WWW Server Performance Erich Nahum, Marcel Rosu, Srini Seshan, Jussara Almeida IBM T.J. Watson Research Center,
Alex Cheung and Hans-Arno Jacobsen August, 14 th 2009 MIDDLEWARE SYSTEMS RESEARCH GROUP.
1 IK1500 Communication Systems IK1330 Lecture 3: Networking Anders Västberg
Copyright © 2005 Department of Computer Science 1 Solving the TCP-incast Problem with Application-Level Scheduling Maxim Podlesny, University of Waterloo.
Vijay Vasudevan, Amar Phanishayee, Hiral Shah, Elie Krevat David Andersen, Greg Ganger, Garth Gibson, Brian Mueller* Carnegie Mellon University, *Panasas.
A Switch-Based Approach to Starvation in Data Centers Alex Shpiner Joint work with Isaac Keslassy Faculty of Electrical Engineering Faculty of Electrical.
1 Emulating AQM from End Hosts Presenters: Syed Zaidi Ivor Rodrigues.
Measuring a (MapReduce) Data Center Srikanth KandulaSudipta SenguptaAlbert Greenberg Parveen Patel Ronnie Chaiken.
Lecture Internet Overview: roadmap 1.1 What is the Internet? 1.2 Network edge  end systems, access networks, links 1.3 Network core  circuit switching,
Reduced TCP Window Size for VoIP in Legacy LAN Environments Nikolaus Färber, Bernd Girod, Balaji Prabhakar.
Tesseract A 4D Network Control Plane
Lecture Internet Overview: roadmap 1.1 What is the Internet? 1.2 Network edge  end systems, access networks, links 1.3 Network core  circuit switching,
Alternative Switching Technologies: Optical Circuit Switches Hakim Weatherspoon Assistant Professor, Dept of Computer Science CS 5413: High Performance.
Transport Level Protocol Performance Evaluation for Bulk Data Transfers Matei Ripeanu The University of Chicago Abstract:
A Scalable, Commodity Data Center Network Architecture Mohammad Al-Fares, Alexander Loukissas, Amin Vahdat Presented by Gregory Peaker and Tyler Maclean.
Jennifer Rexford Princeton University MW 11:00am-12:20pm Data-Center Traffic Management COS 597E: Software Defined Networking.
A Scalable, Commodity Data Center Network Architecture.
Ch. 28 Q and A IS 333 Spring Q1 Q: What is network latency? 1.Changes in delay and duration of the changes 2.time required to transfer data across.
Enable Multi Tenant Clouds Network Virtualization. Dynamic VM Placement. Secure Isolation. … High Scale & Low Cost Datacenters Leverage Hardware. High.
Lecture 1, 1Spring 2003, COM1337/3501Computer Communication Networks Rajmohan Rajaraman COM1337/3501 Textbook: Computer Networks: A Systems Approach, L.
Practical TDMA for Datacenter Ethernet
ElasticTree: Saving Energy in Data Center Networks 許倫愷 2013/5/28.
T. S. Eugene Ngeugeneng at cs.rice.edu Rice University1 COMP/ELEC 429 Introduction to Computer Networks Lecture 8: Bridging Slides used with permissions.
Networking Virtualization Using FPGAs Russell Tessier, Deepak Unnikrishnan, Dong Yin, and Lixin Gao Reconfigurable Computing Group Department of Electrical.
CRON: Cyber-infrastructure for Reconfigurable Optical Networks PI: Seung-Jong Park, co-PI: Rajgopal Kannan GRA: Cheng Cui, Lin Xue, Praveenkumar Kondikoppa,
15-744: Computer Networking L-14 Data Center Networking II.
TCP Throughput Collapse in Cluster-based Storage Systems
Copyright © 2011, Programming Your Network at Run-time for Big Data Applications 張晏誌 指導老師:王國禎 教授.
Introduction to Hadoop and HDFS
DARD: Distributed Adaptive Routing for Datacenter Networks Xin Wu, Xiaowei Yang.
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
Programming Your Network at Run- Time for Big Data Applications Guohui Wang, TS Eugene Ng, Anees Shaikh Presented by Jon Logan.
© Copyright 2010 Hewlett-Packard Development Company, L.P. 1 Jayaram Mudigonda, HP Labs Praveen Yalagandula, HP Labs Mohammad Al-Fares, UCSD Jeff Mogul,
T. S. Eugene Ngeugeneng at cs.rice.edu Rice University1 COMP/ELEC 429 Introduction to Computer Networks Lecture 4: Fundamental network design issues Slides.
A Survey on Optical Interconnects for Data Centers Speaker: Shih-Chieh Chien Adviser: Prof Dr. Ho-Ting Wu.
High-speed TCP  FAST TCP: motivation, architecture, algorithms, performance (by Cheng Jin, David X. Wei and Steven H. Low)  Modifying TCP's Congestion.
ﺑﺴﻢﺍﷲﺍﻠﺭﺣﻣﻥﺍﻠﺭﺣﻳﻡ. Group Members Nadia Malik01 Malik Fawad03.
VL2: A Scalable and Flexible Data Center Network Albert Greenberg, James R. Hamilton, Navendu Jain, Srikanth Kandula, Changhoon Kim, Parantap Lahiri, David.
1 Optical Packet Switching Techniques Walter Picco MS Thesis Defense December 2001 Fabio Neri, Marco Ajmone Marsan Telecommunication Networks Group
Software Defined Networks for Dynamic Datacenter and Cloud Environments.
The Only Constant is Change: Incorporating Time-Varying Bandwidth Reservations in Data Centers Di Xie, Ning Ding, Y. Charlie Hu, Ramana Kompella 1.
Network-Aware Scheduling for Data-Parallel Jobs: Plan When You Can
SECTION 5: PERFORMANCE CHRIS ZINGRAF. OVERVIEW: This section measures the performance of MapReduce on two computations, Grep and Sort. These programs.
An Efficient Gigabit Ethernet Switch Model for Large-Scale Simulation Dong (Kevin) Jin.
Interconnect Networks Basics. Generic parallel/distributed system architecture On-chip interconnects (manycore processor) Off-chip interconnects (clusters.
Chapter 11.4 END-TO-END ISSUES. Optical Internet Optical technology Protocol translates availability of gigabit bandwidth in user-perceived QoS.
An Efficient Gigabit Ethernet Switch Model for Large-Scale Simulation Dong (Kevin) Jin.
An Evaluation of Fairness Among Heterogeneous TCP Variants Over 10Gbps High-speed Networks Lin Xue*, Suman Kumar', Cheng Cui* and Seung-Jong Park* *School.
CSE 413: Computer Network Circuit Switching and Packet Switching Networks Md. Kamrul Hasan
Theophilus Benson*, Ashok Anand*, Aditya Akella*, Ming Zhang + *University of Wisconsin, Madison + Microsoft Research.
C-Through: Part-time Optics in Data centers Aditi Bose, Sarah Alsulaiman.
Cluster Computers. Introduction Cluster computing –Standard PCs or workstations connected by a fast network –Good price/performance ratio –Exploit existing.
Univ. of TehranIntroduction to Computer Network1 An Introduction to Computer Networks University of Tehran Dept. of EE and Computer Engineering By: Dr.
MMPTCP: A Multipath Transport Protocol for Data Centres 1 Morteza Kheirkhah University of Edinburgh, UK Ian Wakeman and George Parisis University of Sussex,
Shuihai Hu, Wei Bai, Kai Chen, Chen Tian (NJU), Ying Zhang (HP Labs), Haitao Wu (Microsoft) Sing Hong Kong University of Science and Technology.
BDTS and Its Evaluation on IGTMD link C. Chen, S. Soudan, M. Pasin, B. Chen, D. Divakaran, P. Primet CC-IN2P3, LIP ENS-Lyon
XFabric: a Reconfigurable In-Rack Network for Rack-Scale Computers Sergey Legtchenko, Nicholas Chen, Daniel Cletheroe, Antony Rowstron, Hugh Williams,
VL2: A Scalable and Flexible Data Center Network
Yiting Xia, T. S. Eugene Ng Rice University
Heitor Moraes, Marcos Vieira, Italo Cunha, Dorgival Guedes
Improving Datacenter Performance and Robustness with Multipath TCP
15-744: Computer Networking
Dingming Wu+, Yiting Xia+*, Xiaoye Steven Sun+,
Bridges and Extended LANs
Specialized Cloud Architectures
Presentation transcript:

David G. Andersen CMU Guohui Wang, T. S. Eugene Ng Rice Michael Kaminsky, Dina Papagiannaki, Michael A. Kozuch, Michael Ryan Intel Labs Pittsburgh 1 c-Through: Part-time Optics in Data Centers

Current solutions for increasing data center network bandwidth 2 1. Hard to construct 2. Hard to expand FatTree

An alternative: hybrid packet/circuit switched data center network 3 Goal of this work: –Feasibility: software design that enables efficient use of optical circuits –Applicability: application performance over a hybrid network

Electrical packet switching Optical circuit switching Switching technology Store and forwardCircuit switching Switching capacity Switching time Optical circuit switching v.s. Electrical packet switching 4 16x40Gbps at high end e.g. Cisco CRS-1 320x100Gbps on market, e.g. Calient FiberConnect Packet granularityLess than 10ms e.g. MEMS optical switch

5 Optical circuit switching is promising despite slow switching time Full bisection bandwidth at packet granularity may not be necessary Full bisection bandwidth at packet granularity may not be necessary [WREN09]: “…we find that traffic at the five edge switches exhibit an ON/OFF pattern… ” [IMC09][HotNets09]: “Only a few ToRs are hot and most their traffic goes to a few other ToRs. …”

Hybrid packet/circuit switched network architecture Optical circuit-switched network for high capacity transfer Electrical packet-switched network for low latency delivery Optical paths are provisioned rack-to-rack –A simple and cost-effective choice –Aggregate traffic on per-rack basis to better utilize optical circuits

Design requirements 7 Control plane: –Traffic demand estimation –Optical circuit configuration Data plane: –Dynamic traffic de-multiplexing –Optimizing circuit utilization (optional) Traffic demands

c-Through (a specific design) 8 No modification to applications and switches Leverage end- hosts for traffic management Centralized control for circuit configuration

c-Through - traffic demand estimation and traffic batching 9 Per-rack traffic demand vector 2. Packets are buffered per-flow to avoid HOL blocking. 1. Transparent to applications. Applications Accomplish two requirements: –Traffic demand estimation –Pre-batch data to improve optical circuit utilization Socket buffers

c-Through - optical circuit configuration 10 Use Edmonds’ algorithm to compute optimal configuration Many ways to reduce the control traffic overhead Traffic demand configuration Controller configuration

c-Through - traffic de-multiplexing 11 VLAN #1 Traffic de-multiplexer Traffic de-multiplexer VLAN #1 VLAN #2 circuit configuration traffic VLAN #2 VLAN-based network isolation: –No need to modify switches –Avoid the instability caused by circuit reconfiguration Traffic control on hosts: –Controller informs hosts about the circuit configuration –End-hosts tag packets accordingly

12 Testbed setup Ethernet switch Emulated optical circuit switch 4Gbps links 100Mbps links 16 servers with 1Gbps NICs Emulate a hybrid network on 48-port Ethernet switch Optical circuit emulation –Optical paths are available only when hosts are notified –During reconfiguration, no host can use optical paths –10 ms reconfiguration delay

13 Evaluation Basic system performance: –Can TCP exploit dynamic bandwidth quickly? –Does traffic control on servers bring significant overhead? –Does buffering unfairly increase delay of small flows? Application performance: –Bulk transfer (VM migration)? –Loosely synchronized all-to-all communication (MapReduce)? –Tightly synchronized all-to-all communication (MPI-FFT) ? Yes No Yes

14 TCP can exploit dynamic bandwidth quickly Throughput ramps up within 10 ms Throughput ramps up within 10 ms Throughput stabilizes within 100ms Throughput stabilizes within 100ms

15 MapReduce Overview mapper reducer load local write data shuffling output file write output file output file Split 0 Split 1 Split 2 Input file Concentrated traffic in 64MB blocks Concentrated traffic in 64MB blocks Concentrated traffic in 64MB blocks Concentrated traffic in 64MB blocks Independent transfers: amenable to batching Independent transfers: amenable to batching

16 MapReduce sort 10GB random data c-Through varying socket buffer size limit (reconfiguration interval: 1 sec) Electrical network Full bisection bandwidth c-Through Completion time (s) 153s 135s

MapReduce sort 10GB random data 17 c-Through varying reconfiguration interval (socket buffer size limit: 100MB) Electrical network Full bisection bandwidth c-Through 168s 135s

Yahoo Gridmix benchmark 18 3 runs of 100 mixed jobs such as web query, web scan and sorting 200GB of uncompressed data, 50 GB of compressed data

19 Summary Hybrid packet/circuit switched data center network c-Through demonstrates its feasibility Good performance even for applications with all to all traffic Future directions to explore: The scaling property of hybrid data center networks Making applications circuit aware Power efficient data centers with optical circuits Picture from Internet websites.