(MRC) 2 These slides are not approved for public release Resilient high-dimensional datacenter 1 Control Plane: Controllers and Switches.

Slides:



Advertisements
Similar presentations
-Grids and the OptIPuter Software Architecture Andrew A. Chien Director, Center for Networked Systems SAIC Chair Professor, Computer Science and Engineering.
Advertisements

Technology Drivers Traditional HPC application drivers – OS noise, resource monitoring and management, memory footprint – Complexity of resources to be.
Virtualisation From the Bottom Up From storage to application.
System Area Network Abhiram Shandilya 12/06/01. Overview Introduction to System Area Networks SAN Design and Examples SAN Applications.
Introducing Campus Networks
B 黃冠智.
Estinet open flow network simulator and emulator. IEEE Communications Magazine 51.9 (2013): Wang, Shie-Yuan, Chih-Liang Chou, and Chun-Ming Yang.
4.1.5 System Management Background What is in System Management Resource control and scheduling Booting, reconfiguration, defining limits for resource.
Helper Threads via Virtual Multithreading on an experimental Itanium 2 processor platform. Perry H Wang et. Al.
© Copyright 2014 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice. Software Defined Networking.
SDN and Openflow.
Network Innovation using OpenFlow: A Survey
Scalable Network Virtualization in Software-Defined Networks
Disco Running Commodity Operating Systems on Scalable Multiprocessors.
1 Dr. Frederica Darema Senior Science and Technology Advisor NSF Future Parallel Computing Systems – what to remember from the past RAMP Workshop FCRC.
Anthony Trinh and Rich Zieminski Department of Computer Science, Columbia University { akt2105, rez2107
Faithful Reproduction of Network Experiments Dimosthenis Pediaditakis Charalampos Rotsos Andrew W. Moore Computer Laboratory,
A Scalable, Commodity Data Center Network Architecture Mohammad Al-Fares, Alexander Loukissas, Amin Vahdat Presented by Gregory Peaker and Tyler Maclean.
Jennifer Rexford Princeton University MW 11:00am-12:20pm Data-Center Traffic Management COS 597E: Software Defined Networking.
Transport SDN: Key Drivers & Elements
Jennifer Rexford Princeton University MW 11:00am-12:20pm SDN Software Stack COS 597E: Software Defined Networking.
WIR FORSCHEN FÜR SIE The Palladio Component Model (PCM) for Performance and Reliability Prediction of Component-based Software Architectures Franz Brosch.
Evaluating Centralized, Hierarchical, and Networked Architectures for Rule Systems Benjamin Craig University of New Brunswick Faculty of Computer Science.
4.x Performance Technology drivers – Exascale systems will consist of complex configurations with a huge number of potentially heterogeneous components.
Microkernels, virtualization, exokernels Tutorial 1 – CSC469.
(1) Univ. of Rome Tor Vergata, (2) Consortium GARR, (3) CREATE-NET
1 | Infinera Copyright 2013 © Intelligent Transport Network Manuel Morales Technical Director Infinera.
A Cloud is a type of parallel and distributed system consisting of a collection of inter- connected and virtualized computers that are dynamically provisioned.
Detail: Reducing the Flow Completion Time Tail in Datacenter Networks SIGCOMM PIGGY.
HPC use in Testing Ad Hoc Wireless Sensor Networks
Priority Research Direction (use one slide for each) Key challenges -Fault understanding (RAS), modeling, prediction -Fault isolation/confinement + local.
Cluster Reliability Project ISIS Vanderbilt University.
Faithful Reproduction of Network Experiments Dimosthenis Pediaditakis Charalampos Rotsos Andrew W. Moore Computer Laboratory,
Extreme scale parallel and distributed systems – High performance computing systems Current No. 1 supercomputer Tianhe-2 at petaflops Pushing toward.
Extreme-scale computing systems – High performance computing systems Current No. 1 supercomputer Tianhe-2 at petaflops Pushing toward exa-scale computing.
Virtualization: Not Just For Servers Hollis Blanchard PowerPC kernel hacker.
1 Heterogeneity in Multi-Hop Wireless Networks Nitin H. Vaidya University of Illinois at Urbana-Champaign © 2003 Vaidya.
VirtualBox What you need to know to build a Virtual Machine.
Politecnico di Torino Dipartimento di Automatica ed Informatica TORSEC Group Performance of Xen’s Secured Virtual Networks Emanuele Cesena Paolo Carlo.
COMS E Cloud Computing and Data Center Networking Sambit Sahu
LAN Switching and Wireless – Chapter 1
The Limitation of MapReduce: A Probing Case and a Lightweight Solution Zhiqiang Ma Lin Gu Department of Computer Science and Engineering The Hong Kong.
4.2.1 Programming Models Technology drivers – Node count, scale of parallelism within the node – Heterogeneity – Complex memory hierarchies – Failure rates.
OOI CI LCA REVIEW August 2010 Ocean Observatories Initiative OOI Cyberinfrastructure Architecture Overview Michael Meisinger Life Cycle Architecture Review.
A.SATHEESH Department of Software Engineering Periyar Maniammai University Tamil Nadu.
Issues Autonomic operation (fault tolerance) Minimize interference to applications Hardware support for new operating systems Resource management (global.
Software Defined Networks for Dynamic Datacenter and Cloud Environments.
Workshop BigSim Large Parallel Machine Simulation Presented by Eric Bohm PPL Charm Workshop 2004.
Harmony: A Run-Time for Managing Accelerators Sponsor: LogicBlox Inc. Gregory Diamos and Sudhakar Yalamanchili.
Min Lee, Vishal Gupta, Karsten Schwan
SDN Management Layer DESIGN REQUIREMENTS AND FUTURE DIRECTION NO OF SLIDES : 26 1.
Simics: A Full System Simulation Platform Synopsis by Jen Miller 19 March 2004.
1 Wide Area Network Emulation on the Millennium Bhaskaran Raman Yan Chen Weidong Cui Randy Katz {bhaskar, yanchen, wdc, Millennium.
Full and Para Virtualization
SOFTWARE DEFINED NETWORKING/OPENFLOW: A PATH TO PROGRAMMABLE NETWORKS April 23, 2012 © Brocade Communications Systems, Inc.
Data Centers and Cloud Computing 1. 2 Data Centers 3.
Integrated Simulation and Emulation Platform for Cyber-Physical System Security Experimentation Wei Yan, Yuan Xue, Xiaowei Li, Jiannian Weng, Timothy Busch,
Software Defined Datacenter – from Vision to Solution
1 Scalability and Accuracy in a Large-Scale Network Emulator Nov. 12, 2003 Byung-Gon Chun.
Atrium Router Project Proposal Subhas Mondal, Manoj Nair, Subhash Singh.
Virtualization Neependra Khare
Instructor Materials Chapter 7: Network Evolution
A New Coherence Method Using A Multicast Address Network
Software Defined Networking (SDN)
Shadow: Scalable and Deterministic Network Experimentation
ModelNet: A Large-Scale Network Emulator for Wireless Networks Priya Mahadevan, Ken Yocum, and Amin Vahdat Duke University, Goal:
Introduction to Virtual Machines
Introduction to Virtual Machines
In-network computation
Elmo Muhammad Shahbaz Lalith Suresh, Jennifer Rexford, Nick Feamster,
Presentation transcript:

(MRC) 2 These slides are not approved for public release Resilient high-dimensional datacenter 1 Control Plane: Controllers and Switches

(MRC) 2 These slides are not approved for public release Resilient high-dimensional datacenter switching Approach: high-dimensional structure with a switchlet for every compute node Multi-path redundant topology for performance, resilience and security Components are functionally interchangeable High(er) performance through closer processor- network affinity Remap network topology to match program data flow 2 Non-traditional world view!

(MRC) 2 Resilient Realtime Data Delivery 3 Datacenter applications suffer from insufficient network isolation. R2D2 implements prioritized, distributed admission control using unmodified, commodity hardware. R2D2 reduces in-network interference latencies by over 300× and performs on- par with, or better than more complex congestion control schemes including DCTCP, ECN, 802.3x, PDQ, and pFabric R2D2

(MRC) 2 These slides are not approved for public release 4

(MRC) 2 These slides are not approved for public release The quest for scalable network experimentation 5 Fidelity Scalability Reproducibility How to evaluate exciting new ideas on future networking ? o Increasing network sizes o Complex systems (OS + net functions virtualization, SDN) o 10GbE and beyond (100 GbE?) link speeds o Complex application-level behavours Challenges Key properties for a modern network experimentation platform Replicate experiment + results: across different platforms over multiple runs Replicate real-system and application behavior with accuracy Ability to reproduce larger scale experiments, maintaining fidelity

(MRC) 2 These slides are not approved for public release Simulation vs Emulation vs Hybrid 6 Reproducibility FidelityScalability

(MRC) 2 These slides are not approved for public release Selena Design 7 Experiment description Python API Selena compiler Selena compiler Reproducibility: - XAPI-based Python API, automated experiments Scalability - Time dilation for unmodified guests - scalability tuning knobs (trade time for fidelity) Fidelity - link emulation - realistic OpenFlow switch models - unmodified code execution: real stacks and OS, full POSIX support

(MRC) 2 These slides are not approved for public release Fidelity evaluation 8 Star topology Fat-tree topology Execution time: Ns3  175min 24sec Selena  20min Execution time: Ns3  175min 24sec Selena  20min Execution time: Ns3  172min 51sec Selena  40min Execution time: Ns3  172min 51sec Selena  40min

(MRC) 2 These slides are not approved for public release Demo Description 9 Fat-Tree topology (K=4) Multi-Layered OpenFlow controller architecture L-2 routing controlled by SDN

(MRC) 2 These slides are not approved for public release Current work 10 Make Selena publicly available (open source code): - Improve scalability - Multi-machine emulation - Optimize guest-2-guest Xen communications Use cases - Evaluation of scenarios with layered OpenFlow controllers - SDN coupling with workload consolidation Publications ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS) 2014 Faithful Reproduction of Network Experiments Dimosthenis Pediaditakis, Charalampos Rotsos, Andrew W. Moore (University of Cambridge)

(MRC) 2 These slides are not approved for public release 11

(MRC) 2 These slides are not approved for public release Directions and Research Questions Can we exploit the NetFPGA10G platform and Bluespec OpenFlow switch to integrate R2D2 with resilient switchlets? Does datacenter mesh networking improve resilience, performance & power use? Can we architect a hierarchical control system for it? Can we use capabilities to identify communication flow? 12