The Synchronous Data Center

Slides:

Advertisements

Similar presentations

IEEE INFOCOM 2004 MultiNet: Connecting to Multiple IEEE Networks Using a Single Wireless Card.

Advertisements

1 An Approach to Real-Time Support in Ad Hoc Wireless Networks Mark Gleeson Distributed Systems Group Dept.

Improving TCP Performance over Mobile Ad Hoc Networks by Exploiting Cross- Layer Information Awareness Xin Yu Department Of Computer Science New York University,

Cache Coherent Distributed Shared Memory. Motivations Small processor count –SMP machines –Single shared memory with multiple processors interconnected.

1 Cheriton School of Computer Science 2 Department of Computer Science RemusDB: Transparent High Availability for Database Systems Umar Farooq Minhas 1,

Network Operating Systems Users are aware of multiplicity of machines. Access to resources of various machines is done explicitly by: –Logging into the.

Virtual Synchrony Jared Cantwell. Review Multicast Causal and total ordering Consistent Cuts Synchronized clocks Impossibility of consensus Distributed.

Scaling Distributed Machine Learning with the BASED ON THE PAPER AND PRESENTATION: SCALING DISTRIBUTED MACHINE LEARNING WITH THE PARAMETER SERVER – GOOGLE,

Distributed Systems Fall 2010 Time and synchronization.

High Performance All-Optical Networks with Small Buffers Yashar Ganjali High Performance Networking Group Stanford University

1 Multiprocessors. 2 Idea: create powerful computers by connecting many smaller ones good news: works for timesharing (better than supercomputer) bad.

588 Section 6 Neil Spring May 11, Schedule Notes – (1 slide) Multicast review –(3slides) RLM (the paper you didn’t read) –(3 slides) ALF & SRM –(8.

Clock Synchronization Ken Birman. Why do clock synchronization?  Time-based computations on multiple machines Applications that measure elapsed time.

Josef Widder1 Why, Where and How to Use the  - Model Josef Widder Embedded Computing Systems Group INRIA Rocquencourt, March 10,

Learning from the Past for Resolving Dilemmas of Asynchrony Paul Ezhilchelvan and Santosh Shrivastava Newcastle University England, UK.

Presentation on Clustering Paper: Cluster-based Scalable Network Services; Fox, Gribble et. al Internet Services Suman K. Grandhi Pratish Halady.

MAC Layer Protocols for Sensor Networks Leonardo Leiria Fernandes.

1 Physical Clocks need for time in distributed systems physical clocks and their problems synchronizing physical clocks u coordinated universal time (UTC)

Common Devices Used In Computer Networks

1 System Models. 2 Outline Introduction Architectural models Fundamental models Guideline.

Copyright © Clifford Neuman and Dongho Kim - UNIVERSITY OF SOUTHERN CALIFORNIA - INFORMATION SCIENCES INSTITUTE Advanced Operating Systems Lecture.

A. Haeberlen Fault Tolerance and the Five-Second Rule 1 HotOS XV (May 18, 2015) Ang Chen Hanjun Xiao Andreas Haeberlen Linh Thi Xuan Phan Department of.

NDSS 2004Hu and Evans, UVa1 Using Directional Antennas to Prevent Wormhole Attacks Lingxuan Hu and David Evans [lingxuan, Department.

EEC 688/788 Secure and Dependable Computing Lecture 10 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University

Protocols and layering Network protocols and software Layered protocol suites The OSI 7 layer model Common network design issues and solutions.

CPU Scheduling Scheduling processes (or kernel-level threads) onto the cpu is one of the most important OS functions. The cpu is an expensive resource.

Mohit Aron Peter Druschel Presenter: Christopher Head

Real-time Software Design

When Is Agreement Possible

Presented by Kristen Carlson Accardi

Wayne Wolf Dept. of EE Princeton University

RT2003, Montreal Niko Neufeld, CERN-EP & Univ. de Lausanne

CSE 486/586 Distributed Systems Logical Time

The University of Adelaide, School of Computer Science

Multi-Channel MAC for Ad Hoc Networks: Handling Multi-Channel Hidden Terminals Using A Single Transceiver Jungmin So and Nitin Vaidya Modified and Presented.

Replication State Machines via Primary-Backup

Distributed Transactions and Spanner

CSE 486/586 Distributed Systems Logical Time

Real-time Software Design

Net301 lecture9 11/5/2015 Lect 9 NET301.

Congestion Control, Internet transport protocols: udp

Lecture 19 – TCP Performance

Lecture 7: Introduction to Distributed Computing.

Software Defined Networking (SDN)

Introduction There are many situations in which we might use replicated data Let’s look at another, different one And design a system to work well in that.

CS 3700 Networks and Distributed Systems

TimeTrader: Exploiting Latency Tail to Save Datacenter Energy for Online Search Balajee Vamanan, Hamza Bin Sohail, Jahangir Hasan, and T. N. Vijaykumar.

IT351: Mobile & Wireless Computing

CSE8380 Parallel and Distributed Processing Presentation

Architectures of distributed systems Fundamental Models

On-time Network On-chip

Shared Memory Programming

Discretized Streams: A Fault-Tolerant Model for Scalable Stream Processing Zaharia, et al (2012)

CS 3700 Networks and Distributed Systems

Architectures of distributed systems Fundamental Models

Ethernet – CSMA/CD Review

COMP60621 Fundamentals of Parallel and Distributed Systems

Chapter 6: CPU Scheduling

Fault-Tolerant State Machine Replication

All goals are not created equally.

Lecture 16, Computer Networks (198:552)

Architectures of distributed systems

EEC 688/788 Secure and Dependable Computing

EEC 688/788 Secure and Dependable Computing

Architectures of distributed systems Fundamental Models

Why study Software Engineering ?

Overview of Networking

COMP60611 Fundamentals of Parallel and Distributed Systems

CSE 486/586 Distributed Systems Logical Time

CSE 486/586 Distributed Systems Time and Synchronization

Presentation transcript:

The Synchronous Data Center Tian Yang Robert Gifford Andreas Haeberlen Linh Thi Xuan Phan Department of Computer and Information Science University of Pennsylvania HotOS XVII (May 14, 2019)

If trains were asynchronous… Station clocks would be at most loosely synchronized Congestion would appear at unpredictable times Station stops could take an arbitrary amount of time Trains would have arbitrary delays and would often be lost entirely HotOS XVII (May 14, 2019)

The Asynchrony assumption System designers typically assume that: Clocks are at most loosely synchronized Network latencies are unpredictable Packets are often dropped in the network We don’t know much about node speeds This is often a good idea! Sometimes we really don’t know Example: System with multiple administrative domains Nicely conservative assumption If the system works in this model, it almost certainly will work on the actual hardware This is the “default”; rarely questioned HotOS XVII (May 14, 2019)

Asynchrony can be expensive But: No time bounds can be given on anything This makes many things very difficult! Example: Congestion control Example: Fault detection Example: Consistency Example: Fighting tail latencies HotOS XVII (May 14, 2019)

It doesn’t have to be that way! The train network is not asynchronous Single administrative domain (like a data center!) Carefully scheduled; speeds and timings are (mostly) known Not all distributed systems are, either! Example: Cyber-physical systems Clocks are closely in sync Network traffic is scheduled; hard latency bounds are known No congestion losses! (And transmission losses are rare) Node speeds and execution times are known exactly CPS are mostly synchronous (out of necessity)! HotOS XVII (May 14, 2019)

So what? Synchrony helps in two ways: How does that help us? Hard latency bounds -> we know how long we need to wait! Absence of a message at a particular time means something How does that help us? No (surprising) congestion anymore Fault detection would be much easier Consistency would be easier to get Long latency tails would disappear Many algorithms become simpler, or even trivial (“boring”) Workloads with timing requirements can be supported Example: Inflate airbag when sensors detect a collision HotOS XVII (May 14, 2019)

Could a data center be synchronous? At first glance, absolutely not! Some objections: Network is shared, so packet delays are unpredictable! Who knows how long anything takes under Linux? Clocks can’t be synchronized closely enough! Our claim: But: Fastpass (SIGCOMM’14) Real-time operating systems Spanner (OSDI’12) Most of the asynchrony in today’s data centers is avoidable! HotOS XVII (May 14, 2019)

Outline Goal: Synchronous data center How could it be done? Network layer Synchronized clocks Building blocks Hardware Software Scheduling NEXT HotOS XVII (May 14, 2019)

The How: Network layer Why is latency so unpredictable? Cross-traffic and queueing! Inspiration: Fastpass (SIGCOMM’14) Machines must ask an ‘arbiter’ for permission before sending Arbiter schedules packets (at >2Tbit/s on eight cores!) Result: (almost) no queueing in the network! No attempt to control end-to-end timing But we see no reason why this couldn’t be added! HotOS XVII (May 14, 2019)

The How: Synchronized clocks Why are clocks so hard to synchronize? Hard to do in the wide area, or via NTP (with cross-traffic) But it can be done: DTP (SIGCOMM’16) achieves nanosecond-precision … with some help from the hardware Google Spanner (OSDI’12) keeps different data centers to within ~4ms … with some help from atomic clocks Having predictable network latencies should help, too! Figure 6(a) from the DTP paper Figure 6 from the Spanner paper HotOS XVII (May 14, 2019)

The How: Building blocks Async Async Tmax Sync Sync Ordering Fault detection HotOS XVII (May 14, 2019)

The How: Software Why is software timing so unpredictable? Reason #1: Hardware features (caches, etc.) Not as bad as it seems: +/- 2% is possible (TDR, OSDI’14) Emerging features, such as Intel’s CAT, should help Meltdown/Spectre will probably accelerate this trend Reason #2: OS structure Linux & friends are not designed for timing stability Idea from CPS: Use elements from RT-OSes … but it will require deep structural changes! No small “synchrony patch” for Linux! HotOS XVII (May 14, 2019)

The How: Fault tolerance What if things (inevitably) break? Could disrupt the careful synchronous “choreography”! Challenge #1: Telling when it breaks Actually easier with synchrony! Challenge #2: Doing something about it How to reconfigure while maintaining timing guarantees? Idea from CPS: Use mode-change protocols! System can operate in different “modes”, based on observed faults Transition from one mode to another via precomputed protocols Result: Timing is maintained during the transition HotOS XVII (May 14, 2019)

The How: Scheduling Can you schedule an entire data center? Surprisingly, we are getting pretty good at it! Sparrow (SOSP’13) can schedule 100ms tasks on 10,000s of cores Idea from CPS: Compositional scheduling Schedule smaller entities (nodes? pods?) in detail Abstract and aggregate, then schedule next-larger entity Repeat until entire system is scheduled Dispatching can be done locally; so can most updates HotOS XVII (May 14, 2019)

Summary Synchronous data centers seem possible! Reasons to be optimistic: Fastpass, DTP, RTOSes, … There are interesting benefits to be had! Asynchrony creates or amplifies challenges like fault detection, congestion control, consistency, tail latencies, load balancing, performance debugging, algorithmic complexity, … These problems could become simpler, or go away entirely! But much work remains to be done! Not much existing work on DC-scale synchronous systems Can we adapt some ideas from cyber-physical systems? Questions? HotOS XVII (May 14, 2019)