SEDA: An Architecture for Scalable, Well-Conditioned Internet Services

Slides:

Advertisements

Similar presentations

CS 443 Advanced OS Fabián E. Bustamante, Spring 2005 Resource Containers: A new Facility for Resource Management in Server Systems G. Banga, P. Druschel,

Advertisements

Study of Hurricane and Tornado Operating Systems By Shubhanan Bakre.

Flash: An efficient and portable Web server Authors: Vivek S. Pai, Peter Druschel, Willy Zwaenepoel Presented at the Usenix Technical Conference, June.

1 SEDA: An Architecture for Well- Conditioned, Scalable Internet Services Matt Welsh, David Culler, and Eric Brewer Computer Science Division University.

CS533 Concepts of Operating Systems Jonathan Walpole.

SEDA: An Architecture for Well- Conditioned, Scalable Internet Services Matt Welsh, David Culler, and Eric Brewer Computer Science Division University.

1 Web Server Performance in a WAN Environment Vincent W. Freeh Computer Science North Carolina State Vsevolod V. Panteleenko Computer Science & Engineering.

Concurrency, Thread and Event CS6410 Sept 6, 2011 Ji-Yong Shin.

SEDA: An Architecture for Well-Conditioned, Scalable Internet Services Matt Welsh, David Culler, and Eric Brewer Computer Science Division University of.

Capriccio: Scalable Threads for Internet Services ( by Behren, Condit, Zhou, Necula, Brewer ) Presented by Alex Sherman and Sarita Bafna.

CS 3013 & CS 502 Summer 2006 Scheduling1 The art and science of allocating the CPU and other resources to processes.

Precept 3 COS 461. Concurrency is Useful Multi Processor/Core Multiple Inputs Don’t wait on slow devices.

Capriccio: Scalable Threads for Internet Services (von Behren) Kenneth Chiu.

Chapter 10: Stream-based Data Management Title: Design, Implementation, and Evaluation of the Linear Road Benchmark on the Stream Processing Core Authors:

Capriccio: Scalable Threads for Internet Services Rob von Behren, Jeremy Condit, Feng Zhou, Geroge Necula and Eric Brewer University of California at Berkeley.

“Why Events are a Bad Idea (For high-concurrency servers)” Paper by Rob von Behren, Jeremy Condit and Eric Brewer, May 2003 Presentation by Loren Davis,

CS533 Concepts of Operating Systems Class 2 Thread vs Event-Based Programming.

Adaptive Content Delivery for Scalable Web Servers Authors: Rahul Pradhan and Mark Claypool Presented by: David Finkel Computer Science Department Worcester.

Wk 2 – Scheduling 1 CS502 Spring 2006 Scheduling The art and science of allocating the CPU and other resources to processes.

Module 8: Monitoring SQL Server for Performance. Overview Why to Monitor SQL Server Performance Monitoring and Tuning Tools for Monitoring SQL Server.

Computer Science Cataclysm: Policing Extreme Overloads in Internet Applications Bhuvan Urgaonkar and Prashant Shenoy University of Massachusetts.

SEDA: An Architecture for Well-Conditioned, Scalable Internet Services

SEDA – Staged Event-Driven Architecture

SEDA: An Architecture for Well-Conditioned, Scalable Internet Services by, Matt Welsh, David Culler, and Eric Brewer Computer Science Division University.

Cloud Computing for the Enterprise November 18th, This work is licensed under a Creative Commons.

SEDA: An Architecture for Well-Conditioned, Scalable Internet Services

Introduction and Overview Questions answered in this lecture: What is an operating system? How have operating systems evolved? Why study operating systems?

LiNK: An Operating System Architecture for Network Processors Steve Muir, Jonathan Smith Princeton University, University of Pennsylvania

Adaptive Overload Control for Busy Internet Servers Matt Welsh and David Culler USENIX Symposium on Internet Technologies and Systems (USITS) 2003 Alex.

1 Scheduling Processes. 2 Processes Each process has state, that includes its text and data, procedure call stack, etc. This state resides in memory.

LOGO Service and network administration Storage Virtualization.

©Ian Sommerville 2000 Software Engineering, 6th edition. Chapter 10Slide 1 Architectural Design l Establishing the overall structure of a software system.

1 Geospatial and Business Intelligence Jean-Sébastien Turcotte Executive VP San Francisco - April 2007 Streamlining web mapping applications.

Increasing Web Server Throughput with Network Interface Data Caching October 9, 2002 Hyong-youb Kim, Vijay S. Pai, and Scott Rixner Rice Computer Architecture.

Computer Science 1 Adaptive Overload Control for Busy Internet Servers Matt Welsh and David Culler USITS 2003 Presented by: Bhuvan Urgaonkar.

1 Admission Control and Request Scheduling in E-Commerce Web Sites Sameh Elnikety, EPFL Erich Nahum, IBM Watson John Tracey, IBM Watson Willy Zwaenepoel,

Empirical Quantification of Opportunities for Content Adaptation in Web Servers Michael Gopshtein and Dror Feitelson School of Engineering and Computer.

5204 – Operating Systems Threads vs. Events. 2 CS 5204 – Operating Systems Forms of task management serial preemptivecooperative (yield) (interrupt)

Measuring the Capacity of a Web Server USENIX Sympo. on Internet Tech. and Sys. ‘ Koo-Min Ahn.

Authors: Danhua Guo 、 Guangdeng Liao 、 Laxmi N. Bhuyan 、 Bin Liu 、 Jianxun Jason Ding Conf. : The 4th ACM/IEEE Symposium on Architectures for Networking.

SEDA: An Architecture for Well-Conditioned, Scalable Internet Services Matt Welsh, David Culler, and Eric Brewer Computer Science Division University of.

Capriccio: Scalable Threads for Internet Service

By: Rob von Behren, Jeremy Condit and Eric Brewer 2003 Presenter: Farnoosh MoshirFatemi Jan

SEDA An architecture for Well-Conditioned, scalable Internet Services Matt Welsh, David Culler, and Eric Brewer University of California, Berkeley Symposium.

Threads versus Events CSE451 Andrew Whitaker. This Class Threads vs. events is an ongoing debate  So, neat-and-tidy answers aren’t necessarily available.

Threads. Readings r Silberschatz et al : Chapter 4.

An Efficient Threading Model to Boost Server Performance Anupam Chanda.

Paper Review of Why Events Are A Bad Idea (for high-concurrency servers) Rob von Behren, Jeremy Condit and Eric Brewer By Anandhi Sundaram.

SEDA. How We Got Here On Tuesday we were talking about Multics and Unix. Fast forward years. How has the OS (e.g., Linux) changed? Some of Multics.

Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.

Introduction to Operating Systems Concepts

Capriccio:Scalable Threads for Internet Services

Optimizing Distributed Actor Systems for Dynamic Interactive Services

SEDA: An Architecture for Scalable, Well-Conditioned Internet Services

Threads vs. Events SEDA – An Event Model 5204 – Operating Systems.

Operating Systems : Overview

Processes and Threads Processes and their scheduling

Capriccio – A Thread Model

Streaming Sensor Data Fjord / Sensor Proxy Multiquery Eddy

Operating Systems : Overview

Admission Control and Request Scheduling in E-Commerce Web Sites

Operating Systems : Overview

Operating Systems : Overview

CS533 Concepts of Operating Systems Class 7

Why Threads Are A Bad Idea (for most purposes)

Why Events Are a Bad Idea (for high concurrency servers)

Database System Architectures

Why Threads Are A Bad Idea (for most purposes)

Why Threads Are A Bad Idea (for most purposes)

SEDA: An Architecture for Well-Conditioned, Scalable Internet Services

Presentation transcript:

SEDA: An Architecture for Scalable, Well-Conditioned Internet Services Authors: Matt Welsh, David Culler, and Eric Brewer UC Berkeley Presented by: Yang Liu, University of Michigan EECS 582 – W16

About the Authors David E. Culler Professor at UC Berkeley Matt Welsh Tech Lead of Chrome Cloud team Previous professor in Harvard University Alumni including Mark Zuckerberg Eric Brewer Professor at UC Berkeley Author of CAP theorem Founder of USA.gov David E. Culler Professor at UC Berkeley Chair of UC Berkeley EECS EECS 582 – W16

Internet Services Characteristic Massive Concurrent Access 2001(paper published) Yahoo: 1.2 Billion page view/day AOL: Web service: 10 billions hits/day Now Service suffers from peak load (“Slashdot Effect”) Peak load is orders of magnitude greater than average Justin Bieber Problem on Instagram Every time Justin Bieber(56 Million Followers) post on Instagram, it brings the whole service slow News about 911overloaded many news sites Peak Load occurs when the service is most valuable Increasing Dynamic Majority services based on dynamic content E-commerce, Social Network, Stream Video, Google Map, Service Oriented Websites/application Service logic changes rapidly Facebook push changes twice a day on their website (2012) Services are hosted on general purpose facilities PaaS, IaaS EECS 582 – W16

Problem Identification Supporting Massive Concurrency is hard Threads/Process designed for timesharing High overhead and memory footprint Don’t scale to thousands of tasks Existing OS design do not provide graceful management of load Standard OS focus on providing maximum transparency Transparency prevent application from making informed decision Dynamic of services exaggerate these problems As services become more dynamic, this engineering burden is excessive Replication is not solving the problem Brings extra complication: Consistency Cannot gracefully handle peak load on Next, let’s take a further look on what people have tried to solve the grace full degreadation problem EECS 582 – W16

Saturation Point Saturation Point in the view of thread Thus, there are two regions of operation: Before saturation, adding more threads increase processor utilization linearly After saturation, processor utilization does not improve with more threads, but is limited by the switching overhead Next, let’s take a further look on what people have tried to solve the grace full degreadation problem EECS 582 – W16

Thread-Based Concurrency Pros: Easy multiprogramming paradigm, brings more concurrency Cons High resource usage, context switch overhead, Cache misses, contended locks Too many threads -> throughput meltdown Traditional solution: Thread pool Problem: How to decide thread pool size, especially upon different load Transparency hide resource contention Transparency hide resource contention: first transparency is from thread itself, which hide many details behind the scene, scheduling policy is based on general purpose may be falsely used, as we saw in EXOkenel and dataplane paper Secondly, the requests stream is also hided in each thread basis, so we have no control and no knowledge of how many requests are there throughout the system EECS 582 – W16

Event-driven Concurrency Pros: Yields efficient and scalable concurrency with explicit flow from the event queue Many examples: click router, Flash web server, etc (Still used in many service-oriented architecture today) Cons Little OS and tool support (when the paper is published) No isolation between FSMs Assume Non blocking I/O which may not be supported Hard to implement, hard to modulation, event scheduling is bounded with application logic Little OS and tool support Now: Signal/Slot in Qt and boost: signals2 Akka and Actor model in Scala/Java: Reactive programming OS: Not found, but Kafka and Golang channel can view as a support EECS 582 – W16

Call for new architecture Support Massive Concurrency Enable Introspection to handle Peak Load Dynamic control for self-tuning resource management Simplify task of building highly concurrent services New Design? EECS 582 – W16

SEDA: Staged Event Driven Architecture Decompose services into stages Each stage is a subset of request processing (Micro service) Each Stage internally contains an incoming queue, an event handler for application logic, a thread pool for stage execution and a controller for resource allocation and scheduling policy Queue and thread pool are controlled by controller Dynamic control grows/shrinks can be applied to the service without changing application logic Best of threads and events Programmability of threads with explicit flow of event The introduction of a queue between stages decouples their execution by introducing an explicit control boundary. This model constrains the execution of a thread to a given stage, as a thread may only pass data across the control boundary by enqueuing an event EECS 582 – W16

Queues for Control and Composition Queues are finite An enqueue behavior may fail Block on full queue -> backpressure Drop rejected events -> load shedding May also do alternative actions, e.g., degraded service Queue introduces explicit execution boundary Threads may only execute within a single stage Performance isolation, modularity, independent load management Explicit event delivery support inspection Trace flow of events through application Monitor queue lengths to detect bottleneck EECS 582 – W16

SEDA thread pool controller Goal: Determine ideal degree of concurrency for a stage Dynamically adjust number of threads allocated to each stage Avoid wasting threads when unneeded Controller operation Observes input queue length, adds threads if over threshold Idle threads removed from pool EECS 582 – W16

SEDA thread pool controller Goal: Schedule for low response time and high throughput Batching factor: number of events consumed by each thread Large batching factor → more locality, higher throughput Small batching factor → lower response time Attempt to find smallest batching factor with stable throughput Reduces batching factor when throughput high, increases when low EECS 582 – W16

 Conclusion on SEDA SEDA Staged Event Driven Architecture Support Massive Concurrency Enable Introspection to handle Peak Load Dynamic control for self-tuning resource management Simplify task of building highly concurrent services  A combination of Event Driven and multi thread Event queue allow inspection of request streams Can perform filter, aggregate during peak load Feed-back control based on event queue Decouple load management from service complexity Stages for modulation SEDA Staged Event Driven Architecture EECS 582 – W16

Apply SEDA for Asynchronous Socket I/O: Sandstorm Asynchronous sockets layer performance Read stage Read network packets Responds to user requests Write stage Write packets to network Established new outgoing connections Listen stage Accept new TCP connections Response to user requests EECS 582 – W16

Apply SEDA for web server design: Haboob Measured static file load from SpecWEB99 benchmark Realistic, industry-standard benchmark 1 to 1024 clients making repeated requests, think time 20ms Total fileset size is 3.31 GB ; page sizes range from 102 Bytes to 940 KB Maintains memory cache of recently accessed pages (200 MB) Significant fraction of page accesses require disk I/O Comparison with Apache and Flash Apache: Process-based concurrency, 150 processes Does not accept new TCP connections when all processes busy Flash (Vivek Pai, Princeton): Event-driven w/ 4 processes . Accepts only 506 simultaneous connections due to fd limits EECS 582 – W16

Haboob: measurement 4-way Pentium III 500 MHz, Gigabit Ethernet, 2 GB RAM, Linux 2.2.14, IBM JDK 1.3 SEDA throughput 10% higher than Apache and Flash (which are in C!) Some degradation due to Linux socket inefficiencies Apache accepts only 150 clients at once - no overload despite thread model But as we will see, this penalizes many clients EECS 582 – W16

Measure Fairness: Jain Fairness Measurement 𝑓(𝑥): fairness in term of clients and their associated requests 𝑁 : number of clients 𝑋 𝑖 : number of requests for each of N clients EECS 582 – W16

Haboob: measurement SEDA yields predicated performance, whereas apache and Flash are very unfair EECS 582 – W16

Discussion: Graceful degradation The DQ Principle from Eric A. Brewer D*Q == constant D: data per second Q: queries per second Can we apply DQ principle with SEDA? data_per_query * queries_per_second == constant EECS 582 – W16

Discussion: Review from the author A Retrospective on SEDA by Matt Welsh, 2010 on his blog Historical limitation Linux threads were suffering a lot of scalability problems What SEDA got Wrong Most critical is the idea of connecting stages through event queues, with each stage having its own separate thread pool. Solve? Decouple queues and thread pools with stages yet still organized the code into stages Never satisfied with Non-blocking I/O, spent a lot of time tuning parameters What SEDA got Right using Java, instead of C The most important contribution of SEDA, I think, was the fact that we made load and resource bottlenecks explicit in the application programming model. Linux threads were suffering a lot of scalability problems, so it was best to avoid using too many of them The most critical is the idea of connecting stages through event queues, with each stage having its own separate thread pool. As a request passes through the stage graph, it experiences multiple context switches, and potentially long queueing at busy stages. This can lead to poor cache behavior and greatly increase response time. Note that under reasonably heavy load, the context switch overhead is amortized across a batch of requests processed at each stage, but on a lightly (or moderately) loaded server, the worst case context switching overhead can dominate. I probably spent more time tuning the sockets library than any other part of the design. (It did not surprise me to learn that people trying to run Sandstorm on different JVMs and threading libraries had trouble getting the same performance: I found those parameters through trial-and-error.) The fact that SEDA never included proper nonblocking disk I/O was disappointing, but this just wasn't available at the time (and I decided, wisely, I think, not to take it on as part of my PhD.) EECS 582 – W16

Discussions My thoughts: SEDA is a useful server design model, can be viewed as a combination of Feedback Controller + staged modulation + event(message) flow + application metrics on load(event queue) Some aspect of it have been Adapted by message-driven Architecture like AKKA, Distributed Message Queue like Kafka, stream processing like Storm, and dataflow model like MapReduce/Tez/Spark SEDA + Scale out distributed system design+ load balancer can perform elastic load handling Controller model with metrics on load is adapted as industrial standard to improve single machine throughput Tech talk from Facebook, Tencent, Alibaba and Baidu shows they all use PID control for high throughput (citation needed) EECS 582 – W16

Summary Server Design paradigm Concepts Lessons learnt Multi threaded: thread per socket or thread pool Event Driven Architecture Staged Event Driven Architecture Concepts Saturation point Jain Fairness Index DQ principle / Graceful Degradation Lessons learnt Modulation is not only ease for programming but can bring flexibility through decoupling There is always a trade off between Transparency and Performance: Transparency means hiding information, but this can be avoid through system design EECS 582 – W16

Q & A Thanks EECS 582 – W16

References SEDA paper & slides: http://www.eecs.harvard.edu/~mdw/proj/seda/ Instagram Justin Bieber Problem: http://www.wired.com/2015/11/how-instagram-solved- its-justin-bieber-problem/ Facebook push code frequency: http://commencement.umich.edu/spring- commencement/spring-commencement/ Event Driven and SOA: https://msdn.microsoft.com/en-us/library/dd129913.aspx Event Driven with Akka: https://blog.openshift.com/building-distributed-and-event-driven- applications-in-java-or-scala-with-akka-on-openshift/ Saturation point: http://www.inf.ed.ac.uk/teaching/courses/pa/Notes/lecture09- multithreading.pdf DQ principle: http://www.cs.berkeley.edu/~brewer/papers/GiantScale-IEEE.pdf DQ principle Picture: https://www.youtube.com/watch?v=nSwraiJSQj8 EECS 582 – W16

References Blog about SEDA: http://muratbuffalo.blogspot.com/2011/02/seda-architecture-for-well- conditioned.html SEDA thesis from Matt: http://www.eecs.harvard.edu/~mdw/papers/mdw-phdthesis.pdf AKKA: http://akka.io/ Kafka: http://kafka.apache.org/ Storm: http://storm.apache.org/ Dataflow: http://googlecloudplatform.blogspot.co.uk/2016/01/Dataflow-and-open-source- proposal-to-join-the-Apache-Incubator.html Blogs About SEDA: http://www.infoq.com/articles/SEDA-Mule Blogs About SEDA: http://www.theserverside.com/news/1363672/Building-a-Scalable- Enterprise-Applications-Using-Asynchronous-IO-and-SEDA-Model EECS 582 – W16