SEDA: An Architecture for Well-Conditioned, Scalable Internet Services Matt Welsh, David Culler, and Eric Brewer Computer Science Division University of California, Berkeley
Scalable, Fair Services Internet servers handle extremely high loads Yahoo! receives 1.2 billion requests daily AOL web cache services 10 billion hits daily Desire efficiency, consistency, and fairness Scalability through replication Load may be extremely variable “Slashdot effect” == 100-fold increase Replication not enough
Goal General framework for highly concurrent and well-conditioned services, that handles wide variations of load gracefully factors out details of concurrency and resource management Framework should be useful in many contexts Servers, routers, etc.
SEDA Staged, Event-Driven Architecture Building block is the Stage Handler processes queued, incoming events Outputs placed on next stages’ queues Multiple threads may service a stage Dynamic Resource Controllers Observe operating conditions Respond by altering resource usage Useful for responding to high load
Background Well-conditioned service: Two common models: Behaves like a simple pipeline load > capacity max throughput + linear drop in response time Two common models: Thread-based concurrency Event-driven server
Thread-based Concurrency
Thread-based Concurrency Benefits Simple, well-supported programming model Drawbacks High overheads as # of threads increases Cache & TLB misses, scheduling, lock contention Virtualizes resources for multiprogramming
Threaded Server Throughput Degradation
Event-driven Server
Event-driven Server Benefits Drawbacks Robust to high loads Drawbacks Harder to program---concerned with scheduling and event partitioning More direct control of resources
Event-driven Server Throughput
SEDA Key mechanisms are event-driven stages, and resource controllers Support massive concurrency Simplify construction of well-conditioned services Enable introspection Support self-tuning resource management Key mechanisms are event-driven stages, and resource controllers
A SEDA Stage
SEDA HTTP Server: Haboob
SEDA Resource Controllers
SEDA Thread Pool Controller
SEDA Batching Controller
Sandstorm Implementation of SEDA in Java Benchmark performance is good (IBM JDK 1.3.0) Advanced features (e.g. GC) helpful Many GP features (queue and thread management, profiling, etc.) Relies on async I/O package Better performance than blocking ops + threads Roughly 20k LOC
Async Sockets Layer
Async File Layer Bounded thread pool to perform standard I/O ops read, write, seek, stat, close One thread per file (ensures serial access) Number of threads adjusted by resource controller
SEDA HTTP Server: Haboob
Haboob Performance Compare with Apache and Flash Measure throughput and response time Calculate Fairness: f(x1, x2,…, xN) = (xi)2 / N xi2 where xi = # of requests completed for each of N clients over some period of time Measure behavior under high load
Configuration 4-way SMP 500 MHz Pentium III 2 GB RAM, Linux 2.2.14, Gigabit Ethernet One server and 32 load generators Flash & Haboob run with 200 MB page cache Offered load derived from SPECweb99 3.31 GB file set size
Haboob Throughput
Haboob Response Times 1024 clients
Haboob Performance
Response Time Controller 1024 clients
Gnutella Packet Router 3 stage architecture GnutellaServer accepts TCP connections GnutellaRouter performs routing GnutellaCatcher manages connections to network Max throughput 20k p/s Real-network run of 37 hours: Processed 24.8m packets (avg. 179 p/s) Received 72k connections (avg. 12 simultaneous)
Performance Challenges Protection from slow sockets Need to threshold queue size of slow connections Close connection on exceeding threshold Load conditioning Introduced bottleneck in query packets (15% of traffic mix) Add threads to GnutellaRouter as offered load increases (queue size > 1000 packets)
Discussion Measurement and control vs. reservation Mechanisms for detecting overload Policies to deal with overload SEDA ease of programming Less worry of synchronization & race conditions than thread-per-request Less complex and “soft” than events Directions for OS design?
Contributions SEDA: Staged, Event-Driven Architecture Applications consist of connected stages each serviced by one or more threads Dynamic Resource Controllers examine and react to high load conditions and control thread usage SEDA implementation and applications Sandstorm implementation written in Java Haboob webserver Gnutella packet router