Stela: Enabling Stream Processing Systems to Scale-in and Scale-out On-demand Boyang Peng, Le Xu, Indranil Gupta University of Illinois at Urbana Champaign Distributed Protocol Research Group (DPRG)
Contributions First work to describe and implement on-demand elasticity within Storm Development of novel metric ETP Evaluation of our system on micro-benchmark applications as well as on applications used in production by Yahoo!
Data Processing Model Acyclic DAG of operators operators are stateless An instance (of an operator) is an instantiation of the operator’s processing logic and the physical entity that executes the operator’s logic
Expected Throughput Percentage the impact (percentage) that each operator has towards the application throughput Component 1 has ETP=0 Component 3 has ETP of 1000/4500=2/9 (note that component 6 is congested)
Find ETP for each operator if o.child = null then return ProcessingRateMap.get(o) //o is a sink end if SubtreeSum ← 0; for each descendant child ∈ o do if child.congested = true then continue; // if the child is congested, give up the subtree rooted at that child else SubtreeSum+ = FINDETP(child); end if end for return SubtreeSum
Stela Scale-out compute N: # of instances being added on new machines N = # of new machines * current instance count / current machine count For each instance slot: Pick component C with highest ETP Update C’s execution rate assuming new instance assigned: update all components’ ETPs
Stela Scale-in Find ETPSum for all machines Remove machine with lowest ETP Sum Round Robin schedule instances on removed machine starting with machine with lowest ETPSum
Overview of Storm Nimbus node (master node, similar to the Hadoop JobTracker): Uploads computations for execution Distributes code across the cluster Launches workers across the cluster Monitors computation and reallocates workers as needed ZooKeeper nodes – coordinates the Storm cluster Supervisor nodes – communicates with Nimbus through Zookeeper, starts and stops workers according to signals from Nimbus
Overview of Storm Five key abstractions help to understand how Storm processes data: Tuples– an ordered list of elements. For example, a “4-tuple” might be (7, 1, 3, 7) Streams – an unbounded sequence of tuples. Spouts –sources of streams in a computation (e.g. a Twitter API) Bolts – process input streams and produce output streams. They can: run functions; filter, aggregate, or join data; or talk to databases. Topologies – the overall calculation, represented visually as a network of spouts and bolts (as in the following diagram)
Implementation
Evaluation Experimental Setup (Emulab) Ubuntu 12.04 100 Mbps VLAN connecting all machines PC 3000 3 GHZ dual core processor 2 GB of memory 10,000 RPM 146 GB SCSI disks D710 2.4 GHz 64-bit Quad Core 12 GB of memory 750 GB SATA disks
Micro-benchmark Experiments
Micro-benchmark Experiments
Micro-benchmark Experiments Stela 65% better Stela 65% better Stela 45% better Stela 120% better
Yahoo Topologies
Yahoo Topologies
Convergence Time
Stela achieves 87.5% and 75% less down time Scale-in Experiments Stela achieves 87.5% and 75% less down time
Scale-in Experiments Yahoo PageLoad Topology
Summary scale-out, Stela achieves throughput that is 45-120% higher than Storm’s reduces interruption to 12.5% For scale-in, Stela performs 40-500% better
Questions?