Online parameter optimization for elastic data stream processing

Slides:

Advertisements

Similar presentations

Feedback Control Real-Time Scheduling: Framework, Modeling, and Algorithms Chenyang Lu, John A. Stankovic, Gang Tao, Sang H. Son Presented by Josh Carl.

Advertisements

Dynamic Thread Mapping for High- Performance, Power-Efficient Heterogeneous Many-core Systems Guangshuo Liu Jinpyo Park Diana Marculescu Presented By Ravi.

Tuning of Loop Cache Architectures to Programs in Embedded System Design Susan Cotterell and Frank Vahid Department of Computer Science and Engineering.

Starfish: A Self-tuning System for Big Data Analytics.

Energy-efficient Task Scheduling in Heterogeneous Environment 2013/10/25.

Anand Krishnamurthy, Shoban P. Chandrabose and Aaron Gember-Jackobson 1 Pratyaastha: An Efficient Elastic Distributed SDN Control Plane.

Zhiguo Ge, Weng-Fai Wong, and Hock-Beng Lim Proceedings of the Design, Automation, and Test in Europe Conference, 2007 (DATE’07) April /4/17.

Energy Conservation in Datacenters through Cluster Memory Management and Barely-Alive Memory Servers Vlasia Anagnostopoulou Susmit.

CloudScale: Elastic Resource Scaling for Multi-Tenant Cloud Systems Zhiming Shen, Sethuraman Subbiah, Xiaohui Gu, John Wilkes.

Efficient Autoscaling in the Cloud using Predictive Models for Workload Forecasting Roy, N., A. Dubey, and A. Gokhale 4th IEEE International Conference.

Application Models for utility computing Ulrich (Uli) Homann Chief Architect Microsoft Enterprise Services.

Operating Systems 1 K. Salah Module 2.1: CPU Scheduling Scheduling Types Scheduling Criteria Scheduling Algorithms Performance Evaluation.

Chapter 6: Database Evolution Title: AutoAdmin “What-if” Index Analysis Utility Authors: Surajit Chaudhuri, Vivek Narasayya ACM SIGMOD 1998.

Chapter 10: Stream-based Data Management Title: Design, Implementation, and Evaluation of the Linear Road Benchmark on the Stream Processing Core Authors:

Proxy Cache Management for Fine-Grained Scalable Video Streaming Jiangchuan Liu, Xiaowen Chu, and Jianliang Xu INFOCOM 2004.

An Overview of Database Access on the Web An Overview of Database Access on the Web Using ASP and Microsoft Database Technology Sheffield Hallam University.

IPOEM: A GPS Tool for Integrated Management in Virtualized Data Centers Hui Zhang 1, Kenji Yoshihira 1, Ya-Yunn Su 2, Guofei Jiang 1, Ming Chen 3, Xiaorui.

Grid Load Balancing Scheduling Algorithm Based on Statistics Thinking The 9th International Conference for Young Computer Scientists Bin Lu, Hongbin Zhang.

Abstract Shortest distance query is a fundamental operation in large-scale networks. Many existing methods in the literature take a landmark embedding.

MAC Layer Protocols for Sensor Networks Leonardo Leiria Fernandes.

Adaptive Video Coding to Reduce Energy on General Purpose Processors Daniel Grobe Sachs, Sarita Adve, Douglas L. Jones University of Illinois at Urbana-Champaign.

Folklore Confirmed: Compiling for Speed = Compiling for Energy Tomofumi Yuki INRIA, Rennes Sanjay Rajopadhye Colorado State University 1.

Department of Computer Science Engineering SRM University

Dynamic Resource Allocation Using Virtual Machines for Cloud Computing Environment.

Min Xu1, Yunfeng Zhu2, Patrick P. C. Lee1, Yinlong Xu2

Profiling Grid Data Transfer Protocols and Servers George Kola, Tevfik Kosar and Miron Livny University of Wisconsin-Madison USA.

Drowsy Caches: Simple Techniques for Reducing Leakage Power Authors: ARM Ltd Krisztián Flautner, Advanced Computer Architecture Lab, The University of.

« Performance of Compressed Inverted List Caching in Search Engines » Proceedings of the International World Wide Web Conference Commitee, Beijing 2008)

1 Optimal Resource Placement in Structured Peer-to-Peer Networks Authors: W. Rao, L. Chen, A.W.-C. Fu, G. Wang Source: IEEE Transactions on Parallel and.

1 Wenguang WangRichard B. Bunt Department of Computer Science University of Saskatchewan November 14, 2000 Simulating DB2 Buffer Pool Management.

© Lindsay Bradford1 Scaling Dynamic Web Content Provision Using Elapsed-Time- Based Content Degradation Lindsay Bradford, Stephen Milliner and.

Data Placement and Task Scheduling in cloud, Online and Offline 赵青天津科技大学

Abhilash Thekkilakattil, Radu Dobrin, Sasikumar Punnekkat Mälardalen Real-time Research Center, Mälardalen University Västerås, Sweden Towards Preemption.

Efficient Subwindow Search: A Branch and Bound Framework for Object Localization ‘PAMI09 Beyond Sliding Windows: Object Localization by Efficient Subwindow.

BEHAVIORAL TARGETING IN ON-LINE ADVERTISING: AN EMPIRICAL STUDY AUTHORS: JOANNA JAWORSKA MARCIN SYDOW IN DEFENSE: XILING SUN & ARINDAM PAUL.

Zibin Zheng DR 2 : Dynamic Request Routing for Tolerating Latency Variability in Cloud Applications CLOUD 2013 Jieming Zhu, Zibin.

VGreen: A System for Energy Efficient Manager in Virtualized Environments G. Dhiman, G Marchetti, T Rosing ISLPED 2009.

Performance Analysis of the Compaq ES40--An Overview Paper evaluates Compaq’s ES40 system, based on the Alpha Only concern is performance: no power.

Dynamic Phase-based Tuning for Embedded Systems Using Phase Distance Mapping + Also Affiliated with NSF Center for High- Performance Reconfigurable Computing.

Supporting Privacy Protection in Personalized Web Search.

BOUNDS ON QOS- CONSTRAINED ENERGY SAVINGS IN CELLULAR ACCESS NETWORKS WITH SLEEP MODES - Sushant Bhardwaj.

Chapter 9: Web Services and Databases Title: NiagaraCQ: A Scalable Continuous Query System for Internet Databases Authors: Jianjun Chen, David J. DeWitt,

Cluster computing. 1.What is cluster computing? 2.Need of cluster computing. 3.Architecture 4.Applications of cluster computing 5.Advantages of cluster.

E-Store: Fine-Grained Elastic Partitioning for Distributed Transaction Processing Systems Jihui Yang CS525 Advanced Distributed System March 1, 2016.

Computer System Optimization. Introduction PC with Software NVR The main components of PC and the factors when choosing a PC Dual streaming Standalone.

PaaSport PaaSport Semantic Models Nick Bassiliades International Hellenic University (IHU) Semantic Models - Training.

Spark on Entropy : A Reliable & Efficient Scheduler for Low-latency Parallel Jobs in Heterogeneous Cloud Huankai Chen PhD Student at University of Kent.

Online Parameter Optimization for Elastic Data Stream Processing Thomas Heinze, Lars Roediger, Yuanzhen Ji, Zbigniew Jerzak (SAP SE) Andreas Meister (University.

BAHIR DAR UNIVERSITY Institute of technology Faculty of Computing Department of information technology Msc program Distributed Database Article Review.

Performance Assurance for Large Scale Big Data Systems

Pro-Active Performance Engineering

Workload Distribution Architecture

On the analysis of indexing schemes

International Conference on Data Engineering (ICDE 2016)

Green cloud computing 2 Cs 595 Lecture 15.

Applying Control Theory to Stream Processing Systems

Presented by Muhammad Abu Saqer

CSE 591: Energy-Efficient Computing Lecture 17 SCALING: survey

Server Allocation for Multiplayer Cloud Gaming

ICICLES: Self-tuning Samples for Approximate Query Answering

Transparent Adaptive Resource Management for Middleware Systems

Optimizing Interactive Analytics Engines for Heterogeneous Clusters

CSE 591: Energy-Efficient Computing Lecture 12 SLEEP: memory

Degree-aware Hybrid Graph Traversal on FPGA-HMC Platform

How Efficient Can We Be?: Bounds on Algorithm Energy Consumption

UmbrellaDB v0.5 Project Report #3

Cloud Computing Architecture

AI Applications in Network Congestion Control

SEDA: An Architecture for Well-Conditioned, Scalable Internet Services

CS 239 – Big Data Systems Fall 2018

Presentation transcript:

Online parameter optimization for elastic data stream processing Heinze et.al Scribed by Tao Feng 02/18/2016

Recap Introduce an Elastic Scaling Streaming Process that achieves an efficient balance between Monetary cost and Quality of Service (latency) Threshold based elastic scaling Runtime and automated operator placement On-line parameter optimization Parameter search scheme Cost function Online profiler When to trigger the optimization Size of utilization history

Evaluation Trade-off between Monetary cost and Quality of Service Manually tuned parameter Optimized parameter (contribution of this paper) Parameter learned from Reinforcement Learning

Pros & Cons Pros Carefully and extensively comparison the trade-offs of the three baseline using different parameter values Simple architecture and intuitive algorithm for elastic scaling and parameter search The framework can be used to other type of elastic scaling stream processing The prototype scales linearly with number of queries and window size …

Pros & Cons Cons Cost function seems self-provable MAX is predefined Overloading part could be further improved or with more explanation The evaluation is based authors’ own system What the situation will be on a system used in the industry (Storm) Didn’t talk about load burst when that is much larger than regular workload User still need to define the latency threshold Reinforcement Learning Black box

Comments & Questions Why is monetary cost the only cost considered? We could also focus on energy as cost etc. What advantage did we get from making the upper and lower threshold granularities so small (which led to several thousand parameter configurations)? Besides CPU, network, and memory consumption, are there any other possible resources that also should be considered? Why can’t memory and network overload be parametrized?