E-Store: Fine-Grained Elastic Partitioning for Distributed Transaction Processing Systems Jihui Yang CS525 Advanced Distributed System March 1, 2016.

Slides:

Advertisements

Similar presentations

Dynamic Replica Placement for Scalable Content Delivery Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy, EECS Department.

Advertisements

Daniel Schall, Volker Höfner, Prof. Dr. Theo Härder TU Kaiserslautern.

SLA-Oriented Resource Provisioning for Cloud Computing

Autonomic Systems Justin Moles, Winter 2006 Enabling autonomic behavior in systems software with hot swapping Paper by: J. Appavoo, et al. Presentation.

Performance and Scalability. Optimizing PerformanceScaling UpScaling Out.

Query Processing of Massive Trajectory Data based on MapReduce Qiang Ma, Bin Yang (Fudan University) Weining Qian, Aoying Zhou (ECNU) Presented By: Xin.

Advanced Database Systems September 2013 Dr. Fatemeh Ahmadi-Abkenari 1.

Small-world Overlay P2P Network

Chapter Physical Database Design Methodology Software & Hardware Mapping Logical Design to DBMS Physical Implementation Security Implementation Monitoring.

Improving Proxy Cache Performance: Analysis of Three Replacement Policies Dilley, J.; Arlitt, M. A journal paper of IEEE Internet Computing, Volume: 3.

Transaction Management and Concurrency Control

Physical Database Monitoring and Tuning the Operational System.

Fair Scheduling in Web Servers CS 213 Lecture 17 L.N. Bhuyan.

Measuring Performance Chapter 12 CSE807. Performance Measurement To assist in guaranteeing Service Level Agreements For capacity planning For troubleshooting.

Grid Load Balancing Scheduling Algorithm Based on Statistics Thinking The 9th International Conference for Young Computer Scientists Bin Lu, Hongbin Zhang.

What is it? What kind of system need it?. Distributing system, cloud system etc.

NoSQL and NewSQL Justin DeBrabant CIS Advanced Systems - Fall 2013.

1 Introduction Introduction to database systems Database Management Systems (DBMS) Type of Databases Database Design Database Design Considerations.

CS Spring 2012 CS 414 – Multimedia Systems Design Lecture 34 – Media Server (Part 3) Klara Nahrstedt Spring 2012.

Distributed Databases

Jiazhang Liu;Yiren Ding Team 8 [10/22/13]. Traditional Database Servers Database Admin DBMS 1.

Chapter 3 Database Architectures and the Web Pearson Education © 2009.

Chapter 4: Organizing and Manipulating the Data in Databases

Database Architectures and the Web

Designing Efficient Systems Services and Primitives for Next-Generation Data-Centers K. Vaidyanathan, S. Narravula, P. Balaji and D. K. Panda Network Based.

CSC271 Database Systems Lecture # 30.

1 An SLA-Oriented Capacity Planning Tool for Streaming Media Services Lucy Cherkasova, Wenting Tang, and Sharad Singhal HPLabs,USA.

Database Replication Policies for Dynamic Content Applications Gokul Soundararajan, Cristiana Amza, Ashvin Goel University of Toronto EuroSys 2006: Leuven,

Database Design – Lecture 16

Task Scheduling for Highly Concurrent Analytical and Transactional Main-Memory Workloads Iraklis Psaroudakis (EPFL), Tobias Scheuer (SAP AG), Norman May.

Chapter 7 Memory Management

Design patterns. What is a design pattern? Christopher Alexander: «The pattern describes a problem which again and again occurs in the work, as well as.

Lecture 9 Methodology – Physical Database Design for Relational Databases.

Your Data Any Place, Any Time Online Transaction Processing.

1 Adapted from Pearson Prentice Hall Adapted form James A. Senn’s Information Technology, 3 rd Edition Chapter 7 Enterprise Databases and Data Warehouses.

Data Warehousing at Acxiom Paul Montrose Data Warehousing at Acxiom Paul Montrose.

Mobile Relay Configuration in Data-Intensive Wireless Sensor Networks.

Scalable Web Server on Heterogeneous Cluster CHEN Ge.

Challenges towards Elastic Power Management in Internet Data Center.

Sensor Database System Sultan Alhazmi

Session-8 Data Management for Decision Support

H-Store: A Specialized Architecture for High-throughput OLTP Applications Evan Jones (MIT) Andrew Pavlo (Brown) 13 th Intl. Workshop on High Performance.

Consistent and Efficient Database Replication based on Group Communication Bettina Kemme School of Computer Science McGill University, Montreal.

Introduction to the Adapter Server Rob Mace June, 2008.

CEPH: A SCALABLE, HIGH-PERFORMANCE DISTRIBUTED FILE SYSTEM S. A. Weil, S. A. Brandt, E. L. Miller D. D. E. Long, C. Maltzahn U. C. Santa Cruz OSDI 2006.

Intro – Part 2 Introduction to Database Management: Ch 1 & 2.

MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.

Investigating Survivability Strategies for Ultra-Large Scale (ULS) Systems Vanderbilt University Nashville, Tennessee Institute for Software Integrated.

Database Replication in Tashkent CSEP 545 Transaction Processing Sameh Elnikety.

ASMA AHMAD 28 TH APRIL, 2011 Database Systems Distributed Databases I.

Distributed Database. Introduction A major motivation behind the development of database systems is the desire to integrate the operational data of an.

Methodology – Physical Database Design for Relational Databases.

Online Data partitioning in distributed database systems

Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.

UbiFlow: Mobile Management in Urban-scale Software Defined IoT

Content Delivery Networks: Status and Trends Speaker: Shao-Fen Chou Advisor: Dr. Ho-Ting Wu 5/8/

On the Placement of Web Server Replicas Yu Cai. Paper On the Placement of Web Server Replicas Lili Qiu, Venkata N. Padmanabhan, Geoffrey M. Voelker Infocom.

Chapter 7 Memory Management Eighth Edition William Stallings Operating Systems: Internals and Design Principles.

Chapter 13 Managing Transactions and Concurrency Database Principles: Fundamentals of Design, Implementation, and Management Tenth Edition.

1 Chapter 22 Distributed DBMSs - Concepts and Design Simplified Transparencies © Pearson Education Limited 1995, 2005.

An Adaptive Cache Coherence Protocol Optimized for Producer-Consumer Sharing Liquin Cheng, John B. Carter and Donglai Dai cs.utah.edu by Evangelos Vlachos.

Software architectures and tools for highly distributed applications Voldemaras Žitkus.

Optimizing Distributed Actor Systems for Dynamic Interactive Services

Physical Database Design for Relational Databases Step 3 – Step 8

Database Architectures and the Web

SQL 2014 In-Memory OLTP What, Why, and How

Resource-Efficient and QoS-Aware Cluster Management

Cloud Computing Architecture

H-store: A high-performance, distributed main memory transaction processing system Robert Kallman, Hideaki Kimura, Jonathan Natkins, Andrew Pavlo, Alex.

Cloud Computing Architecture

Presentation transcript:

E-Store: Fine-Grained Elastic Partitioning for Distributed Transaction Processing Systems Jihui Yang CS525 Advanced Distributed System March 1, 2016

Online transaction processing system OLTP system is a popular data processing system in today's enterprises. Some examples of OLTP systems include order entry, retail sales, and financial transaction systems In large applications, efficient OLTP may depend on sophisticated transaction management software and/or database optimization tactics to facilitate the processing of large numbers of concurrent updates to an OLTP-oriented database.

Challenges OLTP faces Many Online transaction processing(OLTP) applications are subject to unpredictable variations in demand. Important for a back-end Database management system(DBMS) be resilient to load spikes. Several different workload skew that each require different solutions: Hot Spots, Time-Varying Skew, Load Spikes, The Hockey Stick Effect. Need to adapt to unanticipated changes in an application’s workload while continuing to guarantee ACID

E-store Previous work: only been able to migrate large chunks of data A comprehensive framework that Instead of monitoring and migrating data at the granularity of pre-defined chunks, E-Store dynamically alters chunks by extracting their hot tuples, which are considered as separate “singleton” chunks. monitor tool for identification tuple-level access patterns monitoring two-tier data placement scheme integrated H-Store DBMS

E-Store Framework

E-Monitor: Adaptive partition monitoring Problem: 1.examine transactions’ access patterns based on recent log activity  delay 2. monitor the usage of individual tuples in every transaction  expensive Solution: a two-phase monitoring strategy

E-Monitor: Adaptive partition monitoring Phase 1: Collecting System Level Metrics 1.Retrieve node utilization from last 60 sec. 2.Pre-set watermark triggers tuple-level monitoring. Phase 2: Tuple-Level Monitoring 1.Find top-k most frequently accessed tuples within time t 2.Generate a global top-k list. This list is used to build a reconfiguration plan

E-Planner: Provisioning algorithms Scaling Cluster Size Up/Down? Two-Tiered Bin Packing One-Tiered Bin Packing Both use an integer programming model to generate optimal assignment of tuples to partitions. Time consuming, not practical for real-world deployments

E-Planner: Provisioning algorithms Scaling Cluster Size Up/Down? Greedy Greedy Extended First Fit Reasonably high quality partition plans in a much shorter timeframe

Throughput and Latency

Squall—Data Migration

H-Store: a distributed, main-memory DBMS

H-Store 1.Horizontally partitioned tables, where the tuples of each table are allocated without redundancy to the various nodes managed by H-Store 2.A client application initiates a transaction by sending a request to any node. Each transaction is routed to one or more partitions and their corresponding servers that contain the data accessed by a transaction. 3.E-Store assumes all non-replicated tables of an OLTP database form a tree-schema based on foreign key relationships. Although this rules out graph-structured schemas and m-n relationships, it applies to many real-world OLTP applications

Conclusion E-Store is designed to maintain system performance over a highly variable and diverse load. E-Monitor identifies load imbalances and tracks for a short time window the most-read or -written “hot” tuples. E-Planner chooses which data to move and where to place it and generates the reconfiguration plan. E-Store claims that its elasticity techniques are generic and is adaptable to other shared-nothing DBMSs

Future Thoughts 1.E-Store designer only talks about CPU utilization. However, reprovisioning algorithm currently do not take the amount of main memory into account when producing reconfiguration plans, and this is left as future work. 2.Authors claim E-Store is adaptable to other DBMSs, it will be convince only when E-Store is tested on other DBMSs to see if it can achieve high throughput and low latency

Thank you! Any questions?