Predicting Replicated Database Scalability Sameh Elnikety, Microsoft Research Steven Dropsho, Google Inc. Emmanuel Cecchet, Univ. of Mass. Willy Zwaenepoel,

Slides:

Advertisements

Similar presentations

Chen Zhang Hans De Sterck University of Waterloo

Advertisements

Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,

Enterprise Job Scheduling for Clustered Environments Stratos Paulakis, Vassileios Tsetsos, and Stathes Hadjiefthymiades P ervasive C omputing R esearch.

Proactive Prediction Models for Web Application Resource Provisioning in the Cloud _______________________________ Samuel A. Ajila & Bankole A. Akindele.

1 Cheriton School of Computer Science 2 Department of Computer Science RemusDB: Transparent High Availability for Database Systems Umar Farooq Minhas 1,

1 Database Replication Using Generalized Snapshot Isolation Sameh Elnikety, EPFL Fernando Pedone, USI Willy Zwaenepoel, EPFL.

Middleware based Data Replication providing Snapshot Isolation Yi Lin Bettina Kemme Marta Patiño-Martínez Ricardo Jiménez-Peris June 15, 2005.

Windows Server ® Virtualization Infrastructure Planning and Design Published: November 2007 Updated: July 2010.

DMITRI PERELMAN ANTON BYSHEVSKY OLEG LITMANOVICH IDIT KEIDAR DISC 2011 SMV: Selective Multi-Versioning STM 1.

G Robert Grimm New York University Disconnected Operation in the Coda File System.

1 Principles of Reliable Distributed Systems Tutorial 12: Frangipani Spring 2009 Alex Shraer.

© 2011 Citrusleaf. All rights reserved.1 A Real-Time NoSQL DB That Preserves ACID Citrusleaf Srini V. Srinivasan Brian Bulkowski VLDB, 09/01/11.

1 Tashkent: Uniting Durability & Ordering in Replicated Databases Sameh Elnikety, EPFL Steven Dropsho, EPFL Fernando Pedone, USI.

Overview  Strong consistency  Traditional approach  Proposed approach  Implementation  Experiments 2.

© 2008 Hewlett-Packard Development Company, L.P. The information contained herein is subject to change without notice Automated Workload Management in.

Copyright © 2006 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill Technology Education Copyright © 2006 by The McGraw-Hill Companies,

Northwestern University 2007 Winter – EECS 443 Advanced Operating Systems The Google File System S. Ghemawat, H. Gobioff and S-T. Leung, The Google File.

Module 14: Scalability and High Availability. Overview Key high availability features available in Oracle and SQL Server Key scalability features available.

Frangipani: A Scalable Distributed File System C. A. Thekkath, T. Mann, and E. K. Lee Systems Research Center Digital Equipment Corporation.

U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Black-box and Gray-box Strategies for Virtual Machine Migration Timothy Wood, Prashant.

A Brief Overview by Aditya Dutt March 18 th ’ Aditya Inc.

Database Replication. Replication Replication is the process of sharing information so as to ensure consistency between redundant resources, such as software.

Windows Server ® Virtualization Infrastructure Planning and Design Published: November 2007 Updated: January 2012.

Adaptive Control of Virtualized Resources in Utility Computing Environments HP Labs: Xiaoyun Zhu, Mustafa Uysal, Zhikui Wang, Sharad Singhal University.

JOnAS developer workshop – /02/2004 status Emmanuel Cecchet

Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

Bottlenecks: Automated Design Configuration Evaluation and Tune.

Database Replication Policies for Dynamic Content Applications Gokul Soundararajan, Cristiana Amza, Ashvin Goel University of Toronto EuroSys 2006: Leuven,

Scalability Terminology: Farms, Clones, Partitions, and Packs: RACS and RAPS Bill Devlin, Jim Cray, Bill Laing, George Spix Microsoft Research Dec

© Continuent 9/19/2015 PostgreSQL Lightning Talk Availability, Scaling, and more with Tungsten Stephane Giron and Gilles Rayrat PG Euro Prato Italy.

U NIVERSITY OF M ASSACHUSETTS, A MHERST – Department of Computer Science An Analytical Model for Multi-tier Internet Services and its Applications Bhuvan.

Profile Driven Component Placement for Cluster-based Online Services Christopher Stewart (University of Rochester) Kai Shen (University of Rochester) Sandhya.

VLDB2012 Hoang Tam Vo #1, Sheng Wang #2, Divyakant Agrawal †3, Gang Chen §4, Beng Chin Ooi #5 #National University of Singapore, †University of California,

1 On the Placement of Web Server Replicas Lili Qiu, Microsoft Research Venkata N. Padmanabhan, Microsoft Research Geoffrey M. Voelker, UCSD IEEE INFOCOM’2001,

DUCKS – Distributed User-mode Chirp- Knowledgeable Server Joe Thompson Jay Doyle.

Adaptive Virtual Machine Provisioning in Elastic Multi-tier Cloud Platforms Fan Zhang, Junwei Cao, Hong Cai James J. Mulcahy, Cheng Wu Tsinghua University,

A Regression-Based Analytic Model for Dynamic Resource Provisioning of Multi-Tier Applications Qi Zhang College of William and Mary Williamsburg, VA 23187,

1 Specification and Implementation of Dynamic Web Site Benchmarks Sameh Elnikety Department of Computer Science Rice University.

Module 10: Maintaining High-Availability. Overview Introduction to Availability Increasing Availability Using Failover Clustering Standby Servers and.

Consistent and Efficient Database Replication based on Group Communication Bettina Kemme School of Computer Science McGill University, Montreal.

Publish / Subscribe Database Log Shipping over Bittorent P2P CS 848 Fall 2006 Univeristy of Waterloo Project Presentation by N. T c h e r v e n s k i.

Usenix Annual Conference, Freenix track – June 2004 – 1 : Flexible Database Clustering Middleware Emmanuel Cecchet – INRIA Julie Marguerite.

Eric Westfall – Indiana University James Bennett – Indiana University ADMINISTERING A PRODUCTION KUALI RICE INFRASTRUCTURE.

Preventive Replication in Database Cluster Esther Pacitti, Cedric Coulon, Patrick Valduriez, M. Tamer Özsu* LINA / INRIA – Atlas Group University of Nantes.

Applying Database Replication to Multi-player Online Games Yi Lin Bettina Kemme Marta Patiño-Martínez Ricardo Jiménez-Peris Oct 30, 2006.

Database Replication in Tashkent CSEP 545 Transaction Processing Sameh Elnikety.

Design and Evaluation of a Model for Multi-tiered Internet Applications Bhuvan Urgaonkar Internship project talk – Services Management Middleware Dept,

LFC Replication Tests LCG 3D Workshop Barbara Martelli.

1 Admission Control and Request Scheduling in E-Commerce Web Sites Sameh Elnikety, EPFL Erich Nahum, IBM Watson John Tracey, IBM Watson Willy Zwaenepoel,

SQL Server 2005 Implementation and Maintenance Chapter 12: Achieving High Availability Through Replication.

1 Multiversion Reconciliation for Mobile Databases Shirish Hemanath Phatak & B.R.Badrinath Presented By Presented By Md. Abdur Rahman Md. Abdur Rahman.

2/29/ Replication CSEP 545 Transaction Processing Philip A. Bernstein Sameh Elnikety Copyright ©2012 Philip A. Bernstein.

Database Replication in WAN Yi Lin Supervised by: Prof. Kemme April 8, 2005.

Fine-Grained Replication and Scheduling with Freshness and Correctness Guarantees F.Akal 1, C.Türker 1, H.-J.Schek 1, Y.Breitbart 2, T.Grabs 3, L.Veen.

DynamicMR: A Dynamic Slot Allocation Optimization Framework for MapReduce Clusters Nanyang Technological University Shanjiang Tang, Bu-Sung Lee, Bingsheng.

R. Jiménez-Peris Scalability Evaluation of the Replication Support of JOnAS, an Industrial J2EE Application Server A. Paz, F. P é rez-Sorrosal, M. Patiño-Martínez,

DATABASE REPLICATION DISTRIBUTED DATABASE. O VERVIEW Replication : process of copying and maintaining database object, in multiple database that make.

1 Tashkent: Uniting Durability & Ordering in Replicated Databases Sameh Elnikety, EPFL Steven Dropsho, EPFL Fernando Pedone, USI.

Bigtable : A Distributed Storage System for Structured Data Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach Mike Burrows,

Zeta: Scheduling Interactive Services with Partial Execution Yuxiong He, Sameh Elnikety, James Larus, Chenyu Yan Microsoft Research and Microsoft Bing.

EuroSys Doctoral Workshop 2011 Resource Provisioning of Web Applications in Heterogeneous Cloud Jiang Dejun Supervisor: Guillaume Pierre

Pinpoint: Problem Determination in Large, Dynamic Internet Services Mike Chen, Emre Kıcıman, Eugene Fratkin {emrek,

1 Performance Modeling and System Management for Multi-Component Online Services Christopher Stewart and Kai Shen University of Rochester.

Don’t be lazy, be consistent: Postgres-R, A new way to implement Database Replication Paper by Bettina Kemme and Gustavo Alonso, VLDB 2000 Presentation.

Univa Grid Engine Makes Work Management Automatic and Efficient, Accelerates Deployment of Cloud Services with Power of Microsoft Azure MICROSOFT AZURE.

Operational & Analytical Database

Clock-SI: Snapshot Isolation for Partitioned Data Stores

Ganymed: Scalable Replication for Transactional Web Applications

Admission Control and Request Scheduling in E-Commerce Web Sites

Presentation transcript:

Predicting Replicated Database Scalability Sameh Elnikety, Microsoft Research Steven Dropsho, Google Inc. Emmanuel Cecchet, Univ. of Mass. Willy Zwaenepoel, EPFL

Environment –E-commerce website –DB throughput is 500 tps Is 5000 tps achievable? –Yes: use 10 replicas –Yes: use 16 replicas –No: faster machines needed How tx workload scales on replicated db? Motivation Single DBMS 2

Multi-Master Single-Master Replica 2 Replica 1 Replica 3 3 Slave 1 Master Slave 2

Background: Multi-Master Replica 2 Replica 1 Replica 3 Standalone DBMS Load Balancer 4

Read Tx Replica 2 Replica 1 Replica 3 Load Balancer T 5 Read tx does not change DB state

Update Tx Replica 2 Replica 1 Replica 3 Cert Load Balancer T wswswsws 6 Update tx changes DB state Update tx changes DB state

Additional Replica Replica 2 Replica 1 Replica 3 Load Balancer T ws Replica 3 7 Replica 4 Cert wsws

Standalone DBMS –Service demands Multi-master system –Service demands –Queuing model Experimental validation Coming Up … 8

Required –readonly tx: R –update tx: W Transaction load –readonly tx: R –update tx: W / (1 - A 1 ) Standalone DBMS Single DBMS Abort probability is A 1 Submit W / (1 - A 1 ) update tx Commited tx: W Aborted tx: W ∙ A 1 / (1- A 1 ) Abort probability is A 1 Submit W / (1 - A 1 ) update tx Commited tx: W Aborted tx: W ∙ A 1 / (1- A 1 ) 9

Standalone DBMS Single DBMS 10 Required –readonly tx: R –update tx: W Transaction load –readonly tx: R –update tx: W / (1 - A 1 )

Service Demand 11

Required (whole system of N replicas) –Readonly tx: N ∙ R –Update tx: N ∙ W Transaction load per replica –Readonly tx: R –Update tx: W / (1 - A N ) –Writeset: W ∙ (N - 1) Multi-Master with N Replicas 12

MM Service Demand 13 Explosive cost!

Compare: Standalone vs MM Explosive cost! 14 Standalone: Multi-Master:

Readonly Workload Explosive cost! 15 Standalone: Multi-Master:

Update Workload Explosive cost! 16 Standalone: Multi-Master:

Closed-Loop Queuing Model 17

Standard algorithm Iterates over the number of clients Inputs: –Number of clients –Service demand at service centers –Delay time at delay centers Outputs: –Response time –Throughput Mean Value Analysis (MVA) 18

Using the Model 19

Copy of database Log all txs, (Pr : Pw) Python script replays txs –Readonly (rc) –Updates (wc) Writesets –Instrument db with triggers –Play txs to log writesets –Play writesets (ws) Standalone Profiling (Offline) 20

MM Service Demand 21 Explosive cost!

Abort Probability Predicting abort probability is hard Single-master –No prediction needed –Measure offline on master Multi-master –Approximate using –Sensitivity analysis in the paper 22

Using the Model # clients, think time 1.5 ∙ fsync() 1 ms 23

Compare –Measured performance vs model predictions Environment –Linux cluster running PostgreSQL TPC-W workload –Browsing (5% update txs) –Shopping (20% update txs) –Ordering (50% update txs) RUBiS workload –Browsing (0% update txs) –Bidding (20% update txs) Experimental Validation 24

Multi-Master TPC-W Performance Throughput Response time 25

26 Browsing, 5% u 15.7 X Ordering, 50% u 6.7 X 15%

Multi-Master RUBiS Performance Throughput Response time 27

28 Browsing, 0% u 16 X bidding, 20% u 3.4 X

Database system –Snapshot isolation –No hotspots –Low abort rates Server system –Scalable server (no thrashing) Queuing model & MVA –Exponential distribution for service demands Model Assumptions 29

Models –Single-Master –Multi-Master Experimental results –TPC-W –RUBiS Sensitivity analysis –Abort rates –Certifier delay Checkout the Paper 30

Urgaonkar, Pacifici, Shenoy, Spreitzer, Tantawi. “An analytical model for multi-tier internet services and its applications.” Sigmetrics Related Work 31

Derived an analytical model –Predicts workload scalability Implemented replicated systems –Multi-master –Single-master Experimental validation –TPC-W –RUBiS –Throughput predictions match within 15% Conclusions 32

Questions? Danke Schön! 33 Predicting Replicated Database Scalability