Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon Boosting STM Replication via Speculation Roberto Palmieri,

Slides:



Advertisements
Similar presentations
Remus: High Availability via Asynchronous Virtual Machine Replication
Advertisements

Optimistic Methods for Concurrency Control By : H.T. Kung & John T. Robinson Presenters: Munawer Saeed.
Exploiting Distributed Version Concurrency in a Transactional Memory Cluster Kaloian Manassiev, Madalin Mihailescu and Cristiana Amza University of Toronto,
Evaluating Database-Oriented Replication Schemes in Software Transacional Memory Systems Roberto Palmieri Francesco Quaglia (La Sapienza, University of.
Distributed Systems Overview Ali Ghodsi
Replication Management. Motivations for Replication Performance enhancement Increased availability Fault tolerance.
1 Chapter 3. Synchronization. STEMPusan National University STEM-PNU 2 Synchronization in Distributed Systems Synchronization in a single machine Same.
Database Systems, 8 th Edition Concurrency Control with Time Stamping Methods Assigns global unique time stamp to each transaction Produces explicit.
Concurrency Control Nate Nystrom CS 632 February 6, 2001.
EEC 688/788 Secure and Dependable Computing Lecture 12 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Distributed Systems Fall 2010 Replication Fall 20105DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
CMPT Dr. Alexandra Fedorova Lecture X: Transactions.
Database Replication techniques: a Three Parameter Classification Authors : Database Replication techniques: a Three Parameter Classification Authors :
CS 582 / CMPE 481 Distributed Systems Fault Tolerance.
Computer Science Lecture 12, page 1 CS677: Distributed OS Last Class Distributed Snapshots –Termination detection Election algorithms –Bully –Ring.
CS 582 / CMPE 481 Distributed Systems
Transaction Management and Concurrency Control
EEC 693/793 Special Topics in Electrical Engineering Secure and Dependable Computing Lecture 12 Wenbing Zhao Department of Electrical and Computer Engineering.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Transaction Management and Concurrency Control.
2/23/2009CS50901 Implementing Fault-Tolerant Services Using the State Machine Approach: A Tutorial Fred B. Schneider Presenter: Aly Farahat.
EEC 693/793 Special Topics in Electrical Engineering Secure and Dependable Computing Lecture 12 Wenbing Zhao Department of Electrical and Computer Engineering.
Distributed Systems Fall 2009 Replication Fall 20095DV0203 Outline Group communication Fault-tolerant services –Passive and active replication Highly.
9 Chapter 9 Transaction Management and Concurrency Control Hachim Haddouti.
1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems.
CMPT 401 Summer 2007 Dr. Alexandra Fedorova Lecture XI: Distributed Transactions.
CMPT Dr. Alexandra Fedorova Lecture XI: Distributed Transactions.
CS 603 Data Replication February 25, Data Replication: Why? Fault Tolerance –Hot backup –Catastrophic failure Performance –Parallelism –Decreased.
CMPT Dr. Alexandra Fedorova Lecture XI: Distributed Transactions.
Transaction. A transaction is an event which occurs on the database. Generally a transaction reads a value from the database or writes a value to the.
Academic Year 2014 Spring Academic Year 2014 Spring.
TRANSACTIONS AND CONCURRENCY CONTROL Sadhna Kumari.
Commit Protocols. CS5204 – Operating Systems2 Fault Tolerance Causes of failure: process failure machine failure network failure Goals : transparent:
AN OPTIMISTIC CONCURRENCY CONTROL ALGORITHM FOR MOBILE AD-HOC NETWORK DATABASES Brendan Walker.
When Scalability Meets Consistency: Genuine Multiversion Update-Serializable Partial Data Replication Sebastiano Peluso, Pedro Ruivo, Paolo Romano, Francesco.
HPDCS Research Group Research Focus STM Systems Dependability of STM Performance Modelling of STM EURO-TM | 1 st Plenary.
Replication March 16, Replication What is Replication?  A technique for increasing availability, fault tolerance and sometimes, performance 
Consistent and Efficient Database Replication based on Group Communication Bettina Kemme School of Computer Science McGill University, Montreal.
Concurrency Server accesses data on behalf of client – series of operations is a transaction – transactions are atomic Several clients may invoke transactions.
1 ZYZZYVA: SPECULATIVE BYZANTINE FAULT TOLERANCE R.Kotla, L. Alvisi, M. Dahlin, A. Clement and E. Wong U. T. Austin Best Paper Award at SOSP 2007.
IM NTU Distributed Information Systems 2004 Replication Management -- 1 Replication Management Yih-Kuen Tsay Dept. of Information Management National Taiwan.
XA Transactions.
Copyright © George Coulouris, Jean Dollimore, Tim Kindberg This material is made available for private study and for direct.
Commit Algorithms Hamid Al-Hamadi CS 5204 November 17, 2009.
A Survey on Optimistic Concurrency Control CAI Yibo ZHENG Xin
Optimistic Methods for Concurrency Control By: H.T. Kung and John Robinson Presented by: Frederick Ramirez.
Transactions and Concurrency Control. Concurrent Accesses to an Object Multiple threads Atomic operations Thread communication Fairness.
Database Replication in WAN Yi Lin Supervised by: Prof. Kemme April 8, 2005.
A Multiversion Update-Serializable Protocol for Genuine Partial Data Replication Sebastiano Peluso, Pedro Ruivo, Paolo Romano, Francesco Quaglia and Luís.
Page 1 Concurrency Control Paul Krzyzanowski Distributed Systems Except as otherwise noted, the content of this presentation.
Chapter 4 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University Building Dependable Distributed Systems.
Transaction Management Transparencies. ©Pearson Education 2009 Chapter 14 - Objectives Function and importance of transactions. Properties of transactions.
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT By Jyothsna Natarajan Instructor: Prof. Yanqing Zhang Course: Advanced Operating Systems.
Revisiting failure detectors Some of you asked questions about implementing consensus using S - how does it differ from reaching consensus using P. Here.
EEC 688/788 Secure and Dependable Computing Lecture 9 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Antidio Viguria Ann Krueger A Nonblocking Quorum Consensus Protocol for Replicated Data Divyakant Agrawal and Arthur J. Bernstein Paper Presentation: Dependable.
Highly Available Services and Transactions with Replicated Data Jason Lenthe.
Concurrent Revisions: A deterministic concurrency model. Daan Leijen & Sebastian Burckhardt Microsoft Research (OOPSLA 2010, ESOP 2011)
Introduction to NewSQL
Distributed Transactions and Spanner
MVCC and Distributed Txns (Spanner)
Transaction Management
Outline Announcements Fault Tolerance.
Replication and Recovery in Distributed Systems
EEC 688/788 Secure and Dependable Computing
EEC 688/788 Secure and Dependable Computing
Distributed Transactions
Concurrency Control II and Distributed Transactions
EEC 688/788 Secure and Dependable Computing
Concurrency control (OCC and MVCC)
Distributed Optimistic Algorithm
Presentation transcript:

Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon Boosting STM Replication via Speculation Roberto Palmieri, Paolo Romano, Francesco Quaglia, Luis Rodrigues

Replication Is a typical way to achieve Fault Tolerance Several techniques: – Primary Backup approach – Active Replication approach – … Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon

Why Active Replication Each replica keeps all data and executes the same transactions in the same order PRO (+) – Full failure masking – No coordination for processing read-only transactions – Prone to target performance issues CONS (-) – Agreement on common execution order – Deterministic business logic Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon

Literature solutions for actively replicated transactional systems Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon

State Machine Approach (SM) Implements Active Replication paradigm Based on Atomic Broadcast as GCS Does not exploit any kind of optimism Requires deterministic Local CC Coordination phase to-broadcast (m)to-delivery (m) Processing m Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon

Optimistic Approach (OPT) Based on Optimistic Atomic Broadcast as GCS It processes in optimistic manner: At most one conflicting transaction Any non-conflicting transactions Coordination phase to-broadcast (m) to-delivery (m) Processing m opt-delivery (m) Optimistic processing of m Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon if(opt-delivery order == to-delivery order) Commit(m) if(opt-delivery order <> to-delivery order) Abort & Restart(m) if(m is non-conflicting transaction) Commit(m)

Critiques to OPT (1) A-priori knowledge on transaction read/write sets (to detect conflicts) might be infeasible (non-determinism) Unpredictability of transactions data access pattern imposes safe (over-) estimation of accessed data sets (reduced concurrency) Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon

Critiques to OPT (2) Limited overlapping in case of fine-grain transactions Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon

Critiques to OPT (3) Ineffective if deployed on less or unpredictable networks – Networks without spontaneous ordering of optimistic deliveries Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon

Unbalancing Ratio Coordination delay Local Transaction Execution Time Traditional Scenarios Modern (STM) Scenarios ≈ 2 msec ≈ 1/10 msec ≈ 2 msec ≈ 10/100 μsec Long stall periods -> Resources underutilization Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon

Target: Maximize the overlap Coordination phase m1 Coordination phase m2 Coordination phase m3 to-broadcast (m1) to-broadcast (m2) to-broadcast (m3) Opt-delivery (m1) Opt-delivery (m2) Opt-delivery (m3) …… to-delivery (m1) to-delivery (m2) to-delivery (m3) …… Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon Coordination delay Vs Local transaction processing Local transaction processing Vs Local transaction processing Multiple or same Serialization Orders

How: Speculative Processing Basic ideas: – Activate all transactions as soon as they are optimistic delivered – Explore (in depth and/or in breadth) multiple serialization orders time commit(T’ B ) TB:TB: TA:TA: opt-del(T B ) opt-del(T A ) to-del(T A ) to-del(T B ) exec T B T B -> T A exec T A T A -> T B exec T’ A T B -> T A exec T’ B T A -> T B abort(T’ A ) abort(T B ) commit(T A ) abort(T’ A ) abort(T B ) commit(T A ) Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon

Speculative Processing: desirable properties Correctness – The history of committed transaction generated is 1-copy serializable Non-Redundancy – No two speculative instances of the same transaction observe the same snapshot Completeness – Eventually, every permutation of optimistic delivered transactions (not yet final delivered) that produces a distinct snapshot is explored Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon

Relaxing completeness The relevance of the completeness property depends on the likelihood of mismatches between final and optimistic delivery orders completeness at-most one speculative order every distinct speculative order some “opportunistic” speculative orders Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon [SPAA2010] [ISPA2010] [SRDS 2011] [NCA 2011] ✔ Completeness ✗ Completeness

Aggressively Optimistic Transaction Processing (AGGRO) Tailored for networks with spontaneous order Lock based concurrency control Optimized lock semantic to reduce the lock wait time of conflicting transactions No a priori knowledge on transactions read/write sets No sibling transactions (reduced overhead) Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon

AGGRO: key ideas 1.Uncommitted data item versions are aggressively made visible to other transactions (upon transaction completion) independently of whether the creating transactions will be eventually committed 2.Speculative transactions guided to follow the Optimistic delivery order The ordering relations based on T i → T j OPT Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon

Data item version states Each version of shared objects in the transactional memory could be: – Committed – Uncommitted Committed – The creator transaction has already committed Uncommitted – Work-In-Progress (WIP) state, the creator transaction has not reached the complete stage yet – Complete, the creator transaction has reached the complete stage, but is not finalized as committed or aborted yet Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon

Transactions’ ordering Two transaction lists: – Opt-Delivered transactions (ODL) – TO (final)-Delivered transactions (FDL) Transaction T i → T j if – T i and T j are both currently recorded within ODL, with T i ordered before T j – T i is currently recorded within FDL, while T j is currently recorded within ODL – T i and T j are both currently recorded within FDL, with T i ordered before T j Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon T i → T j OPT

Algorithm sketch: Read Read dataitem X by T i Get version V according to → OPT V is marked as WiP Wait until the mark is removed Yes No Read V and update the read-form set of T i Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon

Algorithm sketch: Write Write on dataitem X by T i The dataitem X is marked as WiP Abort (and restart) all transactions that follow T i (according to ) and read X from a different transaction → OPT Early abort mechanism Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon

Algorithm sketch: Complete T i completes its execution Remove all the WiP marks Make available written dataitems to other speculative transactions Permits to activate next conflicting transactions Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon

Trace-based Simulator A discrete event simulator based on JavaSim framework Realistic timing and data access patterns (coming from execution of JVSTM) (Optimistic) Atomic Broadcast service entails the possibility of batching messages to improve performance Atomic Broadcast average delays: – Optimistic delivery: 500 μsec – Final (total ordered) delivery: 2 msec LAN environment with no mismatch between optimistic and final deliveries (spontaneous order) Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon

Test bed Benchmarks: NameTypeMean Transaction Execution Time RB-TreeMicro Benchmark77 μsec SkipListMicro Benchmark281 μsec ListMicro Benchmark324 μsec Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon

Simulation results Baseline (OPT) Vs AGGRO no speculationspeculation Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon

CPU Utilization Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon

Simulation results Baseline (OPT) Vs AGGRO no speculationspeculation Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon

CPU Utilization Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon

Thank you for the attention Roberto Palmieri – Workshop on Distributed Transactional Memory (WDTM 2012) - 22/02/2012 Lisbon