Database replication policies for dynamic content applications Gokul Soundararajan, Cristiana Amza, Ashvin Goel University of Toronto Presented by Ahmed.

Slides:



Advertisements
Similar presentations
Virtual Memory (II) CSCI 444/544 Operating Systems Fall 2008.
Advertisements

Paging: Design Issues. Readings r Silbershatz et al: ,
Optimistic Methods for Concurrency Control By : H.T. Kung & John T. Robinson Presenters: Munawer Saeed.
Database Architectures and the Web
Hadi Goudarzi and Massoud Pedram
A KTEC Center of Excellence 1 Cooperative Caching for Chip Multiprocessors Jichuan Chang and Gurindar S. Sohi University of Wisconsin-Madison.
1 Adapted from UCB CS252 S01, Revised by Zhao Zhang in IASTATE CPRE 585, 2004 Lecture 14: Hardware Approaches for Cache Optimizations Cache performance.
Pipeline Computer Organization II 1 Hazards Situations that prevent starting the next instruction in the next cycle Structural hazards – A required resource.
Local Touch – Global Reach Avoiding the Chaos Monkey Brent Stineman – National Cloud Solution Specialist.
May 7, A Real Problem  What if you wanted to run a program that needs more memory than you have?
Study of Hurricane and Tornado Operating Systems By Shubhanan Bakre.
1 A Real Problem  What if you wanted to run a program that needs more memory than you have?
1 Cheriton School of Computer Science 2 Department of Computer Science RemusDB: Transparent High Availability for Database Systems Umar Farooq Minhas 1,
FlareCo Ltd ALTER DATABASE AdventureWorks SET PARTNER FORCE_SERVICE_ALLOW_DATA_LOSS Slide 1.
Middleware based Data Replication providing Snapshot Isolation Yi Lin Bettina Kemme Marta Patiño-Martínez Ricardo Jiménez-Peris June 15, 2005.
Future Work Needed Kenneth Wade Najim Yaqubie. Outline 1.Model is simple 2.Too many assumptions 3.Conflicting internal architectures 4.Security Challenges.
Wireless & Mobile Networking: Channel Allocation
1 Virtual Memory vs. Physical Memory So far, all of a job’s virtual address space must be in physical memory However, many parts of programs are never.
Wade Wegner Windows Azure Technical Evangelist Microsoft Corporation Windows Azure AppFabric Caching.
Disco Running Commodity Operating Systems on Scalable Multiprocessors.
Online Magazine Bryan Ng. Goal of the Project Product Dynamic Content Easy Administration Development Layered Architecture Object Oriented Adaptive to.
1 Chapter 8 Virtual Memory Virtual memory is a storage allocation scheme in which secondary memory can be addressed as though it were part of main memory.
1  Caches load multiple bytes per block to take advantage of spatial locality  If cache block size = 2 n bytes, conceptually split memory into 2 n -byte.
Proteus: Power Proportional Memory Cache Cluster in Data Centers Shen Li, Shiguang Wang, Fan Yang, Shaohan Hu, Fatemeh Saremi, Tarek Abdelzaher.
U NIVERSITY OF M ASSACHUSETTS, A MHERST Department of Computer Science Emery Berger University of Massachusetts, Amherst Operating Systems CMPSCI 377 Lecture.
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
How WebMD Maintains Operational Flexibility with NoSQL Rajeev Borborah, Sr. Director, Engineering Matt Wilson – Director, Production Engineering – Consumer.
Self-Adaptive QoS Guarantees and Optimization in Clouds Jim (Zhanwen) Li (Carleton University) Murray Woodside (Carleton University) John Chinneck (Carleton.
Word Wide Cache Distributed Caching for the Distributed Enterprise.
Dynamic and Decentralized Approaches for Optimal Allocation of Multiple Resources in Virtualized Data Centers Wei Chen, Samuel Hargrove, Heh Miao, Liang.
Chapter 10 Operating Systems *. 2 Chapter Goals Describe the main responsibilities of an operating system Define memory and process management Explain.
A Cloud is a type of parallel and distributed system consisting of a collection of inter- connected and virtualized computers that are dynamically provisioned.
OPERATING SYSTEMS CPU SCHEDULING.  Introduction to CPU scheduling Introduction to CPU scheduling  Dispatcher Dispatcher  Terms used in CPU scheduling.
Database Replication Policies for Dynamic Content Applications Gokul Soundararajan, Cristiana Amza, Ashvin Goel University of Toronto EuroSys 2006: Leuven,
Ahmad Al-Shishtawy 1,2,Tareq Jamal Khan 1, and Vladimir Vlassov KTH Royal Institute of Technology, Stockholm, Sweden {ahmadas, tareqjk,
Chapter 10 Operating Systems.
임규찬. 1. Abstract 2. Introduction 3. Design Goals 4. Sample-Based Scheduling for Parallel Jobs 5. Implements.
Consistent and Efficient Database Replication based on Group Communication Bettina Kemme School of Computer Science McGill University, Montreal.
© 2005 Prentice Hall10-1 Stumpf and Teague Object-Oriented Systems Analysis and Design with UML.
A Software Architecture for Translucent Replication Etienne Antoniutti Di Muro Università degli Studi di Trieste, Italy 29th November,
Practice 8 Chapter Ten. 1. Is disk scheduling, other than FCFS scheduling, useful in a single-user environment? Explain your answer. Answer: In a single-user.
Database Replication in Tashkent CSEP 545 Transaction Processing Sameh Elnikety.
Replication (1). Topics r Why Replication? r System Model r Consistency Models – How do we reason about the consistency of the “global state”? m Data-centric.
Network-Aware Scheduling for Data-Parallel Jobs: Plan When You Can
Database Replication in WAN Yi Lin Supervised by: Prof. Kemme April 8, 2005.
Replication (1). Topics r Why Replication? r System Model r Consistency Models r One approach to consistency management and dealing with failures.
Security Vulnerabilities in A Virtual Environment
Dynamo: Amazon’s Highly Available Key-value Store DAAS – Database as a service.
Greenlight Presentation Oracle 11g Upgrade February 16, 2012.
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT By Jyothsna Natarajan Instructor: Prof. Yanqing Zhang Course: Advanced Operating Systems.
for all Hyperion video tutorial/Training/Certification/Material Essbase Optimization Techniques by Amit.
Improving Fault Tolerance in AODV Matthew J. Miller Jungmin So.
NETW3005 Virtual Memory. Reading For this lecture, you should have read Chapter 9 (Sections 1-7). NETW3005 (Operating Systems) Lecture 08 - Virtual Memory2.
1 Memory Management n In most schemes, the kernel occupies some fixed portion of main memory and the rest is shared by multiple processes.
Aaron Stanley King. What is SQL Azure? “SQL Azure is a scalable and cost-effective on- demand data storage and query processing service. SQL Azure is.
Chapter 8: Main Memory.
Copyright ©: Nahrstedt, Angrave, Abdelzaher
Presented by Haoran Wang
Analyzing Security and Energy Tradeoffs in Autonomic Capacity Management Wei Wu.
Ganymed: Scalable Replication for Transactional Web Applications
Chapter 7: Consistency & Replication IV - REPLICATION MANAGEMENT -Sumanth Kandagatla Instructor: Prof. Yanqing Zhang Advanced Operating Systems (CSC 8320)
Optimizing Interactive Analytics Engines for Heterogeneous Clusters
-A File System for Lots of Tiny Files
EECS 498 Introduction to Distributed Systems Fall 2017
Consistent Data Replication: Is it feasible in WANs?
CPU SCHEDULING.
CS510 - Portland State University
Virtual Memory: Working Sets
Cache writes and examples
Overview Problem Solution CPU vs Memory performance imbalance
Presentation transcript:

Database replication policies for dynamic content applications Gokul Soundararajan, Cristiana Amza, Ashvin Goel University of Toronto Presented by Ahmed Ataullah Wednesday, November 22 nd 2006

2 The plan Motivation and Introduction Motivation and Introduction Background Background Suggested Technique Suggested Technique Optimization/Feeback ‘loop’ Optimization/Feeback ‘loop’ Summary of Results Summary of Results Discussion Discussion

3 The Problem 3-Tier Framework 3-Tier Framework –Recall: The benefits of partial/full database replication in dynamic content (web) applications –We need to address some issues in the framework as presented in ‘Ganymed’ Problem: Problem: –How many replicas do we allocate? –How do we take care of overload situations while maximizing utilization and minimizing the number of replicas? Solution: Solution: –Dynamically (de)allocate replicas as needed

4 Assumptions Query Load: Query Load: –Write queries are short, sweet and simple –Read queries are complex, costly and more frequent Infrastructure: Infrastructure: –Replica addition is time consuming and ‘expensive’ –Machines are flexible in nature Replica Allocation vs. Replica Mapping Replica Allocation vs. Replica Mapping –Assume an intelligent middleware is present

5 Replica Allocation Full overlapping allocation Full overlapping allocation –All databases replicated across all machines in the cluster Disjoint allocation (no overlapping) Disjoint allocation (no overlapping) (A two database Scenario: Global replica pool not shown) RS = Read Sets, WS=Write Sets : Ratio of WS to RS may be misleading in the above diagram A B

6 Partial Overlapping Allocation Only share write sets. Read sets do not overlap Only share write sets. Read sets do not overlap A B (A two database scenario) RS = Read Sets, WS=Write Sets

7 Dynamic Replication (Eurosys 2006 Slides) Assume a cluster hosts 2 applications Assume a cluster hosts 2 applications –App1 (Red) using 2 machines –App2 (Blue) using 2 machines Assume App1 has a load spike Assume App1 has a load spike

8 Dynamic Replication Choose # of replicas to allocate to App1 Choose # of replicas to allocate to App1 –Say, we adapt by allocating one more replica Then, two options Then, two options –App2 still uses two replicas (overlap replica sets) –App2 loses one replica (disjoint replica sets)

9 Dynamic Replication Choose # of replicas to allocate to App1 Choose # of replicas to allocate to App1 –Say, we adapt by allocating one more replica Then, two options Then, two options –App2 still uses two replicas (overlap replica sets) –App2 loses one replica (disjoint replica sets)

10 Dynamic Replication Choose # of replicas to allocate to App1 Choose # of replicas to allocate to App1 –Say, we adapt by allocating one more replica Then, two options Then, two options –App2 still uses two replicas (overlap replica sets) –App2 loses one replica (disjoint replica sets)

11 Challenges Adding a replica can take time Adding a replica can take time –Bring replica up-to-date –Warm-up memory Can avoid adaptation with fully-overlapped replica sets Can avoid adaptation with fully-overlapped replica sets

12 Challenges However, overlapping applications compete for memory causing interference However, overlapping applications compete for memory causing interference Can avoid interference with disjoint replica sets Can avoid interference with disjoint replica sets

13 Challenges However, overlapping applications compete for memory causing interference However, overlapping applications compete for memory causing interference Can avoid interference with disjoint replica sets Can avoid interference with disjoint replica sets Tradeoff between adaptation delay and interference

14 Our Solution – Partial Overlap Reads of applications sent to disjoint replica sets Reads of applications sent to disjoint replica sets –Avoids interference Read-Set Read-Set –Set of replicas where reads are sent

15 Our Solution – Partial Overlap Writes of apps also sent to overlapping replica sets Writes of apps also sent to overlapping replica sets –Reduces replica addition time Write-Set Write-Set –Set of replicas where writes are sent

16 Optimization For a given application, For a given application, –Replicas in Write-Set – Fully Up-to-Date –Other Replicas – Periodic Batch Updates

17 Secondary Implementation Details Scheduler(s): Scheduler(s): –Separate read-only from read/write queries –One copy serializability is guaranteed Optimization: Optimization: –Scheduler also stores some cached information (queries, write sets etc,) to reduce warm-up/ready time. –Conflict awareness at the scheduler layer

18 Replica Allocation Logic Measure Average Query Latency by solving: WL= (alpha) L + (1 – alpha) WL L is the current query latency and alpha a constant. Note: Responsiveness/stability both depend on alpha Stability Delay ?

19 Results It works…

20 One last interesting issue WL= (alpha) L + (1 – alpha) WL – –L is the current query latency and alpha a constant

21 Discussion Questionable Assumptions Questionable Assumptions –Are write requests really (always) simple? –Scalability beyond 60 replicas (is it an issue?) How closely does this represent a real data center situation? How closely does this represent a real data center situation? –Load contention issues –Overlap assignment –Determination of alpha(s) Actual cost savings vs. Implied cost savings Actual cost savings vs. Implied cost savings –Depends on SLA etc. –Depends on hardware leasing agreements The issue of readiness in secondary replicas The issue of readiness in secondary replicas –What level of ‘warmth’ is good enough for each application. Can some machines be turned off? –What about contention in many databases trying to stay warm. Management concerns Management concerns –Can we truly provide strong guarantees for keeping our end of the SLA promised?