1 Moshe Shadmon ScaleDB Scaling MySQL in the Cloud.

Slides:



Advertisements
Similar presentations
A Ridiculously Easy & Seriously Powerful SQL Cloud Database Itamar Haber AVP Ops & Solutions.
Advertisements

ScaleDB Transactional Shared Disk storage engine for MySQL
HadoopDB Inneke Ponet.  Introduction  Technologies for data analysis  HadoopDB  Desired properties  Layers of HadoopDB  HadoopDB Components.
Oracle Architecture. Instances and Databases (1/2)
ITIS 3110 Jason Watson. Replication methods o Primary/Backup o Master/Slave o Multi-master Load-balancing methods o DNS Round-Robin o Reverse Proxy.
Serverless Network File Systems. Network File Systems Allow sharing among independent file systems in a transparent manner Mounting a remote directory.
Oracle Data Guard Ensuring Disaster Recovery for Enterprise Data
1 Cheriton School of Computer Science 2 Department of Computer Science RemusDB: Transparent High Availability for Database Systems Umar Farooq Minhas 1,
CS-550: Distributed File Systems [SiS]1 Resource Management in Distributed Systems: Distributed File Systems.
ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 16 – Intro. to Transactions.
Merrill Holt Director Parallel Server Product Management Oracle Corporation.
Distributed Processing, Client/Server, and Clusters
1 © Copyright 2010 EMC Corporation. All rights reserved. EMC RecoverPoint/Cluster Enabler for Microsoft Failover Cluster.
1 Principles of Reliable Distributed Systems Tutorial 12: Frangipani Spring 2009 Alex Shraer.
Sinfonia: A New Paradigm for Building Scalable Distributed Systems Marcos K. Aguilera, Arif Merchant, Mehul Shah, Alistair Veitch, Christonos Karamanolis.
Nikolay Tomitov Technical Trainer SoftAcad.bg.  What are Amazon Web services (AWS) ?  What’s cool when developing with AWS ?  Architecture of AWS 
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo How to Scale a Database System.
Database System Architectures  Client-server Database System  Parallel Database System  Distributed Database System Wei Jiang.
How to Cluster both Servers and Storage W. Curtis Preston President The Storage Group.
Virtual Machine Monitors CSE451 Andrew Whitaker. Hardware Virtualization Running multiple operating systems on a single physical machine Examples:  VMWare,
Module – 7 network-attached storage (NAS)
Module 14: Scalability and High Availability. Overview Key high availability features available in Oracle and SQL Server Key scalability features available.
Distributed Databases
National Manager Database Services
Frangipani: A Scalable Distributed File System C. A. Thekkath, T. Mann, and E. K. Lee Systems Research Center Digital Equipment Corporation.
Copyright © 2012 Cleversafe, Inc. All rights reserved. 1 Combining the Power of Hadoop with Object-Based Dispersed Storage.
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
Distributed File Systems Concepts & Overview. Goals and Criteria Goal: present to a user a coherent, efficient, and manageable system for long-term data.
Building Highly Available Systems with SQL Server™ 2005 Vineet Gupta Evangelist – Data and Integration Microsoft Corp.
Highly Available ACID Memory Vijayshankar Raman. Introduction §Why ACID memory? l non-database apps: want updates to critical data to be atomic and persistent.
SANPoint Foundation Suite HA Robert Soderbery Sr. Director, Product Management VERITAS Software Corporation.
Word Wide Cache Distributed Caching for the Distributed Enterprise.
Cloud Computing for the Enterprise November 18th, This work is licensed under a Creative Commons.
1 Storage Refinement. Outline Disk failures To attack Intermittent failures To attack Media Decay and Write failure –Checksum To attack Disk crash –RAID.
CSC 456 Operating Systems Seminar Presentation (11/13/2012) Leon Weingard, Liang Xin The Google File System.
Sofia, Bulgaria | 9-10 October SQL Server 2005 High Availability for developers Vladimir Tchalkov Crossroad Ltd. Vladimir Tchalkov Crossroad Ltd.
CS525: Special Topics in DBs Large-Scale Data Management Hadoop/MapReduce Computing Paradigm Spring 2013 WPI, Mohamed Eltabakh 1.
Cloud Computing & Amazon Web Services – EC2 Arpita Patel Software Engineer.
Copyright 2006 MySQL AB The World’s Most Popular Open Source Database MySQL Cluster: An introduction Geert Vanderkelen MySQL AB.
UNIX File and Directory Caching How UNIX Optimizes File System Performance and Presents Data to User Processes Using a Virtual File System.
Intro – Part 2 Introduction to Database Management: Ch 1 & 2.
Ingres Version 6.4 An Overview of the Architecture Presented by Quest Software.
Achieving Scalability, Performance and Availability on Linux with Oracle 9iR2-RAC Grant McAlister Senior Database Engineer Amazon.com Paper
MapReduce and GFS. Introduction r To understand Google’s file system let us look at the sort of processing that needs to be done r We will look at MapReduce.
Supporting Multi-Processors Bernard Wong February 17, 2003.
Ch 10 Shared memory via message passing Problems –Explicit user action needed –Address spaces are distinct –Small Granularity of Transfer Distributed Shared.
Fast Crash Recovery in RAMCloud. Motivation The role of DRAM has been increasing – Facebook used 150TB of DRAM For 200TB of disk storage However, there.
INTRODUCTION TO DBS Database: a collection of data describing the activities of one or more related organizations DBMS: software designed to assist in.
PARALLEL PROCESSOR- TAXONOMY. CH18 Parallel Processing {Multi-processor, Multi-computer} Multiple Processor Organizations Symmetric Multiprocessors Cache.
CS525: Big Data Analytics MapReduce Computing Paradigm & Apache Hadoop Open Source Fall 2013 Elke A. Rundensteiner 1.
Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
Hadoop/MapReduce Computing Paradigm 1 CS525: Special Topics in DBs Large-Scale Data Management Presented By Kelly Technologies
CS 540 Database Management Systems
Chapter 1 Database Access from Client Applications.
ICOM 6005 – Database Management Systems Design Dr. Manuel Rodríguez-Martínez Electrical and Computer Engineering Department Lecture 16 – Intro. to Transactions.
Em Spatiotemporal Database Laboratory Pusan National University File Processing : Database Management System Architecture 2004, Spring Pusan National University.
St. Petersburg, 2016 Openstack Disk Storage vs Amazon Disk Storage Computing Clusters, Grids and Cloud Erasmus Mundus Master Program in PERCCOM Author:
Disk Cache Main memory buffer contains most recently accessed disk sectors Cache is organized by blocks, block size = sector’s A hash table is used to.
CS 540 Database Management Systems
Virtual Machine Monitors
Distributed Cache Technology in Cloud Computing and its Application in the GIS Software Wang Qi Zhu Yitong Peng Cheng
A Technical Overview of Microsoft® SQL Server™ 2005 High Availability Beta 2 Matthew Stephen IT Pro Evangelist (SQL Server)
Introduction to NewSQL
Introduction of Week 6 Assignment Discussion
Replication Middleware for Cloud Based Storage Service
Outline Midterm results summary Distributed file systems – continued
Database System Architectures
Distributed Resource Management: Distributed Shared Memory
Setting up PostgreSQL for Production in AWS
Presentation transcript:

1 Moshe Shadmon ScaleDB Scaling MySQL in the Cloud

Shared Disk vs. Shared Nothing Shared NothingShared Disk Masters Slaves 2

3  Start small, grow incrementally  Scalable AND highly available  Add capacity on demand with zero downtime  Simplicity  No need to partition data  No need for master-slave Shared Disk Advantages

Server 1 OSS DBMS ScaleDB VM The Virtualized Cloud Database Local Disk OSS DBMS Storage Engine My SQL Server Server 2 OSS DBMS ScaleDB VM OSS DBMS ScaleDB VM OSS DBMS ScaleDB VM OSS DBMS ScaleDB VM Shared Nothing Shared Storage Shared Disk 4

ScaleDB As the Storage Engine 5 ScaleDB Storage Engine MySql Database Management Level Storage Engine Level MySql Server

ScaleDB Cluster Manager ScaleDB Node ScaleDB API Transaction Manager Index Manager Index Manager Data Manager Data Manager Lock Manager Local Lock Manager Log Manager Recovery Manager Recovery Manager Storage Manager Global Recovery Manager Global Sync Manager Global Sync Manager Global Lock Manager Global Lock Manager ScaleDB Storage System Cache & Storage Devices ScaleDB’s Internal Architecture ScaleDB Storage Sysytem Buffer Manager Local Sync Coordinator Threads Manager 6

Deploying ScaleDB … ScaleDB Cluster Manager Node 1 DBMS ScaleDB Node 2 DBMS ScaleDB Node N DBMS ScaleDB Application Application Layer Database Layer (Physical or VM nodes) Storage Layer Shared Storage ScaleDB 7

The Storage Engine Pluggable Storage Engine –Transactional storage engine –Supports MySQL Storage Engine API –Reads/Writes done via network to a shared storage –Maintains a local cache –Local Lock Manager – manage locking at the node level –Connector to Cluster Manager – synchronize operations at a cluster level 8

The Cluster Manager Distributed Lock Manager – manage cluster level locks –Locks can be held over any type of resource: DBMS, Table, Partition, File, Block, Row etc. –Supports multiple lock modes: Read, Read/Write, exclusive etc. –Synchronize state using messaging Local Lock Manager – manage locks at a node level –Maintains locks at the node level –Synchronize state using shared memory Identifies node failures and manage recovery 9

The Cluster Manager Distributed Lock Manager –Synchronize conflicting processes between nodes in the cluster Example: 2 nodes need to update the same resource at the same time. –The challenge: Requests are done via the network – can be expensive: –Internal operations may be in nanoseconds, network operations are in milliseconds –The solution Requests are send only when conflicts occur 10

The Storage Independent storage nodes –Accessible via network –Each node has a Cache Layer and a Persistent Layer –Database nodes can force the write to disk based on transactional requirement –Data can be distributed over multiple storage nodes –Each Storage Node can be mirrored –Each Storage Node may have a Hot Backup Node 11

The Storage Node 12 Disks Cache Based On LRU Interface to Storage Storage Node –Manage the data in cache and flush to disk when required. –Supports the storage engine calls for Read, Write, etc. –Supports pushed calls from storage engine such Count Rows, Search, etc. –Each node is a Linux machine. No need for Network File System (NFS).

Scaling the Storage Tier … ScaleDB Cluster Manager Node 1 DBMS ScaleDB Node 2 DBMS ScaleDB Node N DBMS ScaleDB Database Layer (Physical or VM nodes) Storage Layer 13 Shared Storage Cache TCP/UDP Shared Storage Cache TCP/UDP Shared Storage Cache TCP/UDP Shared Storage Cache TCP/UDP Local Cache Global Cache

14 Global Cache Guarantees cache coherency Manages caching of shared data Minimizes access time to data which is not in local cache and would otherwise be read from disk Implements fast direct memory access over high-speed interconnects for all data blocks and types Uses an efficient and scalable messaging protocol

HA of the Storage Tier … ScaleDB Cluster Manager Node 1 DBMS ScaleDB Node 2 DBMS ScaleDB Node N DBMS ScaleDB Database Layer (Physical or VM nodes) Storage Layer Shared Storage Mirrored Storage ScaleDB Hot Backup 15

Scaling the Storage Tier … ScaleDB Cluster Manager Node 1 DBMS ScaleDB Node 2 DBMS ScaleDB Node N DBMS ScaleDB Database Layer (Physical or VM nodes) Partitioned Storage Partitioned Mirrored Partitioned Hot Backup Partitioned Storage Partitioned Mirrored Partitioned Hot Backup Partitioned Storage Partitioned Mirrored Partitioned Hot Backup Partition 1 Partition 2 Partition Q 16

Scaling the Storage Tier ScaleDB Cluster Manager Node N MySQL Database Layer (Physical or VM nodes) 17 ScaleDB Local Cache Cache Storage Cache Storage Cache Storage Cache Storage Main Mirror Cache Storage Read –From Local Cache –From Main Or Mirror Get From Cache Get From Storage Write –To local cache –At end of transaction multicast to main and mirror optional acknowledgement: –after receive –after write

18 Traditional Query Processing What Were Yesterday Sales ? Get The Sales Table Storage Array Retrieve Entire Sales Table Process Table Data DBMS Server

19 ScaleDB Query Processing Storage Nodes DBMS Server What Were Yesterday Sales ? Get October 15 Sales

Scaling the Storage Tier 20 Advantages –Parallel processing: I/O calls are executed simultaneously on multiple Storage Nodes. Logic pushed to storage layer: “SELECTcustomer_name from calls WHERE amount > 200” Traditional approach – return all rows to the database ScaleDB storage – return selected rows to the database –Leverage cache on multiple storage nodes –Storage layer can be expended without downtime –Data is Mirrored –Support for Hot-Backup –Low cost

High Availability Failure of a node –Detected by the Cluster Manager A surviving node is requested to undo uncommitted transactions Failure of the Cluster Manager –Detected by the Standby Cluster Manager Requests all nodes to undo uncommitted transactions Failure of a Storage Node –Continue with a mirrored storage – or – –Use the Storage Node Log to recover 21

22 Performance / Tuning Occurs when 2 or more nodes want the same resource at the same time Types of Contention: –Read/Read contention – is never a problem because of the shared disk system –Read/Write contention – reader is requested to release the block and grant is provided to writer –Write/Read or Write/Write – Writer sends block to the global cache layer, Buffer invalidate message is send to the other nodes Requestor receives the grant

23 Performance / Tuning Fast Network between the nodes –2 logical networks: Between the database nodes and the Cluster Manager Between the database nodes and the storage –Optimize Socket Receive Buffers ( 256 KB – 1MB ) Partition requests to maintain locality of data –Send requests that update/query the same data to the same node By Database By Table By Table with PK –Logic can change dynamically to adopt to changes Changes in data distribution Changes in user behaviors Additional DBMS nodes

ScaleDB: Elastic/Enterprise Database FunctionSimpleDBRDSScaleDB TransactionsNoYes JoinsNoYes Data ConsistencyNo (Eventual)Yes SQL SupportNoYes ACID CompliantNoYes Supports MySQL applications without modification NoYes Dynamic Elasticity (w/o interruption) YesNoYes High-AvailabilityYesNoYes Eliminates PartitioningYesNoYes Eliminates possible 5-minute data loss upon failure YesNoYes 24

Value Proposition Runs on low-cost cloud infrastructures (e.g. Amazon) High-availability, no single point of failure Dramatically easier set-up & maintenance –No partitioning/repartitioning –No slave and replication headaches –Simplified tuning Scales up/down without interrupting your application Lower TCO 25