The ScaleDB Storage Engine Enabling high performance and scalability, using a Multi-Table Index, and a Shared-Disk Clustering Architecture Moshe Shadmon.

Slides:



Advertisements
Similar presentations
ScaleDB Transactional Shared Disk storage engine for MySQL
Advertisements

The Replica Location Service In wide area computing systems, it is often desirable to create copies (replicas) of data objects. Replication can be used.
Universität Innsbruck Leopold Franzens Copyright 2006 DERI Innsbruck LarCK Workshop, ISWC/ASWC Busan, Korea 16-Feb-14 Towards Scalable.
Replication for Availability & Durability with MySQL and Amazon RDS Grant McAlister.
Horizontal Table Partitioning Dealing with a manageable slice of the pie. Libor Laubacher Progress EMEA Technical Support October 23 rd, 2013.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
CASSANDRA-A Decentralized Structured Storage System Presented By Sadhana Kuthuru.
Chapter 15: Transactions Transaction Concept Transaction Concept Concurrent Executions Concurrent Executions Serializability Serializability Testing for.
Data Management in the Cloud Paul Szerlip. The rise of data Think about this o For the past two decades, the largest generator of data was humans -- now.
MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management Dave Salisbury ( )
Chapter 13 (Web): Distributed Databases
Chapter 3 An Introduction to Relational Databases.
Overview Distributed vs. decentralized Why distributed databases
TRANSACTIONS. Definition One or more SQL statements that operate as a single unit. Each statement in the unit is completely interdependent. If one statement.
Distributed Databases
Performance and Scalability. Performance and Scalability Challenges Optimizing PerformanceScaling UpScaling Out.
Databases with Scalable capabilities Presented by Mike Trischetta.
Titan Graph Database Meet Bhatt(13MCEC02).
Software Engineer, #MongoDBDays.
Cloud Computing Lecture Column Store – alternative organization for big relational data.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
1 Large-scale Incremental Processing Using Distributed Transactions and Notifications Written By Daniel Peng and Frank Dabek Presented By Michael Over.
C-Store: A Column-oriented DBMS Speaker: Zhu Xinjie Supervisor: Ben Kao.
Practical Database Design and Tuning. Outline  Practical Database Design and Tuning Physical Database Design in Relational Databases An Overview of Database.
Lecture 11 Distributed Databases and Cloud computing
VLDB2012 Hoang Tam Vo #1, Sheng Wang #2, Divyakant Agrawal †3, Gang Chen §4, Beng Chin Ooi #5 #National University of Singapore, †University of California,
1 Moshe Shadmon ScaleDB Scaling MySQL in the Cloud.
1 Yasin N. Silva Arizona State University This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins
The ScaleDB Storage Engine Enabling high performance and scalability, using a Multi-Table Index, and a Shared-Disk Clustering Architecture Moshe Shadmon.
Chapter 16 Practical Database Design and Tuning Copyright © 2004 Pearson Education, Inc.
DATABASE TRANSACTION. Transaction It is a logical unit of work that must succeed or fail in its entirety. A transaction is an atomic operation which may.
An Introduction to Relational Databases Prof. Yin-Fu Huang CSIE, NYUST Chapter 3.
Parallel Databases 77. Introduction 4 Basic idea: use multiple disks, memory and/or processors to speed up querying. 4 Measures –Throughput – how many.
The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.
MongoDB Jer-Shuan Lin.
Lec 7 Practical Database Design and Tuning Copyright © 2004 Pearson Education, Inc.
Thursday April 17, 2008 ScaleDB Technical Presentation Scaling MySQL to New Heights.
Nov 2006 Google released the paper on BigTable.
Gamma DBMS Part 1: Physical Database Design Shahram Ghandeharizadeh Computer Science Department University of Southern California.
Your Data Any Place, Any Time Performance and Scalability.
Senior Solutions Architect, MongoDB Inc. Massimo Brignoli #MongoDB Introduction to Sharding.
Chap 5. Disk IO Distribution Chap 6. Index Architecture Written by Yong-soon Kwon Summerized By Sungchan IDS Lab
Data The fact and figures that can be recorded in system and that have some special meaning assigned to it. Eg- Data of a customer like name, telephone.
Shared Nothing Architecture Allen Archer. What is Shared Nothing architecture? It is a distributed architecture in which each node is independent and.
Introduction to Core Database Concepts Getting started with Databases and Structure Query Language (SQL)
Amirhossein Saberi May CASSANDRA NAME A daughter of the Trojan king Priam, who was given the gift of prophecy by Apollo. When she cheated him, however,
OpenAccess ORM Advanced Topics Kevin Babcock April 9, 2009.
Oracle Database High Availability
Cloud Computing and Architecuture
Practical Database Design and Tuning
Distributed Database Concepts
Database Services Katarzyna Dziedziniewicz-Wojcik On behalf of IT-DB.
Windows Azure SQL Federation
Modern Databases NoSQL and NewSQL
A Technical Overview of Microsoft® SQL Server™ 2005 High Availability Beta 2 Matthew Stephen IT Pro Evangelist (SQL Server)
Introduction to NewSQL
Oracle Database High Availability
1 Demand of your DB is changing Presented By: Ashwani Kumar
SQL 2014 In-Memory OLTP What, Why, and How
Predictive Performance
NoSQL Databases An Overview
Physical Database Design
Practical Database Design and Tuning
The PROCESS of Queries John Deardurff
Distributed Databases
The PROCESS of Queries John Deardurff
Practical Issues of Data Placement
H-store: A high-performance, distributed main memory transaction processing system Robert Kallman, Hideaki Kimura, Jonathan Natkins, Andrew Pavlo, Alex.
CPSC-608 Database Systems
Presentation transcript:

The ScaleDB Storage Engine Enabling high performance and scalability, using a Multi-Table Index, and a Shared-Disk Clustering Architecture Moshe Shadmon

Agenda Overview ScaleDBs Clustering Architecture o Shared-Disk vs. Shared-Nothing o MySQL and a Shared-Disk Storage Engine o ScaleDB Installation o Demo ScaleDBs Indexing Technology o Multi-Table Index o Enabling Multi-Table Index in MySQL o Demo Summary ScaleDB Status & Product Availability

Overview Plug-in Storage Engine for MySQL Main Features: o Shared-Disk Architecture o Innovative Multi-Table Indexing o Transactional o Row-Level Locking o ACID Compliant o Atomicity: All tasks of a transaction performed or none of them are. o Consistency: The database is in a consistent state before and after the transaction. o Isolation: Data is not available in an intermediate state during a transaction o Durability: When a transaction completes, the transactions data will persist o Disk-Based Storage Engine

Shared-Disk vs. Shared-Nothing Manageability Adaptability Availability/Fault-Tolerance Scalability Performance Total Cost of Ownership (TCO)

Shared-Nothing: Database Instance 1 Table A Table B Table C Database Instance 1 Database Instance 2 Database Instance 3 Table A Table B Table C Vertical Partitioning

Shared Nothing: Partitioning Your Data…How Predict usage patterns, application evolution, data growth patterns…all are moving targets Avoid data skew: bottlenecks caused by frequently accessed data on just a few nodes Avoid data shipping between nodes Avoid delays from distributed 2-phase commit Searches outside the partition column require participation by all nodes Scaling becomes an exercise in fire fighting

Bob2010K Shideh1835K Ted5060K Kevin62120K Angela55140K Mike4590K Physical View nameage salary Partitioned by Salary Logical View Shared-Nothing: Horizontal Partitioning Ted5060K Kevin62120K Mike4690K nameage salary Bob2010K nameage salary Shideh1835K Angela55140K nameage salary Horizontal Partitioning – Salary % 3

Selections with equality predicates referencing the partitioning attribute are directed to a single node: o Retrieve Emp where salary = 60K SELECT FROM Emp WHERE salary=60K Equality predicates referencing a non- partitioning attribute and range predicates are directed to all nodes: o Retrieve Emp where age = 20 o Retrieve Emp where salary < 20K SELECT FROM Emp WHERE salary<20K Shared-Nothing: Horizontal Partitioning Pitfalls

DB Cluster Node 1 DB Cluster Node 2 DB Cluster Node 3 Table A Table B Table C Shared Disk Subsystem High-Speed Interconnect Shared-Disk: No Partitioning, Full Access to Data Database Instance 1 Table A Table B Table C

Node A Node B Node C Slave A Slave B Slave C Scalability & Availability Shared Nothing

Scalability & Availability Shared Disk Node A Node B Node C Data MySQL Servers with ScaleDB Engine Node DNode E

Grow by simply adding nodes to the cluster o Servers can be added and removed dynamically according to your needs o No interruption to your application High-Availability with dynamic failover o Existing nodes automatically take over Significantly reduced maintenance costs o Can be built on low-cost commodity hardware o No data partitioning o No need for slaves Low Total Cost of Ownership (TCO) Shared-Disk: Summarizing Shared-Disk Benefits

ScaleDB Engine Instance A Cluster Manager Buffer Manager Comm. Layer Server Instance A Shared-Disk: Making it work with MySQL Node 1 ScaleDB Engine Instance B Buffer Manager Cluster Manager Comm. Layer Node 2 Server Instance B Shared Disk Sub-system Cluster Interconnect

ScaleDB Engine Instance A Cluster Manager Buffer Manager Comm. Layer Node 1 Server Instance A ScaleDB Engine Instance B Buffer Manager Cluster Manager Comm. Layer Node 2 Server Instance B Shared-Disk: Insert New Row Shared Disk Sub-system Cluster Interconnect

ScaleDB Engine Instance A Cluster Manager Buffer Manager Comm. Layer Node 1 Server Instance A ScaleDB Engine Instance B Buffer Manager Cluster Manager Comm. Layer Node 2 Server Instance B Shared-Disk: Select Shared Disk Sub-system Cluster Interconnect

ScaleDB Engine Instance A Cluster Manager Buffer Manager Comm. Layer Node 1 Server Instance A ScaleDB Engine Instance B Buffer Manager Cluster Manager Comm. Layer Node 2 Server Instance B Shared-Disk: Create Table Shared Disk Sub-system Cluster Interconnect Table A Meta-Data Table A Meta-Data

ScaleDB Installation Define cluster = true in ScaleDB Config file: ScaleDB.cnf is at the same directory as my.cnf: Cluster params: o cluster = true o nodes_in_cluster = 2 o node_id = 1 o this_machine_port = 100 o next_machine_ip_address = o next_machine_port = 100 o log_directory = /share/logs/

Demo - Sysbench ScaleDB cluster – one node – show throughput ScaleDB cluster – 2 nd node – show throughput

ScaleDB: Multi-Table Indexing B-tree: Only indexes the data in tables Index #1 #1#2 Index #2 Index #3 Index #4 Index #5 #3#4#5 ScaleDB Index #1 #2 #3 #4 #5 ScaleDB: Indexes the data and relationships Advantages: Faster Smaller Referential integrity

Example Scenario: Select information that is spread across 3 tables: Colleges, Students and Enrollment Relationships: Students are enrolled in courses within departments of colleges SELECT c1.CollName, s.StudName, c2.CourseName, e.Grade FROM College AS c1 JOIN Student AS s JOIN Enrollment AS e JOIN Course AS c2 ON ( c1.CollNo = s.CollNo AND s.CollNo = e.CollNo AND s.StudentNo = e.StudentNo AND e.CollNo = c2.CollNo AND e.DeptNo = c2.DeptNo AND e.CourseNum = c2.CourseNum ) WHERE c1.CollNo = X AND s.StudentNo = Y ;

Option #1: Conventional Joins IDCollegeStudents 234Institute of Technology1, High Tech Institute5,742 85Golden State College2, Kaplan College12, California College1,926 IDStudent NameSS#Phone 1220Bruce Chizen (650) Naomi Seligman (279) Raymond Bingham 8872Reed Hastings (312) Maria Klawe 1123Bernard Vergnes CollegeIDCourse NameStudentGrade 510C67Mathematics C123History C14Photography Students Table College Table Enrollment Table Search enrollment by College & Student Get Student information Get College information

Option #2: Materialized View IDCollegeStudentsIDCourse NameIDStudent Name 234Institute of Technology1,334C134Mathematics1145John Cheechoo… 234Institute of Technology1,334C134Mathematics1837Ryane Clowe… 234Institute of Technology1,334C134Mathematics2256Patrick Marleau… 234Institute of Technology1,334C134Mathematics2277Jamie McGinn… 234Institute of Technology1,334C134Mathematics4113Torrey Mitchell… 234Institute of Technology1,334C134Mathematics1145… 385Golden State College2,224G85World History7783Joe Pavelski… 385Golden State College2,224G85World History2234Jeremy Roenick… 385Golden State College2,224G85World History1177Devin Setoguchi… 385Golden State College2,224G85World History4113Torrey Mitchell…...

Col_ID# Col_Name Col_Budget Col_Description Colleges 001Agriculture$1,234,567Nice place to visit 002Arts$5,432,567Sports not so good 003Business$9,999,666Cool logo 004Education$3,234,567Ugh Worcester 005Engineering$8,238,568Serious work 006Law$7,237,767Jumpy students 007Liberal Arts$9,898,777Pretty campus 008Medicine$5,987,004In Texas Students Mike HoganCaucasian Moshe SmithCaucasian Sally ShadmonNative American Billy FleegleAfrican American Saul GoodeAfrican American Tim CollinsPolynesian Sam GeeAsian Rod PaulinoAsian Enrollment B C B A B C F D Coll_ID#Coll_NameColl_BudgetColl_Description Student_ID# College_ID# Student_Name Student_Desc College_ID# Dept_ID#Student_ID# Grade Option #3: Multi-Table Index College Students Enrollment Departments Courses ScaleDB Multi-Table Index Enrollment

Mapping Foreign Keys to Data Views Create Students Table o Foreign key – College Students Enrollment Create Enrollment Table o Foreign key - Students Course Create Course Table o Foreign Key – Department Department Create Department Table o Foreign key – College College Create College Table The Parent-Child tables are Created in MySQL Such that MySQL is able to operate over the new tables The data of the Parent-Child tables is assembled on the fly from the source tables

Mapping Foreign Keys to Data Views Students Enrollment Course DepartmentCollege DepartmentCollege StudentsCollege Physical files: 1. College 2. Department 3. Student 4. Course 5. Enrollment ScaleDB Meta-Data Tables: 1. College 2. College-Dept 3. College-Dept-Course 4. College-Students 5. College-Students-Enrollment 6. Department 7. Students 8. Course 9. Enrollment

Enabling the MySQL optimizer to use a Multi-Table Index SELECT c1.CollName, s.StudName, c2.CourseName, e.Grade FROM College AS c1 JOIN Student AS s JOIN Enrollment AS e JOIN Course AS c2 ON ( c1.CollNo = s.CollNo AND s.CollNo = e.CollNo AND s.StudentNo = e.StudentNo AND e.CollNo = c2.CollNo AND e.DeptNo = c2.DeptNo AND e.CourseNum = c2.CourseNum ) WHERE c1.CollNo = X AND s.StudentNo = Y ; CREATE TABLE sdb_view_college_course_student ( L1_CollNo INT NOT NULL, L1_CollName CHAR(32) NOT NULL, L1_CollBudget INT NOT NULL, L1_CollDescription CHAR(60) NOT NULL, … Table College Columns L2_StudNo INT NOT NULL, L2_StudName CHAR(48) NOT NULL, … Table Student Columns L3_CourseNum CHAR(9) NOT NULL, L3_Grade CHAR(2) NOT NULL, … Table Enrollment Columns PRIMARY KEY ( L1_CollNo, L2_StudtNo, L3_CourseNum)) ENGINE = SCALEDB; Select L1_CollName, L2_StudName, L3_CourseName, L3_Grade FROM sdb_view_college_course_student WHERE l1_CollNo = X AND l2_StudentNo = Y ;

The Multi-Table Index Multi-Table Index appears to MySQL as a data table ScaleDB does not maintain data file associated with the Multi-Table Index For a query using virtual table, ScaleDB assembles the rows on the fly using the Multi-Table Index ScaleDB indexes are different than B-tree indexes ScaleDB indexes provide the same functionality as B-tree, plus… o They maintain referential integrity with minimal overhead o They allow you to search for the data and relationships o They are much smaller in size

Demo Query with join Query with Multi-Table Index 2 nd node virtual table

Benchmarking ScaleDB Index

Summary ScaleDB Cluster o Multiple ScaleDB instances share the same physical data. o Connecting to the cluster is similar to connecting to a single node. o For the application, the cluster appears as a single node. o Transparent application failover o Transparent Scalability ScaleDB Indexes o Provide the B-tree functionality o High performance Map relationships Maintain referential integrity Smaller footprint Independent of the key size

ScaleDB Status and Product Availability Started Beta Process o We are looking for beta companies Product launch is scheduled for June timeframe Please talk to us if you are developer interested in working with ScaleDB