Thursday April 17, 2008 ScaleDB Technical Presentation Scaling MySQL to New Heights.

Slides:



Advertisements
Similar presentations
The ScaleDB Storage Engine Enabling high performance and scalability, using a Multi-Table Index, and a Shared-Disk Clustering Architecture Moshe Shadmon.
Advertisements

ScaleDB Transactional Shared Disk storage engine for MySQL
Adam Jorgensen Pragmatic Works Performance Optimization in SQL Server Analysis Services 2008.
Teradata Columnar: A new standard for Columnar databases Source: Teradata is thinking Big Stephen Swoyer Presented by: Deesha Phalak and Kaushiki Nag.
A Fast Growing Market. Interesting New Players Lyzasoft.
MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management Dave Salisbury ( )
Chapter 13 (Web): Distributed Databases
Physical Database Design CIT alternate keys - named constraints - indexes.
Overview Distributed vs. decentralized Why distributed databases
Physical Database Monitoring and Tuning the Operational System.
Physical Design CS 543 – Data Warehousing. CS Data Warehousing (Sp ) - Asim LUMS2 Physical Design Steps 1. Develop standards 2.
Chapter 3 : Distributed Data Processing
Definition of terms Definition of terms Explain business conditions driving distributed databases Explain business conditions driving distributed databases.
Architecting a Large-Scale Data Warehouse with SQL Server 2005 Mark Morton Senior Technical Consultant IT Training Solutions DAT313.
Databases & Data Warehouses Chapter 3 Database Processing.
1 CSE 480: Database Systems Lecture 9: SQL-DDL Reference: Read Chapter of the textbook.
Module 17 Storing XML Data in SQL Server® 2008 R2.
Sarah Sproehnle Cloudera, Inc
Performance and Scalability. Performance and Scalability Challenges Optimizing PerformanceScaling UpScaling Out.
The Relational Database Model
3 The Relational Model MIS 304 Winter Class Objectives That the relational database model takes a logical view of data That the relational model’s.
Systems analysis and design, 6th edition Dennis, wixom, and roth
1 © Prentice Hall, 2002 Physical Database Design Dr. Bijoy Bordoloi.
RDB/1 An introduction to RDBMS Objectives –To learn about the history and future direction of the SQL standard –To get an overall appreciation of a modern.
Database Technical Session By: Prof. Adarsh Patel.
CS 345: Topics in Data Warehousing Tuesday, October 19, 2004.
Information Systems Today (©2006 Prentice Hall) MySQL 1CS3754 Class Note #8, Is an open-source relational database management system 2.Is fast and.
TM 7-1 Copyright © 1999 Addison Wesley Longman, Inc. Physical Database Design.
1 Adapted from Pearson Prentice Hall Adapted form James A. Senn’s Information Technology, 3 rd Edition Chapter 7 Enterprise Databases and Data Warehouses.
MySQL. Dept. of Computing Science, University of Aberdeen2 In this lecture you will learn The main subsystems in MySQL architecture The different storage.
Microsoft Business Intelligence Environment Overview.
1 Moshe Shadmon ScaleDB Scaling MySQL in the Cloud.
The ScaleDB Storage Engine Enabling high performance and scalability, using a Multi-Table Index, and a Shared-Disk Clustering Architecture Moshe Shadmon.
Massively Distributed Database Systems - Distributed DBS Spring 2014 Ki-Joune Li Pusan National University.
PHP and MySQL CS How Web Site Architectures Work  User’s browser sends HTTP request.  The request may be a form where the action is to call PHP.
Oracle Data Definition Language (DDL) Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2008.
Lecture2: Database Environment Prepared by L. Nouf Almujally & Aisha AlArfaj 1 Ref. Chapter2 College of Computer and Information Sciences - Information.
SQL Server Indexes Indexes. Overview Indexes are used to help speed search results in a database. A careful use of indexes can greatly improve search.
Partitioning Design For Performance and Maintainability Martin Cairns
Kirkwood Center for Continuing Education Introduction to PHP and MySQL By Fred McClurg, Copyright © 2015, Fred McClurg, All Rights.
3 1 Chapter 3 The Relational Database Model Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
Module 4 Designing and Implementing Views. Module Overview Introduction to Views Creating and Managing Views Performance Considerations for Views.
 2009 Calpont Corporation 1 Calpont Open Source Columnar Storage Engine for Scalable MySQL Data Warehousing April 22, 2009 MySQL User Conference Santa.
1 Distributed Databases BUAD/American University Distributed Databases.
1 Biometric Databases. 2 Overview Problems associated with Biometric databases Some practical solutions Some existing DBMS.
©NIIT Normalizing and Denormalizing Data Lesson 2B / Slide 1 of 18 Objectives In this section, you will learn to: Describe the Top-down and Bottom-up approach.
SQL/Lesson 7/Slide 1 of 32 Implementing Indexes Objectives In this lesson, you will learn to: * Create a clustered index * Create a nonclustered index.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
Chapter 5 Index and Clustering
Session 1 Module 1: Introduction to Data Integrity
3 1 Database Systems The Relational Database Model.
Last Updated : 27 th April 2004 Center of Excellence Data Warehousing Group Teradata Performance Optimization.
Creating Indexes on Tables An index provides quick access to data in a table, based on the values in specified columns. A table can have more than one.
Chapter 1 Database Access from Client Applications.
3 1 Chapter 3 The Relational Database Model Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
Howard Paul. Sequential Access Index Files and Data File Random Access.
Diving into Query Execution Plans ED POLLACK AUTOTASK CORPORATION DATABASE OPTIMIZATION ENGINEER.
IT 5433 LM4 Physical Design. Learning Objectives: Describe the physical database design process Explain how attributes transpose from the logical to physical.
2 Copyright © 2008, Oracle. All rights reserved. Building the Physical Layer of a Repository.
Data Integrity & Indexes / Session 1/ 1 of 37 Session 1 Module 1: Introduction to Data Integrity Module 2: Introduction to Indexes.
From RDBMS to Hadoop A case study Mihaly Berekmeri School of Computer Science University of Manchester Data Science Club, 14th July 2016 Hayden Clark,
Neo4j: GRAPH DATABASE 27 March, 2017
Informix Red Brick Warehouse 5.1
Dimensional Model January 14, 2003
CHAPTER 5: PHYSICAL DATABASE DESIGN AND PERFORMANCE
Physical Database Design
What is the Value of an IBM Balanced Warehouse™
The Relational Database Model
Presentation transcript:

Thursday April 17, 2008 ScaleDB Technical Presentation Scaling MySQL to New Heights

ScaleDB ScaleDB for MySQL Database Storage Engine InnoDB, MyISAM, Cluster, Falcon, BDB, Merge, etc.

ScaleDB What Makes ScaleDB Better? ScaleDB Advantages: Performance: New indexing delivers dramatic performance improvement Scalability: Designed for clustering with Plug- and-Cluster™ Architecture

Improving Performance

ScaleDB ScaleDB Indexing Conventional Indexing (B-tree) General Purpose Indexing HashBitmapAggregate Etc. Special-purpose Index Add-ons* *Only supported by high-end commercial databases ScaleDB Index: A general purpose index that also delivers much of the functionality and performance of special-purpose index add-ons

ScaleDB ScaleDB: Multi-Table Indexing B-tree: Only indexes the data in tables Index #1 #1#2 Index #2Index #3Index #4Index #5 #3#4#5 ScaleDB Index #1 #2 #3 #4 #5 ScaleDB: Indexes the data and relationships Advantages: Faster Smaller Referential integrity More functionality

ScaleDB Describing Our Demo Scenario: Select information that is spread across 3 tables: Colleges, Students and Enrollment Relationships: Students are enrolled in courses within departments of colleges DDL Definitions

ScaleDB The Query SELECT c1.CollName, s.StudName, c2.CourseName, e.Grade FROM College AS c1 STRAIGHT_JOIN Student AS s STRAIGHT_JOIN Enrollment AS e STRAIGHT_JOIN Course AS c2 ON ( c1.CollNo = s.CollNo AND s.CollNo = e.CollNo AND s.StudentNo = e.StudentNo AND e.CollNo = c2.CollNo AND e.DeptNo = c2.DeptNo AND e.CourseNum = c2.CourseNum ) WHERE c1.CollNo = X AND s.StudentNo = Y ;

ScaleDB A Sample Scenario Scenario: I need information that is spread across 3 tables: Colleges, Students and Enrollment Col_ID# Col_Name Col_Budget Col_Description Colleges 0001Amhearst$1,234,567Nice place to visit 0002Berkeley$5,432,567Sports not so good 0003Harvard$9,999,666Cool logo 0004Holy Cross$3,234,567Ugh Worcester 0005MIT$8,238,568Serious work 0006Cornell$7,237,767Jumpy students 0007Stanford$9,898,777Pretty campus 0008TCU$5,987,004In Texas Students 0001Amhearst$1,234,567Nice place to visit 0002Berkeley$5,432,567Sports not so good 0003Harvard$9,999,666Cool logo 0004Holy Cross$3,234,567Ugh Worcester 0005MIT$8,238,568Serious work 0006Cornell$7,237,767Jumpy students 0007Stanford$9,898,777Pretty campus 0008TCU$5,987,004In Texas Enrollment 0001Amhearst$1,234,567Nice place to visit 0002Berkeley$5,432,567Sports not so good 0003Harvard$9,999,666Cool logo 0004Holy Cross$3,234,567Ugh Worcester 0005MIT$8,238,568Serious work 0006Cornell$7,237,767Jumpy students 0007Stanford$9,898,777Pretty campus 0008TCU$5,987,004In Texas Coll_ID#Coll_NameColl_BudgetColl_Description Dept_ID#Dept_NameColl_ID#Dept_BudgetCourse_ID#Course_NameColl_ID#Dept_ID# Options: #1: Conventional Joins #2: Materialized View #3: ScaleDB

ScaleDB The Query SELECT c1.CollName, s.StudName, c2.CourseName, e.Grade FROM College AS c1 STRAIGHT_JOIN Student AS s STRAIGHT_JOIN Enrollment AS e STRAIGHT_JOIN Course AS c2 ON ( c1.CollNo = s.CollNo AND s.CollNo = e.CollNo AND s.StudentNo = e.StudentNo AND e.CollNo = c2.CollNo AND e.DeptNo = c2.DeptNo AND e.CourseNum = c2.CourseNum ) WHERE c1.CollNo = X AND s.StudentNo = Y ; |table | type | key | key_len | rows | filtered | | c1 | const | PRIMARY | 4 | 1 | | | s | const | PRIMARY | 14 | 1 | | | e | ref | idx_EnrollStud | 4 | 3 | | | c2 | eq_ref | PRIMARY | 17 | 1 | |

ScaleDB Option #1: Conventional Joins Col_ID# Col_Name Col_Budget Col_Description Colleges 001Agriculture$1,234,567Nice place to visit 002Arts$5,432,567Sports not so good 003Business$9,999,666Cool logo 004Education$3,234,567Ugh Worcester 005Engineering$8,238,568Serious work 006Law$7,237,767Jumpy students 007Liberal Arts$9,898,777Pretty campus 008Medicine$5,987,004In Texas Students Mike HoganCaucasian Moshe SmithCaucasian Sally ShadmonNative American Billy FleegleAfrican American Saul GoodeAfrican American Tim CollinsPolynesian Sam GeeAsian Rod PaulinoAsian Enrollment B C B A B C F D Coll_ID#Coll_NameColl_BudgetColl_Description Student_ID# College_ID# Student_Name Student_Desc College_ID# Dept_ID#Student_ID# Grade Colleges Index(s)Students Index(s)Enrollment Index(s) Query Result: 008 Medicine $5,987,004 In Texas | Saul Goode African American | 4455 B+ | Join

ScaleDB Col_ID# Col_Name Col_Budget Col_Description Materialized View Coll_ID#Coll_NameColl_BudgetColl_Description Student_ID# Student_Name Student_Desc Dept_ID# Grade Col_ID# Col_Name Col_Budget Col_Description ………… 001Agriculture$1,234,567Nice place to visit Mike Hogan Caucasian 3345A 001Agriculture$1,234,567Nice place to visit Mike Hogan Caucasian 3235B+ 001Agriculture$1,234,567Nice place to visit Mike Hogan Caucasian 3245A- 001Agriculture$1,234,567Nice place to visit Mike Hogan Caucasian 3245B 001Agriculture$1,234,567Nice place to visit Mike Hogan Caucasian 3235A+ 001Agriculture$1,234,567Nice place to visit Paul Martyn Caucasian 3239A- 001Agriculture$1,234,567Nice place to visit Paul Martyn Caucasian 3239B 001Agriculture$1,234,567Nice place to visit Paul Martyn Caucasian 3240A+ 008Medicine$5,987,004In Texas Saul Goode African American 4455A 008 Medicine $ 5,987,004 In Texas Saul Goode African American 4455A 008 Medicine $ 5,987,004 In Texas Saul Goode African American 4455B+ 008 Medicine $ 5,987,004 In Texas Saul Goode African American 4455A- 008 Medicine $ 5,987,004 In Texas Saul Goode African American 4455B 008 Medicine $ 5,987,004 In Texas Paul Martyn Caucasian 4454A- 008 Medicine $ 5,987,004 In Texas Paul Martyn Caucasian 4454B 008 Medicine $ 5,987,004 In Texas Paul Martyn Caucasian 4454A+ Enrollment B C B A B C F D Col_ID# Col_Name Col_Budget Col_Description Colleges 001Agriculture$1,234,567Nice place to visit 002Arts$5,432,567Sports not so good 003Business$9,999,666Cool logo 004Education$3,234,567Ugh Worcester 005Engineering$8,238,568Serious work 006Law$7,237,767Jumpy students 007Liberal Arts$9,898,777Pretty campus 008Medicine$5,987,004In Texas Students Mike HoganCaucasian Moshe SmithCaucasian Sally ShadmonNative American Billy FleegleAfrican American Saul GoodeAfrican American Tim CollinsPolynesian Sam GeeAsian Rod PaulinoAsian Coll_ID#Coll_NameColl_BudgetColl_Description Dept_ID#Dept_NameColl_ID#Dept_Budget Course_ID#Course_NameColl_ID#Dept_ID# Copies (and synchronizes) the data from individual tables into one massive view Option #2: Materialized View Query Result: 008 Medicine $5,987,004 In Texas | Saul Goode African American | 4455 B+ | Materialized View Indexes

ScaleDB Col_ID# Col_Name Col_Budget Col_Description Colleges 001Agriculture$1,234,567Nice place to visit 002Arts$5,432,567Sports not so good 003Business$9,999,666Cool logo 004Education$3,234,567Ugh Worcester 005Engineering$8,238,568Serious work 006Law$7,237,767Jumpy students 007Liberal Arts$9,898,777Pretty campus 008Medicine$5,987,004In Texas Students Mike HoganCaucasian Moshe SmithCaucasian Sally ShadmonNative American Billy FleegleAfrican American Saul GoodeAfrican American Tim CollinsPolynesian Sam GeeAsian Rod PaulinoAsian Enrollment B C B A B C F D Coll_ID#Coll_NameColl_BudgetColl_Description Student_ID# College_ID# Student_Name Student_Desc College_ID# Dept_ID#Student_ID# Grade Option #3: ScaleDB Query Result: 008 Medicine $5,987,004 In Texas | Saul Goode African American | 4455 B+ | College Students Enrollment ScaleDB’s multi-table index is relationship-aware A Single Index Lookup Departments Courses ScaleDB Index Enrollment

ScaleDB Building Relationships in ScaleDB College Create College Departments Create Department - foreign key: College Courses Create Course - foreign key: Department Students Create Students - foreign key: College Enrollment Create Enrollment - foreign key: Students Relationship creation is automated

ScaleDB Pros & Cons of Each Method Ease of Implementation Real-Time Data PerformanceTuning Conventional Joins Materialized Views ScaleDB

ScaleDB Performance Variables Early performance benchmarks Used a vanilla scenario Our performance advantage increases with: Query/Schema Complexity Referential Integrity Checks Key Size Data Size/Number of Keys Performance Advantage: 2X – 20X+

ScaleDB MySQL Integration ScaleDB leverages its index to assemble data across tables without step-wise joins MySQL query optimizer sees multiple tables, so it calls for step-wise joins ScaleDB tells the query optimizer about joined tables, they are virtualized (built on the fly) When MySQL’s query optimizer recognizes ScaleDB, phantom tables will be removed

Improving Scalability

ScaleDB The Challenges of Scaling How do I partition data? Predict usage patterns, application evolution, data growth patterns…all are moving targets Avoid data skew: bottlenecks caused by frequently accessed data on just a few nodes Data shipping between nodes (2-phase commit) Searches outside the partition column require participation by all nodes Scaling becomes an exercise in fire fighting

ScaleDB ScaleDB’s Plug-and-Cluster™ Cluster-ready solution, just plug in a server No need to partition the data Based on shared-everything architecture Found in the highest-end commercial databases Eliminates all of the data partitioning problems

ScaleDB Local Lock Manager Shared Storage ScaleDB Cluster Local Lock Manager

ScaleDB ScaleDB Cluster Shared Storage Global Lock Manager

ScaleDB Demo

ScaleDB In a nutshell… MySQL MySQL + ScaleDB

ScaleDB Summary Revolutionary indexing solution delivers a quantum leap in performance & scalability Results: Performance improvements of 2X and up 7X smaller index size (average) Stop jumping through hoops to avoid joins…FREE JOINS! Enables more complex applications, fresh data, lower TCO, superior scalability & performance We’re looking for appropriate beta testers