Presentation is loading. Please wait.

Presentation is loading. Please wait.

Prototyping SD-SQL Server: a Scalable Distributed Database System Soror SAHRI Witold LITWIN Thomas Schwarz

Similar presentations


Presentation on theme: "Prototyping SD-SQL Server: a Scalable Distributed Database System Soror SAHRI Witold LITWIN Thomas Schwarz"— Presentation transcript:

1 Prototyping SD-SQL Server: a Scalable Distributed Database System Soror SAHRI Witold LITWIN Thomas Schwarz Soror.sahri@dauphine.frSoror.sahri@dauphine.fr Witold.litwin@dauphine.fr tjschwarz@scu.eduWitold.litwin@dauphine.frtjschwarz@scu.edu Ceria Laboratory Comp. Eng. Dep. Santa Clara U.

2 WDAS Workshop, Santa Clara, CA, January 5th 2 Overview  Introduction  Overall Architecture  Application Interface  Implementation  Performance  Conclusion

3 WDAS Workshop, Santa Clara, CA, January 5th 3  Most of DBSs have distributed/parallel versions SQL Server, Oracle, DB2  DBSs do not provide dynamically scalable tables. All require manual repartitioning when tables scale-up. A Scalable Distributed Database System: SD-DBS Solution? Introduction Architecture Issue Solution

4 WDAS Workshop, Santa Clara, CA, January 5th 4 Applies SDDS technology to DBSs Why SDDSs? Provide many scalable distributed partitioning schemes. LH*, RP*, k-RP*, LH* RS … These schemes can serve as the basis for SD-DBS architecture Introduction Architecture Issue Solution

5 WDAS Workshop, Santa Clara, CA, January 5th 5 SDDS Technology for DBSs Basic Design Constraints  SDDS Key & tuple-at-the-time based access Search and insert Forwarding Access to system internals  SD-DBS Rich assertional SQL queries No access « under the cover » Application interface has to be used No forwarding Not built-in in any major DBMS

6 WDAS Workshop, Santa Clara, CA, January 5th 6 SD-SQL Server  A prototype SD-DBS  Runs on SQL Server  Uses linked SQL Server nodes Shared Nothing Architecture Client, Server & Peer SDDS nodes Up to 250 nodes at present  Uses updatable distributed partitioned views SDDS client image  Uses AFTER triggers To monitor local tables To split locally overflowing ones

7 WDAS Workshop, Santa Clara, CA, January 5th 7 SD-SQL Server Architecture Split User/Application Linked SQL Servers D1 NDBs D2DiDiDi +1 _D1_T SD-SQL server SD-SQL server SD-SQL client SSPC I I D1_T SD-SQL Server Managers _D1_T I T sd_select SD-SQL peer sd_insert

8 WDAS Workshop, Santa Clara, CA, January 5th 8 SD-SQL Server Architecture: Nodes, SDBs, NDBs  SD-SQL Server is a collection of distributed SD- SQL Server nodes. Linked SQL Server Nodes  An SD-SQL Server Node carries node databases (NDBs) SQL Server DBs Elements of some SDBs  A dynamic collection of NDBs with the same name forms a Scalable (Distributed) Database (SDB) Created at some node with one local NDB Becoming primary NDB and node for the SDB

9 WDAS Workshop, Santa Clara, CA, January 5th 9 DB1 SDB SD-SQL Server Architecture Node1Node2Node3Node i DB1DB1 …… DB1 DB1DB1 DB2 SDB DB2 DB3 SDB DB3 MDB

10 WDAS Workshop, Santa Clara, CA, January 5th 10  An SD-SQL Server NDB is Client NDB Carries only images Interfaces applications Server NDB Carries only the segments Peer NDB Both functions Primary NDB First created for an SDB Carries SDB meta-data Can be server or peer NDB only SD-SQL Server Architecture: Nodes, SDBs, NDBs

11 WDAS Workshop, Santa Clara, CA, January 5th 11  An SD-SQL Server node is Peer Node Carries any NDBs Client Node Carries only Client NDBs Server Node Carries only Server NDBs No application interface Primary Node First ever created  By a script  Can only be server or peer node  Carries the meta-DB (MDB) SD-SQL Server Architecture: Nodes, SDBs, NDBs

12 WDAS Workshop, Santa Clara, CA, January 5th 12  sd_create_node ‘Dell1’ /* Primary node created by script */  sd_create_node ‘Dell2’ /* Server by default */  sd_create_node ‘Dell3, ‘client’  sd_create_node ‘Ceria1’,’peer’  sd_alter_ node ‘Dell3’, ‘ADD server’ /* Becomes peer*/  sd_create_scalable_database ‘SkyServer, ‘Dell1’ /* Creates the primary SkyServer NDB as well at Dell1*/  sd_create_node_database ‘SkyServer’, ‘Dell3’, ‘client’ SD-SQL Server Architecture: Nodes, SDBs, NDBs

13 WDAS Workshop, Santa Clara, CA, January 5th 13 SD-SQL Server Architecture: Scalable Table : Creation  An SDB contains scalable (distributed) tables  Created by the sd_create_table command  Issued to client or peer NDB sd_create_table ‘PhotoObj (objid BIGINT PRIMARY KEY…)’, 10000

14 WDAS Workshop, Santa Clara, CA, January 5th 14 SD-SQL Server Architecture: Scalable Table : Images  A scalable (distributed) table is a collection of segments hidden behind images Scalable SQL Server distributed updatable partitioned views of the segments  Union-all views  Using Lazy Schema Validation option Primary image Created by SD-SQL Server at the table creation Resides at the creation node  Client or peer NDB where the command was issued  In the current prototype Has the name of the scalable table Secondary images Created later  by sd_create_image command Reside at other client or peer NDBs of the SDB Have a specific name, other than that of the table  To avoid name conflict

15 WDAS Workshop, Santa Clara, CA, January 5th 15 Images DB1 SDB N1.DB1N2.DB1N3.DB1 S S S PhotoObj ScalableTable Primary Image CREATE VIEW PhotoObj AS SELECT * FROM N1.DB1.PhotoObj UNION ALL SELECT * FROM N2.DB1.PhotoObj UNION ALL SELECT * FROM N3.DB1.PhotoObj

16 WDAS Workshop, Santa Clara, CA, January 5th 16 SD-SQL Server Architecture: Scalable Table : Segments  Segments are SQL tables Initially, the table has only one primary segment At some server or peer node Peer node could be the table creation node Splits produce the other segments Each is located at a different NDB  Within the SDB  If there is not enough NDBs, splits dynamically append new ones A split occurs when an insert overflows the segment capacity Measured in # of tuples At present all segments of a table have the same capacity Segments may be indexed By segments of SD-SQL Server scalable indexes

17 WDAS Workshop, Santa Clara, CA, January 5th 17 SD-SQL Server Architecture: Scalable Table : Split  A single insert may overflow One segment by one tuple Tuple insert split Produces half-half split appending a single new segment One segment by any number of tuples Bulk insert single segment split Appends one or several new segments Each new segment is 50% loaded Splitting segment is at least 50% - 100% loaded Several segments, each by any number of tuples Multiple segment split Appends one or several new segments Each new segment is 50% loaded Each splitting segment is at least 50% - 100% loaded

18 WDAS Workshop, Santa Clara, CA, January 5th 18 SD-SQL Server Architecture: Scalable Table : Split  Splits are range partitioned  With respect to the partition key Must be a key attribute (SQL Server restriction) 1st key attribute (SD-SQL Server default) Any other key attribute (user defined in sd_create_table command) E.g., foreign key  Split generate SQL Server check constraints Whenever the table has several segments The constraints fix the range of key for each segment

19 WDAS Workshop, Santa Clara, CA, January 5th 19 Tuple insert split S b+1 S S1S1 p b+1-p p=INT(b/2) C( S)=   { c: c  h = c (b+1-p)} C( S 1 )={c: c > l = c (b+1-p)} Check Constraint? b

20 WDAS Workshop, Santa Clara, CA, January 5th 20 Bulk Insert Single Segment Split (a) Initially (b) After the insert (c) After the split

21 WDAS Workshop, Santa Clara, CA, January 5th 21 Multiple Segment Split

22 WDAS Workshop, Santa Clara, CA, January 5th 22 SD-SQL Server Architecture: Scalable Table : Image Adjustment  Splits do not manipulate images  A split makes all existing images outdated The existing distributed partitioned views do not address any new segments  Image correctness is checked when a query addressing the image comes in Before SD-SQL Server executes the query  Image is adjusted if needed New view is produced

23 WDAS Workshop, Santa Clara, CA, January 5th 23 SD-SQL Server Command Interface  The application manipulates scalable tables through SD-SQL Server commands.  These start with sd_.... to distinguish from SQL Server commands for static tables  Command types:  Creation : sd_create_node… SDB, NDB, table, image, index  Alteration : sd_alter_node, sd_alter_table  Removal : sd_drop_node…  Search queries : sd_select  includes sd_select …into…  Creating a scalable table  Update queries : sd_insert, sd_update, sd_ delete

24 WDAS Workshop, Santa Clara, CA, January 5th 24 SD-SQL Server Command Interface  Every command is implemented as SQL Server stored procedure  Initially in MDB  Every standard SQL command has SD-SQL Server counterpart:  With slightly different syntax, besides the sd_ prefix  Brackets around standard SQL clauses, SD-SQL Server specific clauses…  Performing some SD-SQL Server specific processing  Generating some SQL command  To image(s) or every segment  SD-SQL Server commands do not support some SQL Server specific clauses  Case Of for instance  SQL Server create view command does not have SD-SQL Server counterpart

25 WDAS Workshop, Santa Clara, CA, January 5th 25 SD-SQL Server Command Interface  sd_create_table ‘Neighbors (htmid BIGINT, objid BIGINT, Neighborobjid BIGINT) ON PRIMARY KEY…)’, 500, ‘objid’  sd_alter_table ‘PhotoObj ADD t INT, 1000  sd_create_index ‘run_index ON Photoobj (run)‘  sd_create_image ‘Ceria1’, ‘PhotoObj’  sd_drop_image 'SD.Dell3_Photoobj‘ USE Skyserver /* SQL Server command */ sd_insert ‘INTO PhotoObj SELECT * FROM Ceria5.Skyserver-S.PhotoObj sd_select ‘* FROM PhotoObj’ sd_select ‘TOP 5000 * INTO PhotoObj1 FROM PhotoObj’, 500

26 WDAS Workshop, Santa Clara, CA, January 5th 26 SD-SQL Server Naming Rules  At each NDB, including MDB, SD-SQL Server has its own account named SD  For secondary images and segments  For the meta-tables  Otherwise SD-SQL Server uses the SQL Server public dbo account  For any primary image  Hence for every scalable table, for the applications  For SD-SQL Server stored procedures  commands etc.  SD-SQL Server does let scalable tables to be under user accounts  At present  E.g., table dell1.Skyserver.soror.photoObj can only be a static table

27 WDAS Workshop, Santa Clara, CA, January 5th 27 SD-SQL Server Naming Rules  Primary image of scalable table T has SQL Server name dbo.T in its NDB  Users at different NDBs may create different scalable tables T  Not at the same NDB  At every NDB, segment of T created at node N of the SDB bears the name SD._N_T.  At every NDB, secondary image of T created at node N of the SDB bears the name SD.N_T.  The rules avoid the name conflict  Between primary and secondary images and the segments of different scalable tables named T at their NDBs  Between SD-SQL Server objects and other SQL Server objects  Static tables and views

28 WDAS Workshop, Santa Clara, CA, January 5th 28 SD-SQL Server Meta-Tables  Contain various SD-SQL Server specific data in every NDB  Every server NDB (S-catalog)  SD.RP (SgmNd, CreatNd, Table)  Describes the actual partitioning of every scalable table with the primary segment at the NDB  SD.Size (CreatNd, Table, Size)  Contains the segment size for every scalable table at the NDB  SD.Primary (PrimNd, CreatNd, Table).  For every segment at the NDB, a tuple points towards the primary segment of the table the segment belongs to  SD.SDBNode (Node)  Points towards the primary NDB of the SDB.  SD.MDBNode (Node).  Points towards the primary node.

29 WDAS Workshop, Santa Clara, CA, January 5th 29 Scalable Tables S S S … S DB1 SDB N1.DB1N2.DB1N3.DB1 PhotoObj Scalable Table Ni.DB1 1000 Size N1.DB1 Primary Meta-Tables Ni.DB1 Nodes N1.DB1 N2.DB1 N3.DB1 RP

30 WDAS Workshop, Santa Clara, CA, January 5th 30 SD-SQL Client Meta-Tables  Client NDB (C-catalog)  SD.Image (Name, Type, PrimNd,Size)  registers all the local images  SD.Server (Node)  provides the server (peer) node(s) available for the primary segment of a table to create.  Contains only one tuple at present  May contain more  e.g., for the fault tolerance or load balancing.  SD.SDBNode (Node)  Points towards the primary NDB of the SDB.  SD.MDBNode (Node).  Points towards the primary node.

31 WDAS Workshop, Santa Clara, CA, January 5th 31 SD-SQL Peer & al. Meta-Tables  Peer NDB (P-catalog)  C-catalog UNION S-catalog  MDB  SD.Nodes (Node, Type)  Each tuple registers an SD-SQL Server node currently forming the SD-SQL configuration.  Each primary NDB  SD.NDB (Node, NDBType).  Registers all the NDBs currently composing the SDB.  NDBType indicates whether the NDB is a peer, server or client.

32 WDAS Workshop, Santa Clara, CA, January 5th 32 SD-SQL Server Table Evolution  The split leaves the overflowing segment at least half full.  Every new segment ends up half full  To attain the typical load factor of almost 70 %  Split processing tries to do not delay the commit of the insert triggering it  Splitting may be a relatively long operation  AFTER trigger tests the overflow  Asynchronous SQL Server job termed Splitter performs the split

33 WDAS Workshop, Santa Clara, CA, January 5th 33 SD-SQL Server Table Evolution  The allocation of nodes to new segments of a scalable table tries to randomly balance node loads among the clients and /or peers.  The splitting algorithm allocates nevertheless the same nodes to the successive segments of different scalable tables of the same client.  All this, to reduce query execution time  Usually the queries tend to address the tables of the same client

34 WDAS Workshop, Santa Clara, CA, January 5th 34 SD-SQL Server Table Evolution Concurrent execution of the split and of the scalable queries is efficient and serializable.  A concurrent scalable query that addresses the tuples in an overflowing segment  either manipulates them before the split migrates out any of them  or manipulates them only when the split is over

35 WDAS Workshop, Santa Clara, CA, January 5th 35 SD-SQL Server Table Evolution  SD SQL Server processes every command as a distributed transaction at Repeatable Read isolation level  Splits use exclusive locks on RP and segments  Shared locks on other meta-tables  E.g. SD.Primary  Scalable queries use basically shared locks on RP, Image and any other table involved  See details in the paper  Creation of new segment scheme  Keys  Check Constraint calculus  Indexing new segments  Serializability analysis  Deadlocks  Etc.

36 WDAS Workshop, Santa Clara, CA, January 5th 36 Concurrent Split Processing Splitter sd_alter N2.DB1 N1.DB1 N2.DB1 N3.DB1 RP S Exclusive Lock attente Exclusive Lock Shared Lock Exclusive Lock X X

37 WDAS Workshop, Santa Clara, CA, January 5th 37 SD-SQL Server Image Processing  Image Checking & Adjustment  Compares Image meta-table and RP  Expected / Actual # of segments of the table  Recreates the distributed partitioned view if needed  Updates Image  Image Binding  Finds whether a name in FROM clause depends on a scalable table  The name can be a view name or a table name  A view may depend on a view etc.  Processing parses the query and goes recursively through  Image table  SQL Server system tables:  sysobjects and sysdepends  Et the end, it determines all the image names involved and checks upon each of them

38 WDAS Workshop, Santa Clara, CA, January 5th 38 Experimental Performance Analysis  To determine the SD-SQL Server processing efficiency On P4 1.8 GHz PCs with 1 Gbs local net.  Use of the SkyServer BD as benchmark http://research.microsoft.com/~gray/SDSS http://research.microsoft.com/~gray/SDSS Use of the PhotoObj table as a scalable table. PhotoObj has 158,426 tuples (about 260 MB)

39 WDAS Workshop, Santa Clara, CA, January 5th 39 Experiments (Q) SELECT COUNT (*) FROM PhotoObj Execution time of (Q) on SQL Server and SD-SQL Server

40 WDAS Workshop, Santa Clara, CA, January 5th 40 Conclusion  Scalable distributed databases with scalable tables are now a reality with SD-SQL Server No more manual repartitioning Unlike in any other DBS we know about  See the “Related Work” in the paper  The performance analysis proves Efficiency of our design Immediate utility of SD-SQL Server  Future Work Quite a lot Our system is only the “proof-of-the-concept” See the paper

41 Thank You

42 WDAS Workshop, Santa Clara, CA, January 5th 42 Linked SQL Servers D1 D2DiDi+1 SSPC D1_T _D1_T User/Application T sd_create_table NDBs SD-SQL server SD-SQL client SD-SQL Server Managers SD-SQL peer User/Application sd_insert _D1_T Split

43 WDAS Workshop, Santa Clara, CA, January 5th 43 DB1 SDB Node1Node2Node3Node i DB1DB1 …… DB1 DB1DB1 DB2 SDB DB2 MDB


Download ppt "Prototyping SD-SQL Server: a Scalable Distributed Database System Soror SAHRI Witold LITWIN Thomas Schwarz"

Similar presentations


Ads by Google