Presentation is loading. Please wait.

Presentation is loading. Please wait.

An Overview of a Scalable Distributed Database System: SD-SQL Server

Similar presentations


Presentation on theme: "An Overview of a Scalable Distributed Database System: SD-SQL Server"— Presentation transcript:

1 An Overview of a Scalable Distributed Database System: SD-SQL Server
Witold LITWIN, Soror SAHRI & Thomas SCHWARZ Ceria Laboratory Comp. Eng. Dep Paris-Dauphine University Santa Clara U. BNCOD 2006

2 Overview P L A N 1. Introduction 3. SD-SQL Server Architecture
4. SD-SQL Server Application Interface 5. Implementation of SD-SQL Server Le plan de ma présentation s’articule autour des 6 points suivants : Je vais d’abords commencer par une introduction en vous situant le contexte et les objectifs de ma thèse 6. Conclusion & Future Work

3 Facts Most of DBSs have distributed/parallel versions
SQL Server, Oracle, DB2… DBSs do not provide dynamically scalable tables All require manual repartitioning when tables scale-up

4 Facts http://ceria.dauphine.fr/CERIA-publications.html
Research Report, December 2005 [Oracle Database 10g]

5 Objective SD-DBS Allow Dynamic Partitioning of scalable data
Conceive a Scalable Distributed Database System SD-DBS

6 SDDSs An SDDS is new class of data structures
What’s an SDDS? An SDDS is new class of data structures Specific for mutli-computers Why SDDSs? SDDSs provide many scalable distributed partitioning schemes LH*, RP*, k-RP*, LH*RS… These schemes can serve as the basis for SD-DBS architecture

7 SD-SQL Server? SD-SQL Server is a Scalable Distributed Database System (SD-DBS) SD-SQL Server uses the SD-DBS architecture Proposed by: Pr. Litwin, Pr. Schwartz & Pr. Risch (2002) SD-SQL Server runs on Microsoft SQL Server 2000 Shared Nothing Architecture Up to 250 nodes at present

8 SD-SQL Server Managers
Gross Architecture User/Application T sd_create_table User/Application sd_insert Linked SQL Servers D1 D2 Di Di+1 NDBs SD-SQL server SD-SQL client SD-SQL Server Managers SD-SQL peer S P C D1_T _D1_T _D1_T Split

9 The Nodes, SDBs & NDBs …… Node1 Node2 Node3 Nodei MDB DB2 DB1 DB1 DB2
DB1 SDB DB2 SDB

10 NDB Types An SD-SQL Server NDB can be: Client NDB Carries only images
Application interfaces Server NDB Carries only the segments Primary NDB First created for an SDB Peer NDB Both functions

11 Scalable Tables A scalable (distributed) table is a collection of segments Segments are SQL tables A scalable table has, initially, only one primary segment At some server or peer NDB All the segments of a scalable table have the same characteristics

12 A check constraint defines the Min and Max for each segment
Scalable Tables: Meta-data Each scalable table has meta-data: The segment capacity The actual partitioning of the scalable table The check constraint of each segment A check constraint defines the Min and Max for each segment These meta-data are stored in the meta-tables

13 Scalable Tables: Meta-data
……. DB1 SDB N1.DB1 N2.DB1 N3.DB1 Ni.DB1 Meta-Tables N1.DB1 Primary N1.DB1 N2.DB1 N3.DB1 RP T Scalable Table 1000 Size Ni.DB1 Nodes Meta-Tables

14 Scalable Tables: Splitting
The number of segments in a scalable table is variable If a segment overflows, its split is triggered A split occurs when an insert overflows the segment capacity Splits produce other segments for a scalable table Each new segment resulted from the split will have a check constraint

15 Scalable Tables: Splitting
Check Constraint? p b+1-p p=INT(b/2) C( S)=  { c: c < h = c (b+1-p)} C( S1)={c: c > = c (b+1-p)} b+1 b S S S1 SELECT TOP Pi * WITH TIES INTO Ni.S1 FROM S ORDER BY C ASC SELECT TOP Pi * INTO Ni.Si FROM S ORDER BY C ASC

16 Single segment bulk insert split
Scalable Tables: Splitting Single segment bulk insert split

17 Scalable Tables: Splitting
Multi-segment split

18 Scalable Tables: Splitting
sd_create_node sd_create_node_database NDB DB1 sd_create_node_database N1 N2 N4 N3 N3 NDB DB1 NDB DB1 Ni SDB DB1 SDB DB1 SDB DB1 NDB DB1 sd_insert sd_insert sd_insert ……. Scalable Table T

19 Images An Image hides the scalable table partitioning
An image is an SQL Server distributed updateable partitioned view of the table An SQL Server Union-all view with check constraints An image resides on client or peer NDBs All meta-data of an image are stored in the Image meta-table

20 Image Types Primary image Resides at the creation node
Has the name of the scalable table Secondary images Reside at other client or peer NDBs of the SDB Have a specific name, other than that of the table To avoid name conflict

21 Image Adjustment An image presents the actual partitioning of its scalable table Defines the partitioning as known to the client It do not address any new segments resulted from a split Are dynamically adjustable by the client When a query to the image comes in Image checking Image adjustment if necessary

22 Image Adjustment Alter the partitioned view definition
Get the number of segments presented in the image, N1 Get the number of segments of the scalable table, N2 Compare N1 and N2: If N1<N2 then Image Adjustment Alter the partitioned view definition

23 Image: Example T Image T Scalable Table Primary Image
DB1 SDB N1.DB1 N2.DB1 N3.DB1 N4.DB1 Primary Image T Image T Scalable Table CREATE VIEW T AS SELECT * FROM N2.DB1.SD._N1_T CREATE VIEW T AS SELECT * FROM N2.DB1.SD._N1_T UNION ALL SELECT * FROM N3.DB1.SD._N1_T UNION ALL SELECT * FROM N4.DB1.SD._N1_T

24 Application Interface
The application interface manipulates scalable tables through SD-SQL Server commands The SD-SQL Server commands start with ‘sd_’ to distinguish from SQL Server commands for static tables INSERT sd_insert CREATE TABLE sd_create_table

25 Nodes Management Node Creation Node Alteration Node Removal
sd_create_node ‘Dell1’ /* Server by default */ sd_create_node ‘Ceria’, ‘client’ Node Alteration sd_alter_node ‘Ceria’, ‘ADD server’ /* Becomes peer*/ Node Removal sd_drop_node ‘Ceria’

26 SDB & NDB Management SDB Creation SDB Alteration SDB Removal
sd_create_scalable_database ‘SkyServer’, ‘Dell1’, ‘Server’, 2 /* Creates the primary SkyServer NDB as well at Dell1*/ SDB Alteration sd_create_node_database ‘SkyServer’, ‘Ceria’, ‘Client’ SDB Removal sd_drop_scalable_database ‘SkyServer’

27 Scalable Tables Management
Scalable Table Creation sd_create_table ‘PhotoObj (objid BIGINT PRIMARY KEY…)’, 10000 Scalable Table Alteration sd_alter_table ‘PhotoObj ADD t INT’, 1000 sd_create_index ‘run_index ON Photoobj (run)’ sd_drop_index ‘PhotoObj.run_index’ Scalable Table Removal sd_drop_table ‘PhotoObj’

28 Image Adjustment Secondary Image Creation Secondary Image Removal
sd_create_image ‘Ceria’, ‘PhotoObj’ sd_create_image ‘Dell2’, ‘PhotoObj’ Secondary Image Removal sd_drop_image 'PhotoObj’

29 Scalable Queries Management
USE SkyServer /* SQL Server command */ Scalable Update Queries sd_insert ‘INTO PhotoObj SELECT * FROM Ceria5.Skyserver-S.PhotoObj’ Scalable Search Queries sd_select ‘* FROM PhotoObj’ sd_select ‘TOP 5000 * INTO PhotoObj1 FROM PhotoObj’, 500

30 Command Processing Let Q a scalable query using the PhotoObj image:
sd_select ‘COUNT (*) FROM PhotoObj’ Image Binding Find Images in Q Checking of the PhotoObj image Adjustment PhotoObj Image Adjustment Execution of Q

31 Concurrency SD-SQL Server processes every command as SQL distributed transaction at Repeatable Read isolation level Much less blocking than at Serializable Level SD-SQL Server performs the split asynchronously with the insert that triggered it It launches the actual splitting as an asynchronous job called splitter

32 Concurrency Splits use exclusive locks on segments and on tuples in RP meta-table. Shared locks on other meta-tables: Primary, NDB meta-tables Scalable queries use basically shared locks on meta-tables and any other table involved

33 Concurrency: Example X X Dell1.SkyServer Dell1 Splitter sd_alter_table
RP PhotoObj Splitter sd_alter_table X Exclusive Lock Shared Lock Waiting X Exclusive Lock Exclusive Lock

34 Experimental Environment
6 Machines Pentium IV 1.7 GHz RAM: 780 Mb & 1 Gb Operating System: Windows 2K Server Ethernet Network: max bandwidth of 1 Gb/s Use of SQL Analyzer for editing queries Use of SQL Profiler to take measurements

35 The SkyServer Benchmark
We use SkyServer database as benchmark Provided and installed at Ceria by Dr. Gray SkyServer brings the entire database of the Sloan Digital Sky Survey, SDSS We use of the PhotoObj table as an example scalable table In our experiments, PhotoObj has 158,426 tuples (about 260 MB) Originally, it has 14 M tuples

36 Splitting Measurements
Splitting of PhotoObj scalable table into 2, 3, 4 and 5 segments according to different capacities

37 Splitting Measurements
Splitting of PhotoObj scalable table (of 160 k tuples) with indexes into 2, 3, 4 and 5 segments

38 Image Adjustment (Q) sd_select ‘COUNT (*) FROM PhotoObj’
Query (Q1) execution time

39 Image Adjustment: Complex Query
sd_select 'TOP x.objid FROM PhotoObj x, PhotoObj y WHERE x.obj=y.obj AND x.objid>y.objid

40 Image Adjustment (Q) sd_select ‘COUNT (*) FROM Ti’

41 Image Adjustment: Interpretation
Scalable View Overhead sec per level and per table Real Overhead 13%

42 Execution time of (Q) on SQL Server and SD-SQL Server
Comparison between SD-SQL Server & SQL Server (Q): sd_select ‘COUNT (*) FROM PhotoObj’ Execution time of (Q) on SQL Server and SD-SQL Server

43 Conclusion Scalable distributed databases with scalable tables are now a reality with SD-SQL Server No more manual repartitioning Unlike in any other DBS we know about The performance analysis proves Efficiency of our design Immediate utility of SD-SQL Server Use of SD-SQL Server as core component of a virtual repository of eGov documents.

44 Future Works More performance measurements Query error processing
Management of data failures Use of the high availability methods Application on other DBMSs Oracle, DB2, etc.

45 Thank you for your attention


Download ppt "An Overview of a Scalable Distributed Database System: SD-SQL Server"

Similar presentations


Ads by Google