Parallel Databases COMP3211 Advanced Databases Dr Nicholas Gibbins - 2014-2015.

Slides:



Advertisements
Similar presentations
More About Transaction Management Chapter 10. Contents Transactions that Read Uncommitted Data View Serializability Resolving Deadlocks Distributed Databases.
Advertisements

Distributed Databases COMP3017 Advanced Databases Dr Nicholas Gibbins –
CS 603 Handling Failure in Commit February 20, 2002.
Distributed Databases John Ortiz. Lecture 24Distributed Databases2  Distributed Database (DDB) is a collection of interrelated databases interconnected.
Distributed databases
COS 461 Fall 1997 Transaction Processing u normal systems lose their state when they crash u many applications need better behavior u today’s topic: how.
Transaction.
MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management Dave Salisbury ( )
Chapter 13 (Web): Distributed Databases
CMPT Dr. Alexandra Fedorova Lecture X: Transactions.
ICS 421 Spring 2010 Distributed Transactions Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 3/16/20101Lipyeow.
Transaction Processing Lecture ACID 2 phase commit.
Distributed Databases Logical next step in geographically dispersed organisations goal is to provide location transparency starting point = a set of decentralised.
Transaction Management and Concurrency Control
CS 582 / CMPE 481 Distributed Systems
Distributed DBMSPage © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture Distributed Database.
Chapter 8 : Transaction Management. u Function and importance of transactions. u Properties of transactions. u Concurrency Control – Meaning of serializability.
1 Distributed Databases CS347 Lecture 16 June 6, 2001.
Distributed DBMSPage © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture Distributed Database.
Chapter 18: Distributed Coordination (Chapter 18.1 – 18.5)
Database Management Systems I Alex Coman, Winter 2006
©Silberschatz, Korth and Sudarshan19.1Database System Concepts Distributed Transactions Transaction may access data at several sites. Each site has a local.
1 ICS 214B: Transaction Processing and Distributed Data Management Distributed Database Systems.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 10 Transaction Management and Concurrency Control.
Transaction Management WXES 2103 Database. Content What is transaction Transaction properties Transaction management with SQL Transaction log DBMS Transaction.
Transaction. A transaction is an event which occurs on the database. Generally a transaction reads a value from the database or writes a value to the.
Distributed Databases
1 Distributed and Parallel Databases. 2 Distributed Databases Distributed Systems goal: –to offer local DB autonomy at geographically distributed locations.
CS162 Section Lecture 10 Slides based from Lecture and
III. Current Trends: 2 - Distributed DBMSsSlide 1/47 III. Current Trends Distributed DBMSs: Advanced Concepts 3C13/D63C13/D6.
04/18/2005Yan Huang - CSCI5330 Database Implementation – Distributed Database Systems Distributed Database Systems.
Recovery & Concurrency Control. What is a Transaction?  A transaction is a logical unit of work that must be either entirely completed or aborted. 
DISTRIBUTED DATABASE SYSTEM.  A distributed database system consists of loosely coupled sites that share no physical component  Database systems that.
Database Management System Module 5 DeSiaMorewww.desiamore.com/ifm1.
Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins
BIS Database Systems School of Management, Business Information Systems, Assumption University A.Thanop Somprasong Chapter # 10 Transaction Management.
Chapterb19 Transaction Management Transaction: An action, or series of actions, carried out by a single user or application program, which reads or updates.
TRANSACTIONS. Objectives Transaction Concept Transaction State Concurrent Executions Serializability Recoverability Implementation of Isolation Transaction.
Distributed Transactions Chapter 13
Transactions and Their Distribution Zachary G. Ives University of Pennsylvania CIS 455 / 555 – Internet and Web Systems October 20, 2015.
1 CS 430 Database Theory Winter 2005 Lecture 16: Inside a DBMS.
Operating Systems Distributed Coordination. Topics –Event Ordering –Mutual Exclusion –Atomicity –Concurrency Control Topics –Event Ordering –Mutual Exclusion.
Concurrency Control in Database Operating Systems.
University of Tampere, CS Department Distributed Commit.
Databases Illuminated
II.I Selected Database Issues: 2 - Transaction ManagementSlide 1/20 1 II. Selected Database Issues Part 2: Transaction Management Lecture 4 Lecturer: Chris.
INTRODUCTION TO DBS Database: a collection of data describing the activities of one or more related organizations DBMS: software designed to assist in.
XA Transactions.
Commit Algorithms Hamid Al-Hamadi CS 5204 November 17, 2009.
Introduction.  Administration  Simple DBMS  CMPT 454 Topics John Edgar2.
Infrastructure for Data Warehouses. Basics Of Data Access Data Store Machine Memory Buffer Memory Cache Data Store Buffer Bus Structure.
Transaction Processing Concepts
Introduction to Distributed Databases Yiwei Wu. Introduction A distributed database is a database in which portions of the database are stored on multiple.
 Distributed Database Concepts  Parallel Vs Distributed Technology  Advantages  Additional Functions  Distribution Database Design  Data Fragmentation.
Multidatabase Transaction Management COP5711. Multidatabase Transaction Management Outline Review - Transaction Processing Multidatabase Transaction Management.
Distributed DBMSPage © 1998 M. Tamer Özsu & Patrick Valduriez Outline Introduction Background Distributed DBMS Architecture Distributed Database.
Topics in Distributed Databases Database System Implementation CSE 507 Some slides adapted from Navathe et. Al and Silberchatz et. Al.
CMS Advanced Database and Client-Server Applications Distributed Databases slides by Martin Beer and Paul Crowther Connolly and Begg Chapter 22.
Distributed Databases
Distributed Databases – Advanced Concepts Chapter 25 in Textbook.
Distributed Database Concepts
Chapter 19: Distributed Databases
Database System Implementation CSE 507
Commit Protocols CS60002: Distributed Systems
Outline Introduction Background Distributed DBMS Architecture
Chapter 10 Transaction Management and Concurrency Control
Distributed Databases
DISTRIBUTED DATABASES
Distributed Databases Recovery
The Gamma Database Machine Project
Presentation transcript:

Parallel Databases COMP3211 Advanced Databases Dr Nicholas Gibbins

Overview 2 The I/O bottleneck Parallel architectures Parallel query processing –Inter-operator parallelism –Intra-operator parallelism –Bushy parallelism Concurrency control Reliability

The I/O Bottleneck

The Memory Hierarchy, Revisited TypeCapacityLatency Registers10 1 bytes1 cycle L110 4 bytes<5 cycles L210 5 bytes5-10 cycles RAM bytes20-30 cycles (10 -8 s) Hard Disk bytes10 6 cycles (10 -3 s) 4

The I/O Bottleneck 5 Access time to secondary storage (hard disks) dominates performance of DBMSes Two approaches to addressing this: –Main memory databases (expensive!) –Parallel databases (cheaper!) Increase I/O bandwidth by spreading data across a number of disks

Definitions 6 Parallelism –An arrangement or state that permits several operations or tasks to be performed simultaneously rather than consecutively Parallel Databases –have the ability to split –processing of data –access to data –across multiple processors, multiple disks

Why Parallel Databases 7 Hardware trends Reduced elapsed time for queries Increased transaction throughput Increased scalability Better price/performance Improved application availability Access to more data In short, for better performance

Parallel Architectures

Tightly coupled Symmetric Multiprocessor (SMP) P = processor M = memory Shared Memory Architecture 9 P Global Memory PP

Less complex database software Limited scalability Single buffer Single database storage Software – Shared Memory 10 P Global Memory PP

Loosely coupled Distributed Memory Shared Disc Architecture 11 PPP MMM S

Avoids memory bottleneck Same page may be in more than one buffer at once – can lead to incoherence Needs global locking mechanism Single logical database storage Each processor has its own database buffer Software – Shared Disc 12 PPP MMM S

Massively Parallel Loosely Coupled High Speed Interconnect (between processors) Shared Nothing Architecture 13 PPP MMM

Each processor owns part of the data Each processor has its own database buffer One page is only in one local buffer – no buffer incoherence Needs distributed deadlock detection Needs multiphase commit protocol Needs to break SQL requests into multiple sub-requests Software - Shared Nothing 14 PPP MMM

Hardware vs. Software Architecture 15 It is possible to use one software strategy on a different hardware arrangement Also possible to simulate one hardware configuration on another –Virtual Shared Disk (VSD) makes an IBM SP shared nothing system look like a shared disc setup (for Oracle) From this point on, we deal only with shared nothing

Shared Nothing Challenges 16 Partitioning the data Keeping the partitioned data balanced Splitting up queries to get the work done Avoiding distributed deadlock Concurrency control Dealing with node failure

Parallel Query Processing

Dividing up the Work 18 Application Coordinator Process Worker Process

Database Software on each node 19 App1 DBMS W1W2 C1 DBMS W1W2 App2 DBMS W1W2 C2

Inter-Query Parallelism 20 Improves throughput Different queries/transactions execute on different processors –(largely equivalent to material in lectures on concurrency)

Intra-Query Parallelism 21 Improves response times (lower latency) Intra-operator (horizontal) parallelism –Operators decomposed into independent operator instances, which perform the same operation on different subsets of data Inter-operator (vertical) parallelism –Operations are overlapped –Pipeline data from one stage to the next without materialisation Bushy (independent) parallelism –Subtrees in query plan executed concurrently

Intra-Operator Parallelism

23 SQL Query Subset Queries Processor

Partitioning 24 Decomposition of operators relies on data being partitioned across the servers that comprise the parallel database –Access data in parallel to mitigate the I/O bottleneck Partitions should aim to spread I/O load evenly across servers Choice of partitions affords different parallel query processing approaches: –Range partitioning –Hash partitioning –Schema partitioning

Range Partitioning 25 A-H I-P Q-Z

Hash Partitioning 26 Table

Schema Partitioning 27 Table 1 Table 2

Rebalancing Data 28 Data in proper balance Data grows, performance drops Add new nodes and disc Redistribute data to new nodes

Intra-Operator Parallelism 29 Example query: –SELECT c1,c2 FROM t WHERE c1>5.5 Assumptions: –100,000 rows –Predicates eliminate 90% of the rows Considerations for query plans: –Data shipping –Query shipping

Data Shipping 30 π c1,c2 σ c1>5.5 ∪ t1t1 t2t2 t3t3 t4t4

Data Shipping 31 Coordinator and Worker Network Worker 25,000 tuples 10,000 tuples (c1,c2)

Query Shipping 32 π c1,c2 σ c1>5.5 t1t1 t2t2 t3t3 t4t4 ∪ π c1,c2 σ c1>5.5 π c1,c2 σ c1>5.5 π c1,c2 σ c1>5.5

Query Shipping 33 Coordinator Network Worker 2,500 tuples 10,000 tuples (c1,c2)

Query Shipping Benefits 34 Database operations are performed where the data are, as far as possible Network traffic is minimised For basic database operators, code developed for serial implementations can be reused In practice, mixture of query shipping and data shipping has to be employed

Inter-Operator Parallelism

36 Allows operators with a producer-consumer dependency to be executed concurrently –Results produced by producer are pipelined directly to consumer –Consumer can start before producer has produced all results –No need to materialise intermediate relations on disk (although available buffer memory is a constraint) –Best suited to single-pass operators

Inter-Operator Parallelism 37 time ScanJoinSort Scan Join Sort

Intra- + Inter-Operator Parallelism 38 time ScanJoinSort Scan Join Sort Scan Join Sort

The Volcano Architecture 39 Basic operators as usual: –scan, join, sort, aggregate (sum, count, average, etc) The Exchange operator –Inserted between the steps of a query to: –Pipeline results –Direct streams of data to the next step(s), redistributing as necessary Provides mechanism to support both vertical and horizontal parallelism

Exchange Operators 40 Example query: –SELECT county, SUM(order_item) FROM customer, order WHERE order.customer_id=customer_id GROUP BY county ORDER BY SUM(order_item)

Exchange Operators 41 SORT GROUP HASH JOIN SCAN CustomerOrder

Exchange Operators 42 EXCHANGE SCAN Customer HASH JOIN

Exchange Operators 43 EXCHANGE SCAN Customer HASH JOIN EXCHANGE SCAN Order

EXCHANGE SCAN Customer HASH JOIN EXCHANGE SCAN Order EXCHANGE GROUP SORT 44

Bushy Parallelism

46 Execute subtrees concurrently Bushy Parallelism σ π RSTU π

Parallel Query Processing

Some Parallel Queries 48 Enquiry Collocated Join Directed Join Broadcast Join Repartitioned Join Combine aspects of intra-operator and bushy parallelism

Orders Database 49 CUSTKEYC_NAME…C_NATION… ORDERKEYDATE…CUSTKEY… SUPPKEYS_NAME…S_NATION… SUPPKEY… CUSTOMER ORDER SUPPLIER

50 “How many customers live in the UK?” Enquiry/Query SCAN COUNT Slave Task CoordinatorSUM Multiple partitions of customer table Return subcounts to coordinator Return to application

Collocated Join 51 “Which customers placed orders in July?” UNION ORDER Tables both partitioned on CUSTKEY (the same key) and therefore corresponding entries are on the same node Requires a JOIN of CUSTOMER and ORDER SCAN JOIN CUSTOMER Return to application

Directed Join 52 “Which customers placed orders in July?” (tables have different keys) UNION CUSTOMER SCAN ORDER SCAN JOIN Slave Task 1Slave Task 2 Coordinator Return to application ORDER partitioned on ORDERKEY, CUSTOMER partitioned on CUSTKEY Retrieve rows from ORDER, then use ORDER.CUSTKEY to direct appropriate rows to nodes with CUSTOMER

53 “Which customers and suppliers are in the same country?” Broadcast Join UNION CUSTOMER SCAN SUPPLIER SCAN JOIN Slave Task 1Slave Task 2 Coordinator Return to application SUPPLIER partitioned on SUPPKEY, CUSTOMER on CUSTKEY. Join required on *_NATION Send all SUPPLIER to each CUSTOMER node BROADCAST

54 “Which customers and suppliers are in the same country?” Repartitioned Join UNION CUSTOMER SCAN SUPPLIER Slave Task 1 Coordinator SUPPLIER partitioned on SUPPKEY, CUSTOMER on CUSTKEY. Join required on *_NATION. Repartition both tables on *_NATION to localise and minimise the join effort SCAN Slave Task 2 JOIN Slave Task 3 Return to application

Concurrency Control

Concurrency and Parallelism 56 A single transaction may update data in several different places Multiple transactions may be using the same (distributed) tables simultaneously One or several nodes could fail Requires concurrency control and recovery across multiple nodes for: –Locking and deadlock detection –Two-phase commit to ensure ‘all or nothing’

Locking and Deadlocks 57 With Shared Nothing architecture, each node is responsible for locking its own data No global locking mechanism However: –T1 locks item A on Node 1 and wants item B on Node 2 –T2 locks item B on Node 2 and wants item A on Node 1 –Distributed Deadlock

Resolving Deadlocks 58 One approach – Timeouts Timeout T2, after wait exceeds a certain interval –Interval may need random element to avoid ‘chatter’ i.e. both transactions give up at the same time and then try again Rollback T2 to let T1 to proceed Restart T2, which can now complete

Resolving Deadlocks 59 More sophisticated approach (DB2) Each node maintains a local ‘wait-for’ graph Distributed deadlock detector (DDD) runs at the catalogue node for each database Periodically, all nodes send their graphs to the DDD DDD records all locks found in wait state Transaction becomes a candidate for termination if found in same lock wait state on two successive iterations

Reliability

61 We wish to preserve the ACID properties for parallelised transactions –Isolation is taken care of by 2PL protocol –Isolation implies Consistency –Durability can be taken care of node-by-node, with proper logging and recovery routines –Atomicity is the hard part. We need to commit all parts of a transaction, or abort all parts Two-phase commit protocol (2PC) is used to ensure that Atomicity is preserved

Two-Phase Commit (2PC) Distinguish between: –The global transaction –The local transactions into which the global transaction is decomposed Global transaction is managed by a single site, known as the coordinator Local transactions may be executed on separate sites, known as the participants 62

Phase 1: Voting Coordinator sends “prepare T” message to all participants Participants respond with either “vote-commit T” or “vote-abort T” Coordinator waits for participants to respond within a timeout period 63

Phase 2: Decision If all participants return “vote-commit T” (to commit), send “commit T” to all participants. Wait for acknowledgements within timeout period. If any participant returns “vote-abort T”, send “abort T” to all participants. Wait for acknowledgements within timeout period. When all acknowledgements received, transaction is completed. If a site does not acknowledge, resend global decision until it is acknowledged. 64

Normal Operation 65 C P prepare T vote-commit T commit T ack vote-commit T received from all participants

Logging 66 C P prepare T vote-commit T commit T ack vote-commit T received from all participants

Aborted Transaction 67 C P prepare T vote-commit T abort T ack vote-abort T received from at least one participant

Aborted Transaction 68 C P prepare T vote-abort T abort T ack P vote-abort T received from at least one participant

State Transitions 69 C P prepare T vote-commit T commit T ack vote-commit T received from all participants INITIAL WAIT COMMIT INITIAL READY COMMIT

State Transitions 70 C P prepare T vote-commit T abort T ack vote-abort T received from at least one participant INITIAL WAIT ABORT INITIAL READY ABORT

State Transitions 71 C P prepare T vote-abort T abort T ack P INITIAL WAIT ABORT INITIAL ABORT

Coordinator State Diagram 72 sent: prepare T recv: vote-abort T sent: abort T INITIAL WAIT ABORTCOMMIT recv: vote-commit T sent: commit T

Participant State Diagram recv: prepare T sent: vote-commit T recv: commit T send: ack INITIAL READY COMMITABORT recv: prepare T sent: vote-abort T recv: abort T send: ack 73

Dealing with failures 74 If the coordinator or a participant fails during the commit, two things happen: –The other sites will time out while waiting for the next message from the failed site and invoke a termination protocol –When the failed site restarts, it tries to work out the state of the commit by invoking a recovery protocol The behaviour of the sites under these protocols depends on the state they were in when the site failed

Termination Protocol: Coordinator Timeout in WAIT –Coordinator is waiting for participants to vote on whether they're going to commit or abort –A missing vote means that the coordinator cannot commit the global transaction –Coordinator may abort the global transaction Timeout in COMMIT/ABORT –Coordinator is waiting for participants to acknowledge successful commit or abort –Coordinator resends global decision to participants who have not acknowledged 75

Termination Protocol: Participant Timeout in INITIAL –Participant is waiting for a “prepare T” –May unilaterally abort the transaction after a timeout –If “prepare T” arrives after unilateral abort, either: –resend the “vote-abort T” message or –ignore (coordinator then times out in WAIT) Timeout in READY –Participant is waiting for the instruction to commit or abort – blocked without further information –Participant can contact other participants to find one that knows the decision – cooperative termination protocol 76

Recovery Protocol: Coordinator Failure in INITIAL –Commit not yet begun, restart commit procedure Failure in WAIT –Coordinator has sent “prepare T”, but has not yet received all vote-commit/vote-abort messages from participants –Recovery restarts commit procedure by resending “prepare T” Failure in COMMIT/ABORT –If coordinator has received all “ack” messages, complete successfully –Otherwise, terminate 77

Recovery Protocol: Participant Failure in INITIAL –Participant has not yet voted –Coordinator cannot have reached a decision –Participant should unilaterally abort by sending “vote-abort T” Failure in READY –Participant has voted, but doesn't know what the global decision was –Cooperative termination protocol Failure in COMMIT/ABORT –Resend “ack” message 78

Parallel Utilities

80 Ancillary operations can also exploit the parallel hardware –Parallel Data Loading/Import/Export –Parallel Index Creation –Parallel Rebalancing –Parallel Backup –Parallel Recovery