Parallel and distributed databases R & G Chapter 22.

Slides:



Advertisements
Similar presentations
Chapter 7 LAN Operating Systems LAN Software Software Compatibility Network Operating System (NOP) Architecture NOP Functions NOP Trends.
Advertisements

Distributed databases
Chapter 13 (Web): Distributed Databases
City University London
Information Resources Management April 17, Agenda n Administrivia n Database Architectures.
ABCSG - Distributed Database 1 Data Management Distributed Database Data Replication.
2/18/2004 Challenges in Building Internet Services February 18, 2004.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Distributed Database Management Systems
Chapter 1 Introduction 1.1A Brief Overview - Parallel Databases and Grid Databases 1.2Parallel Query Processing: Motivations 1.3Parallel Query Processing:
Overview Distributed vs. decentralized Why distributed databases
EEC-681/781 Distributed Computing Systems Lecture 3 Wenbing Zhao Department of Electrical and Computer Engineering Cleveland State University
Chapter 12 Distributed Database Management Systems
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 17 Client-Server Processing, Parallel Database Processing,
©Silberschatz, Korth and Sudarshan18.1Database System Concepts Centralized Systems Run on a single computer system and do not interact with other computer.
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo How to Scale a Database System.
Distributed databases
WPI, MOHAMED ELTABAKH PARALLEL & DISTRIBUTED DATABASES 1.
Distributed Databases
An Introduction to Infrastructure Ch 11. Issues Performance drain on the operating environment Technical skills of the data warehouse implementers Operational.
Database Architectures and the Web
PMIT-6102 Advanced Database Systems
Database Design, Application Development, and Administration, 5 th Edition Copyright © 2011 by Michael V. Mannino. All rights reserved. Chapter 18 Client-Server.
Copyright © 2004 Pearson Education, Inc.. Chapter 25 Distributed Databases and Client–Server Architectures.
1 Distributed and Parallel Databases. 2 Distributed Databases Distributed Systems goal: –to offer local DB autonomy at geographically distributed locations.
Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved DISTRIBUTED SYSTEMS.
Database Architectures and the Web Session 5
1 © Prentice Hall, 2002 Physical Database Design Dr. Bijoy Bordoloi.
Database Architecture Introduction to Databases. The Nature of Data Un-structured Semi-structured Structured.
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
Chapter 18: Database System Architectures
Exercises for Chapter 2: System models
Parallel Databases COMP3017 Advanced Databases Dr Nicholas Gibbins
IMDGs An essential part of your architecture. About me
Alireza Angabini Advanced DB class Dr. M.Rahgozar Fall 88.
Database Systems: Design, Implementation, and Management Tenth Edition Chapter 12 Distributed Database Management Systems.
Database Systems: Design, Implementation, and Management Ninth Edition Chapter 12 Distributed Database Management Systems.
Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides.
Parallel Databases COMP3211 Advanced Databases Dr Nicholas Gibbins
Usenix Annual Conference, Freenix track – June 2004 – 1 : Flexible Database Clustering Middleware Emmanuel Cecchet – INRIA Julie Marguerite.
Achieving Scalability, Performance and Availability on Linux with Oracle 9iR2-RAC Grant McAlister Senior Database Engineer Amazon.com Paper
Parallel Databases 77. Introduction 4 Basic idea: use multiple disks, memory and/or processors to speed up querying. 4 Measures –Throughput – how many.
1 Distributed Databases (DDBs) Chap Distributed Databases Distributed Systems goal: –to offer local DB autonomy at geographically distributed locations.
Oracle's Distributed Database Bora Yasa. Definition A Distributed Database is a set of databases stored on multiple computers at different locations and.
The Replica Location Service The Globus Project™ And The DataGrid Project Copyright (c) 2002 University of Chicago and The University of Southern California.
Distributed Databases DBMS Textbook, Chapter 22, Part II.
ASMA AHMAD 28 TH APRIL, 2011 Database Systems Distributed Databases I.
1 Distributed Databases BUAD/American University Distributed Databases.
Databases Illuminated
CS338Parallel and Distributed Databases11-1 Parallel and Distributed Databases Lecture Topics Multi-CPU and distributed systems Monolithic system Client–server.
Distributed database system
1 Distributed Databases Chapter 21, Part B. 2 Introduction v Data is stored at several sites, each managed by a DBMS that can run independently. v Distributed.
Authors Brian F. Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver, Ramana.
What does the Cloud mean for Data Management: Challenges and Opportunities Akrivi Vlachou Norwegian University of Science and Technology (NTNU), Trondheim,
Dynamo: Amazon’s Highly Available Key-value Store DAAS – Database as a service.
Mapping the Data Warehouse to a Multiprocessor Architecture
CMU SCS Carnegie Mellon Univ. Dept. of Computer Science /615 - DB Applications C. Faloutsos – A. Pavlo Lecture#24: Distributed Database Systems (R&G.
Introduction to Distributed Databases Yiwei Wu. Introduction A distributed database is a database in which portions of the database are stored on multiple.
1 Distributed Databases architecture, fragmentation, allocation Lecture 1.
Em Spatiotemporal Database Laboratory Pusan National University File Processing : Database Management System Architecture 2004, Spring Pusan National University.
An Introduction to Super-Scalability But first…
Distributed databases A brief introduction with emphasis on NoSQL databases Distributed databases1.
CPT-S Advanced Databases 11 Yinghui Wu EME 49.
Distributed Systems Architecure. Architectures Architectural Styles Software Architectures Architectures versus Middleware Self-management in distributed.
Distributed Databases
6/25/2018.
Introduction to NewSQL
Mapping the Data Warehouse to a Multiprocessor Architecture
Chapter 17: Database System Architectures
Database System Architectures
Presentation transcript:

Parallel and distributed databases R & G Chapter 22

What is a distributed database?

Why distribute a database Scalability and performance Resilience to failures Throughput Data size versus X X

Why distribute a database Data is already distributed Or needs to be distributed Data is in multiple systems

Why not distribute a database You must earn your complexity! Communication needed Must build a complex infrastructure Unpredictable latencies must be masked More types of failures More components to fail Network failures Congestion, timeouts More complex planning Communication cost plus I/O cost May have to deal with heterogeneity Different types of systems Different schemas, possibly incompatible Different administrative domains

Types of distributed databases

The old days: mainframes Definitely not distributed!

Client-server User interaction Data processing Network

Parallel database

Primary/secondary X

Multidatabase

How do they work? What is shared? How to distribute the data? How to process the data? How to update the data?

What is shared? Memory CPUsRAM Disk Most modern DBMSs

What is shared? Disk RAM Oracle RAC

What is shared? Nothing RAM Search engines, Teradata

Server 1Server 2Server 3Server 4 Bike$86 6/2/ Chair$10 6/5/ How to distribute the data? Couch$570 6/1/ Car$1123 6/1/ Lamp$19 6/7/ Bike$56 6/9/ Scooter$18 6/11/ Hammer$8000 6/11/

How to distribute the data? Hash partitioning Range partitioning (key,value) Hash() (key,value) <= X> X

Server 1Server 2Server 3Server 4 How to distribute the data? Bike Chair Couch Car Lamp Bike Scooter Hammer $86 $10 $570 $1123 $19 $56 $18 $8000 6/2/07 6/5/07 6/1/07 6/7/07 6/9/07 6/11/

Query processing Intra-operator parallelism Inter-operator parallelism

Parallel scanning filter Result

Sorting

Parallel hash join Hash()

Join

Semi-join

Inter-operator parallelism

Updating distributed data Synchronous: read-any-write-all Reads are fast

Updating distributed data Synchronous: voting

Updating distributed data Synchronous: voting Writes tolerant to disconnection

Consistency of distributed data Should provide ACID

Primary/secondary

Two-phase commit PREPARE PREPARED COMMIT

Two-phase commit PREPARE PREPAREDABORT

Two-phase commit PREPARE PREPARED ABORT

Two-phase commit PREPARE PREPARED X

Conclusion Parallelism and distribution very useful Performance Fault tolerance Scale But complex! Rethink lots of aspects of the system Must earn the complexity