Information Resources Management April 17, 2001. Agenda n Administrivia n Database Architectures.

Slides:



Advertisements
Similar presentations
Distributed Processing, Client/Server and Clusters
Advertisements

Database Architectures and the Web
Distributed Databases John Ortiz. Lecture 24Distributed Databases2  Distributed Database (DDB) is a collection of interrelated databases interconnected.
Distributed databases
Transaction.
MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management Dave Salisbury ( )
Chapter 13 (Web): Distributed Databases
Technical Architectures
Distributed Databases Logical next step in geographically dispersed organisations goal is to provide location transparency starting point = a set of decentralised.
ABCSG - Distributed Database 1 Data Management Distributed Database Data Replication.
Distributed Database Management Systems
1 Database Architectures Modified from …..Modern Database Management Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden.
Overview Distributed vs. decentralized Why distributed databases
Chapter 7: Client/Server Computing Business Data Communications, 5e.
1 © Prentice Hall, 2002 Chapter 13: Distributed Databases Modern Database Management 6 th Edition Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden.
Distributed Database Management Systems
©Silberschatz, Korth and Sudarshan19.1Database System Concepts Distributed Transactions Transaction may access data at several sites. Each site has a local.
Chapter 12 Distributed Database Management Systems
McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 17 Client-Server Processing, Parallel Database Processing,
Chapter 9: The Client/Server Database Environment
©Silberschatz, Korth and Sudarshan18.1Database System Concepts Centralized Systems Run on a single computer system and do not interact with other computer.
Definition of terms Definition of terms Explain business conditions driving distributed databases Explain business conditions driving distributed databases.
Database System Architectures  Client-server Database System  Parallel Database System  Distributed Database System Wei Jiang.
1 © Prentice Hall, 2002 The Client/Server Database Environment.
DATABASE MANAGEMENT SYSTEMS 2 ANGELITO I. CUNANAN JR.
Distributed Databases
Lecture The Client/Server Database Environment
Client-Server Processing and Distributed Databases
The Client/Server Database Environment
Chapter 3 Database Architectures and the Web Pearson Education © 2009.
1 Distributed and Parallel Databases. 2 Distributed Databases Distributed Systems goal: –to offer local DB autonomy at geographically distributed locations.
MBA 664 Database Management Systems Dave Salisbury ( )
Database Architectures and the Web Session 5
Database Design – Lecture 16
Lecture On Database Analysis and Design By- Jesmin Akhter Lecturer, IIT, Jahangirnagar University.
04/18/2005Yan Huang - CSCI5330 Database Implementation – Distributed Database Systems Distributed Database Systems.
DISTRIBUTED DATABASE SYSTEM.  A distributed database system consists of loosely coupled sites that share no physical component  Database systems that.
9/5/2012ISC329 Isabelle Bichindaritz1 Web Database Environment.
Unit – I CLIENT / SERVER ARCHITECTURE. Unit Structure  Evolution of Client/Server Architecture  Client/Server Model  Characteristics of Client/Server.
Lecture 5: Sun: 1/5/ Distributed Algorithms - Distributed Databases Lecturer/ Kawther Abas CS- 492 : Distributed system &
Session-8 Data Management for Decision Support
10 1 Chapter 10 Distributed Database Management Systems Database Systems: Design, Implementation, and Management, Sixth Edition, Rob and Coronel.
Database Systems: Design, Implementation, and Management Tenth Edition Chapter 12 Distributed Database Management Systems.
Database Systems: Design, Implementation, and Management Ninth Edition Chapter 12 Distributed Database Management Systems.
Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides.
Distributed Database Systems Overview
The Client/Server Database Environment Ployphan Sornsuwit KPRU Ref.
Personal Computer - Stand- Alone Database  Database (or files) reside on a PC - on the hard disk.  Applications run on the same PC and directly access.
DISTRIBUTED COMPUTING
Kjell Orsborn UU - DIS - UDBL DATABASE SYSTEMS - 10p Course No. 2AD235 Spring 2002 A second course on development of database systems Kjell.
Databases Illuminated
3-Tier Architecture Chandrasekaran Rajagopalan Cs /01/99.
MBA 664 Database Management Systems Dave Salisbury ( )
Introduction to Distributed Databases Yiwei Wu. Introduction A distributed database is a database in which portions of the database are stored on multiple.
 Distributed Database Concepts  Parallel Vs Distributed Technology  Advantages  Additional Functions  Distribution Database Design  Data Fragmentation.
Chapter 1 Database Access from Client Applications.
18 September 2008CIS 340 # 1 Last Covered (almost)(almost) Variety of middleware mechanisms Gain? Enable n-tier architectures while not necessarily using.
1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September 2008, revised October 2009.
Topics in Distributed Databases Database System Implementation CSE 507 Some slides adapted from Navathe et. Al and Silberchatz et. Al.
E-commerce Architecture Ayşe Başar Bener. Client Server Architecture E-commerce is based on client/ server architecture –Client processes requesting service.
Distributed Databases
Distributed Databases – Advanced Concepts Chapter 25 in Textbook.
Database Architectures and the Web
The Client/Server Database Environment
The Client/Server Database Environment
The Client/Server Database Environment
Database Architectures and the Web
Distributed Databases
Introduction of Week 14 Return assignment 12-1
Database System Architectures
Presentation transcript:

Information Resources Management April 17, 2001

Agenda n Administrivia n Database Architectures

Administrivia n Homework #8

Database Architectures n Centralized n Client-Server n Parallel - single site n Distributed - multiple sites

Database Architectures Centralized (Parallel) Distributed Client-Server Function Data

Centralized n PC, Mini, or Mainframe n Single Database n Single Database Manager n One or More Users n Data and Function in One Place

Client-Server n PCs to Mainframes to Minis n PC to PC n Mainframe to Mainframe n Use Desktop Processing Power n Better User Interface n Greater Functionality n Retain Centralized Control of Data

Client-Server: Basic Model ServerClient Request Result

Servers n Supercomputer n Mainframe n Mini n PC Server n All retain all data

Client-Server Architecture Data Function Server (Back-End) Client (Front-End) Thin Client Fat Client

Functionality n Presentation n I/O Processing n Validation n Business Rules n Application Logic n Data Management n Validation n Error Handling

“Thin” Client n Presentation Services Only n Accept Input n Format Output n Display n Server does all processing

“Fat” Client n Presentation n Validation n Application Logic - Programs n Data Management n Send SQL to Server n Server is just DBMS

“In Between” Client n Client n Presentation n Some Application Logic n Server n Some Applicaton Logic n Data Management and Services

Benefits of Client-Server n Use Local Processing Power n Better User Interface n Some Functionality if System Down n Use Sunk Costs of PCs n Support Reengineering n Support Intranets n Flexibility, Scalability, Customizeability

Challenges of Client-Server n Cost of (Upgraded) PCs n Network Reliance n Distributing Application Updates n Management of Complex System n Problem Identification & Resolution n Application Partitioning

Other Client-Server Architectures n Traditional is Two-Tiered (client-server) n Three-Tiered n Client-Application Server-DB Server n (PC - Mini - Mainframe) n (PC - PC Server - Mainframe) n Beyond Three n PC - PC Server - Web Server - Mini - Mainframe

Client-Server vs. Distributed n Client-Server: Application Distribution n Distributed: Data Distribution Often, “client-server” is used to refer to either application distribution or data distribution or both.

Middleware n What if n Multiple databases (sources) need to be accessed from a single client? n Different kinds of clients? n Mix of clients and servers? n Want to take advantage of existing base of applications (legacy systems)?

Middleware n Fat Clients just send SQL transactions n Other types of transactions may be needed based on the server (system)

Middleware Software that shields applications from the complexity of the operating environment. Client Middleware System (Legacy) System (Legacy)

Types of Middleware n Transaction Process (TP) Monitor n Database Middleware n Remote Procedure Call (RPC) n Message-Oriented Middleware (MOM) n Object-Request Brokers n (CORBA - ORB)

TP Monitor n Synchronous - sender must wait n Queuing n Message Delivery n Insured Delivery n Either Direction

Database Middleware n Variety of Clients/Platforms n Variety of Servers/DBMSs/Platforms n Specific to DB transactions (SQL)

Message-Oriented Middleware (MOM) n Asynchronous - clients do not wait n Queues & Queue Management/Recovery n Message Delivery n Insured Delivery n Either Direction (like or EDI only transactions)

Advantages of Middleware n Leverage sunk costs (legacy systems) n Reduce development cost n Reduce development time n Increase responsiveness n Improve overall systems management n Consolidate diffuse information

Challenges of Middleware n Cost n Session management - Transaction state n Security n Network reliance n Diversity of systems - lack of standards n Constant technology change n Availability of talent n Middleware Management

Parallel and Distributed n Client-Server is an attempt to improve performance n Reduce time to execute a transaction n Parallel n Reduce time to get the data n Distributed

Parallel Systems n Single site for data n Very Large databases n Operations performed simultaneously

Parallel Database Architecures n Shared Memory n Shared Disk n Shared Nothing n Hierarchical

Shared Memory P P P M

n Advantages n Extremely efficient communications n Disadvantages n Max of 32/64 processors n Bus becomes bottleneck

Shared Disk P P P M M M

n Advantages n No bus bottleneck n Fault tolerance provided n Disadvantages n Disk access becomes bottleneck

Shared Nothing P P P M M M

n Advantages n No disk bottleneck n Highly scaleable n Disadvantages n High communication overhead/cost n Between processors n To another processor’s data

Hierarchical P P P P P M M M

Hierarchical n Advantages n Best of all worlds n Disadvantages n Worst of all worlds n Some high communcation overhead/cost n Between subsystems n Complexity

Distributed Databases n Client-Server - distribute functionality n What about distributing data?

Distributed Databases n Overview n Distributed Storage n Distributed Queries n Distributed Transactions n Multidatabase (Middleware)

Distributed Databases n Multiple locations n Single logical database n Several physical databases n Network connections

Advantages n Sharing across locations n Local control n Availability

Challenges n Development costs n People & Equipment n Testing n Problem identification & resolution n Technical expertise n Network dependence n Increased processing overhead

Distributed Data Storage n Replication n Fragmentation n Both

Replication n Data is repeated n Spectrum of options available n Temporary replication of specific rows n Replicate infrequently changed data n Replicate by site n Central site - all / each local site - their data only n Full replication n Everything everywhere

Concerns with Replication n Availability needed n Amount of parallelism in reads n Overhead of updates n Keeping replicas updated n Conflicting updates

Fragmentation n Partitioning n Divide data into subsets based on need n Have to be able to pull back together to get original tables

Fragmentation n Horizontal n by rows n specified conditions n Vertical n by column n each requires primary key (or created key) n Mixed n by row and column

Fragmentation & Replication n Repeat as necessary: n Replicate fragments n Fragment replicas n Don’t lose track of what you have and where it is!

Network Transparency n Distributing data should not require that the user know where or how it’s been distributed. n The database should be seen as a single entity no matter how fragmented and replicated it becomes.

Network Transparency n Some DBMSs are starting to provide this level of functionality so transparency exists even at the program level, but in many cases this “transparency” must be programmed into the applications. n It must always be designed into the database.

Distributed Queries n How do you query data that is everywhere?

Effeciency vs. Overhead n Splitting the query apart n Keeping track of the data/locations n Making sure everything gets executed n Putting the results back together n Generating network traffic n Handling partial results

Distributed Queries n Full replication can avoid the overhead n Huge increase in update overhead n Parallel execution no longer possible n Additional costs of replication

Example n 5 sites - NY, Pgh, Chicago, Dallas, Los Angeles n Data fragmented by site - no replication n Query (in Pgh): SELECT Name, Max (Salary) from Employee

Option 1 - High Bandwidth 1. Have all sites send their full employee tables to Pgh. 2. Build a temporary employee table. 3. Run the query against this table.

Option 2 - Not so High Bandwidth 1. Examine the query and determine it can be run separately at each location and the results combined. 2. Submit just the query to each location. 3. Wait for the results from each city. 4. As results return, build a temporary table (5 rows only). 5. Find the max using the temporary table.

Distributed Transactions n Transaction Types n Coordinators n Commit Protocols n Concurrency Controls n Deadlocks

Transaction Types n Local - transaction only needs local data n Global - transaction uses non-local data n My global becomes someone else’s local n Either type of transaction must still have ACID properties - global is the concern

System Structure n Things to do: 1. Process local transactions (transaction manager) 2. Process and track global transactions (transaction coordinator)

Global Processing 1. Recognize as global 2. Break up transaction 3. Distribute pieces 4. Assemble results 5. Coordinate termination 6. Handle problems

Coordinator of Coordinators n Coordinate among sites n Detect problems n Attempt to fix n Share status with others

Coordinator Failure n Backup Coordinator n receives all messages - maintains state n monitors coordinator n automatically takes over if coordinator down n avoids delays - increases overhead n Election n highest pre-assigned number

Commit Protocols n Two-Phase n Three-Phase n All sites must commit or all sites have to rollback n Replicated data only

Two-Phase Commit n Phase 1 n Send PREPARE to all sites n Sites respond READY or ABORT n Phase 2 n If all sites READY, n COMMIT locally - Send COMMITs n If not READY or time expires n ROLLBACK locally - Send ROLLBACK

Two-Phase Commit Coordinator Site Site requests commit

Two-Phase Commit - Phase 1 Coordinator Site Send PREPARE - all sites

Two-Phase Commit - Phase 1 Coordinator Site Sites respond READY

Two-Phase Commit - Phase 2 Coordinator Site COMMIT locally

Two-Phase Commit - Phase 2 Coordinator Site Send COMMIT - all sites

Two-Phase Commit - Phase 1 Coordinator Site Site responds ABORT or does not respond

Two-Phase Commit - Phase 2 Coordinator Site ROLLBACK locally

Two-Phase Commit - Phase 2 Coordinator Site Send ROLLBACK - all sites

Site Failure - Recovery n COMMIT and ROLLBACK as normal n If READY only n Check with coordinator or other sites n Either COMMIT or ROLLBACK n If no one found, ROLLBACK

Coordinator Failure n Ask the sites n If one has COMMIT, then REDO n If one has ROLLBACK, then UNDO n If one doesn’t have READY, UNDO n If all READY only n Coordinator must decide n Sites must wait and locks are held n “Blocking” occurs

Three-Phase Commit n Phase 1 n Sent PREPARE n Sites respond READY or ABORT n Phase 2 n If all sites READY, send PRECOMMIT n Else, ROLLBACK n Sites must ACKNOWLEDGE n Phase 3 n If at least K sites ACKNOWLEDGE, send COMMIT

Coordinator Failure n Three-Phase Commit prevents blocking n If coordinator fails n New coordinator is selected n Sites queried to determine status n New coordinator resumes

Network Partitioning n Network split creates two separate networks n Each “half” selects a coordinator n Coordinators make independent decisions n Result could be different decisions n Resolution of network problem may create need to resolve database problems

Concurrency Control n Single Lock Manager n Multiple Lock Managers

Single Lock Manager n One site for all locking n All other sites must go to it n Can read from anywhere n Updates must be to all copies n Advantages: Simple, Easy deadlock detection n Disadvantages: Bottleneck, Vulnerability

Simple Multiple Lock Mgrs n Each site locks a unique partition of the data n non-replicated data n Advantages: Fairly simple, reduced bottlenecks n Disadvantages: Complicated deadlock detection

Majority Protocol n Each site locks its own data n replication possible n Request owner for lock on data that isn’t local n When multiple owners, n/2 + 1 (majority) must provide the lock n Advantages: No bottlenecks n Disadvantages: More messages sent, Complicated deadlock detection, More deadlocks (each gets 1/2)

Biased Protocol n Reduced form of Majority Protocol n For a READ, only need any single lock n For a WRITE, need all locks n Advantages: No bottle necks, Reduced traffic n Disadvantages: Update traffic, Deadlocks

Primary Copy n Site designated to hold “primary” copy n Multiple sites n Replicated Data n All locks through that site n Advantages: Fairly simple, reduced bottlenecks n Disadvantages: Vulnerability, Complicated deadlock detection

Other Than Locking n Timestamps n Centralized generation n Local generation n Timestamp tests determine ability to read or write

Deadlocks & Distributed Data n Centralized n One Site n Distributed n Centralized - same advantages and disadvantages as other centralized control (database or locking)

Distributed Deadlock Detection n Each site tracks all transactions accessing its own data n Dummy transaction for transactions that originated here but are executing elsewhere n If deadlock found that includes dummy transaction n Must send deadlock information to other sites n They check for deadlock n May have to pass on to another site

Homework #9 n Continuuing with the Carnegie Library n Client/Server n Distrributed Database