Md Rezaul Huda Chowdhury Reza. Concurrency Control Modify concurrency control schemes for use in distributed environment. We assume that each site participates.

Slides:



Advertisements
Similar presentations
Types of Distributed Database Systems
Advertisements

Universidade do Minho A Framework for Multi-Class Based Multicast Routing TNC 2002 Maria João Nicolau, António Costa, Alexandre Santos {joao, costa,
Multicasting in Mobile Ad hoc Networks By XIE Jiawei.
CS3771 Today: deadlock detection and election algorithms  Previous class Event ordering in distributed systems Various approaches for Mutual Exclusion.
Jaringan Komputer Lanjut Packet Switching Network.
Distributed Databases John Ortiz. Lecture 24Distributed Databases2  Distributed Database (DDB) is a collection of interrelated databases interconnected.
Chapter 13 (Web): Distributed Databases
1 A Case For End System Multicast Yang-hua Chu, Sanjay Rao and Hui Zhang Carnegie Mellon University Largely adopted from Jonathan Shapiro’s slides at umass.
What we will cover…  Distributed Coordination 1-1.
Web Caching Schemes1 A Survey of Web Caching Schemes for the Internet Jia Wang.
Internet Networking Spring 2006 Tutorial 12 Web Caching Protocols ICP, CARP.
Network Layer4-1 Spanning trees r Suppose you have a connected undirected graph m Connected: every node is reachable from every other node m Undirected:
Distributed Database Management Systems
1 Spring Semester 2007, Dept. of Computer Science, Technion Internet Networking recitation #13 Web Caching Protocols ICP, CARP.
Application Layer Multicast
Overview Distributed vs. decentralized Why distributed databases
Lecture-12 Concurrency Control in Distributed Databases
CS335 Networking & Network Administration Tuesday, May 18, 2010.
1 IP Multicasting. 2 IP Multicasting: Motivation Problem: Want to deliver a packet from a source to multiple receivers Applications: –Streaming of Continuous.
Chapter 18: Distributed Coordination (Chapter 18.1 – 18.5)
Spring Routing & Switching Umar Kalim Dept. of Communication Systems Engineering 06/04/2007.
Distributed Databases
CSE679: Multicast and Multimedia r Basics r Addressing r Routing r Hierarchical multicast r QoS multicast.
Communication Part IV Multicast Communication* *Referred to slides by Manhyung Han at Kyung Hee University and Hitesh Ballani at Cornell University.
04/20/2005Yan Huang - CSCI5330 Database Implementation – Distributed Database Systems Distributed Database Systems.
IMS 4212: Distributed Databases 1 Dr. Lawrence West, Management Dept., University of Central Florida Distributed Databases Business needs.
Multicast Routing Protocols NETE0514 Presented by Dr.Apichan Kanjanavapastit.
Network Layer4-1 R1 R2 R3R4 source duplication R1 R2 R3R4 in-network duplication duplicate creation/transmission duplicate Broadcast Routing r Deliver.
AD HOC WIRELESS MUTICAST ROUTING. Multicasting in wired networks In wired networks changes in network topology is rare In wired networks changes in network.
CH2 System models.
04/18/2005Yan Huang - CSCI5330 Database Implementation – Distributed Database Systems Distributed Database Systems.
DISTRIBUTED DATABASE SYSTEM.  A distributed database system consists of loosely coupled sites that share no physical component  Database systems that.
©Silberschatz, Korth and Sudarshan19.1Database System Concepts 1 Chapter 19: Distributed Databases Heterogeneous and Homogeneous Databases Distributed.
CS 5565 Network Architecture and Protocols Godmar Back Lecture 22.
Multicast Routing Algorithms n Multicast routing n Flooding and Spanning Tree n Forward Shortest Path algorithm n Reversed Path Forwarding (RPF) algorithms.
Session-8 Data Management for Decision Support
Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides.
Chapter 22 Network Layer: Delivery, Forwarding, and Routing Part 5 Multicasting protocol.
Operating Systems Distributed Coordination. Topics –Event Ordering –Mutual Exclusion –Atomicity –Concurrency Control Topics –Event Ordering –Mutual Exclusion.
Multiprossesors Systems.. What are Distributed Databases ? “ A Logically interrelated collection of shared data ( and a description of this data) physically.
© J. Liebeherr, All rights reserved 1 Multicast Routing.
2007/03/26OPLAB, NTUIM1 A Proactive Tree Recovery Mechanism for Resilient Overlay Network Networking, IEEE/ACM Transactions on Volume 15, Issue 1, Feb.
Distributed Databases
Databases Illuminated
T. S. Eugene Ngeugeneng at cs.rice.edu Rice University1 COMP/ELEC 429 Introduction to Computer Networks Lecture 21: Multicast Routing Slides used with.
Distributed Transaction Management. Outline Introduction Concurrency Control Protocols  Locking  Timestamping Deadlock Handling Replication.
Chapter 19 Distributed Databases. 2 Distributed Database System n A distributed DBS consists of loosely coupled sites that share no physical component.
Introduction to Distributed Databases Yiwei Wu. Introduction A distributed database is a database in which portions of the database are stored on multiple.
Introduction to Active Directory
Distributed DBMS, Query Processing and Optimization
A Case for End System Multicast 學號: 報告人:通訊所 吳瑞益 指導教授:楊峻權 日期: ACM SIGMETRICS.
1 A Case For End System Multicast Yang-hua Chu, Sanjay Rao and Hui Zhang Carnegie Mellon University.
1 Information Retrieval and Use De-normalisation and Distributed database systems Geoff Leese September 2008, revised October 2009.
CS3771 Today: Distributed Coordination  Previous class: Distributed File Systems Issues: Naming Strategies: Absolute Names, Mount Points (logical connection.
Communication Networks Recitation 11. Multicast & QoS Routing.
Routing Semester 2, Chapter 11. Routing Routing Basics Distance Vector Routing Link-State Routing Comparisons of Routing Protocols.
Topics in Distributed Databases Database System Implementation CSE 507 Some slides adapted from Navathe et. Al and Silberchatz et. Al.
Distributed Systems Lecture 7 Multicast 1. Previous lecture Global states – Cuts – Collecting state – Algorithms 2.
Distributed Databases
Database System Concepts, 5th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Chapter 22: Distributed.
Distributed Databases – Advanced Concepts Chapter 25 in Textbook.
Chapter 19: Distributed Databases
Intra-Domain Routing Jacob Strauss September 14, 2006.
Overlay Networking Overview.
Distributed Database Systems
EE 122: Lecture 22 (Overlay Networks)
Database System Architectures
EE 122: Lecture 13 (IP Multicast Routing)
Implementing Multicast
Optional Read Slides: Network Multicast
Presentation transcript:

Md Rezaul Huda Chowdhury Reza

Concurrency Control Modify concurrency control schemes for use in distributed environment. We assume that each site participates in the execution of a commit protocol to ensure global transaction automicity. We assume all replicas of any item are updated Md Rezaul Huda Chowdhury Reza

Single-Lock-Manager Approach System maintains a single lock manager that resides in a single chosen site, say S i When a transaction needs to lock a data item, it sends a lock request to S i and lock manager determines whether the lock can be granted immediately If yes, lock manager sends a message to the site which initiated the request If no, request is delayed until it can be granted, at which time a message is sent to the initiating site Md Rezaul Huda Chowdhury Reza

Single-Lock-Manager Approach The transaction can read the data item from any one of the sites at which a replica of the data item resides. Writes must be performed on all replicas of a data item Advantages of scheme: Simple implementation Simple deadlock handling Disadvantages of scheme are: Bottleneck: lock manager site becomes a bottleneck Vulnerability: system is vulnerable to lock manager site failure. Md Rezaul Huda Chowdhury Reza

Distributed Lock Manager In this approach, functionality of locking is implemented by lock managers at each site Lock managers control access to local data items But special protocols may be used for replicas Advantage: work is distributed and can be made robust to failures Disadvantage: deadlock detection is more complicated Lock managers cooperate for deadlock detection Several variants of this approach Primary copy Majority protocol Biased protocol Quorum consensus Md Rezaul Huda Chowdhury Reza

Deadlock Handling Consider the following two transactions and history, with item X and transaction T 1 at site 1, and item Y and transaction T 2 at site 2: T 1 : write (X) write (Y) T 2 : write (Y) write (X) X-lock on X write (X) X-lock on Y write (Y) wait for X-lock on X Wait for X-lock on Y Result: deadlock which cannot be detected locally at either site Md Rezaul Huda Chowdhury Reza

Centralized Approach A global wait-for graph is constructed and maintained in a single site; the deadlock-detection coordinator Real graph: Real, but unknown, state of the system. Constructed graph:Approximation generated by the controller during the execution of its algorithm. the global wait-for graph can be constructed when: a new edge is inserted in or removed from one of the local wait-for graphs. a number of changes have occurred in a local wait-for graph. the coordinator needs to invoke cycle-detection. If the coordinator finds a cycle, it selects a victim and notifies all sites. The sites roll back the victim transaction. Md Rezaul Huda Chowdhury Reza

Availability High availability: time for which system is not fully usable should be extremely low (e.g % availability) Failures are more likely in large distributed systems To be robust, a distributed system must Detect failures Reconfigure the system so computation may continue Recovery/reintegration when a site or link is repaired Failure detection: distinguishing link failure from site failure is hard (partial) solution: have multiple links, multiple link failure is likely a site failure Md Rezaul Huda Chowdhury Reza

Reconfiguration Reconfiguration: Abort all transactions that were active at a failed site Making them wait could interfere with other transactions since they may hold locks on other sites However, in case only some replicas of a data item failed, it may be possible to continue transactions that had accessed data at a failed site (more on this later) If replicated data items were at failed site, update system catalog to remove them from the list of replicas. This should be reversed when failed site recovers, but additional care needs to be taken to bring values up to date If a failed site was a central server for some subsystem, an election must be held to determine the new server E.g. name server, concurrency coordinator, global deadlock detector Md Rezaul Huda Chowdhury Reza

Reconfiguration (Cont.) Since network partition may not be distinguishable from site failure, the following situations must be avoided Two or more central servers elected in distinct partitions More than one partition updates a replicated data item Updates must be able to continue even if some sites are down Solution: majority based approach Alternative of “read one write all available” is tantalizing but causes problems Md Rezaul Huda Chowdhury Reza

Site Reintegration When failed site recovers, it must catch up with all updates that it missed while it was down Problem: updates may be happening to items whose replica is stored at the site while the site is recovering Solution 1: halt all updates on system while reintegrating a site Unacceptable disruption Solution 2: lock all replicas of all data items at the site, update to latest version, then release locks Other solutions with better concurrency also available Md Rezaul Huda Chowdhury Reza

Heterogeneous Distributed Databases Many database applications require data from a variety of preexisting databases located in a heterogeneous collection of hardware and software platforms Data models may differ (hierarchical, relational, etc.) Transaction commit protocols may be incompatible Concurrency control may be based on different techniques (locking, time stamping, etc.) System-level details almost certainly are totally incompatible. A multi database system is a software layer on top of existing database systems, which is designed to manipulate information in heterogeneous databases Creates an illusion of logical database integration without any physical database integration Md Rezaul Huda Chowdhury Reza

Advantages Preservation of investment in existing hardware system software Applications Local autonomy and administrative control Allows use of special-purpose DBMSs Step towards a unified homogeneous DBMS Full integration into a homogeneous DBMS faces Technical difficulties and cost of conversion Organizational/political difficulties Organizations do not want to give up control on their data Local databases wish to retain a great deal of autonomy Md Rezaul Huda Chowdhury Reza

Unicast, Broadcast versus Multicast Unicast One-to-one Destination – unique receiver host address Broadcast One-to-all Destination – address of network Multicast One-to-many Multicast group must be identified Destination – address of group Key: Unicast transfer Broadcast transfer Multicast transfer Md Rezaul Huda Chowdhury Reza

Multicast application examples Financial services Delivery of news, stock quotes, financial indices, etc Remote conferencing/e-learning Streaming audio and video to many participants (clients, students) Interactive communication between participants Data distribution e.g., distribute experimental data from Large Hadron Collider (LHC) at CERN lab to interested physicists around the world Md Rezaul Huda Chowdhury Reza

IP multicast Highly efficient bandwidth usage Key Architectural Decision: Add support for multicast in IP layer Berkeley Gatech Stanford CMU Routers with multicast support Md Rezaul Huda Chowdhury Reza

So what is the big issue … more than 20 years since proposal, but no wide area IP multicast deployment Scalability (with number of groups) -- Routers maintain per-group state IP Multicast: best-effort multi-point delivery service -- Providing higher level features such as reliability, congestion control, flow control, and security has shown to be more difficult than in the unicast case Can we achieve efficient multi-point delivery without IP-layer support? Md Rezaul Huda Chowdhury Reza

Application layer multicast Stanford CMU Stan1 Stan2 Berk2 Overlay Tree Gatech Berk1 Berkeley Gatech Stan1 Stan2 Berk1 Berk2 CMU Md Rezaul Huda Chowdhury Reza

Pros and Cons Scalability Routers do not maintain per-group state End systems do, but they participate in very few groups Potentially simplify support for higher level functionality Leverage computation and storage of end systems Leverage solutions for unicast congestion, error and flow control Efficiency concerns redundant traffic on physical links increase in latency due to end-systems Md Rezaul Huda Chowdhury Reza

Multicasting Algorithms Md Rezaul Huda Chowdhury Reza

Requirements of Multipoint Routing Algorithms. u Support reliable transmission link failure should not increase delay or reduce resource availability. u Return optimal routes taking into consideration price to be paid(bandwidth consumed) end to end delay. (no. of links traversed) u Minimize network load. Avoid loops. Avoid traffic concentration on a few links or sub-nets. u Minimize the state stored in routers. Md Rezaul Huda Chowdhury Reza

Multipoint Routing algos Performance Metrics Quality of a tree is judged according to the following three dimensions Low Delay: End to end delay between source and receiver relative to the shortest unicast path delay. Low Cost : Cost of total bandwidth consumption Cost of tree state info Light Traffic Concentration : Maximum number of flows on a unidirectional link. How evenly the routes are distributed. Md Rezaul Huda Chowdhury Reza

Routing Algorithms All multi-point services use some kind of a distribution tree. Multicast trees can be Shared across sources. (shared trees) Only one tree needs to be established for each group, which is shared by all the sources within that group. Source specific. (shortest path trees). A shortest path tree rooted at each sending node needs to be established Md Rezaul Huda Chowdhury Reza

SOURCE BASED MULTIPOINT ROUTING The Technique. A Source Rooted Shortest Path Tree (SRSPT) algo: Computes the shortest paths between the source and each of the receivers within the group. Eliminates duplicate data copies on common links. Maintains one SRSPT per sender. Concept: All receiving nodes compute path towards the source independently. Used by: current day IP multicast protocols as applications are still small scale. local area. Md Rezaul Huda Chowdhury Reza

SOURCE BASED MULTIPOINT ROUTING Merits vs Demerits Advantages. u SRSPTs are easy to compute. Use the classic unicast routing tables. u Efficient distributed implementations are possible u Entire global topology not required. u There can be no loops in the path returned. Disadvantages u Does not minimize total cost of distribution u Does not scale well. u One piece of state information per source and per group is kept in each router. u May fail badly if the underlying unicast routing is asymmetric. Md Rezaul Huda Chowdhury Reza

SHARED TREE APPROACH OF MULTIPOINT ROUTING Characteristics of Steiner Tree based algorithms. The Minimum Steiner Tree: The minimal cost subgraph spanning a given subset of nodes in a graph. u The Steiner Tree problem is NP-complete. finding the minimum steiner tree in a graph has exponential cost. u The tree designed is undirected. solution feasible only for symmetric links. u Monolithic algorithm. has to be run each time group membership changes. Md Rezaul Huda Chowdhury Reza

SHARED TREE APPROACH OF MULTIPOINT ROUTING Characteristics of Steiner Tree based algorithms. u The SMT defines an absolute limit on the minimum tree cost to serve as a reference for gauging the cost-optimality of heuristic alternatives. u The SMT for all members of a multicast group is the same irrespective of the role of sender or receiver. u only one state entry needs to be maintained per group. u it scales well for larger groups. u The SMT may have unbounded delay. u Worst case maximum end-to-end path length of a SMT can be the longest acyclic path within the graph. Md Rezaul Huda Chowdhury Reza

SHARED TREE APPROACH OF MULTIPOINT ROUTING Characteristics of Core Based Tree algorithms. u Concept: Use the shortest Path Tree rooted at a node in the center of the network u Steps: Choose an optimal center for the group. Multiple cores can be used for better fault tolerance & delay characteristics. Group members send a join message to the center. Intermediate nodes mark interface from which the multicast info is received and forward it to the center. u Choose the center to: minimize max/avg delay for all members on the tree. Minimize the sum of tree-link costs. Md Rezaul Huda Chowdhury Reza

SHARED TREE APPROACH OF MULTIPOINT ROUTING Advantages of Core Based Tree algorithms u Work well with multiple senders/receivers state information is stored per group, therefore scalable. u Receiver based approach. Supports dynamic group membership with relative ease. u Suitable for sparsely distributed receivers. SPTs will not have many common links. u Do not have the unbounded delay problems of SMTs. u Simple to implement used as the basis of PIM and of The CBT interdomain Routing Protocol. Md Rezaul Huda Chowdhury Reza

SHARED TREE APPROACH OF MULTIPOINT ROUTING Disadvantages of Core Based Tree algorithms u Incur extra delay as compared to the RPF approach. u Suffer from traffic concentration on links converging towards the center. u Choosing the optimal center is an NP complete problem. u Locating the center requires complete knowledge of the network topology. Md Rezaul Huda Chowdhury Reza

MULTIPOINT ROUTING TradeOffs between algos u Any single tree cannot achieve Minimal Cost and Minimal Delay both. Shortest Path Trees  Minimize delay at expense of Cost. Steiner Minimal Trees  Minimize cost at expense of Delay. Between these  spectrum of different types of trees offering different tradeoffs. u Different strategies to place the routes results in different degrees of traffic concentration. Md Rezaul Huda Chowdhury Reza