Distributed Database Management Systems

Slides:



Advertisements
Similar presentations
Distributed Data Processing
Advertisements

Database Architectures and the Web
Prentice Hall, Database Systems Week 1 Introduction By Zekrullah Popal.
Distributed Databases John Ortiz. Lecture 24Distributed Databases2  Distributed Database (DDB) is a collection of interrelated databases interconnected.
Distributed databases
Transaction.
MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management Dave Salisbury ( )
Chapter 13 (Web): Distributed Databases
1 Minggu 12, Pertemuan 23 Introduction to Distributed DBMS (Chapter , 22.6, 3rd ed.) Matakuliah: T0206-Sistem Basisdata Tahun: 2005 Versi: 1.0/0.0.
ABCSG - Distributed Database 1 Data Management Distributed Database Data Replication.
Distributed Database Management Systems
Overview Distributed vs. decentralized Why distributed databases
Organizing Data & Information
©Silberschatz, Korth and Sudarshan19.1Database System Concepts Lecture-10 Distributed Database System A distributed database system consists of loosely.
Chapter 12 Distributed Database Management Systems
©Silberschatz, Korth and Sudarshan18.1Database System Concepts Centralized Systems Run on a single computer system and do not interact with other computer.
Definition of terms Definition of terms Explain business conditions driving distributed databases Explain business conditions driving distributed databases.
DISTRIBUTED COMPUTING
Outline Introduction Background Distributed Database Design
Distributed Databases
Introduction to Databases Transparencies 1. ©Pearson Education 2009 Objectives Common uses of database systems. Meaning of the term database. Meaning.
DISTRIBUTED DATABASES AND DDBMS.  Understand the concept of “Distributed Data”  Describe various Distributed Data and DDBMS implementations  Explain.
Chapter 1 Database Systems. Good decisions require good information derived from raw facts Data is managed most efficiently when stored in a database.
Distributed Database The University of California Berkeley Extension Copyright © 2011 Patrick McDermott.
1 Distributed and Parallel Databases. 2 Distributed Databases Distributed Systems goal: –to offer local DB autonomy at geographically distributed locations.
Computer System Architectures Computer System Software
Database Architectures and the Web Session 5
Introduction to DISTRIBUTED SYSTEMS Tran, Van Hoai Department of Systems & Networking Faculty of Computer Science & Engineering HCMC University of Technology.
Database Design – Lecture 16
DISTRIBUTED DATABASES IN ADBMS Shilpa Seth
Lecture 5: Sun: 1/5/ Distributed Algorithms - Distributed Databases Lecturer/ Kawther Abas CS- 492 : Distributed system &
Session-8 Data Management for Decision Support
Database Systems: Design, Implementation, and Management Tenth Edition Chapter 12 Distributed Database Management Systems.
Database Systems: Design, Implementation, and Management Ninth Edition Chapter 12 Distributed Database Management Systems.
Week 5 Lecture Distributed Database Management Systems Samuel ConnSamuel Conn, Asst Professor Suggestions for using the Lecture Slides.
Distributed Database Systems Overview
Advanced Computer Networks Topic 2: Characterization of Distributed Systems.
G063 - Distributed Databases. Learning Objectives: By the end of this topic you should be able to: explain how databases may be stored in more than one.
Distributed Databases Midterm review. Lectures covered Everything until (including) March 2 nd Everything until (including) March 2 nd Focus on distributed.
Multiprossesors Systems.. What are Distributed Databases ? “ A Logically interrelated collection of shared data ( and a description of this data) physically.
Chapter 1 1 Lecture # 1 & 2 Chapter # 1 Databases and Database Users Muhammad Emran Database Systems.
Distributed DBMSs- Concept and Design Jing Luo CS 157B Dr. Lee Fall, 2003.
Kjell Orsborn UU - DIS - UDBL DATABASE SYSTEMS - 10p Course No. 2AD235 Spring 2002 A second course on development of database systems Kjell.
Distributed Databases
1 Distributed Databases BUAD/American University Distributed Databases.
Databases Illuminated
SOFTWARE DESIGN. INTRODUCTION There are 3 distinct types of activities in design 1.External design 2.Architectural design 3.Detailed design Architectural.
1 Chapter 1 Introduction to Databases Transparencies.
Distributed database system
CS742 – Distributed & Parallel DBMSM. Tamer Özsu Page 1.1 Outline Introduction & architectural issues What is a distributed DBMS Problems Current state-of-affairs.
Topic Distributed DBMS Database Management Systems Fall 2012 Presented by: Osama Ben Omran.
Distributed Database Management Systems. Reading Textbook: Ch. 1, Ch. 3 Textbook: Ch. 1, Ch. 3 For next class: Ch. 4 For next class: Ch. 4 FarkasCSCE.
 Distributed Database Concepts  Parallel Vs Distributed Technology  Advantages  Additional Functions  Distribution Database Design  Data Fragmentation.
Chapter 2 Database Environment.
1 Chapter 22 Distributed DBMS Concepts and Design CS 157B Edward Chen.
Distributed DBMS Architecture Chapter 4 Principles Of Distributed Database Systems,2/e By Ozsu, Patrick Valduriez.
1 Lecture1 Introduction to Databases Systems Database 1.
IT 5433 LM1. Learning Objectives Understand key terms in database Explain file processing systems List parts of a database environment Explain types of.
1 Chapter 22 Distributed DBMSs - Concepts and Design Simplified Transparencies © Pearson Education Limited 1995, 2005.
Virtual University of Pakistan In the name of Allah.
Lecture 13 Parallel Processing. 2 What is Parallel Computing? Traditionally software has been written for serial computation. Parallel computing is the.
Virtual University of Pakistan Distributed database Management Systems Lecture 03.
CHAPTER 25 - Distributed Databases and Client–Server Architectures
Distributed Database Management Systems
Database Architectures and the Web
Chapter 19: Distributed Databases
Distributed Databases
Database System Architectures
Presentation transcript:

Distributed Database Management Systems Lecture - 1

References 1-Distributed Database Systems (2nd Edition) by T.M., Ozsu, P. Valdusiez 2- Distributed Database Systems. By D. Bell, J. Grimson, Addison-Wesley, 1992

References 3- Distributed Systems: Concepts and Design, 4th Edition, by G. Coulouris, J. Dollimore, T. Kindberg, Addison- Wesley Prerequisites: Database Management Systems, Computer Networks

Briefly Course Introduction Introduction to database and Distributed Systems in general Architectures and Design Issues of DDBS Technological Aspects and designs Theoretical Aspects of the topic

Each program contains data description that it manipulates Little bit of History Traditional File Processing System: the very first form of business data processing Each program contains data description that it manipulates Redundancy of data Problems in maintenance

Program and Data Interdependence Library Applications Data Files Examination Applications Data Files Registration Applications Data Files Program and Data Interdependence

File Processing Systems Library Exam Registration Reg_Number Name Father Name Address Books Issued Class Phone Fine Semester Grade Duplication of Data Vulnerable to Inconsistency

Traditional File Processing

History continues Database Approach: (Also called centralized database) Database is a shared collection of logically related data

Database Approach Data Description Manipulation PROGRAM 1 Database …. PROGRAM 1 PROGRAM 2 PROGRAM 3 Takes care of all major drawbacks of File System Environment plus more

Distributed Computing System A number of autonomous processing elements that are connected through a computer network and that cooperate in performing their assigned tasks

Distributed Computing Systems Distributed System Software enables computers to coordinate and share The word distributed? Processing logic Functions Data Control; All are relevant and important here

Classifications of DCS Degree of Coupling How closely systems are connected May be the measured as ratio of messages interchanged to the local processing Could be Weak (over the network) or Strong (if components are shared)

Classifications of DCS Interconnection structure Could be point to point or a common interconnection channel Interdependence of Components Synchronization Factors are not totally independent

Why DCS? Suits some of the Organizational Structures; more reliable and responsive Nature of some applications Technological Push

DCS’s Alerts Information pieces and Lack of Standards Difficulties in Large Application Design Too Many Options Available

Distributed DB and DBMS

Distributed Database: A collection of logically interrelated databases that are spread physically across multiple locations connected by a data communications link.

Main Characteristics Data at multiple sites DM at each site Local requirements Global perspective

Where to apply Major two reasons that make an application a candidate to be DDBS application Large Number of Users Operation spread large geographical area

Example Applications Banking Air Ticketing Business at multiple locations

Distributed DBMS: A software system that permits the management of DDB and makes the distributed transparent environment to the users Decentralized Database: A collection of independent databases on non-networked computers.

Resembling Setups

Distributed Files: A collection of files stored on different computers of a network; not a DDBS DDBS is logically related, common structure among files, and accessed via same interface

Resembling Setups Multiprocessor System: multiple processors that share some form of memory Processor Unit Memory I/O System Shared Everything Tight Coupling

Resembling Setups Computer System CPU Memory Shared Secondary Shared Everything Loose Coupling

Resembling Setups CPU Memory Computer System Switch Shared Nothing

Resembling Setups DDBS is also different from a centralized system having C/S system involving network

Reasons for DDBS

Local units want control over data. Consolidate data for integrated decisions Reduce telecommunication costs. Reduce the risk of telecommunication failures.

• • • • Distributed DBMS DBMS 1 DBMS n Global User Node 1 Node n Schema Local User

Objectives/Promises of DDBSs

Transparency User View System View

DATA INDEPENDENCE Data independence is a fundamental form of transparency that we look for within a DBMS It is also the only type that is important within the context of a centralized DBMS

Data Independence Two types, Logical Data Independence and Physical Data Independence Logical Data Independence refers to the immunity of user applications to changes in the logical structure (i.e., schema) of the database Physical Data Independence it deals with hiding the details of the storage structure from user applications A transparent system hides the implementation details from its users When a user application is written, it should not be concerned with the details of physical data organization. Therefore, the user application should not need to be modified when data organization changes occur due to performance considerations

Data Independence

Network Transparency User should not only be free from network management activities rather it should be unaware of even existence of the network Then there would be no difference between database applications that would run on a centralized database and those that would run on a distributed database. This type of transparency is referred to as network transparency or distribution transparency Location Transparency and Naming Transparency Naming transparency means that a unique name is provided for each object in the database

replication transparency For performance, reliability, and availability reasons, it is usually desirable to be able to distribute data in a replicated fashion across the machines on a network Such replication helps performance since diverse and conflicting user requirements can be more easily accommodated if one of the machines fails, a copy of the data are still available on another machine on the network Assuming that data are replicated, the transparency issue is whether the users should be aware of the existence of copies or whether the system should handle the management of copies and the user should act as if there is a single copy of the data Replication transparency refers only to the existence of replicas, not to their actual location

Fragmentation Transparency The final form of transparency that needs to be addressed within the context of a distributed database system is that of fragmentation transparency This is commonly done for reasons of performance, availability, and reliability Furthermore, fragmentation can reduce the negative effects of replication. Each replica is not the full relation but only a subset of it; thus less space is required and fewer data items need to be managed There are two general type of fragmentations are available named Horizontal fragmentation and the second one is the vertical fragmentation

Responsibility of Transparency Transparency is desirable but there is a compromise between level of Transparency and difficulty/cost (Gray argues that full transparency makes the management of distributed data very difficult and claims that “applications coded with transparent access to geographically distributed databases have: poor manageability, poor modularity, and poor message performance” [Gray, 1989]) The language/Compiler: to provide uniform method of manipulating data. Avoid connectivity details. Operating System: already provides in form of Device Drivers The third layer at which transparency can be supported is within the DBMS Practically, we get combination of all three

Layers of Transparency

Improved Performance This has two potential advantages: 1. Since each site handles only a portion of the database, contention for CPU and I/O services is not as severe as for centralized databases 2. Localization reduces remote access delays that are usually involved in wide area networks (for example, the minimum round-trip message propagation delay in satellite-based systems is about 1 second)

Easier System Expansion It is much easier to accommodate increasing database sizes Expansion can usually be handled by adding processing and storage power to the network. Obviously, it may not be possible to obtain a linear increase in “power,” since this also depends on the overhead of distribution. However, significant improvements are still possible Grosh’s law: It was commonly believed that it would be possible to purchase a fourfold powerful computer if one spent twice as much As the time passed with the advent of microcomputers and workstations, and their price/performance characteristics, this law is considered invalid now but this doesn’t means that MAIN-FRAMES are dead

Thanks