Presentation is loading. Please wait.

Presentation is loading. Please wait.

Distributed Database Management Systems

Similar presentations


Presentation on theme: "Distributed Database Management Systems"— Presentation transcript:

1

2 Distributed Database Management Systems
Lecture - 1

3 References 1-Distributed Database Systems (2nd Edition) by T.M., Ozsu, P. Valdusiez 2- Distributed Database Systems. By D. Bell, J. Grimson, Addison-Wesley, 1992

4 References 3- Distributed Systems: Concepts and Design, 4th Edition, by G. Coulouris, J. Dollimore, T. Kindberg, Addison- Wesley Prerequisites: Database Management Systems, Computer Networks

5 Briefly Course Introduction
Introduction to database and Distributed Systems in general Architectures and Design Issues of DDBS Technological Aspects and designs Theoretical Aspects of the topic

6 Each program contains data description that it manipulates
Little bit of History Traditional File Processing System: the very first form of business data processing Each program contains data description that it manipulates Redundancy of data Problems in maintenance

7 Program and Data Interdependence
Library Applications Data Files Examination Applications Data Files Registration Applications Data Files Program and Data Interdependence

8 File Processing Systems
Library Exam Registration Reg_Number Name Father Name Address Books Issued Class Phone Fine Semester Grade Duplication of Data Vulnerable to Inconsistency

9 Traditional File Processing

10 History continues Database Approach: (Also called centralized database) Database is a shared collection of logically related data

11 Database Approach Data Description Manipulation PROGRAM 1 Database
…. PROGRAM 1 PROGRAM 2 PROGRAM 3 Takes care of all major drawbacks of File System Environment plus more

12 Distributed Computing System
A number of autonomous processing elements that are connected through a computer network and that cooperate in performing their assigned tasks

13 Distributed Computing Systems
Distributed System Software enables computers to coordinate and share The word distributed? Processing logic Functions Data Control; All are relevant and important here

14 Classifications of DCS
Degree of Coupling How closely systems are connected May be the measured as ratio of messages interchanged to the local processing Could be Weak (over the network) or Strong (if components are shared)

15 Classifications of DCS
Interconnection structure Could be point to point or a common interconnection channel Interdependence of Components Synchronization Factors are not totally independent

16 Why DCS? Suits some of the Organizational Structures; more reliable and responsive Nature of some applications Technological Push

17 DCS’s Alerts Information pieces and Lack of Standards
Difficulties in Large Application Design Too Many Options Available

18 Distributed DB and DBMS

19 Distributed Database: A collection of logically interrelated databases that are spread physically across multiple locations connected by a data communications link.

20 Main Characteristics Data at multiple sites DM at each site
Local requirements Global perspective

21 Where to apply Major two reasons that make an application a candidate to be DDBS application Large Number of Users Operation spread large geographical area

22 Example Applications Banking Air Ticketing
Business at multiple locations

23 Distributed DBMS: A software system that permits the management of DDB and makes the distributed transparent environment to the users Decentralized Database: A collection of independent databases on non-networked computers.

24 Resembling Setups

25 Distributed Files: A collection of files stored on different computers of a network; not a DDBS
DDBS is logically related, common structure among files, and accessed via same interface

26 Resembling Setups Multiprocessor System: multiple processors that share some form of memory Processor Unit Memory I/O System Shared Everything Tight Coupling

27 Resembling Setups Computer System CPU Memory Shared Secondary
Shared Everything Loose Coupling

28 Resembling Setups CPU Memory Computer System Switch Shared Nothing

29 Resembling Setups DDBS is also different from a centralized system having C/S system involving network

30 Reasons for DDBS

31 Local units want control over data.
Consolidate data for integrated decisions Reduce telecommunication costs. Reduce the risk of telecommunication failures.

32 • • • • Distributed DBMS DBMS 1 DBMS n Global User Node 1 Node n
Schema Local User

33 Objectives/Promises of DDBSs

34 Transparency User View System View

35 DATA INDEPENDENCE Data independence is a fundamental form of transparency that we look for within a DBMS It is also the only type that is important within the context of a centralized DBMS

36 Data Independence Two types, Logical Data Independence and Physical Data Independence Logical Data Independence refers to the immunity of user applications to changes in the logical structure (i.e., schema) of the database Physical Data Independence it deals with hiding the details of the storage structure from user applications A transparent system hides the implementation details from its users When a user application is written, it should not be concerned with the details of physical data organization. Therefore, the user application should not need to be modified when data organization changes occur due to performance considerations

37 Data Independence

38 Network Transparency User should not only be free from network management activities rather it should be unaware of even existence of the network Then there would be no difference between database applications that would run on a centralized database and those that would run on a distributed database. This type of transparency is referred to as network transparency or distribution transparency Location Transparency and Naming Transparency Naming transparency means that a unique name is provided for each object in the database

39 replication transparency
For performance, reliability, and availability reasons, it is usually desirable to be able to distribute data in a replicated fashion across the machines on a network Such replication helps performance since diverse and conflicting user requirements can be more easily accommodated if one of the machines fails, a copy of the data are still available on another machine on the network Assuming that data are replicated, the transparency issue is whether the users should be aware of the existence of copies or whether the system should handle the management of copies and the user should act as if there is a single copy of the data Replication transparency refers only to the existence of replicas, not to their actual location

40 Fragmentation Transparency
The final form of transparency that needs to be addressed within the context of a distributed database system is that of fragmentation transparency This is commonly done for reasons of performance, availability, and reliability Furthermore, fragmentation can reduce the negative effects of replication. Each replica is not the full relation but only a subset of it; thus less space is required and fewer data items need to be managed There are two general type of fragmentations are available named Horizontal fragmentation and the second one is the vertical fragmentation

41 Responsibility of Transparency
Transparency is desirable but there is a compromise between level of Transparency and difficulty/cost (Gray argues that full transparency makes the management of distributed data very difficult and claims that “applications coded with transparent access to geographically distributed databases have: poor manageability, poor modularity, and poor message performance” [Gray, 1989]) The language/Compiler: to provide uniform method of manipulating data. Avoid connectivity details. Operating System: already provides in form of Device Drivers The third layer at which transparency can be supported is within the DBMS Practically, we get combination of all three

42 Layers of Transparency

43 Improved Performance This has two potential advantages:
1. Since each site handles only a portion of the database, contention for CPU and I/O services is not as severe as for centralized databases 2. Localization reduces remote access delays that are usually involved in wide area networks (for example, the minimum round-trip message propagation delay in satellite-based systems is about 1 second)

44 Easier System Expansion
It is much easier to accommodate increasing database sizes Expansion can usually be handled by adding processing and storage power to the network. Obviously, it may not be possible to obtain a linear increase in “power,” since this also depends on the overhead of distribution. However, significant improvements are still possible Grosh’s law: It was commonly believed that it would be possible to purchase a fourfold powerful computer if one spent twice as much As the time passed with the advent of microcomputers and workstations, and their price/performance characteristics, this law is considered invalid now but this doesn’t means that MAIN-FRAMES are dead

45 Thanks


Download ppt "Distributed Database Management Systems"

Similar presentations


Ads by Google