Presentation is loading. Please wait.

Presentation is loading. Please wait.

CENG 553 Database Management Systems Nihan Kesim Çiçekli URL:

Similar presentations


Presentation on theme: "CENG 553 Database Management Systems Nihan Kesim Çiçekli URL:"— Presentation transcript:

1 CENG 553 Database Management Systems Nihan Kesim Çiçekli email: nihan@ceng.metu.edu.tr URL: http://www.ceng.metu.edu.tr/~nihan

2 2 Course Description This is an advanced course on databases covering theoretical and practical issues of database management systems. After a review of relational databases, transaction processing and a survey of various topics in database systems will be presented. Topics include object-oriented databases, distributed databases, web databases, multimedia databases, data mining, data warehouses.

3 3 References P. Lewis, A. Berstein, M. Kifer, Database systems: An application-oriented approach, 2nd edition, Addison-Wesley, 2005. Raghu Ramakrishnan, Database Management Systems, McGraw Hill, 3 rd edition, 2003. R. Elmasri, S.B. Navathe, Database Systems, 6 th edition, Pearson, 2011. A. Silberschatz, H.F. Korth, S. Sudarshan, Database System Concepts, McGraw Hill, 4 th edition, 2002.

4 4 Grading Term Paper 30% Class Presentation 20% Critiques 15% Final 35%  The term paper will be prepared by a team of two students.

5 5 Schedule March 17: Term paper topics and articles to be studied by each student will be determined. March 24: The schedule of class presentations will be announced. March 27: Articles to be criticized by each student will be determined. May 5 - May 26: Class presentations. May 27: Term paper submission.

6 6 Suggested Topics Content based querying in multimedia databases Automatic annotation in multimedia databases Recommender Systems Web Service Composition Web Service Discovery Web Mining Spatial databases and GIS Security and e-commerce Main memory databases DBMS interfaces

7 INTRODUCTION 7

8 8 Main Characteristics of the Database Approach Data Abstraction: A data model is used to hide storage details and present the users with a conceptual view of the database. Support of multiple views of the data: Each user may see a different view of the database, which describes only the data of interest to that user.

9 9 Main Characteristics of the Database Approach Self-describing nature of a database system: A DBMS catalog stores the description of the database. The description is called meta-data. This allows the DBMS software to work with different databases. Insulation between programs and data: Called program-data independence. Allows changing data storage structures and operations without having to change the DBMS access programs.

10 10 Main Characteristics of the Database Approach Sharing of data and multiuser transaction processing : allowing a set of concurrent users to retrieve and to update the database. Concurrency control within the DBMS guarantees that each transaction is correctly executed or completely aborted.

11 11 Historical Development of Database Technology Early Database Applications: The Hierarchical and Network Models were introduced in mid 1960’s and dominated during the seventies. A bulk of the worldwide database processing still occurs using these models. Relational Model based Systems: The model that was originally introduced in 1970 was heavily researched and experimented within IBM and the universities. Relational DBMS products emerged in the 1980’s.

12 12 Historical Development of Database Technology Object-oriented applications: OODBMSs were introduced in late 1980’s and early 1990’s to cater to the need of complex data processing in CAD and other applications. Their use has not taken off much. Data on the Web and E-commerce Applications: Web contains data in HTML (Hypertext markup language) with links among pages. This has given rise to a new set of applications, and E-commerce is using new standards like XML (eXtended Markup Language).

13 13 Extending Database Capabilities New functionality is being added to DBMSs in the following areas: –Scientific Applications –Image Storage and Management –Audio and Video data management –Data Mining –Data warehouses –Spatial data management –Time Series and Historical Data Management The above gives rise to new research and development in incorporating new data types, complex data structures, new operations and storage and indexing schemes in database systems.

14 14 Database Schema vs. Database State Schema is a description of a particular collection of data, using the given data model. Database State: Refers to the content of a database at a moment in time. Valid State: A state that satisfies the structure and constraints of the database. Distinction The database schema changes very infrequently. The database state changes every time the database is updated. Schema is also called intension, whereas state is called extension.

15 15 Three-Schema Architecture Proposed to support DBMS characteristics of: Program-data independence. Support of multiple views of the data.

16 16 Three-Schema Architecture Defines DBMS schemas at three levels: Internal schema at the internal level to describe physical storage structures and access paths. Typically uses a physical data model. Conceptual schema at the conceptual level to describe the structure and constraints for the whole database for a community of users. Uses a conceptual or an implementation data model. External schemas at the external level to describe the various user views. Usually uses the same data model as the conceptual level. Mappings among schema levels are needed to transform requests and data. Programs refer to an external schema, and are mapped by the DBMS to the internal schema for execution.

17 17 Data Independence Logical Data Independence: The capacity to change the conceptual schema without having to change the external schemas and their application programs. Physical Data Independence: The capacity to change the internal schema without having to change the conceptual schema.

18 18 DBMS Interfaces Stand-alone query language interfaces. Programmer interfaces for embedding DML in programming languages User-friendly interfaces: Menu-based Forms-based, designed for naïve users Graphics-based (Point and Click, Drag and Drop etc.) Natural language: requests in written English Combinations of the above

19 19 Other DBMS Interfaces Speech as Input and Output Web Browser as an interface Parametric interfaces (e.g. bank tellers) using function keys. Interfaces for the DBA: Creating accounts, granting authorizations Setting system parameters Changing schemas or access path

20 20 Concurrency Control Concurrent execution of user programs is essential for good DBMS performance. – Because disk accesses are frequent, and relatively slow, it is important to keep the CPU humming by working on several user programs concurrently. Interleaving actions of different user programs can lead to inconsistency: e.g., check is cleared while account balance is being computed. DBMS ensures such problems don’t arise: users can pretend they are using a single-user system.

21 21 Transaction: An Execution of a DB Program Key concept is transaction, which is an atomic sequence of database actions (reads/writes). Each transaction, executed completely, must leave the DB in a consistent state if DB is consistent when the transaction begins. – Users can specify some simple integrity constraints on the data, and the DBMS will enforce these constraints. – Beyond this, the DBMS does not really understand the semantics of the data. (e.g., it does not understand how the interest on a bank account is computed). – Thus, ensuring that a transaction (run alone) preserves consistency is ultimately the user’s responsibility!

22 22 Scheduling Concurrent Transactions DBMS ensures that execution of {T1,..., Tn} is equivalent to some serial execution T1′... Tn′. – Before reading/writing an object, a transaction requests a lock on the object, and waits till the DBMS gives it the lock. All locks are released at the end of the transaction. (Strict 2PL locking protocol.) – Idea: If an action of T i (say, writing X) affects T j (which perhaps reads X), one of them, say T i, will obtain the lock on X first and T j is forced to wait until T i completes; this effectively orders the transactions. – What if T j already has a lock on Y and T i later requests a lock on Y? (Deadlock!) T i or T j is aborted and restarted!

23 23 Ensuring Atomicity DBMS ensures atomicity (all-or-nothing property) even if system crashes in the middle of a Xact. Idea: Keep a log (history) of all actions carried out by the DBMS while executing a set of Xacts: – Before a change is made to the database, the corresponding log entry is forced to a safe location. (WAL protocol; OS support for this is often inadequate.) – After a crash, the effects of partially executed transactions are undone using the log. (Thanks to WAL, if log entry wasn’t saved before the crash, corresponding change was not applied to database!)

24 24 The Log The following actions are recorded in the log: – Ti writes an object: The old value and the new value. Log record must go to disk before the changed page! – Ti commits/aborts: A log record indicating this action. Log records chained together by Xact id, so it’s easy to undo a specific Xact (e.g., to resolve a deadlock). Log is often duplexed and archived on “stable” storage. All log related activities (and in fact, all CC related activities such as lock/unlock, dealing with deadlocks etc.) are handled transparently by the DBMS.

25 25 Structure of a DBMS A typical Relational DBMS has a layered architecture. The figure does not show the concurrency control and recovery components. This is one of several possible architectures; each system has its own variations. Query Optimization and Execution Relational Operators Files and Access Methods Buffer Management Disk Space Management DB These layers must consider concurrency control and recovery

26 26 Centralized Architecture Centralized Database System: combines everything into single system including- DBMS software, hardware, application programs and user interface processing software.

27 27 Single-User System Presentation Services - displays forms, handles flow of information to/from screen Application Services - implements user request, interacts with DBMS ACID properties automatic (isolation is trivial) or not required (this is not really an enterprise) presentation application services DBMS user module centralized system

28 28 Centralized Multi-User System Dumb terminals connected to mainframe –Application and presentation services on mainframe ACID properties required –Isolation: DBMS sees an interleaved schedule –Atomicity and durability: system supports a major enterprise Transaction abstraction is necessary; supplied by DBMS’s transaction support module.

29 29 Centralized Multi-User System user module central machine presentation application services presentation application services communication DBMS (Xaction support)

30 30 Basic Client-Server Architecture Specialized servers with specialized functions File Servers Printer Servers Web Servers E-mail Servers DBMS Server Clients

31 31 Clients Provide appropriate interfaces and a client-version of the system to access and utilize the server resources. Clients maybe diskless machines or PCs or Workstations with disks with only the client software installed. Connected to the servers via some form of a network. (LAN: local area network, wireless network, etc.)

32 32 DBMS Server Provides database query and transaction services to the clients Sometimes called query and transaction servers

33 33 Two Tier Client-Server Architecture User Interface Programs and Application Programs run on the client side Interface ODBC (Open Database Connectivity) (or JDBC) provides an Application program interface (API) allow client side programs to call the DBMS.

34 34 Two-Tiered Model of TPS DBMS database server machine presentation application services presentation application services client machines communication

35 35 Three Tier Client-Server Architecture Common for Web applications Intermediate Layer called Application Server or Web Server: stores the web connectivity software and the rules and business logic (constraints) part of the application used to access the right amount of data from the database server acts like a conduit for sending partially processed data between the database server and the client. Additional Features- Security: encrypt the data at the server before transmission decrypt data at the client

36 36 Three-Tiered Model of TPS DBMS database server machine presentation server client machines communication presentation server application server application server machine

37 37 Classification of DBMSs Based on the data model used: Traditional: Relational, Network, Hierarchical. Emerging: Object-oriented, Object-relational. Other classifications: Single-user (typically used with micro- computers) vs. multi-user (most DBMSs). Centralized (uses a single computer with one database) vs. distributed (uses multiple computers, multiple databases)

38 38 Classification of DBMSs Distributed Database Systems have now come to be known as client server based database systems because they do not support a totally distributed environment, but rather a set of database servers supporting a set of clients.

39 39 Variations of Distributed Environments Homogeneous DDBMS Heterogeneous DDBMS Federated or Multidatabase Systems

40 40 Application Designer’s View of a Distributed Database Designer might see the individual schemas of each local database -- called a multidatabase -- in which case distribution is visible –Can be homogeneous (all databases from one vendor) or heterogeneous (databases from different vendors) Designer might see a single global schema that integrates all local schemas (is a view) in which case distribution is hidden Designer might see a restricted global schema, which is the union of all the local schemas –Supported by some vendors of homogeneous systems

41 41 Views of Distributed Data (a) Multidatabase with local schemas (b) Integrated distributed database with global schema

42 42 Multidatabases Application must explicitly connect to each site Application accesses data at a site using SQL statements based on that site’s schema Application may have to do reformatting in order to integrate data from different sites Application must manage replication –Know where replicas are stored and decide which replica to access

43 43 Global and Restricted Global Schemas Middleware provides integration of local schemas into a global schema –Application need not connect to each site –Application accesses data using global schema Need not know where data is stored – location transparency –Global joins are supported –Middleware performs necessary data reformatting –Middleware manages replication – replication transparency


Download ppt "CENG 553 Database Management Systems Nihan Kesim Çiçekli URL:"

Similar presentations


Ads by Google