Presentation is loading. Please wait.

Presentation is loading. Please wait.

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Database Management Systems Chapter 1 Professor: Iluju Kiringa

Similar presentations


Presentation on theme: "Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Database Management Systems Chapter 1 Professor: Iluju Kiringa"— Presentation transcript:

1 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Database Management Systems Chapter 1 Professor: Iluju Kiringa kiringa@site.uottawa.ca kiringa@site.uottawa.ca http://www.site.uottawa.ca/~kiringa SITE 5072

2 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke2 What Is a DBMS?  A database is a very large, integrated collection of data.  Organizations are faced with HUGE amounts of data that needs to be efficiently managed.  Models real-world organization / enterprise.  Entities (e.g., students, courses, faculty, and classrooms)  Relationships (e.g., Sue is enrolled in CSI3317; Iluju teaches CSI3317, CSI3317 is taught in TBT070)  A Database Management System (DBMS) is a software package designed to store and manage databases.

3 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke3 Drawbacks of Files Systems  Each application must move large datasets between main memory and secondary storage (must deal with, e.g., buffering, page-oriented access, etc.)  Each application must deal with some method of identifying all data items in case the available addressing mode is not sufficient (e.g., 32-bit addressing cannot directly access more than 4GB.)

4 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke4 Drawbacks of Files Systems (Cont’d)  Need special code for different queries (i.e. all query codes are ad hoc ).  Must protect data from inconsistency due to multiple concurrent users changing it.  Ensure consistent crash recovery.  Provide more security and access control than the password mechanism offered by operating systems.

5 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke5 Why Use a DBMS?  Data independence : application don’t see details of data representation and storage.  Efficient access : use of very sophisticated data storage and access methods.  Reduced application development time : functionalities of a DBMS need not be duplicated.  Data integrity and security : enforcing integrity constraints and access control.  Uniform data administration : experienced users administer data that is used by inexperienced ones.  Concurrent access, recovery from crashes : users access data without thinking of whoever else uses it.

6 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke6 Why Study Databases??  Datasets increasing in diversity and volume  Digital libraries, interactive video, Human Genome project, EOS project, Mars exploration projects, … ... need for DBMS exploding  Efficient management of such a diverse amount of data involves research in many fundamental issues.  Those issues are common to all the domains.  DBMS encompasses most of CS  OS, languages, theory, AI, multimedia, logic, etc. ?

7 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke7 Data Models  A data model is a collection of high-level constructs for describing data.  A schema is a description of a particular collection of data, using a given data model.  An instance of a schema is a sample set of data organized using a given schema.  The relational model is the most widely used model today.  Main concept: relation instance (relation), i.e. a table with rows and columns.  Every relation has a relation schema (schema), which describes the name and columns, or fields.

8 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke8 Levels of Abstraction  Data schemas are given at 3 levels of abstractions.  Many views, single conceptual (logical) schema and physical schema.  Views (or external schemas) describe how users see the data.  Conceptual schema defines logical structure  Physical schema describes the files and indexes used. * Schemas are defined using a data definition language (DDL). Physical Schema Conceptual Schema View 1View 2View 3

9 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke9 Example: Enterprise Database  Conceptual schema:  Emp(eid: integer, ename: string, age: integer, sal: real)  Works_in(eid: integer, did: integer, rating: real)  Dept(did: integer, budget: real, mgr: integer)  Physical schema:  Relations stored as ordered files.  Index on first column of Emp, and Dept; index on all columns of Works_in, etc.  External Schema (View):  Dept_count(did: integer, count: integer)

10 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke10 Example: University Database  Conceptual schema:  Students(sid: string, name: string, login: string, age: integer, gpa: real)  Courses(cid: string, cname: string, credits: integer)  Enrolled(sid: string, cid: string, grade: string)  Physical schema:  Relations stored as unordered files.  Index on first column of Students, and Courses; index on two first columns of Enrolled, etc.  External Schema (View):  Course_info(cid: string, enrollment: integer)

11 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke11 Data Independence *  Applications insulated from how data is structured and stored.  Each level of abstraction protected from changes in the structure of the level below it.  Logical data independence : Protection from changes in logical structure of data.  Physical data independence : Protection from changes in physical structure of data. * One of the most important benefits of using a DBMS!

12 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke12 Querying Data  A query is a question involving the stored data, e.g.,  What is the name of the professor who teaches CSI3317?  Which percentage of students got an A+ in CSI3317?  A query language is a special purpose language in which queries can be posed against databases.  Data is modified/queried using a data manipulation language (DML). So a query language is a subset of a DML  The relational model supports 2 query languages:  Relational calculus : logic-based query language  Relational algebra : based on a set of operators for manipulating relations.

13 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke13 Concurrency Control  Concurrent execution of user programs is essential for good DBMS performance.  Because disk accesses are frequent, and relatively slow, it is important to keep the cpu working by processing several user programs concurrently.  Interleaving actions of different user programs can lead to inconsistency: e.g., depositing money while account balance is being computed.  DBMS ensures such problems don’t arise: users can pretend they are using a single-user system.

14 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke14 Transaction: An Execution of a DB Program  A transaction is an atomic sequence of database actions (reads/writes) corresponding to the execution of a DB transaction program.  Each transaction, executed completely, must leave the DB in a consistent state if DB is consistent when the transaction begins.  Users can specify some simple integrity constraints on the data, and the DBMS will enforce these constraints.  Beyond this, the DBMS does not really understand the semantics of the data.  Thus, ensuring that a transaction (run alone) preserves consistency is ultimately the user’s responsibility!

15 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke15 Scheduling Concurrent Transactions  A DBMS ensures that execution of {T1,..., Tn} is equivalent to some serial execution of T1,..., Tn.  Before reading/writing an object, a transaction requests a lock on the object, and waits till the DBMS gives it the lock. All locks are released at the end of the transaction. (Strict 2PL locking protocol.)  Idea: If an action of Ti (say, writing X) affects Tj (which perhaps reads X), one of them, say Ti, will obtain the lock on X first and Tj is forced to wait until Ti completes; this effectively orders the transactions.  If Tj already has a lock on Y and Ti later requests a lock on Y, there is Deadlock! Ti or Tj is aborted and restarted!

16 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke16 Ensuring Atomicity  A DBMS ensures atomicity (all-or-nothing property) even if there is a crash in the middle of a transaction.  Idea: Keep a log (history) of all actions carried out by the DBMS while executing a set of transactions:  Before a change is made to the database, the corresponding log entry is forced to a safe location. ( WAL protocol ; OS support for this is often inadequate.)  After a crash, the effects of partially executed transactions are undone using the log. (Thanks to WAL, if log entry wasn’t saved before the crash, corresponding change was not applied to database!)

17 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke17 The Log  The following actions are recorded in the log:  Ti writes an object : the old value and the new value. Log record must go to disk before the changed page!  Ti commits/aborts : a log record indicating this action.  Log records are chained together by transaction id, so it’s easy to undo a specific transaction.  Log is often archived on “stable” storage.  All log related activities (and in fact, all CC related activities such as lock/unlock, dealing with deadlocks etc.) are handled transparently by the DBMS.

18 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke18 People Involved with Databases  End users: just use an interface to a DBMS!  DBMS vendors: IBM, Oracle, Informix, Microsoft, etc  DBMS researchers and implementers: invent new theories and algorithms for, and write code of DBMSs  DB application programmers: write C/C++/Java/… programs that interact with DBMSs  E.g. smart webmasters, CSI3317 students, etc  Database administrator (DBA)  Designs logical /physical schemas  Handles security and authorization  Data availability, crash recovery  Database tuning as needs evolve DBAs must understand how a DBMS works!

19 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke19 Structure of a DBMS  A typical DBMS has a layered architecture.  The figure does not show the concurrency control and recovery components.  This is one of several possible architectures; each system has its own variations. Query Optimization and Execution Relational Operators Files and Access Methods Buffer Management Disk Space Management DB These layers must consider concurrency control and recovery

20 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke20 Historical Perspective  Early 1960s: GE designed 1 st general-purpose DBMS, based on the network data model.  Late 1960s: IBM developed IMS, based on the hierarchical data model.  Early 1970s: E. Codd introduced the relational model to solve problems related to the previous models.  Late 1970s and early 1980s: work on transaction processing, mainly by Jim Gray and P. Bernstein,  All the 1980s: use of relational DBs became standard practice in large corporations, many vendors entered the market, standardization efforts on SQL, the query language for relational DBs, etc.

21 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke21 Historical Perspective (Cont’d)  Late 1980’s and early 1990s: richer data models (object- oriented, object-relational, etc), and more expressive query languages (Datalog, nested relations, etc) are introduced.  Late 1990s: major vendors extend relational DBMSs towards new data types (images, text, multimedia content) and queries based hereon.  All along the way:  Scientific recognition: Turing Awards to DB researchers E. Codd and J. Gray  Birth of a huge, multibillion industry (IBM, Oracle …)

22 Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke22 Summary  DBMS used to maintain, query large datasets.  Benefits include recovery from system crashes, concurrent access, quick application development, data integrity and security.  Levels of abstraction give data independence.  A DBMS typically has a layered architecture.  DB researchers, implementers, and administrators hold responsible jobs and are well-paid!  DBMS R&D is one of the broadest, most exciting areas in CS.  Databases are a multibillion industry !


Download ppt "Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Database Management Systems Chapter 1 Professor: Iluju Kiringa"

Similar presentations


Ads by Google