Download presentation
Presentation is loading. Please wait.
Published byErick Beasley Modified over 8 years ago
1
CENG 352 Database Management Systems İsmail Sengör Altıngövde email: altingovde@ceng.metu.edu.tr URL: http://www.ceng.metu.edu.tr/~altingovde
2
CENG 352 Instructor: İsmail Sengör Altıngövde Office: A-203 Email: altingovde@ceng.metu.edu.tr Lecture Hours: Thu. 16:40 (BMB4) Fri. 15:40, 16:40 (BMB4) Course Web page: http://cow.ceng.metu.edu.tr Teaching Assistant: Abdullah Doğan –A-206, adogan@ceng.metu.edu.tr 2
3
Text Books and References 1.Raghu Ramakrishnan, Database Management Systems, McGraw Hill, 3 rd edition, 2003 (text book). 2.M. Kifer, A. Bernstein and P. Lewis, Database Systems: An Application Oriented Approach, 2nd edition, Addison Wesley, 2005. 3.A. Silberschatz, H.F. Korth, S. Sudarshan, Database System Concepts, McGraw Hill, 6 th edition, 2010. 4.R. Elmasri, S.B. Navathe, Fundamentals of Database Systems, 6 th edition, Addison-Wesley, 2011. 5.H. Garcia-Molina, J. D. Ullman, J. Widom, Database Systems The Complete Book, Prentice Hall, 2002. 3
4
Required Background Material from CENG351 (Entity- Relationship modeling, relational data model, relational algebra, B+-trees and hash-based indexing, algorithms for sorting/merging data files) Intermediate programming skills –C/C++/Java
5
Course Outline Review (Relational Model) –E/R modeling –SQL Database Application Development and Internet Applications –Embedded SQL, cursors, dynamic SQL, JDBC classes and interfaces, stored procedures. –http protocol, html/xml documents, xml dtd, html forms, javascript, style sheets, cgi, application servers, servlets, java server pages.
6
Course Outline (cont.) Relational Database Design –Functional Dependency Theory Attribute closure, FD set closure, 3NF, BCNF, decomposition lossless-join decomposition, dependency- preserving decomposition. –Normalization of Relations Decomposition into 3NF, BCNF. –Other Types of Dependencies Multivalued dependencies, join dependencies, 4NF, 5NF.
7
Course Outline (cont.) Query Processing –Algorithms for relational operators (sorting, indexing (hashing, tree-indexing), nested-loops join, sort-merge join, hash join), query evaluation plans. Query Optimization –Translation of SQL queries to RA, estimating the cost of an execution plan, equivalences of RA expressions, enumeration of alternative plans.
8
Course Outline (cont.) Transaction Processing –Properties of transactions, concurrent execution, schedules, recoverability. Concurrency Control –Serializability theory, two-phase locking, deadlock problem, timestamp-ordering, optimistic concurrency control. Crash Recovery –Logging, checkpointing, recovering from a system crash, example systems like ARIES.
9
Grading Midterm 25 % Final Exam 35 % In-class written assignments 20 % Project 20 % Attendance: –Once in every week –30% attendance is mandatory to get the final exam 9
10
Grading Policies Policy on final exam: –Attendance should be at least 30%, AND –The average grade of the in-class assignments should be at least 30 Policy on missed midterm: –make-up exam in the following week (only for the legal excuses and if informed beforehand) 10
11
Projects Projects in groups of two. –Determine your groups till the end of add/drops, i.e., Oct 16 Friday, 24:00 –and e-mail to your TA 3 stages: –Proposal & Preliminary E/R Report –Design Report –Final Report + Demo 11
12
Questions?
13
CENG 35113 Files Data is not scattered hither and thither on disk. Instead, it is organized into files. Files are organized into records. Records are organized into fields.
14
File-Based Systems Collection of application programs that perform services for the end users (e.g. reports). Each program defines and manages its own data. 14
15
File-Based Processing Pearson Education © 2009 15
16
Limitations of File-Based Approach Separation and isolation of data –Each program maintains its own set of data. –Users of one program may be unaware of potentially useful data held by other programs. Duplication of data –Same data is held by different programs. –Wasted space and potentially different values and/or different formats for the same item. 16
17
Limitations of File-Based Approach Data dependence –File structure is defined in the program code. Incompatible file formats –Programs are written in different languages, and so cannot easily access each other’s files. Fixed Queries/Proliferation of application programs –Programs are written to satisfy particular functions. –Any new requirement needs a new program. 17
18
Database Approach Arose because: –Definition of data was embedded in application programs, rather than being stored separately and independently. –No control over access and manipulation of data beyond that imposed by application programs. Result: –the database and Database Management System (DBMS). 18
19
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke19 Database Management Systems Chapter 1
20
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke20 What Is a DBMS? A very large, integrated collection of data. Models real-world enterprise. Entities (e.g., students, courses) Relationships (e.g., Madonna is taking CS564) A Database Management System (DBMS) is a software package designed to store and manage databases.
21
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke21 Typical DBMS Functionality Define a database : in terms of data types, structures and constraints Construct or load the database on a secondary storage medium Manipulating the database : querying, generating reports, insertions, deletions and modifications to its content Concurrent Processing and Sharing by a set of users and programs – yet, keeping all data valid and consistent
22
Database Management System (DBMS) Pearson Education © 2009 22
23
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke23 Why Use a DBMS? Data independence and efficient access. Reduced application development time. Data integrity and security. Uniform data administration. Concurrent access, recovery from crashes.
24
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke24 Why Study Databases?? Shift from computation to information at the “low end”: scramble to webspace (a mess!) at the “high end”: scientific applications Datasets increasing in diversity and volume. Digital libraries, interactive video, Human Genome project, EOS project ... need for DBMS exploding DBMS encompasses most of CS OS, languages, theory, AI, multimedia, logic ?
25
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke25 Data Models A data model is a collection of concepts for describing data. A schema is a description of a particular collection of data, using a given data model. The relational model of data is the most widely used model today. Main concept: relation, basically a table with rows and columns. Every relation has a schema, which describes the columns, or fields.
26
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke26 Instance of Students Relation Students( sid: string, name: string, login: string, age: integer, gpa: real ) sidnameloginagegpa 53666Jonesjones@cs183.4jones@cs 53688Smithsmith@ee183.2smith@ee 53650Smithsmith@math193.8smith@math
27
Data Models Object-Based Data Models Entity-Relationship Semantic Functional Object-Oriented. Record-Based Data Models Relational Data Model Network Data Model Hierarchical Data Model. Physical Data Models Pearson Education © 2009 27
28
Relational Data Model Pearson Education © 2009 28
29
Network Data Model Pearson Education © 2009 29
30
Hierarchical Data Model Pearson Education © 2009 30
31
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke31 Levels of Abstraction Single conceptual (logical) schema and physical schema, many views * Schemas are defined using DDL; data is modified/queried using DML. Physical Schema Conceptual Schema View 1View 2View 3 Conceptual schema defines logical structure Physical schema describes the files and indexes used External shemata (views) describe how users see the data.
32
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke32 Example: University Database Conceptual schema: Students(sid: string, name: string, login: string, age: integer, gpa:real) Courses(cid: string, cname:string, credits:integer) Enrolled(sid:string, cid:string, grade:string) Physical schema: Relations stored as unordered files. Index on first column of Students. External Schema (View): Course_info(cid:string,enrollment:integer)
33
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke33 Data Independence * Applications insulated from how data is structured and stored. Logical data independence : Protection from changes in logical structure of data. Physical data independence : Protection from changes in physical structure of data. * One of the most important benefits of using a DBMS!
34
Data Independence Logical Data Independence –Refers to immunity of external schemas to changes in conceptual schema. –Conceptual schema changes (e.g. addition/removal of entities). –Should not require changes to external schema or rewrites of application programs. 34
35
Data Independence Physical Data Independence –Refers to immunity of conceptual schema to changes in the internal schema. –Physical schema changes (e.g. using different file organizations, storage structures/devices). –Should not require change to conceptual or external schemas. 35
36
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke36 Concurrency Control Concurrent execution of user programs is essential for good DBMS performance. Because disk accesses are frequent, and relatively slow, it is important to keep the cpu humming by working on several user programs concurrently. Interleaving actions of different user programs can lead to inconsistency: e.g., check is cleared while account balance is being computed. DBMS ensures such problems don’t arise: users can pretend they are using a single-user system.
37
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke37 Transaction: An Execution of a DB Program Key concept is transaction, which is an atomic sequence of database actions (reads/writes). Each transaction, executed completely, must leave the DB in a consistent state if DB is consistent when the transaction begins. Users can specify some simple integrity constraints on the data, and the DBMS will enforce these constraints. Beyond this, the DBMS does not really understand the semantics of the data. (e.g., it does not understand how the interest on a bank account is computed). Thus, ensuring that a transaction (run alone) preserves consistency is ultimately the user’s responsibility!
38
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke38 Structure of a DBMS A typical DBMS has a layered architecture. The figure does not show the concurrency control and recovery components. This is one of several possible architectures; each system has its own variations. Query Optimization and Execution Relational Operators Files and Access Methods Buffer Management Disk Space Management DB These layers must consider concurrency control and recovery
39
Database Languages Data Definition Language (DDL) –Allows the DBA or user to describe and name entities, attributes, and relationships required for the application –plus any associated integrity and security constraints. Data Manipulation Language (DML) –Provides basic data manipulation operations on data held in the database. 39
40
System Catalog Repository of information (metadata) describing the data in the database. One of the fundamental components of DBMS. Typically stores: –names, types, and sizes of data items; –constraints on the data; –names of authorized users; –data items accessible by a user and the type of access; –usage statistics. 40
41
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke41 Components of Data-Intensive Systems Three separate types of functionality: Data management Application logic Presentation The system architecture determines whether these three components reside on a single system (“tier) or are distributed across several tiers
42
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke42 Single-Tier Architectures All functionality combined into a single tier, usually on a mainframe User access through dumb terminals Advantages: Easy maintenance and administration Disadvantages: Today, users expect graphical user interfaces. Centralized computation of all of them is too much for a central system Client Application logic DBMS
43
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke43 Client-Server Architectures Work division: Thin client Client implements only the graphical user interface Server implements business logic and data management Work division: Thick client Client implements both the graphical user interface and the business logic Server implements data management Application logic DBMS Network Client DBMS Network Client Application logic Client Application logic
44
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke44 Client-Server Architectures (Contd.) Disadvantages of thick clients No central place to update the business logic Security issues: Server needs to trust clients Access control and authentication needs to be managed at the server Clients need to leave server database in consistent state One possibility: Encapsulate all database access into stored procedures Does not scale to more than several 100s of clients Large data transfer between server and client More than one server creates a problem: x clients, y servers: x*y connections
45
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke45 The Three-Tier Architecture Database System Application Server Client Program (Web Browser) Presentation tier Middle tier Data management tier
46
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke46 The Three Layers Presentation tier Primary interface to the user Needs to adapt to different display devices (PC, PDA, cell phone, voice access?) Middle tier Implements business logic (implements complex actions, maintains state between different steps of a workflow) Accesses different data management systems Data management tier One or more standard database management systems
47
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke47 Example 1: Airline reservations Build a system for making airline reservations What is done in the different tiers? Database System Airline info, available seats, customer info, etc. Application Server Logic to make reservations, cancel reservations, add new airlines, etc. Client Program Log in different users, display forms and human- readable output
48
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke48 Example 2: Course Enrollment Build a system using which students can enroll in courses Database System Student info, course info, instructor info, course availability, pre-requisites, etc. Application Server Logic to add a course, drop a course, create a new course, etc. Client Program Log in different users (students, staff, faculty), display forms and human-readable output
49
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke49 Technologies Database System (DB2) Application Server (Tomcat, Apache) Client Program (Web Browser) HTML Javascript XSLT JSP Servlets Cookies CGI XML Stored Procedures
50
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke50 Advantages of the Three-Tier Architecture Heterogeneous systems Tiers can be independently maintained, modified, and replaced Thin clients Only presentation layer at clients (web browsers) Integrated data access Several database systems can be handled transparently at the middle tier Central management of connections Scalability Replication at middle tier permits scalability of business logic Software development Code for business logic is centralized Interaction between tiers through well-defined APIs: Can reuse standard components at each tier
51
The DBMS Marketplace Relational DBMS companies – Oracle, Sybase – are among the largest software companies in the world. IBM offers its relational DB2 system. Microsoft offers SQL-Server, plus Microsoft Access for the cheap DBMS on the desktop, answered by “lite” systems from other competitors. Relational companies also challenged by “object-oriented DB” companies. But countered with “object-relational” systems, which retain the relational core while allowing type extension as in OO systems. 51
52
Databases make these folks happy... End users and DBMS vendors DB application programmers – E.g., smart webmasters Database administrator (DBA) – Designs logical /physical schemas – Handles security and authorization – Data availability, crash recovery – Database tuning as needs evolve Must understand how a DBMS works!
53
Summary DBMS used to maintain, query large datasets. Benefits include recovery from system crashes, concurrent access, quick application development, data integrity and security. Levels of abstraction give data independence. A DBMS typically has a layered architecture. DBAs hold responsible jobs and are well-paid! DBMS R&D is one of the broadest, most exciting areas in CS.
54
Data Independence and the Three- Level Architecture Pearson Education © 2009 54
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.