CENG 352 Database Management Systems İsmail Sengör Altıngövde URL:

Slides:



Advertisements
Similar presentations
1 Introduction to Database Systems CSE444 Instructor: Scott Vandenberg University of Washington Winter 2000.
Advertisements

Database Management Systems 1 Ramakrishnan & Gehrke Introduction to Database Systems Chapter 1 Instructor: Mirsad Hadzikadic.
Chapter 1 Instructor: Murali Mani Database Management Systems.
1 541: Database Systems S. Muthu Muthukrishnan. 2 Some Data Collections I Have Played With….  Wireless call detail records.  U. S. Patents.  AskJeeves.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Introduction to Database Systems Chapter 1 Instructor: Wang-Chien Lee
Database: A collection of related data [Elmasri]. A database represents some aspect of real world called “miniworld” [Elmasri] or “enterprise” [Ramakrishnan].
Introduction to Database Systems Ch. 1, Ch. 2 Mr. John Ortiz Dept. of Computer Science University of Texas at San Antonio.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Database Management Systems Chapter 1 Instructor: Deborah Strahman
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Introduction to Database Systems Chapter 1 Instructor: Johannes Gehrke
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 1 Database Systems I Introduction.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Database Management Systems Chapter 1.
1 Introduction to Database Systems Ref. Ramakrishnan & Gehrke Chapter 1.
Introduction to Databases Transparencies
1 CENG 302 Introduction to Database Management Systems Nihan Kesim Çiçekli URL:
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Database Management Systems Chapter 1 Instructor: Ethan Jackson
CSCD34 - Data Management Systems,- A. Vaisman1 CSC D34 - Data Management Systems Instructor: Alejandro Vaisman University of Toronto.
CSC343H – Introduction to Databases
CSC2012 Database Technology & CSC2513 Database Systems.
Database Management Systems 1 Introduction to Database Systems Instructor: Xintao Wu Ramakrishnan & Gehrke.
CS6530 Graduate-level Database Systems Prof. Feifei Li.
Introduction: Databases and Database Users
1 CS862 - Advanced Database Systems Sang H. Son
Database Management Systems 1 Ramakrishnan & Gehrke Introduction to Database Systems Chpt 1 Instructor: Xintao Wu.
Database Management Systems 1 Ramakrishnan & Gehrke Introduction to Database Systems Chpt 1 Instructor: Weichao Wang.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Database Management Systems Chapter 1 Overview of Database Systems.
DATABASE MANAGEMENT SYSTEMS TERM B. Tech II/IT II Semester UNIT-I PPT SLIDES Text Books: (1) DBMS by Raghu Ramakrishnan (2) DBMS by Sudarshan and.
INFS614, Dr. Brodsky, GMU1 Database Management Systems INFS 614 Instructor: Professor Alex Brodsky
ICS 321 Fall 2009 Introduction to Database Systems Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa.
Introduction to Database Systems1. 2 Basic Definitions Mini-world Some part of the real world about which data is stored in a database. Data Known facts.
1 What Is a DBMS?  A very large, integrated collection of data.  Models real-world enterprise.  Entities (e.g., students, courses)  Relationships (e.g.,
1 Chapter 1 Introduction to Databases Transparencies.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Introduction to Database Systems Chapter 1 Instructor: Johannes Gehrke
-ebru a.s ATTRIBUTE: Description of entities For employee entity number, name, deptno, age, adr, salary..etc are attributes. RECORD: Stores whole.
CpSc 8620: DBMS Design Introduction. 2 Attribution Materials and lecture notes in this course are adapted from various sources, including the authors.
1 CS462- Database Systems Sang H. Son
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Why do you learn database?? Chapter 0.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Database Management Systems Chapter 1.
BBM 371 – Data Management Lecture 3: Basic Concepts of DBMS Prepared by: Ebru Akçapınar Sezer, Gönenç Ercan.
CENG 352 Database Management Systems Nihan Kesim Çiçekli URL:
1 CENG 351 CENG 351 Introduction to Data Management and File Structures Department of Computer Engineering METU.
Database Management Systems 1 Ramakrishnan & Gehrke Introduction to Database Systems Chpt 1 Instructor: Xin Zhang.
1 Database Management Systems Introduction Instructor: Oliver Schulte
CENG 352 Database Management Systems İsmail Sengör Altıngövde URL:
CS522 Advanced database Systems Huiping Guo Department of Computer Science California State University, Los Angeles Course administration.
Introduction to Databases Transparencies
COP Introduction to Database Structures
DATABASE MANAGEMENT SYSTEMS
Introduction to Database Systems Chapter 1
DATABASE MANAGEMENT SYSTEMS
CENG 351 Introduction to Data Management and File Structures
Introduction to Database Systems
Instructor: Elke Rundensteiner
Database Management Systems Chapter 1
Introduction to Databases Transparencies
Introduction to Databases
Introduction to Databases
Introduction to Database Systems
Overview of Database Systems Chpt 1
Instructor: Murali Mani
Database Management Systems Chapter 1
Introduction to Databases
Database Management Systems
Database Management Systems CSE594
Introduction to Databases
Introduction to Databases Transparencies
Sang H. Son CS6750: Database Systems The slides for this text are organized into chapters. This lecture covers Chapter 1. Chapter 1: Introduction.
Introduction to Database Systems
Database Management Systems Chapter 1
Data Independence Applications insulated from how data is structured and stored. Logical data independence: Protection from changes in logical structure.
Introduction to Database Systems Chpt 1
Presentation transcript:

CENG 352 Database Management Systems İsmail Sengör Altıngövde URL:

CENG 352 Instructor: İsmail Sengör Altıngövde Office: A Lecture Hours: Mon 15:40-17:30 (BMB4); Thu. 16:40 (BMB4) Course Web page: Teaching Assistant: Abdullah Doğan –A-206, 2

Text Books and References 1.Raghu Ramakrishnan, Database Management Systems, McGraw Hill, 3 rd edition, 2003 (text book). 2.M. Kifer, A. Bernstein and P. Lewis, Database Systems: An Application Oriented Approach, 2nd edition, Addison Wesley, A. Silberschatz, H.F. Korth, S. Sudarshan, Database System Concepts, McGraw Hill, 6 th edition, R. Elmasri, S.B. Navathe, Fundamentals of Database Systems, 6 th edition, Addison-Wesley, H. Garcia-Molina, J. D. Ullman, J. Widom, Database Systems The Complete Book, Prentice Hall,

Required Background Material from CENG351 (Entity- Relationship modeling, relational data model, relational algebra, B+-trees and hash-based indexing, algorithms for sorting/merging data files) Intermediate programming skills –C/C++/Java

Course Outline Review (Relational Model) –E/R modeling –SQL Database Application Development and Internet Applications –Embedded SQL, cursors, dynamic SQL, JDBC classes and interfaces, stored procedures. –http protocol, html/xml documents, xml dtd, html forms, javascript, style sheets, cgi, application servers, servlets, java server pages.

Course Outline (cont.) Relational Database Design –Functional Dependency Theory Attribute closure, FD set closure, 3NF, BCNF, decomposition lossless-join decomposition, dependency- preserving decomposition. –Normalization of Relations Decomposition into 3NF, BCNF. –Other Types of Dependencies Multivalued dependencies, join dependencies, 4NF, 5NF.

Course Outline (cont.) Query Processing –Algorithms for relational operators (sorting, indexing (hashing, tree-indexing), nested-loops join, sort-merge join, hash join), query evaluation plans. Query Optimization –Translation of SQL queries to RA, estimating the cost of an execution plan, equivalences of RA expressions, enumeration of alternative plans.

Course Outline (cont.) Transaction Processing –Properties of transactions, concurrent execution, schedules, recoverability. Concurrency Control –Serializability theory, two-phase locking, deadlock problem, timestamp-ordering, optimistic concurrency control. Crash Recovery –Logging, checkpointing, recovering from a system crash, example systems like ARIES.

Grading Midterm 25 % Final Exam 35 % Assignments & Quizzes 20 % Project 20 % Attendance: –Once in every week –50% attendance is mandatory to get the final exam 9

Grading Policies Policy on final exam: –Attendance should be > 50%, AND –The weighted average of the assignments & quizzes, OR the midterm grade should be at least 30 Policy on missed midterm: –make-up exam in the following week (only for the legal excuses and if informed beforehand) Lateness policy: –No late submission for the written assignments. –For the programming homework(s): can be late up to 3 days. Afterwards, will be penalized by 10%* day 2 –All assignments are to be your own work. 10

Projects Projects in groups of two. –Determine your groups till the end of add/drops, i.e., Oct 3 Friday, 24:00 3 stages: –Proposal & Preliminary E/R Report –Design Report –Final Report + Demo 11

Questions?

CENG Files Data is not scattered hither and thither on disk. Instead, it is organized into files. Files are organized into records. Records are organized into fields.

File-Based Systems Collection of application programs that perform services for the end users (e.g. reports). Each program defines and manages its own data. 14

File-Based Processing Pearson Education ©

Limitations of File-Based Approach Separation and isolation of data –Each program maintains its own set of data. –Users of one program may be unaware of potentially useful data held by other programs. Duplication of data –Same data is held by different programs. –Wasted space and potentially different values and/or different formats for the same item. 16

Limitations of File-Based Approach Data dependence –File structure is defined in the program code. Incompatible file formats –Programs are written in different languages, and so cannot easily access each other’s files. Fixed Queries/Proliferation of application programs –Programs are written to satisfy particular functions. –Any new requirement needs a new program. 17

Database Approach Arose because: –Definition of data was embedded in application programs, rather than being stored separately and independently. –No control over access and manipulation of data beyond that imposed by application programs. Result: –the database and Database Management System (DBMS). 18

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke19 Database Management Systems Chapter 1

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke20 What Is a DBMS?  A very large, integrated collection of data.  Models real-world enterprise.  Entities (e.g., students, courses)  Relationships (e.g., Madonna is taking CS564)  A Database Management System (DBMS) is a software package designed to store and manage databases.

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke21 Typical DBMS Functionality  Define a database : in terms of data types, structures and constraints  Construct or load the database on a secondary storage medium  Manipulating the database : querying, generating reports, insertions, deletions and modifications to its content  Concurrent Processing and Sharing by a set of users and programs – yet, keeping all data valid and consistent

Database Management System (DBMS) Pearson Education ©

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke23 Why Use a DBMS?  Data independence and efficient access.  Reduced application development time.  Data integrity and security.  Uniform data administration.  Concurrent access, recovery from crashes.

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke24 Why Study Databases??  Shift from computation to information  at the “low end”: scramble to webspace (a mess!)  at the “high end”: scientific applications  Datasets increasing in diversity and volume.  Digital libraries, interactive video, Human Genome project, EOS project ... need for DBMS exploding  DBMS encompasses most of CS  OS, languages, theory, AI, multimedia, logic ?

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke25 Data Models  A data model is a collection of concepts for describing data.  A schema is a description of a particular collection of data, using a given data model.  The relational model of data is the most widely used model today.  Main concept: relation, basically a table with rows and columns.  Every relation has a schema, which describes the columns, or fields.

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke26 Instance of Students Relation Students( sid: string, name: string, login: string, age: integer, gpa: real ) sidnameloginagegpa

Data Models  Object-Based Data Models  Entity-Relationship  Semantic  Functional  Object-Oriented.  Record-Based Data Models  Relational Data Model  Network Data Model  Hierarchical Data Model.  Physical Data Models Pearson Education ©

Relational Data Model Pearson Education ©

Network Data Model Pearson Education ©

Hierarchical Data Model Pearson Education ©

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke31 Levels of Abstraction  Single conceptual (logical) schema and physical schema, many views * Schemas are defined using DDL; data is modified/queried using DML. Physical Schema Conceptual Schema View 1View 2View 3 Conceptual schema defines logical structure Physical schema describes the files and indexes used External shemata (views) describe how users see the data.

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke32 Example: University Database  Conceptual schema:  Students(sid: string, name: string, login: string, age: integer, gpa:real)  Courses(cid: string, cname:string, credits:integer)  Enrolled(sid:string, cid:string, grade:string)  Physical schema:  Relations stored as unordered files.  Index on first column of Students.  External Schema (View):  Course_info(cid:string,enrollment:integer)

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke33 Data Independence *  Applications insulated from how data is structured and stored.  Logical data independence : Protection from changes in logical structure of data.  Physical data independence : Protection from changes in physical structure of data. * One of the most important benefits of using a DBMS!

Data Independence Logical Data Independence –Refers to immunity of external schemas to changes in conceptual schema. –Conceptual schema changes (e.g. addition/removal of entities). –Should not require changes to external schema or rewrites of application programs. 34

Data Independence Physical Data Independence –Refers to immunity of conceptual schema to changes in the internal schema. –Physical schema changes (e.g. using different file organizations, storage structures/devices). –Should not require change to conceptual or external schemas. 35

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke36 Concurrency Control  Concurrent execution of user programs is essential for good DBMS performance.  Because disk accesses are frequent, and relatively slow, it is important to keep the cpu humming by working on several user programs concurrently.  Interleaving actions of different user programs can lead to inconsistency: e.g., check is cleared while account balance is being computed.  DBMS ensures such problems don’t arise: users can pretend they are using a single-user system.

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke37 Transaction: An Execution of a DB Program  Key concept is transaction, which is an atomic sequence of database actions (reads/writes).  Each transaction, executed completely, must leave the DB in a consistent state if DB is consistent when the transaction begins.  Users can specify some simple integrity constraints on the data, and the DBMS will enforce these constraints.  Beyond this, the DBMS does not really understand the semantics of the data. (e.g., it does not understand how the interest on a bank account is computed).  Thus, ensuring that a transaction (run alone) preserves consistency is ultimately the user’s responsibility!

Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke38 Structure of a DBMS  A typical DBMS has a layered architecture.  The figure does not show the concurrency control and recovery components.  This is one of several possible architectures; each system has its own variations. Query Optimization and Execution Relational Operators Files and Access Methods Buffer Management Disk Space Management DB These layers must consider concurrency control and recovery

Database Languages Data Definition Language (DDL) –Allows the DBA or user to describe and name entities, attributes, and relationships required for the application –plus any associated integrity and security constraints. Data Manipulation Language (DML) –Provides basic data manipulation operations on data held in the database. 39

System Catalog Repository of information (metadata) describing the data in the database. One of the fundamental components of DBMS. Typically stores: –names, types, and sizes of data items; –constraints on the data; –names of authorized users; –data items accessible by a user and the type of access; –usage statistics. 40

The DBMS Marketplace Relational DBMS companies – Oracle, Sybase – are among the largest software companies in the world. IBM offers its relational DB2 system. Microsoft offers SQL-Server, plus Microsoft Access for the cheap DBMS on the desktop, answered by “lite” systems from other competitors. Relational companies also challenged by “object-oriented DB” companies. But countered with “object-relational” systems, which retain the relational core while allowing type extension as in OO systems. 41

Databases make these folks happy... End users and DBMS vendors DB application programmers – E.g., smart webmasters Database administrator (DBA) – Designs logical /physical schemas – Handles security and authorization – Data availability, crash recovery – Database tuning as needs evolve Must understand how a DBMS works!

Summary DBMS used to maintain, query large datasets. Benefits include recovery from system crashes, concurrent access, quick application development, data integrity and security. Levels of abstraction give data independence. A DBMS typically has a layered architecture. DBAs hold responsible jobs and are well-paid! DBMS R&D is one of the broadest, most exciting areas in CS.