CPSC-608 Database Systems Fall 2011 Instructor: Jianer Chen Office: HRBB 315C Phone: Notes 1
large volume of data database system users
large volume of data database system users where do we store them?
database system users secondary storage (disks) in secondary storage
database system users secondary storage (disks) how are the data organized?
database system users secondary storage (disks) in tables (relations)
database system users secondary storage (disks) in tables (relations) how do we define relations?
database system users secondary storage (disks) in tables (relations) database administrator DDL language
database system users secondary storage (disks) in tables (relations) database administrator DDL language how do we manipulate relations?
database system secondary storage (disks) in tables (relations) database administrator DDL language database programmer DML (query) language
database management system secondary storage (disks) in tables (relations) database administrator DDL language database programmer DML (query) language simply translate database programs into machine programs
database management system secondary storage (disks) in tables (relations) database administrator DDL language database programmer DML (query) language simply translate database programs into machine programs then what is the difference between DBMS and a programming language compiler?
secondary storage (disks) in tables (relations) database administrator DDL language database programmer DML (query) language then what is the difference between DBMS and a programming language compiler? 1. it has to deal with data stored in hierarchical memory structures DBMS file manager buffer manager main memory buffers
secondary storage (disks) in tables (relations) database administrator DDL language database programmer DML (query) language then what is the difference between DBMS and a programming language compiler? 2. it has to support efficient manipulations of data in hierarchical memory structures DBMS file manager buffer manager main memory buffers index/file manager
secondary storage (disks) in tables (relations) database administrator DDL language database programmer DML (query) language then what is the difference between DBMS and a programming language compiler? 3. it needs to translate the input database program into an internal representation DBMS file manager buffer manager main memory buffers index/file manager DML complier DDL complier
secondary storage (disks) in tables (relations) database administrator DDL language database programmer DML (query) language then what is the difference between DBMS and a programming language compiler? 4. it needs to produce efficient internal codes dealing with data in hierarchical memory structure DBMS file manager buffer manager main memory buffers index/file manager DML complier DDL complier query execution engine
secondary storage (disks) in tables (relations) database administrator DDL language database programmer DML (query) language then what is the difference between DBMS and a programming language compiler? 5. it needs to be consistent DBMS file manager buffer manager main memory buffers index/file manager DML complier DDL complier query execution engine transaction manager concurrency control lock table
secondary storage (disks) in tables (relations) database administrator DDL language database programmer DML (query) language then what is the difference between DBMS and a programming language compiler? 6. it needs to be reliable DBMS file manager buffer manager main memory buffers index/file manager DML complier DDL complier query execution engine transaction manager concurrency control lock table logging & recovery
secondary storage (disks) in tables (relations) database administrator DDL language database programmer DML (query) language DBMS file manager buffer manager main memory buffers index/file manager DML complier DDL complier query execution engine transaction manager concurrency control lock table logging & recovery
secondary storage (disks) in tables (relations) database administrator DDL language database programmer DML (query) language DBMS file manager buffer manager main memory buffers index/file manager DML complier DDL complier query execution engine transaction manager concurrency control lock table logging & recovery undergraduate database
secondary storage (disks) in tables (relations) database administrator DDL language database programmer DML (query) language DBMS file manager buffer manager main memory buffers index/file manager DML complier DDL complier query execution engine transaction manager concurrency control lock table logging & recovery graduate database
secondary storage (disks) in tables (relations) database administrator DDL language database programmer DML (query) language DBMS file manager buffer manager main memory buffers index/file manager DML complier DDL complier query execution engine transaction manager concurrency control lock table logging & recovery graduate database
A Quick Review on Undergraduate Database
secondary storage (disks) in tables (relations) database administrator DDL language database programmer DML (query) language DBMS file manager buffer manager main memory buffers index/file manager DML complier DDL complier query execution engine transaction manager concurrency control lock table logging & recovery
We have agreed Information (i.e., database) is organized in tables (i.e., relations) stored in disks.
We have agreed Information (i.e., database) is organized in tables (i.e., relations). ● How is information represented by relations? ● What are “good” table structures? ● What operations can we apply on tables?
How is information represented by relations?
Information consists of ● objects (i.e., entities) plus ● connections (i.e., relationships) among entities
How is information represented by relations? Information consists of ● objects (i.e., entities) plus ● connections (i.e., relationships) among entities Thus, information can be given by Entity/relationship (R/E) diagrams
How is information represented by relations? Information consists of ● objects (i.e., entities) plus ● connections (i.e., relationships) among entities Thus, information can be given by Entity/relationship (R/E) diagrams Read: Sections The Entity/Relationship Model
How to convert E/R diagrams into relations (i.e., tables)?
Fairly straightforward:
How are E/R diagrams converted into relations (i.e., tables)? Fairly straightforward: ● an entity set is given by a table where each column corresponds to a property (i.e., attribute) of the entities; ● a relationship among entities is given by a table whose columns correspond to the identifications of the related entities (that now become attributes).
How are E/R diagrams converted into relations (i.e., tables)? Fairly straightforward: ● an entity set is given by a table where each column corresponds to a property (i.e., attribute) of the entities; ● a relationship among entities is given by a table whose columns correspond to the identifications of the related entities (that now become attributes). Read: sections 4.5. From E/R Diagrams to Relational Designs
What are “good” table strcutures? ● have no inconsistency; ● avoid redundancy; ● easy to use
What are “good” table strcutures? ● have no inconsistency; ● avoid redundancy; ● easy to use Typical questions: ● Should we split a table when it is too fat? ● Should we merge tables when they are too thin?
What are “good” table strcutures? ● have no inconsistency; ● avoid redundancy; ● easy to use Typical questions: ● Should we split a table when it is too fat? ● Should we merge tables when they are too thin? Read: Chapter 3. Design Theory for Relational Databases
Some terminology namemanf WinterbrewPete’s Bud LiteAnheuser-Busch Beers Attributes (column headers) Tuples (rows) Relation, attribute, tuples a relation
Some terminology Keys and superkeys Superkey: a set of attributes that uniquely determines a tuple; Key: a superkey that does not contain any smaller superkey.
Some terminology Relation schema: relation name and attribute list. Database schema: set of all relation schemas in the database. Database: collection of relations.
Relational operations Typically, selecting tuples that meet a given condition.
Relational operations Core relational operations: Union, intersection, and difference. –Usual set operations; –Extended to bags Selection: picking certain rows. Projection: picking certain columns. Products and joins: compositions of relations. Renaming of relations and attributes.
Relational operations extended relational operations: ● δ = eliminate duplicates from bags. ● τ = sort tuples. ● γ = grouping and aggregation.
Relational operations extended relational operations: ● δ = eliminate duplicates from bags. ● τ = sort tuples. ● γ = grouping and aggregation. Read: Chapter 5 Algebraic and Logical Query Languages