ITEC 3220A Using and Designing Database Systems Instructor: Prof. Z. Yang Course Website: http://people.math.yorku.ca/~zyang/itec3220a.htm Office: TEL 3049
Course Objective Examine databases, trends in database management systems and their application in a wide range of organizational areas Provide an overview of database processing, both historical and discussion of recent trends in database management Provide the student with exposure to a range of tools, including a relational DBMS as well as an object-oriented DBMS
Textbooks Database Systems. Design, Implementation, and Management, 11th Edition - Peter Rob & Carlos Coronel
Marking Scheme Final exam (closed book) - 50% Midterm (closed book) - 35% Assignments (2 assignments) - 15% Lecture notes will be made available at: http://people.yorku.ca/~zyang/itec3220a.htm
Schedule Week 1 Database concepts and the relational database model Week 2 Entity relationship model Week 3 Normalization Week 4 SQL Week 5 SQL + lab Week 6 Advanced SQL + lab
Schedule (Cont’d) Week 7 Midterm Week 8 Database design & case study Week 9 Transaction management and concurrent control Week 10 Transaction management and concurrent control (Cont’d) ; Data warehousing Week 11 Objected-Oriented database Week 12 Review
Database Systems and Data Models Introduction Database Systems and Data Models
Basic Definition Data: raw facts Constitute building blocks of information Information: is produced by processing data and reveals meaning of data Good, timely, relevant information key to decision making Good decision making key to organizational survival Database: shared, integrated computer structure housing: End user data Metadata
An Example Converting data to information
An Example (Cont’d) Metadata
What is a Database Management System (DBMS) A collection of programs that manages the database structure and controls access to the data stored in the database Possible to share data among multiple applications or users Example: bank and its ATM machines Makes data management more efficient and effective End users have better access to more and better-managed data
DBMS Manages Interaction
File and File System Terminology Data Field Record File Raw Facts Group of characters with specific meaning Record Logically connected fields that describe a person, place, or thing File Collection of related records
Example
Disadvantages of File Processing Data Dependence Change in file’s data characteristics requires modification of data access programs Lengthy development time Excessive program maintenance Structural Dependence Change in file structure requires modification of related programs
Example
Disadvantages of File Processing (Cont’d) Data Redundancy Different and conflicting versions of same data Results of uncontrolled data redundancy Data anomalies Modification Insertion Deletion Data inconsistency Lack of data integrity
Solution: Database Approach Database consists of logically related data stored in a single repository Advantages of database approach Structural and data independence Minimal data redundancy Reduces inconsistency, data anomalies Improves data sharing and data quality Stores data structures, relationships, and access paths
Database vs. File Systems
Database System Environment Hardware: all the system's physical devices Software Operating system software DBMS software Application programs and utility software People Procedures Data
Database Models Collection of logical constructs used to represent data structure and relationships within the database Conceptual models: logical nature of data representation Implementation models: emphasis on how the data are represented in the database
Database Models: Historic Overview Flat files - 1960s - 1980s Hierarchical – 1970s - 1990s Network – 1970s - 1990s Relational – 1980s - present Object-oriented – 1990s - present Object-relational – 1990s - present Data warehousing – 1980s - present Web-enabled – 1990s - present
Hierarchical Database Model Logically represented by an upside down tree Each parent can have many children. Each child has only one parent.
Hierarchical Database Model (Cont’d) Advantages Conceptual simplicity Database security and integrity Data independence Efficiency Disadvantages Complex implementation Difficult to manage and lack of standards Lacks structural independence Application programming and use complexity Implementation limitations
Network Database Model Each record can have multiple parents Composed of sets Each set has owner record and member record. Member may have several owners.
Network Database Model (Cont’d) Advantages Conceptual simplicity Handles more relationship types Data access flexibility Promotes database integrity Data independence Conformance to standards Disadvantages System complexity Lack of structural independence
Relational Database Model Perceived by user as a collection of tables for data storage Tables are a series of row/column intersections. Tables related by sharing common entity characteristic(s).
Relational Database Model (Cont’d)
Relational Database Model (Cont’d) Schema for the table Graphical representation Text description AGENT(AGENT_CODE, AGENT_LNAME, AGENT_FNAME, AGENT_INITIAL, AGENT_AREACODE, AGETN_PHONE) AGENT AGENT_CODE AGENT_LNAME AGENT_FNAME AGENT_INITIAL AGENT_AREACODE AGENT_PHONE
Relational Database Model (Cont’d) Advantages Structural independence Improved conceptual simplicity Easier database design, implementation, management, and use Ad hoc query capability with SQL Powerful database management system Disadvantages Substantial hardware and system software overhead Poor design and implementation is made easy May promote “islands of information” problems
Object-Oriented Database Model Objects or abstractions of real-world entities are stored Attributes describe properties Collection of similar objects is a class Methods represent real world actions of classes Classes are organized in a class hierarchy Inheritance is ability of object to inherit attributes and methods of classes above it
OO Data Model Advantages Disadvantages Adds semantic content Visual presentation includes semantic content Database integrity Both structural and data independence Disadvantages Lack of OODM Complex navigational data access Steep learning curve High system overhead slows transactions
Costs and Risks of the Database Approach Up-front costs: Installation Management Cost and Complexity Conversion Costs Ongoing Costs Requires New, Specialized Personnel Need for Explicit Backup and Recovery Organizational Conflict
Review Basic concepts: data, information, database, DBMS, file, conceptual model, implementation model, etc Why database and its importance, cost and risk Different database models definition advantage disadvantage
The Relational Database Model Chapter 3 The Relational Database Model
In this chapter, you will learn: Basic components of the relational database model Entities and their attributes Relationships among entities Relational algebra Relationship in relational database Data redundancy
Basic Definition Entities and Attributes Tables Entity is a person, place, event, or thing about which data is collected Attributes are characteristics of the entity Tables Holds related entities or entity set Also called relations Comprised of rows and columns
Table Characteristics Two-dimensional structure with rows and columns Rows (tuples) represent single entity Columns represent attributes Row/column intersection represents single value Tables must have an attribute to uniquely identify each row Column values all have same data format Each column has range of values called attribute domain Order of the rows and columns is immaterial to the DBMS
Example Tables
Terminology for Relational Database Table-Oriented Set-oriented Record-Oriented Table Relation Record type Row Tuple Record Column Attribute Field
Key Consists of one or more attributes that determine other attributes Primary key (PK) is an attribute (or a combination of attributes) that uniquely identifies any given entity (row). Key’s role is based on determination If you know the value of attribute A, you can look up (determine) the value of attribute B
Keys (Cont’d) Composite key Key attribute Superkey Candidate key Composed of more than one attribute Key attribute Any attribute that is part of a key Superkey Any key that uniquely identifies each entity Candidate key A superkey without redundancies
Keys (Cont’d) Foreign key (FK) Referential integrity Secondary key An attribute whose values match primary key values in the related table Referential integrity FK contains a value that refers to an existing valid tuple (row) in another relation Secondary key Key used strictly for data retrieval purposes
Simple Relational Database
Controlled Redundancy Makes the relational database work Tables within the database share common attributes that enable us to link tables together. Multiple occurrences of values in a table are not redundant when they are required to make the relationship work. Redundancy is unnecessary duplication of data
Integrity Rules
Integrity Rules (cont’d)
Exercises Table name: TRUCK Table name: BASE Table name: TYPE
Exercises (Cont’d) For each table, identify the primary key and the foreign keys. Do the tables exhibit entity integrity? Explain So the tables exhibit referential integrity? Explain Identify the TRUCK table’s candidate key (s). For each table, identify a super key and a secondary key