Databases and DBMSs Todd S. Bacastow January 2005
A Process of Mapping Real World Conceptual Data Model Logical Physical Data Model High level model Comprises “Things” “Descriptions” “How things are connected” Relational Hierarchical Network Object Oriented
Data Models A data model describes the structure of a database data types, relationships, constraints a set of basic operations insert, delete, modify, retrieve user-defined operations
Types of Data Models Conceptual Logical Physical concepts: entity, attribute, relationship Entity-Relationship model (DBMS-independent) Logical data represented by record structure E.g. relational, network, hierarchical Physical describes how data is stored in the disk
DBMS Architecture External Level External View External View Conceptual Level Conceptual Schema Internal Level Internal Schema
Conceptual Schema Internal Schema External View External Level Describes a part of the database for a particular user group and hides the rest Supports multiple views of a database Same data model as the conceptual schema External Level External View
Conceptual Level Data Abstraction Conceptual Level Conceptual Schema Internal Schema External View Conceptual Level Data Abstraction hides unnecessary details Conceptual Level hides physical layer Data types, Constraints, User Operations Uses both conceptual/logical data models Conceptual Level Conceptual Schema
Internal Level Defines physical storage on the disk Conceptual Schema Internal Schema External View Internal Level Defines physical storage on the disk Defines data location path, blocks, pages, … Device specific STORED_EMP BYTES=20 PREFIX BYTES=20, OFFSET=0 EMP# BYTES=20, OFFSET=6, INDEX=EMPX DPET# BYTES=20, OFFSET=12 PAY BYTES=20, OFFSET=16, ALIGN=FULLWORD Internal Level Internal Schema
DB Schema vs. DB State Database Schema description of the database is specified during database design Database State (extension of the schema) current state of the database: a snapshot actual data instances in a DB changes over time by update initially, a database is empty state with no data
DB Schema vs. DB State Valid State Schema Diagram DBMS checks every state of the database does it satisfy the structure and constraints specified in the schema? Schema Diagram Displays database schema
Example Schema
Designer Goal : develop a schema that changes infrequently Database Schema Metadata descriptions of the schema constructs and constraints stored in the database catalog Schema Evolution Schema change prompted by the change of application requirements
DBMS Mapping Mappings for multi-level DBMS Three-Schema Architecture Conceptual Schema Internal Schema External View DBMS Mapping Mappings for multi-level DBMS to transform a request specified at one level into the request at another level access: external conceptual internal DB retrieve: DB internal conceptual external Three-Schema Architecture advantage: true data independence disadvantage: overhead cost of mappings
Data Independence What happens when the schema changes at some level? the capacity to change the schema at one level without having to change the schema at the next higher level Two Types of Data Independence logical and physical data independence
Data Independence (con’t) 1. Logical Data Independence capacity to change the conceptual schema without having to change the external schema when: logical reorganization of the database 2. Physical Data Independence change the internal schema without having to change the conceptual schema when: physical reorganization of the files
DBMS Languages Data Definition Language (DDL) to define DB conceptual schema Data Manipulation Language (DML) to specify database requests: update, retrieval high-level DML: describes which data to retrieve low-level DML: describes how to retrieve it
DBMS Languages (con’t) High-level DML: set-oriented, declarative Low-level DML: record-oriented, procedural Types of DML data sublangauge: DML embedded in a general purpose language (for DBAs) query language: high-level, interactive, stand-alone DML (casual end users) user-friendly interface for DML (naïve users)
Database System Environment DBMS Component Modules Managers, i.e., disk control Compiler, i.e., query Processors
DBMS Interfaces Menu-based interfaces Forms-based interfaces Natural language interfaces interpret requests to high-level queries Command line
System Utilities & Tools Loading loads existing data files into the database DBMS conversion, reformatting the data Backup provides a backup copy of the database incremental backup: updates changes only File Reorganization to improve performance
System Utilities & Tools Performance Monitoring monitors database usage provides statistics Data Dictionary also called information repository stores additional information: (catalog) + design decisions, usage standards, user information, application program descriptions
Mainframe/ terminal Storage, Logic and Presentation all in same place Network Mainframe/terminal Storage, Logic and Presentation all in same place No platform specific user interface Doesn’t take advantage of client machine
Client Server without stored procedures Storage Logic Presentation Network DBMS Database server handles storage only Logic and presentation in client Takes advantage of client cpu Logic changes require client redistribution Integrity not maintained if other DB tool used Each user needs to be a specific database user
Client Server with stored procedures Storage Logic Presentation Network DBMS Database handles storage and business logic Logic changed in one place, no redistribution of client DBMS dependent code Each user needs to be specific database user
Client Server with 3 tiers Storage Logic Presentation Network DBMS Storage in database Logic in Transaction Monitor Client does presentation only Authentication and Access control can be done in TP monitor Each user does NOT have to be a database user
Client Server with 3 tiers Database Servers Storage Logic Presentation Network DBMS Transaction Monitor A component which sits between the client and the database server to insure reliable updates of information Used in airline reservation and banking systems
Why 3 Tiers? Scalability Flexibility Complexity multiple transaction monitors load balancing Flexibility Complexity update multiple data stores Two phase commit with multiple databases
Classifications of DBMSs Data Model (OO, Relational, hierarchical) Number of Users ( single vs. multi-user) Number of Database Sites ( centralized vs. distributed vs. federated) Special-purpose vs. general-purpose