BD05/06 Chapter 1: Introduction Purpose of database systems Data abstraction levels Data models SQL :Data Definition Language and Data Manipulation Language Transaction management Database users DBMS structure Typical database architectures
BD05/06 Database management systems (DBMS) Collection of interrelated data and a set of programs to access the data DBMS contains information about a particular enterprise DBMS contains relevant information about a particular enterprise DBMS provides an environment that is both convenient and efficient to use. Database applications: Banking: all transactionsBanking: all transactions Airlines: reservations, schedulesAirlines: reservations, schedules Universities: registration, gradesUniversities: registration, grades Sales: customers, products, purchasesSales: customers, products, purchases Manufacturing: production, inventory, orders, supply chainManufacturing: production, inventory, orders, supply chain Human resources: employee records, salaries, tax deductionsHuman resources: employee records, salaries, tax deductions
BD05/06 DBMS vs File Systems (1) In the early days, database applications were built on top of file systems Drawbacks of using file systems to store data: Data redundancy and inconsistency: Multiple file formats, duplication of information in different files Data redundancy and inconsistency: Multiple file formats, duplication of information in different files Difficulty in accessing data: Need to write a new program to carry out each new task Difficulty in accessing data: Need to write a new program to carry out each new task Data isolation: Multiple files and formats Data isolation: Multiple files and formats Integrity problems: Integrity constraints (e.g. account balance > 0) become part of program code; Hard to add new constraints or change existing ones Integrity problems: Integrity constraints (e.g. account balance > 0) become part of program code; Hard to add new constraints or change existing ones
BD05/06 DBMS vs File Systems (2) Drawbacks of using file systems (cont.) Atomicity of updates Atomicity of updates Failures may leave database in an inconsistent state with partial updates carried outFailures may leave database in an inconsistent state with partial updates carried out E.g. transfer of funds from one account to another should either complete or not happen at all E.g. transfer of funds from one account to another should either complete or not happen at all Concurrent access by multiple users Concurrent access by multiple users Concurrent access needed for performanceConcurrent access needed for performance Uncontrolled concurrent accesses can lead to inconsistenciesUncontrolled concurrent accesses can lead to inconsistencies E.g. two people reading a balance and updating it at the same time E.g. two people reading a balance and updating it at the same time Security problems Security problems …. DBMS offer solutions to all the above problems!
BD05/06 Why use a DBMS? Data independence and efficient access Reduced application development time. Data integrity and security. Uniform data administration. Concurrent access, recovery from crashes.
BD05/06 Levels of data abstraction (1) Physical level: describes how data are stored. Logical level: describes data stored in the database, and the relationships among the data. View level: application programs hide details of data types. Views can also hide information (e.g., salary) for security purposes.
BD05/06 Levels of data abstraction (2)
BD05/06 Data Models A collection of conceptual tools for describing: data data data relationships data relationships data semantics data semantics data constraints data constraintsEx: Entity-Relationship model, UML class diagram Entity-Relationship model, UML class diagram Relational model Relational model Other models: Other models: semi-structured data models (XML used to represent sstr. data)semi-structured data models (XML used to represent sstr. data) older models: network model and hierarchical modelolder models: network model and hierarchical model
BD05/06 Entity-Relationship Model Example of schema in the entity-relationship model
BD05/06 A Sample Relational Database
BD05/06 Data Independence Applications insulated from how data is structured and stored. Protection from changes in the structure of data. Logical data independence: Protection from changes in the logical structure of data. CustomerPublic(customer-id, customer-name) CustomerPrivate(customer-id, salary) Protection from changes in the physical structure of data. Physical data independence: Protection from changes in the physical structure of data. * One of the most important benefits of using a DBMS!
BD05/06 Instances and Schemas : the logical structure of the database, a description of a particular collection of data, using a given data model Schema: the logical structure of the database, a description of a particular collection of data, using a given data model Analogous to type information of a variable in a program Analogous to type information of a variable in a program : database design at the physical level Physical schema: database design at the physical level : database design at the logical level Logical schema: database design at the logical level : the actual content of the database at a particular point in time Instance: the actual content of the database at a particular point in time Analogous to the value of a variable Analogous to the value of a variable
BD05/06 Levels of data abstraction (recap)
BD05/06Example (View): External schema (View): CustomerPublic (customer-id, customer-name) : Logical schema: Customer(customer-id, customer-name, customer-street, customer-age) Customer(customer-id, customer-name, customer-street, customer-age) Account(account-number, balance) Account(account-number, balance) Depositor(account-number, customer-id) Depositor(account-number, customer-id) : Physical schema: Relations stored as unordered files. Relations stored as unordered files. Index on first column of Customer. Index on first column of Customer.
BD05/06 SQL: Structured Query Language A DBMS provides: a Data Definition Language (DDL) a Data Definition Language (DDL) a Data Manipulation Language (DML) a Data Manipulation Language (DML) as part of a single DB language: SQL as part of a single DB language: SQL Most widely used declarative query language Procedural – user specifies what data is required and how to get those dataProcedural – user specifies what data is required and how to get those data Declarative – user specifies what data is required without specifying how to get those dataDeclarative – user specifies what data is required without specifying how to get those data
BD05/06 SQL: Data Definition Language (DDL) : Specification notation for defining the database DDL: Specification notation for defining the database Ex: create table account( account-number char(10), balance integer)Ex: create table account( account-number char(10), balance integer) DDL compiler generates a set of tables stored in a DDL compiler generates a set of tables stored in a data dictionary Data dictionary contains (i.e., data about data) Data dictionary contains metadata (i.e., data about data) Ex: Database schema, consistency constraints, access methods Ex: Database schema, consistency constraints, access methods
BD05/06 SQL: Data Manipulation Language (DML) Language for accessing and manipulating data Ex: find the name of the customer with customer-id select customer.customer-name from customer where customer.customer-id = ‘ ’
BD05/06 Transaction Management A transaction is a collection of operations that performs a single logical function in a database application Ex: funds transfer includes withdraw from account A and deposit into account B Ex: funds transfer includes withdraw from account A and deposit into account B Transaction management component ensures that the database remains in a consistent (correct) state despite system failures (e.g., power failures and operating system crashes) and transaction failures. Transactions must be atomic, consistent, isolated and durable/persistent
BD05/06 Database Users (1) Users are differentiated by the way they expect to interact with the system Application programmers: interact with system through DML calls Sophisticated users: form requests in a database query language Naïve/End users: invoke one of the permanent application programs that have been written previously E.g. people accessing database over the web, bank tellers, clerical staffE.g. people accessing database over the web, bank tellers, clerical staff
BD05/06 Database Users (2) Application programmers Coordinates all the activities of the database system; has a good understanding of the enterprise’s information resources and needs. Duties include: Application programmers Coordinates all the activities of the database system; has a good understanding of the enterprise’s information resources and needs. Duties include: Schema definitionSchema definition Storage structure and access method definitionStorage structure and access method definition Schema and physical organization modificationSchema and physical organization modification Granting user authority to access the databaseGranting user authority to access the database Specifying integrity constraintsSpecifying integrity constraints Acting as liaison with usersActing as liaison with users Monitoring performance, tuning the system and respondingMonitoring performance, tuning the system and responding to changes in requirements to changes in requirements
BD05/06 Overall DBMS Structure
BD05/06 Database applications Application programs generally access databases through one of: Language (C, C++, Java, etc) extensions to allow embedded SQL Language (C, C++, Java, etc) extensions to allow embedded SQL Application program interface (e.g. ODBC/JDBC) which allow SQL queries to be sent to a database Application program interface (e.g. ODBC/JDBC) which allow SQL queries to be sent to a database
BD05/06 Typical application architectures Two-tier architecture: E.g. client programs using ODBC/JDBC to communicate with a database Three-tier architecture: E.g. web-based applications, and applications built using “middleware”