1 CSE 480: Database Systems Lecture 1: Introduction Reference: Read Chapters 1 & 2 of the textbook
2 Database Systems are Pervasive Retail Banking Law enforcement
3 Database-Driven Web Sites
4 What is a Database? l Collection of related data central to a given enterprise (mini-world or universe of discourse) –Examples: Banking – savings/checking accounts, mortgage, etc Vehicle registration – car registration, year, make, etc Student registration – name, PID, GPA, last semester enrolled, etc Electronic Medical Records – name, SSN, date of birth, address, symptoms, diseases, medication, test results, etc
5 Example of a Database l Mini-world: UNIVERSITY environment –What are the mini-world concepts that need to be captured by the database? l Entities: –STUDENTs –COURSEs –SECTIONs –DEPARTMENTs –INSTRUCTORs
6 Example of a Database l Relationships between entities of the mini-world: –SECTIONs are for specific COURSEs –STUDENTs take SECTIONs –COURSEs have prerequisite COURSEs –INSTRUCTORs teach SECTIONs –COURSEs are offered by DEPARTMENTs –STUDENTs major in DEPARTMENTs
7 Example of a Database l Constraints on the entities and relationships –Each course must have a unique course number –GPA must be a real number between 0 and 4.0 –Each section has only one instructor but an instructor can teach more than one section l Database design (Lectures 2-4) –Specifying the entities, relationships, and constraints of a mini-world using the Entity-Relationship and Enhanced Entity Relationship models. Database Architect or Designer
8 Database Management System (DBMS) l A collection of programs that enables users to create and maintain a database l Examples of DBMS –MS Access, MS SQL Server, IBM DB2, Oracle, Sybase, Postgres, mySQL, and many more l Why do we need a DBMS?
9 File Server Architecture (no DBMS) Source: Modern Database Management. 6th Edition, Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden Thick client
10 Client-Server DBMS Architecture Source: Modern Database Management. 6th Edition, Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden DBMS running on database server; performs all data storage and access operations Thin client
1 Three-tier Architecture Business rules stored on application server Source: Modern Database Management. 6th Edition, Jeffrey A. Hoffer, Mary B. Prescott, Fred R. McFadden
12 Typical DBMS Functionalities l Define a database –Specify the structure of the data records l Construct a database –Store the data on some storage medium controlled by the DBMS l Manipulate the database –Query the database to retrieve specific data, update the database to reflect changes, and generate reports l Support concurrent processing and sharing by users and applications –yet, keeping all the data valid and consistent l Support protection/security measures to prevent unauthorized access
13 Characteristics of DBMS l Self-Describing l Provides insulation between programs and data l Allows multiple views l Allows multi-user transaction processing
14 Characteristics of DBMS l Self-describing nature of a database management system –DBMS contains not only the data but also complete description of its structure and constraints Structure: Student ID is 10 characters long, GPA is a real number Constraints: GPA must be between 0 and 4.0 (non-negative) –A DBMS catalog stores the description of the database The description is called meta-data –This allows the DBMS software to work with any types of data (banking, university, company, etc)
15 Example of DBMS Catalog Information in DBMS catalog are needed for query processing and optimization (to be discussed more in lectures 22-24)
16 Characteristics of DBMS l Insulation between programs and data –Program-data independence Allows changing data storage structures and operations without changing the DBMS access programs –Program-operation independence In OO and OR database systems, users can define operations (methods) on data using an interface; implementation of the operation (method) can be separately specified
17 Characteristics of DBMS l Support multiple views of the data –A database typically has many users, each of whom require different perspective (view) of the database –A common principle used by many organizations is that data must be accessible on a need-to-know basis –Example: Student database may contain information about student’s name, SSN, courses taken and grades, salary, etc Users of the database include registrar office and payroll department –Registrar doesn’t need to know what is student’s salary –Payroll doesn’t need to know what is student’s GPA
18 Characteristics of DBMS l Multi-user transaction processing –Database stores information about current state of an enterprise Example: Bank database stores balance for each customer account –When an event in the real world changes, a transaction is executed to cause corresponding change in the database state A transaction is an executing program or process that includes one or more database accesses, such as reading or updating database records Each transaction is designed to maintain correctness of the relationship between database state and real-world enterprise it is modeling –Example: When a customer deposits $50 in a bank, a deposit transaction is executed to increase the account balance by $50 –Concurrency control of DBMS ensures correctness of the database when multiple concurrent transactions are executed
19 Database System Concepts l Data Models l Database Schema vs Database Instances l DBMS Languages
20 Abstraction l Data is actually stored as bits, but it is difficult to work with data at this level l DBMS provides a level of abstraction by hiding the details of data organization and storage –A data model is used to hide storage details and present the users with a conceptual view of the database
21 Data Model User/Program (John, 21) (Mary, 19) (CSE480) (CSE331) (CSE, Engr) (ECE, Engr) Physical data storage Data model Student CourseDepartment DBMS
2 Examples of Data Models l Network Model l Hierarchical Model l Relational Model (most widely used) l Object-Oriented Data Models l Object-Relational Models l More recently, NoSQL –Google BigTable –Amazon Dynamo –Facebook Cassandra
23 Relational Data Model l Proposed by Edgar Codd –E. F. Codd: A Relational Model of Data for Large Shared Data Banks. Commun. ACM 13(6): (1970)Commun. ACM 13 l Model the data as relations (tables) –Advantages: Simple Mathematically based Has a set of powerful, high-level operators to analyze relational expressions ( queries) –Queries are transformed to equivalent expressions automatically (query processing and optimization) Transformed expressions can be executed more efficiently
24 Database Schemas versus Instances l In any data model, it is important to distinguish between description of the database from the database itself l Database Schema: –The description of a database Includes descriptions of data elements, data types, and constraints –Schema Diagram: An illustrative display of a database schema l Database Instance (State/Snapshot): –The actual data stored in the database at a particular moment in time –Valid State: A state that satisfies the structure and constraints of the database
25 Example of a Database Schema
26 Example of a Database State
27 Database Schema vs. Database State l Distinction –The database schema changes very infrequently. –The database state changes every time the database is updated. l Schema is also called intension l State is also called extension
28 Three-Schema Architecture External schemas Internal Schema Physical storage for data about students, courses, employment, etc
29 Internal Schema/Level l Describes the details of how data is physically stored –Specify how data is stored in files, tracks, cylinders. –Specify the indices that support fast access to the rows of a table –Specify the machine that has the data (Data may be distributed)
30 Conceptual Schema/Level l Hides the details of physical data representation –In the relational model, the conceptual schema presents data as a set of tables (relations) l DBMS maps from conceptual to internal schema automatically l Physical data independence –Internal schema can be changed without changing the conceptual schema
31 External Schema/Level l External schema customizes the conceptual schema to the needs of various users l In the relational model, the external schema also presents data as a set of relations External schemas
32 External Schema l Application is written in terms of an external schema. –Different external schemas can be provided to different categories of users l DBMS maps external to conceptual schema automatically at run time l Logical data independence –Conceptual schema can be changed without changing external schema and application programs
3 DBMS Languages l Data Definition Language (DDL): –Used to specify the conceptual schema of a database In many DBMSs, DDL is also used to define internal and external schemas (views). In some DBMSs, separate storage definition language (SDL) and view definition language (VDL) are used to define internal and external schemas CREATE TABLE DEPARTMENT ( DNAMEVARCHAR(10)NOT NULL, DNUMBERINTEGERNOT NULL, MGRSSNCHAR(9), MGRSTARTDATECHAR(9) );
34 DBMS Languages l Data Manipulation Language (DML) –Used to specify database retrievals and updates –Both DML and DDL can be embedded in a general-purpose programming language, such as C, C++, Java or PHP INSERT INTO DEPARTMENT VALUES ( ‘Payroll’, 154, ‘ ’, ‘ ’); SELECT MgrSSN FROM DEPARTMENT WHERE DName = ‘Payroll’;
35 Example of SQL Embedded in Java
36 Database System Environment
37 MySQL Account l Every registered student will have access to a MySQL account on mysql-user.cse.msu.edu l To log in, go to: – –Username is your CSE username Password is your PID Server Choice: mysql-user l Send an to if you have problems logging