COP Introduction to Database Structures

Slides:



Advertisements
Similar presentations
1 Introduction to Database Systems CSE444 Instructor: Scott Vandenberg University of Washington Winter 2000.
Advertisements

Database Management Systems 1 Ramakrishnan & Gehrke Introduction to Database Systems Chapter 1 Instructor: Mirsad Hadzikadic.
Chapter 1 Instructor: Murali Mani Database Management Systems.
1 541: Database Systems S. Muthu Muthukrishnan. 2 Some Data Collections I Have Played With….  Wireless call detail records.  U. S. Patents.  AskJeeves.
Database Management Systems, R. Ramakrishnan and J. Gehrke1 Introduction to Database Systems Chapter 1 Instructor: Wang-Chien Lee
Introduction to Databases
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Database Management Systems Chapter 1 Instructor: Deborah Strahman
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Database Management Systems Chapter 1.
1 Introduction to Database Systems Ref. Ramakrishnan & Gehrke Chapter 1.
Introduction to Databases Transparencies
1 CENG 302 Introduction to Database Management Systems Nihan Kesim Çiçekli URL:
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Database Management Systems Chapter 1 Instructor: Ethan Jackson
CSCD34 - Data Management Systems,- A. Vaisman1 CSC D34 - Data Management Systems Instructor: Alejandro Vaisman University of Toronto.
Introduction to Databases
Database Management Systems 1 Introduction to Database Systems Instructor: Xintao Wu Ramakrishnan & Gehrke.
CS6530 Graduate-level Database Systems Prof. Feifei Li.
 DATABASE DATABASE  DATABASE ENVIRONMENT DATABASE ENVIRONMENT  WHY STUDY DATABASE WHY STUDY DATABASE  DBMS & ITS FUNCTIONS DBMS & ITS FUNCTIONS 
1 CS862 - Advanced Database Systems Sang H. Son
Database Management Systems 1 Ramakrishnan & Gehrke Introduction to Database Systems Chpt 1 Instructor: Xintao Wu.
Database Management Systems 1 Ramakrishnan & Gehrke Introduction to Database Systems Chpt 1 Instructor: Weichao Wang.
Database Management Systems
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Database Management Systems Chapter 1 Overview of Database Systems.
Lecture # 3 & 4 Chapter # 2 Database System Concepts and Architecture Muhammad Emran Database Systems 1.
Introduction to Database Systems1. 2 Basic Definitions Mini-world Some part of the real world about which data is stored in a database. Data Known facts.
1 What Is a DBMS?  A very large, integrated collection of data.  Models real-world enterprise.  Entities (e.g., students, courses)  Relationships (e.g.,
1 Chapter 1 Introduction to Databases Transparencies.
1 CS462- Database Systems Sang H. Son
1 Chapter 2 Database Environment Pearson Education © 2009.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Database Management Systems Chapter 1.
1 CENG 351 CENG 351 Introduction to Data Management and File Structures Department of Computer Engineering METU.
Database Management Systems 1 Ramakrishnan & Gehrke Introduction to Database Systems Chpt 1 Instructor: Xin Zhang.
Introduction to Databases Transparencies
Introduction to Databases
Introduction to Databases Transparencies
DATABASE MANAGEMENT SYSTEMS
Introduction to DBMS Purpose of Database Systems View of Data
Introduction to Database Systems Chapter 1
CS4222 Principles of Database System
Introduction to Databases
Chapter 2 Database Environment.
Introduction to Database Systems
Instructor: Elke Rundensteiner
Database Management Systems Chapter 1
Introduction to Databases Transparencies
Introduction to Databases
Introduction to Databases
Introduction to Database Systems
Overview of Database Systems Chpt 1
Introduction to Databases
Instructor: Murali Mani
Introduction to Databases
Chapter 2 Database Environment.
Database Management Systems Chapter 1
Chapter 2 Database Environment Pearson Education © 2009.
Chapter 2 Database Environment.
Data Base System Lecture : Database Environment
Data Base System Lecture 2: Introduction to Database
Introduction to Databases
Database Management Systems
Database Management Systems CSE594
Introduction to DBMS Purpose of Database Systems View of Data
Introduction to Databases
Introduction to Databases
Introduction to Databases Transparencies
Sang H. Son CS6750: Database Systems The slides for this text are organized into chapters. This lecture covers Chapter 1. Chapter 1: Introduction.
Introduction to Database Systems
Chapter 2 Database Environment Pearson Education © 2009.
Database Management Systems Chapter 1
Chapter 2 Database Environment Pearson Education © 2009.
Introduction to Database Systems Chpt 1
Presentation transcript:

COP 3540 - Introduction to Database Structures

Welcome Course web-site http://faculty.eng.fau.edu/yangk/COP3540/index.html

Data Data are raw or isolated facts from which the required information is produced. Information is a collection of processed data.

Data Banking Airlines Universities Credit card transaction Telecommunication Finance Sales Manufacturing Human resources Biology Ecology Geospatial

Why is it important? Database systems are an essential component of life in modern society. Examples include banking system, airline reservation, mobile, car navigation system, etc.

File Systems A file is a collection of records, which contains logically related data. Each record contains a logically connected set of one or more fields, where each field represents some characteristic of the real-world object that is being modeled. the definition of the data is embedded in the application programs, rather than being stored separately and independently; there is no control over the access and manipulation of data beyond that imposed by the application programs.

File Systems A file is a sequence of records. All records in a file are of the same record type. File-processing system is supported by a conventional operating system. The system stores permanent records in various files, and it needs different application program to extract records from the appropriate files and add record to appropriate files.

Database Database is a collection of related data. Database is a shared collection of logically related data, and a description of this data, designed to meet the information needs of an organization. Database is also defined as a self-describing collection of integrated records. The description of the data is known as the system catalog (or data dictionary or metadata – the ‘data about data’). It is the self-describing nature of a database that provides program–data independence.

Database Database represents the entities, the attributes, and the logical relationships between the entities. An entity is a distinct object (a person, place, thing, concept, or event) in the organization that is to be represented in the database. An attribute is a property that describes some aspect of the object that we wish to record, and a relationship is an association between entities.

System Catalog The system catalog is one of the fundamental components of a DBMS. It contains ‘data about the data’, or metadata. The catalog should be accessible to users. The Information Resource Dictionary System is an ISO standard that defines a set of access methods for a data dictionary. This allows dictionaries to be shared and transferred from one system to another.

Database Management System (DBMS) Database Management System (DBMS) is a software that manages and controls access to the database. DBMS enables users to define, create, maintain, and control access to the database. DBMS interacts with the users’ application programs and the database. DBMS provides a Data Definition Language (DDL), which allows users to define the database, and a Data Manipulation Language (DML), which allows users to insert, update, delete, and retrieve data from the database.

Database Management System (DBMS) DBMS provides controlled access to the database. a security system, which prevents unauthorized users accessing the database; an integrity system, which maintains the consistency of stored data; a concurrency control system, which allows shared access of the database; a recovery control system, which restores the database to a previous consistent state following a hardware or software failure; a user-accessible catalog, which contains descriptions of the data in the database.

Database Management System (DBMS) Some advantages of the database approach include control of data redundancy, data consistency, sharing of data, and improved security and integrity. Some disadvantages include complexity, cost, reduced performance, and higher impact of a failure.

Files vs. DBMS Application must stage large datasets between main memory and secondary storage (e.g., buffering, page-oriented access, 32-bit addressing, etc.) Special code for different queries Must protect data from inconsistency due to multiple concurrent users Crash recovery Security and access control

Application Program Application program is a computer program that interacts with the database by issuing an appropriate request (typically an SQL statement) to the DBMS. The more inclusive term database system is used to define a collection of application programs that interact with the database along with the DBMS and database itself.

Why Use a DBMS? Data independence and efficient access. Reduced application development time. Data integrity and security. Uniform data administration. Concurrent access, recovery from crashes.

Why Study Databases? Shift from computation to information at the “low end”: scramble to webspace (a mess!) at the “high end”: scientific applications Datasets increasing in diversity and volume. Digital libraries, interactive video, Human Genome project, EOS project ... need for DBMS exploding DBMS encompasses most of CS OS, languages, theory, AI, multimedia, logic

History of Database Management Systems The roots of the DBMS lie in file-based systems. The hierarchical and CODASYL systems represent the first-generation of DBMSs. The hierarchical model is typified by IMS (Information Management System) and the network or CODASYL model by IDS (Integrated Data Store), both developed in the mid-1960s. The relational model, proposed by E. F. Codd in 1970, represents the second-generation of DBMSs. It has had a fundamental effect on the DBMS community and there are now over one hundred relational DBMSs. The third-generation of DBMSs are represented by the Object-Relational DBMS and the Object-Oriented DBMS.

Data Model Data model is a collection of concepts that can be used to describe a set of data, the operations to manipulate the data, and a set of integrity constraints for the data. Semantic data model is a more abstract, high-level data model that describes the meaning of its instances. Entity-relationship (ER) model are widely used for Semantic data model.

Level of Abstraction in DBMS The ANSI-SPARC database architecture uses three levels of abstraction: external, conceptual, and internal. The external level consists of the users’ views of the database. The conceptual level is the community view of the database. It specifies the information content of the entire database, independent of storage considerations. The conceptual level represents all entities, their attributes, and their relationships, as well as the constraints on the data, and security and integrity information. The internal level is the computer’s view of the database. It specifies how data is represented, how records are sequenced, what indexes and pointers exist, and so on.

Level of Abstraction in DBMS The external/conceptual mapping transforms requests and results between the external and conceptual levels. The conceptual/internal mapping transforms requests and results between the conceptual and internal levels. (Internal schema)

Level of Abstraction in DBMS Database schema is a description of the database structure. Data independence makes each level immune to changes to lower levels. Logical data independence refers to the immunity of the external schemas to changes in the conceptual schema. Physical data independence refers to the immunity of the conceptual schema to changes in the internal schema. (Internal schema)

Levels of Abstraction in a DBMS

Database Language Database Language consists of two parts: a Data Definition Language (DDL) and a Data Manipulation Language (DML). DDL is used to specify the database schema. DML is used to both read and update the database. The part of a DML that involves data retrieval is called a query language.

Review A data model is a collection of concepts for describing data. A schema is a description of a particular collection of data, using the a given data model. The relational model of data is the most widely used model today. Main concept: relation, basically a table with rows and columns. Every relation has a schema, which describes the columns, or fields.

Relational Model Relational data model proposed by E. F. Codd A mathematical relation is a subset of the Cartesian product of two or more sets. In database terms, a relation is any subset of the Cartesian product of the domains of the attributes. A relation is normally written as a set of n-tuples, in which each element is chosen from the appropriate domain. Relations are physically represented as tables, with the rows corresponding to individual tuples and the columns to attributes.

Example: University Database Conceptual schema: Students(sid: string, name: string, login: string, age: integer, gpa:real) Courses(cid: string, cname:string, credits:integer) Enrolled(sid:string, cid:string, grade:string) Physical schema: Relations stored as unordered files. Index on first column of Students. External Schema (View): Course_info(cid:string,enrollment:integer)

Example: University Database Applications insulated from how data is structured and stored. Logical data independence: Protection from changes in logical structure of data. Physical data independence: Protection from changes in physical structure of data. One of the most important benefits of using a DBMS!

Transaction Management A transaction is a series of actions, carried out by a single user or application program, which accesses or changes the contents of the database. A transaction is a logical unit of work consisting of one or more SQL statements that is guaranteed to be atomic with respect to recovery.

Transaction Management A DBMS must furnish a mechanism that will ensure either that all the updates corresponding to a given transaction are made or that none of them is made. A DBMS must furnish a mechanism to ensure that the database is updated correctly when multiple users are updating the database concurrently. A DBMS must furnish a mechanism for recovering the database in the event that the database is damaged in any way.

Concurrency Control Concurrent execution of user programs is essential for good DBMS performance. Because disk accesses are frequent, and relatively slow, it is important to keep the cpu humming by working on several user programs concurrently. Interleaving actions of different user programs can lead to inconsistency: e.g., check is cleared while account balance is being computed. DBMS ensures such problems don’t arise: users can pretend they are using a single-user system.

Roles in the Database Environment Database Administrator (DBA) is responsible for the physical realization of the database, including physical database design and implementation, security and integrity control, maintenance of the operational system, and ensuring satisfactory performance of the applications for users. Logical database designer is concerned with identifying the data (that is, the entities and attributes), the relationships between the data, and the constraints on the data that is to be stored in the database. Application Developers End-Users

Databases make these folks happy ... End users and DBMS vendors DB application programmers E.g., smart webmasters Database administrator (DBA) Designs logical /physical schemas Handles security and authorization Data availability, crash recovery Database tuning as needs evolve

Files and Access Methods Structure of a DBMS A typical DBMS has a layered architecture. The figure does not show the concurrency control and recovery components. This is one of several possible architectures; each system has its own variations. These layers must consider concurrency control and recovery Query Optimization and Execution Relational Operators Files and Access Methods Buffer Management Disk Space Management DB

Architecture of DBMS

Summary DBMS used to maintain, query large datasets. Benefits include recovery from system crashes, concurrent access, quick application development, data integrity and security. Levels of abstraction give data independence. A DBMS typically has a layered architecture. DBAs hold responsible jobs and are well-paid! DBMS R&D is one of the broadest, most exciting areas in CS.