Presentation is loading. Please wait.

Presentation is loading. Please wait.

COP Introduction to Database Structures

Similar presentations


Presentation on theme: "COP Introduction to Database Structures"— Presentation transcript:

1 COP 3540 - Introduction to Database Structures

2 Welcome Course web-site

3 Data Data are raw or isolated facts from which the required information is produced. Information is a collection of processed data.

4 Data Banking Airlines Universities Credit card transaction
Telecommunication Finance Sales Manufacturing Human resources Biology Ecology Geospatial

5

6

7 Why is it important? Database systems are an essential component of life in modern society. Examples include banking system, airline reservation, mobile, car navigation system, etc.

8 File Systems A file is a collection of records, which contains logically related data. Each record contains a logically connected set of one or more fields, where each field represents some characteristic of the real-world object that is being modeled. the definition of the data is embedded in the application programs, rather than being stored separately and independently; there is no control over the access and manipulation of data beyond that imposed by the application programs.

9 File Systems A file is a sequence of records.
All records in a file are of the same record type. File-processing system is supported by a conventional operating system. The system stores permanent records in various files, and it needs different application program to extract records from the appropriate files and add record to appropriate files.

10 Database Database is a collection of related data.
Database is a shared collection of logically related data, and a description of this data, designed to meet the information needs of an organization. Database is also defined as a self-describing collection of integrated records. The description of the data is known as the system catalog (or data dictionary or metadata – the ‘data about data’). It is the self-describing nature of a database that provides program–data independence.

11 Database Database represents the entities, the attributes, and the logical relationships between the entities. An entity is a distinct object (a person, place, thing, concept, or event) in the organization that is to be represented in the database. An attribute is a property that describes some aspect of the object that we wish to record, and a relationship is an association between entities.

12 System Catalog The system catalog is one of the fundamental components of a DBMS. It contains ‘data about the data’, or metadata. The catalog should be accessible to users. The Information Resource Dictionary System is an ISO standard that defines a set of access methods for a data dictionary. This allows dictionaries to be shared and transferred from one system to another.

13 Database Management System (DBMS)
Database Management System (DBMS) is a software that manages and controls access to the database. DBMS enables users to define, create, maintain, and control access to the database. DBMS interacts with the users’ application programs and the database. DBMS provides a Data Definition Language (DDL), which allows users to define the database, and a Data Manipulation Language (DML), which allows users to insert, update, delete, and retrieve data from the database.

14 Database Management System (DBMS)
DBMS provides controlled access to the database. a security system, which prevents unauthorized users accessing the database; an integrity system, which maintains the consistency of stored data; a concurrency control system, which allows shared access of the database; a recovery control system, which restores the database to a previous consistent state following a hardware or software failure; a user-accessible catalog, which contains descriptions of the data in the database.

15 Database Management System (DBMS)
Some advantages of the database approach include control of data redundancy, data consistency, sharing of data, and improved security and integrity. Some disadvantages include complexity, cost, reduced performance, and higher impact of a failure.

16 Files vs. DBMS Application must stage large datasets between main memory and secondary storage (e.g., buffering, page-oriented access, 32-bit addressing, etc.) Special code for different queries Must protect data from inconsistency due to multiple concurrent users Crash recovery Security and access control

17 Application Program Application program is a computer program that interacts with the database by issuing an appropriate request (typically an SQL statement) to the DBMS. The more inclusive term database system is used to define a collection of application programs that interact with the database along with the DBMS and database itself.

18 Why Use a DBMS? Data independence and efficient access.
Reduced application development time. Data integrity and security. Uniform data administration. Concurrent access, recovery from crashes.

19 Why Study Databases? Shift from computation to information
at the “low end”: scramble to webspace (a mess!) at the “high end”: scientific applications Datasets increasing in diversity and volume. Digital libraries, interactive video, Human Genome project, EOS project ... need for DBMS exploding DBMS encompasses most of CS OS, languages, theory, AI, multimedia, logic

20 History of Database Management Systems
The roots of the DBMS lie in file-based systems. The hierarchical and CODASYL systems represent the first-generation of DBMSs. The hierarchical model is typified by IMS (Information Management System) and the network or CODASYL model by IDS (Integrated Data Store), both developed in the mid-1960s. The relational model, proposed by E. F. Codd in 1970, represents the second-generation of DBMSs. It has had a fundamental effect on the DBMS community and there are now over one hundred relational DBMSs. The third-generation of DBMSs are represented by the Object-Relational DBMS and the Object-Oriented DBMS.

21 Data Model Data model is a collection of concepts that can be used to describe a set of data, the operations to manipulate the data, and a set of integrity constraints for the data. Semantic data model is a more abstract, high-level data model that describes the meaning of its instances. Entity-relationship (ER) model are widely used for Semantic data model.

22 Level of Abstraction in DBMS
The ANSI-SPARC database architecture uses three levels of abstraction: external, conceptual, and internal. The external level consists of the users’ views of the database. The conceptual level is the community view of the database. It specifies the information content of the entire database, independent of storage considerations. The conceptual level represents all entities, their attributes, and their relationships, as well as the constraints on the data, and security and integrity information. The internal level is the computer’s view of the database. It specifies how data is represented, how records are sequenced, what indexes and pointers exist, and so on.

23 Level of Abstraction in DBMS
The external/conceptual mapping transforms requests and results between the external and conceptual levels. The conceptual/internal mapping transforms requests and results between the conceptual and internal levels. (Internal schema)

24 Level of Abstraction in DBMS
Database schema is a description of the database structure. Data independence makes each level immune to changes to lower levels. Logical data independence refers to the immunity of the external schemas to changes in the conceptual schema. Physical data independence refers to the immunity of the conceptual schema to changes in the internal schema. (Internal schema)

25 Levels of Abstraction in a DBMS

26 Database Language Database Language consists of two parts: a Data Definition Language (DDL) and a Data Manipulation Language (DML). DDL is used to specify the database schema. DML is used to both read and update the database. The part of a DML that involves data retrieval is called a query language.

27 Review A data model is a collection of concepts for describing data.
A schema is a description of a particular collection of data, using the a given data model. The relational model of data is the most widely used model today. Main concept: relation, basically a table with rows and columns. Every relation has a schema, which describes the columns, or fields.

28 Relational Model Relational data model proposed by E. F. Codd
A mathematical relation is a subset of the Cartesian product of two or more sets. In database terms, a relation is any subset of the Cartesian product of the domains of the attributes. A relation is normally written as a set of n-tuples, in which each element is chosen from the appropriate domain. Relations are physically represented as tables, with the rows corresponding to individual tuples and the columns to attributes.

29 Example: University Database
Conceptual schema: Students(sid: string, name: string, login: string, age: integer, gpa:real) Courses(cid: string, cname:string, credits:integer) Enrolled(sid:string, cid:string, grade:string) Physical schema: Relations stored as unordered files. Index on first column of Students. External Schema (View): Course_info(cid:string,enrollment:integer)

30 Example: University Database
Applications insulated from how data is structured and stored. Logical data independence: Protection from changes in logical structure of data. Physical data independence: Protection from changes in physical structure of data. One of the most important benefits of using a DBMS!

31 Transaction Management
A transaction is a series of actions, carried out by a single user or application program, which accesses or changes the contents of the database. A transaction is a logical unit of work consisting of one or more SQL statements that is guaranteed to be atomic with respect to recovery.

32 Transaction Management
A DBMS must furnish a mechanism that will ensure either that all the updates corresponding to a given transaction are made or that none of them is made. A DBMS must furnish a mechanism to ensure that the database is updated correctly when multiple users are updating the database concurrently. A DBMS must furnish a mechanism for recovering the database in the event that the database is damaged in any way.

33 Concurrency Control Concurrent execution of user programs is essential for good DBMS performance. Because disk accesses are frequent, and relatively slow, it is important to keep the cpu humming by working on several user programs concurrently. Interleaving actions of different user programs can lead to inconsistency: e.g., check is cleared while account balance is being computed. DBMS ensures such problems don’t arise: users can pretend they are using a single-user system.

34 Roles in the Database Environment
Database Administrator (DBA) is responsible for the physical realization of the database, including physical database design and implementation, security and integrity control, maintenance of the operational system, and ensuring satisfactory performance of the applications for users. Logical database designer is concerned with identifying the data (that is, the entities and attributes), the relationships between the data, and the constraints on the data that is to be stored in the database. Application Developers End-Users

35 Databases make these folks happy ...
End users and DBMS vendors DB application programmers E.g., smart webmasters Database administrator (DBA) Designs logical /physical schemas Handles security and authorization Data availability, crash recovery Database tuning as needs evolve

36 Files and Access Methods
Structure of a DBMS A typical DBMS has a layered architecture. The figure does not show the concurrency control and recovery components. This is one of several possible architectures; each system has its own variations. These layers must consider concurrency control and recovery Query Optimization and Execution Relational Operators Files and Access Methods Buffer Management Disk Space Management DB

37 Architecture of DBMS

38 Summary DBMS used to maintain, query large datasets.
Benefits include recovery from system crashes, concurrent access, quick application development, data integrity and security. Levels of abstraction give data independence. A DBMS typically has a layered architecture. DBAs hold responsible jobs and are well-paid! DBMS R&D is one of the broadest, most exciting areas in CS.


Download ppt "COP Introduction to Database Structures"

Similar presentations


Ads by Google