Download presentation
Presentation is loading. Please wait.
1
Introduction to Database Systems
Instructor: Xintao Wu The slides for this text are organized into several modules. Each lecture contains about enough material for a 1.25 hour class period. (The time estimate is very approximate--it will vary with the instructor, and lectures also differ in length; so use this as a rough guideline.) This lecture is the first of two in Module (1). Module (1): Introduction (DBMS, Relational Model) Module (2): Storage and File Organizations (Disks, Buffering, Indexes) Module (3): Database Concepts (Relational Queries, DDL/ICs, Views and Security) Module (4): Relational Implementation (Query Evaluation, Optimization) Module (5): Database Design (ER Model, Normalization, Physical Design, Tuning) Module (6): Transaction Processing (Concurrency Control, Recovery) Module (7): Advanced Topics Ramakrishnan & Gehrke
2
Ramakrishnan & Gehrke
3
History 60s C. Bachman GE network data model
Late 60s IBM IMS hierarchical data model E. Codd relational model 80s SQL IBM R transaction J. Gray Late 80s-90s DB2, Oracle, Informix, Sybase 90s Data warehouse, internet, NoSQL, NewSQL, M. Stonebreaker Turing award and Turing test? Turing award list Turing website Ramakrishnan & Gehrke
4
What Is a DBMS? A very large, integrated collection of data.
Models real-world enterprise. Entities (e.g., students, courses) Relationships (e.g., Madonna is taking ITCS6160) A Database Management System (DBMS) is a software package designed to maintain and utilize databases. Ramakrishnan & Gehrke
5
Why Use a DBMS? Data independence and efficient access.
Reduced application development time. Data integrity and security. Uniform data administration. Concurrent access, recovery from crashes. Ramakrishnan & Gehrke 3
7
Data Models A data model is a collection of concepts for describing data. A schema is a description of a particular collection of data, using the given data model. The relational model of data is the most widely used model today. Main concept: relation, basically a table with rows and columns. Every relation has a schema, which describes the columns, or fields. Ramakrishnan & Gehrke 5
8
Levels of Abstraction Many views, single conceptual (logical) schema and physical schema. Views describe how users see the data. Conceptual schema defines logical structure Physical schema describes the files and indexes used. View 1 View 2 View 3 Conceptual Schema Physical Schema Schemas are defined using DDL; data is modified/queried using DML. Ramakrishnan & Gehrke 6
9
Example: University Database
Conceptual schema: Students(sid: string, name: string, login: string, age: integer, gpa:real) Courses(cid: string, cname:string, credits:integer) Enrolled(sid:string, cid:string, grade:string) Physical schema: Relations stored as unordered files. Index on first column of Students. External Schema (View): Course_info(cid:string,enrollment:integer) Ramakrishnan & Gehrke 7
10
Data Independence Applications insulated from how data is structured and stored. Logical data independence: Protection from changes in logical structure of data. Physical data independence: Protection from changes in physical structure of data. One of the most important benefits of using a DBMS! Ramakrishnan & Gehrke
11
Files and Access Methods
Structure of a DBMS These layers must consider concurrency control and recovery A typical DBMS has a layered architecture. The figure does not show the concurrency control and recovery components. This is one of several possible architectures; each system has its own variations. Query Optimization and Execution Relational Operators Files and Access Methods Buffer Management Disk Space Management DB Ramakrishnan & Gehrke 22
12
Transaction Management: ACID properties
A tomicity: All actions in the Xact happen, or none happen. C onsistency: If each Xact is consistent, and the DB starts consistent, it ends up consistent. I solation: Execution of one Xact is isolated from that of other Xacts. D urability: If a Xact commits, its effects persist. The Recovery Manager guarantees Atomicity & Durability whereas the Concurrency Control guarantees Consistency & Isolation. Ramakrishnan & Gehrke
13
Motivation of concurrency control
Consistency Isolation Example Two parallel transactions T1 and T2 Serial execution Execution with interleaving actions Example shown on board Ramakrishnan & Gehrke
14
Example Consider two transactions (Xacts):
T1: BEGIN A=A+100, B=B END T2: BEGIN A=1.06*A, B=1.06*B END Intuitively, the first transaction is transferring $100 from B’s account to A’s account. The second is crediting both accounts with a 6% interest payment. There is no guarantee that T1 will execute before T2 or vice-versa, if both are submitted together. However, the net effect must be equivalent to these two transactions running serially in some order.
15
Example (Contd.) Consider a possible interleaving (schedule):
T1: A=A+100, B=B-100 T2: A=1.06*A, B=1.06*B This is OK. But what about: T1: A=A+100, B=B-100 T2: A=1.06*A, B=1.06*B The DBMS’s view of the second schedule: T1: R(A), W(A), R(B), W(B) T2: R(A), W(A), R(B), W(B)
16
Motivation of recovery management
Atomicity: Transactions may abort (“Rollback”). Durability: What if DBMS stops running? (Causes?) Desired Behavior after system restarts: T1, T2 & T3 should be durable. T4 & T5 should be aborted (effects not seen). crash! T1 T2 T3 T4 T5 Ramakrishnan & Gehrke
17
Summary DBMS used to maintain, query large datasets.
Benefits include recovery from system crashes, concurrent access, quick application development, data integrity and security. Levels of abstraction give data independence. A DBMS typically has a layered architecture. DBAs hold responsible jobs and are well-paid! DBMS R&D is one of the broadest, most exciting areas in CS. Ramakrishnan & Gehrke
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.