Presentation is loading. Please wait.

Presentation is loading. Please wait.

Introduction to Database Systems CIS 4301 Lecture Notes 1/10/2006.

Similar presentations


Presentation on theme: "Introduction to Database Systems CIS 4301 Lecture Notes 1/10/2006."— Presentation transcript:

1 Introduction to Database Systems CIS 4301 Lecture Notes 1/10/2006

2 Lecture 1© CIS 4301 - Spring 20062 What is a Database? Collection of related data items that are being stored for record-keeping & analysis Could be stored on cards in Rolodex, file cabinet, computer, … Computerized databases are managed by a Database Management System (DBMS) Persistent storage: Efficient, safe storage of large amounts of data Programming interface: High-level language for specifying operations user wishes to perform on data Transaction management: Concurrent access to data, provides recovery in light of failure

3 Lecture 1© CIS 4301 - Spring 20063 Importance of DBMS Amount of electronically available data is exploding Cost of storage is continuously dropping Moore’s law: every 18 months, speed of processor|capacity of disk doubles or price goes down by half Value of data as an organizational asset is widely accepted High demand in industry for powerful, flexible data management systems to store data efficiently and get the most out of their large, complex data sets e.g., data warehousing, data mining Largest databases Federal Express, Wal-Mart, Kight-Ridder (Dialog), … Tables with 1 billion or more rows Approaching 10’s of TB of data Think of the consequences of storing this much data… ?

4 Lecture 1© CIS 4301 - Spring 20064 Brief History of Data Management Early DBMSs (late 1960’s) evolved from file- based processing systems Need for supporting concurrent access to the data by many users, recovery, back-up, … Roots in airline reservation systems (SABRE), banking systems, corporate record-storage systems Visualize the data much as it was stored Tree-based (hierarchical model) Graph-based (network model) Cumbersome to use, require programming to access data DEPTS EMPSMGRITEMS NAMESS#

5 Lecture 1© CIS 4301 - Spring 20065 Advent of Modern DBMS Early 1970’s Ted Codd invented new data model (=relational data model) and the concept of data abstraction Soon thereafter, team of IBM’ers invented SQL (Structured Query Language) Became de-facto standard for query languages based on the relational data model Commercial DBMS based on relational model are now widely accepted in industry e.g., Microsoft Access, Oracle 9i, Sybase Adaptive Server, … >10 billion dollar industry!

6 Lecture 1© CIS 4301 - Spring 20066 Characteristics of Modern Database Systems Support for concurrent access to data Safeguard data against accidentally loss Maintain integrity of database in light of changes Support for distributed data Control access to data More recently Support for non-standard data Support for heterogeneous data Support for decision-support and analysis

7 Lecture 1© CIS 4301 - Spring 20067 Additional Requirements Increase usage and new applications for databases have resulted in additional requirements (since early days) High availability High reliability High throughput Low response time Extensible

8 Lecture 1© CIS 4301 - Spring 20068 Some Recent Trends DBMS are getting smaller and smaller DBMS that can store GB of data can run on PC Databases are getting bigger and bigger Multiple TBs (terabyte = 10 12 bytes) not uncommon Databases also able to store images, video, audio Database stored on secondary storage devices Use of Tertiary Storage in OLTPs Larger capacity disks but much slower response time (10-20 msec vs. several sec.) Tape, CD, etc. usually involves robotic conveyance DBMS Supporting Parallel Computing Speed-up query processing through parallelism (e.g., read data from many disks) However, need special algorithms to partition data correctly

9 Lecture 1© CIS 4301 - Spring 20069 Types of DBMS General-purpose DBMS Multimedia DBMS Geographic information systems (GIS) Data warehouse DBMS Real-time DBMS Active DBMS

10 Lecture 1© CIS 4301 - Spring 200610 Actors System Analyst Database Designer Application Programmer Project Manager Database Administrator System Administrator End Users Naïve end users Sophisticated end users

11 Lecture 1© CIS 4301 - Spring 200611 When NOT to Use a DBMS Initial investment too high Too much overhead Application is simple, well-defined, not expected to change Stringent real-time requirements (use specialized real-time DBMS) Multi-user access to data is not required Alternative: collection of files managed by access programs

12 Lecture 1© CIS 4301 - Spring 200612 Some Terminology Database (DB) Collection of related data that exists over a long period of time Database Management System (DBMS) Collection of programs that allows users to create a new database and specify its structure gives users the ability to query and modify the data efficiently keeps the data secure from accidents or unauthorized use controls the access to the data for many users at once Database System (DBS) The database and DBMS software together make up what is known as the Database System

13 Lecture 1© CIS 4301 - Spring 200613 DBMS Languages Data Definition Language (DDL) Used to define the conceptual and internal schemas Includes constraint definition language (CDL) for describing conditions that database instances must satisfy Includes storage definition language (SDL) to influence layout of physical schema (some DBMSs) Data Manipulation Language (DML) Used to describe operations on the instances of a database Procedural DML (how) vs. declarative DML (what) e.g., Relational Algebrae.g., SQL Note, SQL includes a DML and a DDL in one! Host Language General-purpose programming language which lets users embed DML commands (data sublanguage) into their code

14 Lecture 1© CIS 4301 - Spring 200614 Architecture of a DBMS Query Processor Storage Manager Transaction Subsystem Schema Modifications Queries Database System DBMS Software Data Definition (Metadata)

15 Lecture 1© CIS 4301 - Spring 200615 Component Overview Data storage (incl. metadata) e.g., names of relations, attributes, data types, etc. Often, DBMS maintains an index Helps us find data items quickly given part of their value; how? Storage manager Handles requests from levels above, retrieves data from store and returns it in format requested by queries Query processor Processes not only queries but also requests for modifications, etc. Figures out best way to retrieve data Transaction subsystem Handles concurrent transactions against database Three types of input at top

16 Lecture 1© CIS 4301 - Spring 200616 Transactional Requirements Maintain database consistency Database restrictions stored as integrity constraints Burden of the user/programmer to assure that transaction preserves all such constraints Guarantee that transaction is executed as a whole or not at all (atomicity guarantee) e.g., either deposit whole amount or no money at all Guarantee that no information is lost (durability) For multiple transactions running concurrently, guarantee that transactions do not interfere with each other (isolation guarantee) The effect of multiple, concurrent transactions on database should be the same as that of a serial execution of the transactions; why? Atomicity, durability, and isolation are guaranteed by transaction subsystem More on transactions later in semester


Download ppt "Introduction to Database Systems CIS 4301 Lecture Notes 1/10/2006."

Similar presentations


Ads by Google