Lecture 1 DBMS & More Atif Farid Mohammad Adjunct Professor UNCC
Introductions
How many of you know? PhP or ColdFusion or JavaScript
How many of you are? Grads or Undergrads
What are your majors? CS or SIS or Something Else
DBMS & More… Textbook: Fundamentals of Database Systems, Elmarsi/Navathe, Benjamin, 2011 (Sixth Edition) Workload: (30 %) 3 In Class Quiz (30%) Midterm exam (40% ) One project, parts 1 & 2
Any Questions ?
Objectives Introduction Example of database systems Database characteristics People associated with database Advantage of Using a DBMS Database implications When Not to use a DBMS 8
Introduction Databases: Examples of Databases include: Used to maintain information and to present data to users Examples of Databases include: Reservations systems (Hotel, Car, Airline) Transactions processing systems (Online Banking systems, local library) Investigations systems(Scientific Database systems) Multimedia database systems Geographic information systems (GIS) 9
Core Database Terminologies: 1 Any information (most likely in electronic form) worth preserving E.g., names, addresses, grades, etc. Database? A collection of related data describing the activities of one or more organizations Organized (or structured) for access and modification Preserved over a long period E.g. University Database 10
Core Terminologies: 3 Query? an operation that extracts specified data from the database E.g. get the list of all courses and grades taken by “Smith” Relation? an organization of data into a two-dimensional table, rows (tuples) represent basic entities or facts of some sort columns (attributes or fields) represent properties of those entities. Schema? a description of the structure of the data in a database also known as metadata 11
Database Def-1 A database is a shared collection of logically related data that is stored to meet the requirements of different users of an organization
A database is a self-describing collection of integrated records Database Def-2 A database is a self-describing collection of integrated records
Database Def-3 A database models a particular real world system in the computer in the form of data
The concept of a shared organizational database Management Control Planning Marketing Product Development Sales Corporate Database Accounting Accounts Payable Receivable Manufacturing Production Scheduling
A bit of History Computer initially used for computational/ engineering purposes Commercial applications introduced File Processing System
File Processing System A collection of programs that perform services for the end-users such as production of reports
File Processing Systems Library Applications Data Files Examination Applications Data Files Registration Applications Data Files Program and Data Interdependence
File Processing Systems Library Exam Registration Reg_Number Name Father Name Address Books Issued Class Phone Fine Semester Grade Duplication of Data Incompatible Formats Vulnerable to Inconsistency
Database Approach
Advantages of Database Approach Registration Examination Library Applications Database Management System University Students Database - Data Sharing - Data Independence - Controlled Redundancy - Better Data Integrity
Data & Information Company: Super Soft Dept: Sales Emp Name Age Salary Matt Damon 23 55 Bruce Willis 24 Katie Holmes 20 40 Robert Langdon 19
Schema Database Applications Database Management System (DBMS)
Other Advantages Data consistency Better data security Faster development of new applications Four minutes 24
They also provide Economy of scale Better concurrency control Better backup and recovery procedures 5 mins, 9 tot 25
Disadvantages Higher costs Conversion cost More difficult recovery Higher Costs include HW, SW, trained staff Conversion Cost includes from Manual to DB or from File system to DB, both involve costs 4 mins, 13 tot 26
Typical Components Software Application Programs DBMS Data Database End users Software interact Application Programmers Application Programs develop “What” to get Database Administrators DBMS maintain Database “How” to get Data Database Designers design Users
Levels of Data Real World Data Entity, Attribute e.g. A student, A class name Record type, Data item type e.g. Student record type Metadata Explain from this and later text followed by this diagram Data Occurrences Student record, Data item occurrence e.g. ‘s001’, ’Amir’, ‘CS101’ 28
Employee name, age, qual, sal Emp Name text Age number Sal number John Durso 23 55 Lisa Smith 24 Cindy Bates 20 40 Braden Sams 19
Levels of Data Real-world data Metadata Data Occurrence 4/21 30
Database Users Application Programmers End Users Naïve Sophisticated
Roles in the Database Environment Data Administrator (DA) Database Administrator (DBA) Database Designers (Logical and Physical) Application Programmers End Users (naive and sophisticated)
Functions of DBA Schema definition Granting data access Routine Maintenance Backups Monitoring disk space Monitoring jobs running
Database Properties A database must Represent some aspects of real world Collection of data must be logically coherent and meaningful Database is designed, build, and populated with data for specific purpose 34
Database Construction Database can be constructed Manually E.g., a library card catalog Computerized system A Specific set of applications Database Mgt systems 35
Database Mgt system (DBMS) Database management system (DBMS)? A collection of programs that enables users To define (specifying data type, etc) To construct (storing) To manipulate (reading, writing) To share (simultaneous access) To protect (security & privacy protection) To modify (changing requirements) Database system= DBMS + Database 36
37
Example: University Database System A typical university database system maintaining information regarding Students Courses Grades 38
Example: A University Database 39
Database Engineering Design of any DB application starts Requirements definition and analysis Conceptual Design Performed using Entity relationship Diagram Logical Design Performed using Orcale / Mysquel, etc Physical Design (storing/accessing/indexing) 40
Main Characteristics of DB Self-Describing nature of a database system ( catalog =Data + metadata) Insulation between programs and data ( program-data-independence) Support of multiple views of the data (virtual data) Sharing of data and multi-user transaction processing (concurrency control) Transaction and atomicity ACID 41
People involved in Database Involved two types of people people on the Scene people behind the Scene 42
People on the Scene Database administrators (DBA) Database designers Authorization, coordination, supervision of DB Database designers Defining, building, maintaining, etc End Users Casual, naive, sophisticated Software Engineer system analysts and application programmers 43
People behind the Scene DBMS system designers and implementers Tool developers Operators and maintenance personnel 44
File System vs. Database Mgt Systems Suppose an organization needs to manage a large collection of data, say, 500 GB (i.e., 500X1024 MB) Data is supposed to be accessed concurrently by employees Changes made to the data must be applied consistently Access to the data must be restricted 45
File systems vs. A DBMS (cont’) Using file systems, the data is stored in OS files Not enough MM Difficulty to directly access data (with 32-bit , we can access 4GB) Need to write special program to answer each question Duplicated efforts inconsistency Concurrent accesses Security policies … 46
Primary Advantages of Using a DBMS: 1 Controlling Redundancy Duplicate space and efforts inconsistency Restricting Unauthorized Access Security and authorization subsystem Providing persistent storage for program objects and data structure Impedance mismatch problem (incompatibility between PL and DBMS) Deriving new information from existing ones (view) 47
Primary Advantages of Using a DBMS: 2 Providing Multiple User Interfaces Representing Complex Relationships among Data Enforcing Integrity Constraints Providing Backup and crash Recovery Scheduling concurrent accesses to the data Reducing application development time 48
Implication of the Database Approach Standards can be enforced (data and display format) Application development time can be reduced Flexibility and maintainability Availability of up-to date information Economics of Scale (i.e., reducing overall costs of operations, sources and mgt) 49
When Not to Use A DBMS When there is unnecessary overhead resulted from: High initial up-front cost Generality that a DBMS provides for defining and processing data Overhead for providing security concurrency recovery integrity 50
When to Use File System A File system is desirable under these conditions: The database and application are simple, well-defined, and unchangeable Single-user access to the data is required performance 51
So, what is a Database? Collection of data central to some enterprise Essential to operation of enterprise Contains the only record of enterprise activity An asset in its own right Historical data can guide enterprise strategy Of interest to other enterprises State of database mirrors state of enterprise Database is persistent
What is a Database Management System? A Database Management System (DBMS) is a program that manages a database: Supports a high-level access language (e.g. SQL). Application describes database accesses using that language. DBMS interprets statements of language to perform requested database access.
What is a Transaction? When an event in the real world changes the state of the enterprise, a transaction is executed to cause the corresponding change in the database state With an on-line database, the event causes the transaction to be executed in real time A transaction is an application program with special properties - discussed later - to guarantee it maintains database correctness
What is a Transaction Processing System? Transaction execution is controlled by a TP monitor Creates the abstraction of a transaction, analogous to the way an operating system creates the abstraction of a process TP monitor and DBMS together guarantee the special properties of transactions A Transaction Processing System consists of TP monitor, databases, and transactions
Transaction Processing System DBMS database transactions DBMS database Transaction Processing Monitor
System Requirements High Availability: on-line => must be operational while enterprise is functioning High Reliability: correctly tracks state, does not lose data, controlled concurrency High Throughput: many users => many transactions/sec Low Response Time: on-line => users are waiting
System Requirements (con’t) Long Lifetime: complex systems are not easily replaced Must be designed so they can be easily extended as the needs of the enterprise change Security: sensitive information must be carefully protected since system is accessible to many users Authentication, authorization, encryption
Roles in Design, Implementation, and Maintenance of a TPS System Analyst - specifies system using input from customer; provides complete description of functionality from customer’s and user’s point of view Data Scientists: use all available data sources (internal and external) to analyze and gain insights to help decision makers. Database Designer - specifies structure of data that will be stored in database.
Roles in Design, Implementation and Maintenance of a TPS (con’t) Application Programmer - implements application programs (transactions) that access data and support enterprise rules Database Administrator - maintains database once system is operational: space allocation, performance optimization, database security System Administrator - maintains transaction processing system: monitors interconnection of HW and SW modules, deals with failures and congestion
OLTP vs. OLAP On-line Transaction Processing (OLTP) Day-to-day handling of transactions that result from enterprise operation Maintains correspondence between database state and enterprise state On-line Analytic Processing (OLAP) Analysis of information in a database for the purpose of making management decisions
OLAP Analyzes historical data (terabytes) using complex queries Due to volume of data and complexity of queries, OLAP often uses a data warehouse Data Warehouse - (offline) repository of historical data generated from OLTP or other sources Data Mining - use of warehouse data to discover relationships that might influence enterprise strategy
Examples - Supermarket OLTP Event is 3 cans of soup and 1 box of crackers bought; update database to reflect that event OLAP Last winter in all stores in northeast, how many customers bought soup and crackers together? Data Mining Are there any interesting combinations of foods that customers frequently bought together?
Big Data Era Every 2 days we create as much information as we did from the dawn of civilization up to 2003. (2010, Eric Shmidt, Google) Science: Astronomy, Physics, Bioinformatics, Neuroinfomatics, earth science, etc. Business: Automobile, Healthcare, Financial, infotaiment The automotive industry is projected to be the 2nd largest generator of data by 2015
Big Data Opportunities to Auto Google self-driving car (generates 1GB/s). Recommendation/alert to customers. Manufactures can better understand customers and market trends. Better driving behaviors, better cars, less accidents, and the bottom line is happier customers.
Big Data Opportunities to Auto Google self-driving car (generates 1GB/s). Recommendation/alert to customers. Manufactures can better understand customers and market trends. Better driving behaviors, better cars, less accidents, and the bottom line is happier customers.
Why is Database Important in the Future? “In the future, … any given discipline advances is likely to depend on… database, workflow management, visualization, and cloud computing technologies. “ G. Bell, T. Hey, A. Szalay, “Beyond the Data Deluge,” Science, Vol. 323, no. 5919, pp. 1297-1298, 2009.
Turing Awardees in DB Charles Bachman (1973) Edgar F. Codd (1981) Jim Gray (1998)
Charles Bachman Developer of IDS: the first database system
Edgar. F. Codd Inventor of the Relational Model
Jim Gray Founder of Transaction Processing
Questions ?