DCT 2033 DATABASE MANAGEMENT SYSTEM Chapter 1 File Systems and Databases
Objective The difference between data and information What a database is, about different types of databases,and why they are valuable assets for decision making. Why database design is important How modern database evolves from files and file systems About flaws in file system data management How a database system differs from a file system and how a DBMS functions within the database system. Why data models are important About the basic data-modeling building blocks What business rules are and how they affect database design. How the major data models evolved, and their advantages and disadvantages. How data models can be classified by level of abstraction
Examples of Database Applications Purchases from the supermarket Purchases using your credit card Booking a holiday at the travel agents Using the local library Taking out insurance Renting a video Using the Internet Studying at university
Data vs. Information Data: Information: Raw facts; building blocks of information Unprocessed information Information: Data processed to reveal meaning Accurate, relevant, and timely information is key to good decision making Good decision making is the key to survival in a global environment
Transforming Raw Data into Information
Transforming Raw Data into Information (continued)
Transforming Raw Data into Information (continued)
Transforming Raw Data into Information (continued)
Introducing the Database and the DBMS Database—shared, integrated computer structure that stores: End user data (raw facts) Metadata (data about data)
Introducing the Database and the DBMS (continued) DBMS (database management system): Collection of programs that manages database structure and controls access to data Possible to share data among multiple applications or users Makes data management more efficient and effective
Importance of DBMS It helps make data management more efficient and effective. Its query language allows quick answers to ad hoc queries. It provides end users better access to more and better-managed data. It promotes an integrated view of organization’s operations -- “big picture.” It reduces the probability of inconsistent data. End users have better access to more and better-managed data Promotes integrated view of organization’s operations Probability of data inconsistency is greatly reduced Possible to produce quick answers to ad hoc queries
The DBMS Manages the Interaction Between the End User and the Database
Database Systems 4 Types of Database Systems: 3) Location Centralized Distributed 4) Use Transactional (Production) Decision support Data warehouse 4 Types of Database Systems: 1) Number of Users Single-user Desktop database Multiuser Workgroup database Enterprise database 2) Scope Desktop Workgroup Enterprise
Types of Databases Single-user: Desktop: Multi-user: Supports only one user at a time Desktop: Single-user database running on a personal computer Multi-user: Supports multiple users at the same time
Types of Databases (continued) Workgroup: Multi-user database that supports a small group of users or a single department Enterprise: Multi-user database that supports a large group of users or an entire organization
Types of Databases (continued) Can be classified by location: Centralized: Supports data located at a single site Distributed: Supports data distributed across several sites
Types of Databases (continued) Can be classified by use: Transactional (or production): Supports a company’s day-to-day operations Data warehouse: Stores data used to generate information required to make tactical or strategic decisions Often used to store historical data Structure is quite different
Why Database Design is Important Defines the database’s expected use Different approach needed for different types of databases Avoid redundant data A well-designed database facilitates data management and becomes a valuable information generator. Poorly designed database generates errors leads to bad decisions can lead to failure of organization
Historical Roots: Files and File Systems Manual File systems: Collection of file folders kept in file cabinet Organization within folders based on data’s expected use (ideally logically related) System adequate for small amounts of data with few reporting requirements Finding and using data in growing collections of file folders became time-consuming and cumbersome Collection of application programs that perform services for the end users (e.g. reports). Each program defines and manages its own data.
Limitations of File-Based Approach Separation and isolation of data Each program maintains its own set of data. Users of one program may be unaware of potentially useful data held by other programs. Duplication of data Same data is held by different programs. Wasted space and potentially different values and/or different formats for the same item.
Limitations of File-Based Approach Time-consuming, high-level activity As number of files expands, system administration becomes difficult Making changes in existing file structure is difficult File structure changes require modifications in all programs that use data in that file
Limitations of File-Based Approach Modifications are likely to produce errors, requiring additional time to “debug” the program Security features hard to program and therefore often omitted Data dependence Incompatibility of files
Files and File Systems
Database Systems Problems inherent in file systems make using a database system desirable File system Many separate and unrelated files Database Logically related data stored in a single logical data repository
Database Systems
Database System
Basic File Terminology
The Database System Environment Database system is composed of 5 main parts: Hardware Computer Peripherals Software Operating system software DBMS software Application programs and utility software People Systems administrators Database administrators (DBAs) Database designers Systems analysts and programmers End users Procedures Instructions and rules that govern the design and use of the database system Data Collection of facts stored in the database
Database Systems The complexity of database systems depends on various organizational factors: Organization’s size Organization’s function Organization’s corporate culture Organizational activities and environment Database solutions must be cost effective AND strategically effective.
DBMS Functions DBMS performs functions that guarantee integrity and consistency of data : Data dictionary management defines data elements and their relationships any changes made in a database structure are automatically recorded in the data dictionary. Data storage management stores data and related data entry forms, report definitions, data validation rules, procedural code etc.
DBMS Functions (continued) Data transformation and presentation translates logical requests into commands to physically locate and retrieve the requested data Security management enforces user security and data privacy within database
DBMS Functions (continued) 5. Multiuser access control uses sophisticated algorithms to ensure multiple users can access the database concurrently without compromising the integrity of the database 6. Backup and recovery management provides backup and data recovery procedures 7. Data integrity management promotes and enforces integrity rules
DBMS Functions (continued) 8. Database access languages and application programming interfaces provide data access through a query language DBMS query languages contains 2 component: data definition languages & data manipulation languages 9. Database communication interfaces allow database to accept end-user requests via multiple, different network environments
DBMS Functions (continued)
DBMS Functions (continued)
Advantages of DBMS Control of the redundancy Data consistency Sharing of data Improved data intergrity Improved security Enforcement of standard More information from the same amount of data Economy scale
Advantages of DBMS Balance of conflicting requirements Improved data accessibility and sponsiveness Increase productivity Improved backup and recovery services
Disadvantages of DBMS Complexity, size Cost of DBMS Additional hardware costs Cost of conversion Performance Higher impact of a failure
Summary Data are raw facts. Information is the result of processing data to reveal its meaning. To implement and manage a database, use a DBMS. Database design defines the database structure. A well-designed database facilitates data management and generates accurate and valuable information. A poorly designed database can lead to bad decision making, and bad decision making can lead to the failure of an organization.
Summary (continued) Databases were preceded by file systems. Limitations of file system data management: requires extensive programming system administration complex and difficult making changes to existing structures is difficult security features are likely to be inadequate independent files tend to contain redundant data DBMS’s were developed to address file systems’ inherent weaknesses
Data Models Definition: is the relatively simple representation, usually graphical, of real-world data structures. Function: understand the complexities of the real-world environment. Two components of data models: 1) Structure - refers to the way the system structure data or, the way the users feel the data is structured. 2) Operation - facilities given to the user of the DBMS to manipulate data within the database.
Data Model Basic Building Blocks Basic building blocks of data model:entity, attributes, relationship Entity is anything about which data are to be collected and stored(person, places, things, or event for which data is collected) Attribute is a characteristic/ property of an entity Relationship describes an association among (two or more) entities One-to-many (1:M) relationship Many-to-many (M:N or M:M) relationship One-to-one (1:1) relationship
Data Model Basic Building Blocks Three Types of Relationships: One-to-many relationships (1:M) A painter paints many different paintings, but each one of them is painted by only that painter. PAINTER (1) paints PAINTING (M) Many-to-many relationships (M:N) An employee might learn many job skills, and each job skill might be learned by many employees. EMPLOYEE (M) learns SKILL (N) One-to-one relationships (1:1) Each store is managed by a single employee and each store manager (employee) only manages a single store. EMPLOYEE (1) manages STORE (1)
The Evolution of Data Models FOUR models/ categories of Implementation Data Models Hierarchical Network Relational Object oriented
Hierarchical Model Basic Structure Collection of records logically organized to conform to the upside-down tree (hierarchical) structure. The top layer is perceived as the parent of the segment directly beneath it. The segments below other segments are the children of the segment above them. A tree structure is represented as a hierarchical path on the computer’s storage media.
The Hierarchical Model Characteristics Basic concepts form the basis for subsequent database development Limitations lead to a different way of looking at database design Basic concepts show up in current data models Best understood by examining manufacturing process
The Hierarchical Model Advantages Conceptual simplicity Database security Data independence Database integrity Efficiency dealing with a large database
The Hierarchical Model Disadvantages Complex implementation Difficult to manage Lacks structural independence Applications programming and use complexity Implementation limitations Lack of standards
Network Database Models Basic Structure Set -- A relationship is called a set. Each set is composed of at least two record types: an owner (parent) record and a member (child) record. A set is represents a 1:M relationship between the owner and the member.
The Network Data Model Advantages Conceptual simplicity Handles more relationship types Data access flexibility Promotes database integrity Data independence Conformance to standards Disadvantages System complexity Lack of structural independence
Relational Database Models Basic Structure RDBMS allows operations in a human logical environment. The relational database is perceived as a collection of tables. Each table consists of a series of row/column intersections. Tables (or relations) are related to each other by sharing a common entity characteristic. The relationship type is often shown in a relational schema. A table yields complete data and structural independence.
Advantages Structural independence Improved conceptual simplicity Easier database design, implementation, management & use Ad hoc query capability (SQL) Powerful database management system Disadvantages Substantial hardware and system software overhead Possibility of poor design and implementation Potential “islands of information” problems
Entity-Relationship Data Models It is one of the most widely accepted graphical data modeling tools. It graphically represents data as entities and their relationships in a database structure. It complements the relational data model concepts.
Basic Structure of ER Model E-R models are normally represented in an entity relationship diagram (ERD). An entity is represented by a rectangle. Each entity is described by a set of attributes. An attribute describes a particular characteristics of the entity. A relationship is represented by a diamond connected to the related entities.
Advantages of the ER model Exceptional conceptual simplicity Visual representation Effective communication tool Integrated with the relational database model Disadvantages of the ER model Limited constraint representation Limited relationship representation No data manipulation language Loss of information content
Data Redundancy Data redundancy results in data inconsistency Different and conflicting versions of the same data appear in different places Errors more likely to occur when complex entries are made in several different files and/or recur frequently in one or more files Data anomalies develop when required changes in redundant data are not made successfully
Data Redundancy Types of data anomalies: Update anomalies Occur when changes must be made to existing records Insertion anomalies Occur when entering new records Deletion anomalies Occur when deleting records
Advantages of Database Processing Economy pf scale Getting more information from same amount of data Sharing of data Balancing conflicting requirements Enforcement of standards Controlled redundancy Consistency Integrity Security Flexibility and responsiveness Increased programmer productivity Improved program maintenance Data independence
Disadvantages of Database Processing Size Complexity Cost of DBMSs Additional hardware requirements Higher impact of a failure Recovery more difficult
Recap…. Data, information, metadata, database, DBMS DBMS function. Entities, attributes, relationship Files, records, fields Data models(relational model, network model, hierarchical model, object-oriented model) Advantages and disadvantages of database processing