Chapter 9: Database Systems

Slides:



Advertisements
Similar presentations
Chapter 10: Designing Databases
Advertisements

C6 Databases.
Management Information Systems, Sixth Edition
Chapter Information Systems Database Management.
Chapter 9 Database Systems. 2 Chapter 9: Database Systems 9.1 Database Fundamentals 9.2 The Relational Model 9.3 Object-Oriented Databases 9.4 Maintaining.
Database Management: Getting Data Together Chapter 14.
Chapter 9: Database Systems
Chapter 4 Relational Databases Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall 4-1.
Database Features Lecture 2. Desirable features in an information system Integrity Referential integrity Data independence Controlled redundancy Security.
Databases and Database Management Systems
Mgt 20600: IT Management & Applications Databases Tuesday April 4, 2006.
Chapter 4 Relational Databases Copyright © 2012 Pearson Education 4-1.
Introduction. 
Database Technical Session By: Prof. Adarsh Patel.
311: Management Information Systems Database Systems Chapter 3.
I Information Systems Technology Ross Malaga 4 "Part I Understanding Information Systems Technology" Copyright © 2005 Prentice Hall, Inc. 4-1 DATABASE.
Professor Michael J. Losacco CIS 1110 – Using Computers Database Management Chapter 9.
C6 Databases. 2 Traditional file environment Data Redundancy and Inconsistency: –Data redundancy: The presence of duplicate data in multiple data files.
Databases Shortfalls of file management systems Structure of a database Database administration Database Management system Hierarchical Databases Network.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Prepared By Prepared By : VINAY ALEXANDER ( विनय अलेक्सजेंड़र ) PGT(CS),KV JHAGRAKHAND.
Chapter 9 Database Systems Introduction to CS 1 st Semester, 2014 Sanghyun Park.
ITGS Databases.
Data resource management
Chapter 13 Designing Databases Systems Analysis and Design Kendall & Kendall Sixth Edition.
Chapter 9 Database Systems © 2007 Pearson Addison-Wesley. All rights reserved.
Chapter 9 Database Systems. © 2005 Pearson Addison-Wesley. All rights reserved 9-2 Chapter 9: Database Systems 9.1 Database Fundamentals 9.2 The Relational.
Advanced Accounting Information Systems Day 10 answers Organizing and Manipulating Data September 16, 2009.
Introduction to Database Chapter #9 Sec 9.1. What is a Database? A flat file is considered to be one-dimensional storage system because it presents its.
1st December 2015Birkbeck College, U. London1 Introduction to Computer Systems Lecturer: Steve Maybank Department of Computer Science and Information Systems.
Copyright © 2012 Pearson Education, Inc. Communication Skills.
Data Resource Management Lecture 8. Traditional File Processing Data are organized, stored, and processed in independent files of data records In traditional.
Data Resource Management Data Concepts Database Management Types of Databases Chapter 5 McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies,
© 2017 by McGraw-Hill Education. This proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner.
Management Information Systems by Prof. Park Kyung-Hye Chapter 7 (8th Week) Databases and Data Warehouses 07.
Introduction To DBMS.
Databases Chapter 16.
DESIGNING DATABASE APPLICATIONS
Chapter 9 Database Systems
Fundamentals & Ethics of Information Systems IS 201
Database Management System
Fundamentals of Information Systems
Information Systems Database Management
Lecture 2 The Relational Model
Chapter 4 Relational Databases
Tools for Memory: Database Management Systems
Databases and Information Management
Chapter 3 The Relational Database Model
Introduction to Database Management System
Chapter 9: Database Systems
Chapter 2 Database Environment Pearson Education © 2009.
Basic Concepts in Data Management
MANAGING DATA RESOURCES
Week 11: Database Management System
Database Fundamentals
Database management concepts
Database Fundamentals(continuing)
MANAGING DATA RESOURCES
Databases and Information Management
Database management concepts
Accounting Information Systems 9th Edition
Chapter 9: Database Systems
Chapter 17 Designing Databases
DATABASE TECHNOLOGIES
Chapter 2 Database Environment Pearson Education © 2009.
INTRODUCTION A Database system is basically a computer based record keeping system. The collection of data, usually referred to as the database, contains.
Database management systems
Presentation transcript:

Chapter 9: Database Systems 9.1 Database Fundamentals 9.2 The Relational Model 9.3 Object-Oriented Databases 9.4 Maintaining Database Integrity 9.5 Traditional File Structures 9.6 Data Mining 9.7 Social Impact of Database Technology

Definition of a Database A database (DB) is collection of data with internal links that allow users to access the data from a variety of perspectives. Contrast: a flat file, which can only be accessed in one way.

Database Database is multidimensional, since internal links between its entries make the information accessible from a variety of perspectives Flat File is the traditional one-dimensional file storage system that presents its information from a single point of view

Example of a Flat File Panel 2.4.06 £3400 205 North Street Honeywell Case 5.5.06 £2000 15 Retail Road Univac 43 Plate 1.7.06 £1000 1 The Lane Sperry 37 Product Due date Price Customer address Customer name Order no.

Relational Database Order file Customer file Address Name Customer no. Product Due date Price Customer no. Order no. 205 North Street Honeywell 54 15 Retail Road Univac 103 1 The Lane Sperry 102 Panel 2.4.06 £3400 54 20 Case 5.5.06 £2000 103 43 Plate 1.7.06 £1000 102 37 Order file Customer file

Relational Database linkage Order file Customer file Address Name Customer no. Product Due date Price Customer no. Order no. 205 North Street Honeywell 54 15 Retail Road Univac 103 1 The Lane Sperry 102 Panel 2.4.06 £3400 54 20 Case 5.5.06 £2000 103 43 Plate 1.7.06 £1000 102 37 Order file Customer file

Figure 9.1 A file versus a database organization

Databases emerged as a means of integrating the information stored and maintained by a particular organisation Different departments/users can use the same data. Benefits: Reduce (or remove) duplication of data being stored in several places. Ensure all departments use the same versions of the data Easier to make changes to data -- change in one place only, not in several files

Advantages of a Single Central Database Easier to control access to information. Easier to update information. Easier (overall) to maintain and upgrade as technology improves.

Potential problems of data integration Security and Privacy issues Sensitive data (personal data, salary records ...) may be accessed by unauthorised personnel e.g. Need to restrict payroll user's access to exclude other financial data. The ability to control access to information in the database is as important as the ability to share it

Schemas To provide different users with different information within a database, database systems often rely on schemas and subschemas A schema is a description of the entire database, i.e. the contents of records and the links between them.

Example of a Schema Customer(Customer ID, Name, Address) Order(Order No. , Customer ID, Invoice No. , Status) Invoice(Invoice No. , Customer ID, Order No. , Status) Product(Product Code, Description) Bold underline: primary key Underline: foreign key http://en.wikipedia.org/wiki/Relational_model

Schemas and subschemas Schema = a description of the structure of an entire database, used by database software to maintain the database Subschema = a description of only that portion of the database pertinent to a particular user’s needs, used to prevent sensitive data from being accessed by unauthorised personnel

Database management systems (DBMS) Database Management System = a software layer that maintains a database and manipulates it in response to requests from applications Distributed Database = a database stored on multiple machines DBMS will mask this organisational detail from its users Data independence = the ability to change the organisation of a database without changing the application software which uses it

Figure 9.2 The conceptual layers of a database implementation Two software layers: application software (user interface) DBMS (database manipulation)

Advantages of the Conceptual Layers Application software is simplified as it is insulated from database management details. Easier to control user access through a single DBMS. Data independence: database organisation can be changed without changing the application software. to make a change in the DB for one user, need to change only the overall schema and the subschemas of users involved in the change.

Database models Database model = conceptual view of a database Relational database model Object-oriented database model Abstraction actual storage details are hidden from the user

9.2 Relational database model Name of relation Customers Address Name Customer no. Table heading 205 North Street Honeywell 54 15 Retail Road Univac 103 1 The Lane Sperry 102 Table All data is stored in rectangular tables called relations. Each row in a relation is a tuple cardinality = no.of tuples Each column is an attribute degree = no. of attributes

Advantages of this Relational Model Information is preserved. e.g. the name and address of a customer are recorded even if the customer has no orders. Information is not duplicated: e.g. the name and address of a customer with two orders are not entered twice.

Figure 9.3 A relation containing employee information

A relation does not specify the order of the attributes or the order of the tuples. Customers Customers 205 North Street Honeywell 54 15 Retail Road Univac 103 1 The Lane Sperry 102 Univac 15 Retail Road 103 Honeywell 205 North Street 54 Sperry 1 The Lane 102 is the same as

Repeated Tuples A relation cannot have repeated tuples 205 North Street Honeywell 54 15 Retail Road Univac 103 1 The Lane Sperry 102 Not a relation in a relational database.

Attributes in Different Relations An attribute of a given relation is not an attribute of any other relation. E.g. “Orders.customer no” is a different attribute from “Customers.customer no”.

Keys Primary key: an attribute whose value uniquely identifies a tuple. Composite key: a minimal set of attributes whose values together uniquely identify a tuple Foreign key: set of attributes pointing to a primary key or a composite key in another table.

Example of a Composite Key Post Code Town Street House no. Composite key: House no., Post Code. CH3 8PN Chester High St. 23 DB21 0TQ Bury Acacia Ave. 1

Example of a Foreign Key Product Due date Price Customer no. Order no. Address Name Customer no. 205 North Street Honeywell 54 15 Retail Road Univac 103 1 The Lane Sperry 102 Panel 2.4.06 £3400 54 20 Case 5.5.06 £2000 103 43 Plate 1.7.06 £1000 102 37 Order file Customer file

Evaluating a relational design Avoid multiple concepts within one relation Can lead to redundant data Deleting a tuple could also delete necessary but unrelated information

Table Structure Each table should correspond to a single concept or task. Each row of a table should be uniquely identified by a key. Avoid multiple copies of information.

Improving a relational design Decomposition = dividing the columns of a relation into two or more relations, duplicating those columns necessary to maintain relationships Lossless or nonloss decomposition = a “correct” decomposition that does not lose any information Aim: to produce a better table structure

Example of a Lossless Decomposition Address Name Product Due date Price Customer no. Order no. 205 North Street Honeywell Panel 2.4.06 £3400 54 20 25 Retail Road Univac Case 5.5.06 £2000 103 43 1 The Lane Sperry Plate 1.7.06 £1000 102 37 Original table

Lossless Decomposition Address Name Customer no. Product Due date Price Customer no. Order no. 205 North Street Honeywell 54 15 Retail Road Univac 103 1 The Lane Sperry 102 Panel 2.4.06 £3400 54 20 Case 5.5.06 £2000 103 43 Plate 1.7.06 £1000 102 37 Order file Customer file

Example of a Lossy Decomposition Dept Job Title Empl Id Planning Manager 18 Finance 17 Assistant 203

Lossy Decomposition Dept Job Title Job Title Empl Id Planning Manager Finance Assistant Manager 18 17 Assistant 203 After decomposition we can no longer tell in which department employees 17 and 18 worked

Figure 9.7 A relation and a proposed decomposition

Figure 9.4 A relation containing redundancy Problem: more than one concept has been combined in a single relation. Redesign the system with 3 relations...

Figure 9.5 An employee database consisting of three relations

Figure 9.6 Finding the departments in which employee 23Y34 has worked

Rules for database normalisation http://www.datamodel.org/NormalizationRules.html http://en.wikipedia.org/wiki/Database_normalization

Relational operations Extracting information from a database Can the answers to database queries be found in a systematic way? In the relational model the answers are always relations.

Example Database queries Get List of Order no and Product. List of tuples of orders for Plate. List of Order no and Name.

Contents of a Relational Database Customers Orders Address Name Customer no Product Due date Price Customer no Order no 205 North Street Honeywell 54 15 Retail Road Univac 103 1 The Lane Sperry 102 Panel 2.4.06 £3400 54 20 Case 5.5.06 £2000 103 43 Plate 1.7.06 £1000 102 37 All data in a relational database are stored in tables

Query 1: List of Order Nos. and Products Project: remove the attributes Customer no., Price and Due date to leave... Orders Product Due date Price Customer no Order no Panel 2.4.06 £3400 54 20 Case 5.5.06 £2000 103 43 Plate 1.7.06 £1000 102 37 Product Order no Ans1: Panel 20 Case 43 Plate 37

Query 2: List of Tuples of Orders for Plate Select: take all tuples for which the product is Plate, to give... Product Due date Price Customer no Order no Panel 2.4.06 £3400 54 20 Case 5.5.06 £2000 103 43 Plate 1.7.06 £1000 102 37 Ans2: Product Due date Price Customer no Order no Plate 1.7.06 £1000 102 37

Query 3: List of Order No. and Name Customers Orders Address Name Customer no Product Due date Price Customer no Order no 205 North Street Honeywell 54 15 Retail Road Univac 103 1 The Lane Sperry 102 Panel 2.4.06 £3400 54 20 Case 5.5.06 £2000 103 43 Plate 1.7.06 £1000 102 37 Join: combine tuples in Orders and Customers, then use Project to retain the attributes Order no. and Name.

Query 3: List of Order No. and Name (cont'd) Join and then Project Temp <- Join Orders and Customers where O.Cno=C.Cno C.add C.Nm C.Cno O.Pr O.Due O.pr O.Cno O.no 205 North Street Honeywell 54 Panel 2.4.06 £3400 20 15 Retail Road Univac 103 Case 5.5.06 £2000 43 1 The Lane Sperry 102 Plate 1.7.06 £1000 37 Ans3 C.Nm O.no Honeywell 20 Univac 43 Sperry 37 Names of attributes are abbreviated

Project The original relation is R1, the new relation created by Project is R2 and the attributes projected to R2 are att1,…, attn. R2 <- Project att1,…attn from R1 Example: Ans1 <- Project Orders.Order no, Orders.Product from Orders

Figure 9.9 The PROJECT operation

Select The original relation is R1, the new relation created by Select is R2 and the conditions on the attribute values in R2 are c1,…cn. R2 <- Select from R1 where c1,…, cn. Example: Ans2 <- Select from Orders where Orders.Product=‘Plate’

Figure 9.8 The SELECT operation

Join The original relations are R1, R2, the new relation created by Join is R3 and the conditions on the attribute values in R3 are c1,…cn. R3 <- Join R1 and R2 where c1,…, cn Example: Temp <- Join Orders, Customers where O.Cno=C.Cno

Figure 9.10 The JOIN operation

Figure 9.11 Another example of the JOIN operation

Another Example for Join R3  Join Orders and Customers where Orders.Price>£1500 and O.Cno=C.Cno R3 C.add C.Nm C.Cno O.Pr O.Due O.pr O.Cno O.no 205 North Street Honeywell 54 Panel 2.4.06 £3400 20 15 Retail Road Univac 103 Case 5.5.06 £2000 43

Join and Keys A B C  Join A and B where A.foreign key=B.key C data 22 DB1 16 D2 22 28 D1 16 34 C  Join A and B where A.foreign key=B.key B.data B.key A.data A.Foreign key A.key C DB2 22 D2 28 DB1 16 D1 34

Structured Query Language (SQL) operations to manipulate tuples insert update delete select

Structured Query Language SQL is used to retrieve data and manipulate data. SQL is declarative. It describes the required information. It does not specify how the information is obtained. Adopted by ANSI (1986) and ISO (1987). First commercial release: Oracle Corporation (1979), then IBM (1979) http://en.wikipedia.org/wiki/SQL

SQL Data Retrieval 1 List of Order no and Product. select Order no, Product from Orders SQL command The attributes Order no, Product are projected from Orders Result

SQL Data Retrieval 2 List of tuples of orders for Plate. select * from Orders where Orders.Product=‘Plate’ SQL command * Means all attributes. The resulting relation contains all tuples in Orders for which Orders.Product=‘Plate’. Result

SQL Data Retrieval 3 List of Order no and Name. select Orders.Order no, Customers.name from Orders, Customers where Orders.Customer no = Customers.Customer no SQL command Join Orders and Customers. Project the the attributes Orders.Order no, Customers.name Result

SQL Command for Data Retrieval select <list of attributes> from <list of relations> where <list of conditions> select * from Orders, Customers where Orders.Price > 1500 and Orders.Customer no=Customers.Customer no Example

SQL Commands for Data Manipulation insert into Orders values (56, 54, ‘£1000’, ‘1.8.06’, ‘Plate’) delete from Orders where Orders.Customer no = 54 update Customers set Customers.Address = ‘16 High Street’ where Customers.Customer no = 102

Advantages of SQL High level language with short but powerful commands. Simple, natural syntax. Details of information retrieval left to the DBMS, allowing quick applications development. Multipurpose: querying, modifying, controlling access to data, defining the database http://www.craigsmullins.com/idug_sql.htm

Problems with SQL Large, complicated standard. Standard does not always fully specify the effects of a command. There are many commercial extensions of and variations on the standard, thus SQL code is usually not easily portable. http://en.wikipedia.org/wiki/SQL

SQL and Relational Databases SQL allows many deviations from the relational model, including: Duplicate tuples Unnamed attributes Two or more attributes with the same name Specification of the order of attributes http://en.wikipedia.org/wiki/Relational_model

9.3 Object-oriented databases Object-oriented database = a database constructed by applying the object-oriented paradigm Each data entity stored as a persistent object Relationships indicated by links between objects DBMS maintains inter-object links

Figure 9.13 The associations between objects in an object-oriented database

Advantages of object-oriented databases Matches design paradigm of object-oriented applications Intelligence can be built into attribute handlers Example: names of people Can handle exotic data types Example: multimedia Can store intelligent entities

9.4 Maintaining database integrity Transaction = a sequence of operations that must all happen together Example: transferring money between bank accounts Transaction log = non-volatile record of each transaction’s activities, built before the transaction is allowed to happen Commit point = point at which transaction has been recorded in log Roll-back = procedure to undo a failed, partially completed transaction Problem of cascading rollback.

Maintaining database integrity (continued) Simultaneous access problems Incorrect summary problem Lost update problem Locking = preventing others from accessing data being used by a transaction Shared lock: used when reading data Exclusive lock: used when altering data

9.5 Traditional File Structures Sequential Files Indexed Files Hash Files

Sequential files Sequential file = file whose contents can only be read in order Reader must be able to detect end-of-file (EOF) Data can be stored in logical records, sorted by a key field Greatly increases the speed of batch updates

Figure 9.14 A procedure for merging two sequential files

Figure 9.15 Applying the merge algorithm (Letters are used to represent entire records.  The particular letter indicates the value of the record’s key field.)

Figure 9.16 The structure of a simple employee file implemented as a text file

Indexed files Index = list of (key, location) pairs Sorted by key values location = where the record is stored

Figure 9.17 Opening an indexed file

Hashing Each record has a key The master file is divided into buckets A hash function computes a bucket number for each key value Each record is stored in the bucket corresponding to the hash of its key

Figure 9.18 Hashing the key field value 25X3Z to one of 41 buckets

Figure 9.19 The rudiments of a hashing system

Collisions in Hashing Collision = when two keys hash to the same bucket Major problem when table is over 75% full Solution: increase number of buckets and rehash all data

9.6 Data mining Data mining = a set of techniques for discovering patterns in collections of data Relies heavily on statistical analyses Data warehouse = static data collection to be mined Data cube = data presented from many perspectives to enable mining Raises significant ethical issues when it involves personal information

Data mining strategies Class description Class discrimination Cluster analysis Association analysis Outlier analysis Sequential pattern analysis

Data mining strategies Class description identifying properties that characterise a given group of data items what characterises people who buy small economical cars, or expensive mobile phones Class discrimination identifying properties that divide two groups what distinguishes customers who buy used cars from those who buy new ones; or what distinguishes a reader of Vatan from a reader of Cumhuriyet

Data mining strategies Cluster analysis trying to discover classes find properties of data items that identify groupings Association analysis looking for links between data groups do people who buy beer also buy potato chips?

Data mining strategies Outlier analysis identify data entries that are out of the ordinary e.g. identifying use of stolen credit cards from altered spending patterns Sequential pattern analysis try to identify patterns of behaviour over time e.g. economic trends, climate conditions...

Social impact of database technology Problems Massive amounts of personal data are being collected Often without knowledge or meaningful consent of affected people Data merging produces new, more invasive information Errors are widely disseminated and hard to correct Remedies Existing legal remedies largely ineffective Negative publicity may be more effective