Download presentation
Presentation is loading. Please wait.
1
Chapter 9: Database Systems
9.1 Database Fundamentals 9.2 The Relational Model 9.3 Object-Oriented Databases 9.4 Maintaining Database Integrity 9.5 Traditional File Structures 9.6 Data Mining 9.7 Social Impact of Database Technology
2
Definition of a Database
A database (DB) is collection of data with internal links that allow users to access the data from a variety of perspectives. Contrast: a flat file, which can only be accessed in one way.
3
Database Database is multidimensional, since internal links between its entries make the information accessible from a variety of perspectives Flat File is the traditional one-dimensional file storage system that presents its information from a single point of view
4
Example of a Flat File Panel 2.4.06 £3400 205 North Street Honeywell
Case 5.5.06 £2000 15 Retail Road Univac 43 Plate 1.7.06 £1000 1 The Lane Sperry 37 Product Due date Price Customer address Customer name Order no.
5
Relational Database Order file Customer file Address Name Customer no.
Product Due date Price Customer no. Order no. 205 North Street Honeywell 54 15 Retail Road Univac 103 1 The Lane Sperry 102 Panel 2.4.06 £3400 54 20 Case 5.5.06 £2000 103 43 Plate 1.7.06 £1000 102 37 Order file Customer file
6
Relational Database linkage Order file Customer file Address Name
Customer no. Product Due date Price Customer no. Order no. 205 North Street Honeywell 54 15 Retail Road Univac 103 1 The Lane Sperry 102 Panel 2.4.06 £3400 54 20 Case 5.5.06 £2000 103 43 Plate 1.7.06 £1000 102 37 Order file Customer file
7
Figure 9.1 A file versus a database organization
8
Databases emerged as a means of integrating the information stored and maintained by a particular organisation Different departments/users can use the same data. Benefits: Reduce (or remove) duplication of data being stored in several places. Ensure all departments use the same versions of the data Easier to make changes to data -- change in one place only, not in several files
9
Advantages of a Single Central Database
Easier to control access to information. Easier to update information. Easier (overall) to maintain and upgrade as technology improves.
10
Potential problems of data integration
Security and Privacy issues Sensitive data (personal data, salary records ...) may be accessed by unauthorised personnel e.g. Need to restrict payroll user's access to exclude other financial data. The ability to control access to information in the database is as important as the ability to share it
11
Schemas To provide different users with different information within a database, database systems often rely on schemas and subschemas A schema is a description of the entire database, i.e. the contents of records and the links between them.
12
Example of a Schema Customer(Customer ID, Name, Address)
Order(Order No. , Customer ID, Invoice No. , Status) Invoice(Invoice No. , Customer ID, Order No. , Status) Product(Product Code, Description) Bold underline: primary key Underline: foreign key
13
Schemas and subschemas
Schema = a description of the structure of an entire database, used by database software to maintain the database Subschema = a description of only that portion of the database pertinent to a particular user’s needs, used to prevent sensitive data from being accessed by unauthorised personnel
14
Database management systems (DBMS)
Database Management System = a software layer that maintains a database and manipulates it in response to requests from applications Distributed Database = a database stored on multiple machines DBMS will mask this organisational detail from its users Data independence = the ability to change the organisation of a database without changing the application software which uses it
15
Figure 9.2 The conceptual layers of a database implementation
Two software layers: application software (user interface) DBMS (database manipulation)
16
Advantages of the Conceptual Layers
Application software is simplified as it is insulated from database management details. Easier to control user access through a single DBMS. Data independence: database organisation can be changed without changing the application software. to make a change in the DB for one user, need to change only the overall schema and the subschemas of users involved in the change.
17
Database models Database model = conceptual view of a database
Relational database model Object-oriented database model Abstraction actual storage details are hidden from the user
18
9.2 Relational database model
Name of relation Customers Address Name Customer no. Table heading 205 North Street Honeywell 54 15 Retail Road Univac 103 1 The Lane Sperry 102 Table All data is stored in rectangular tables called relations. Each row in a relation is a tuple cardinality = no.of tuples Each column is an attribute degree = no. of attributes
19
Advantages of this Relational Model
Information is preserved. e.g. the name and address of a customer are recorded even if the customer has no orders. Information is not duplicated: e.g. the name and address of a customer with two orders are not entered twice.
20
Figure 9.3 A relation containing employee information
21
A relation does not specify the order of the attributes or the order of the tuples.
Customers Customers 205 North Street Honeywell 54 15 Retail Road Univac 103 1 The Lane Sperry 102 Univac 15 Retail Road 103 Honeywell 205 North Street 54 Sperry 1 The Lane 102 is the same as
22
Repeated Tuples A relation cannot have repeated tuples
205 North Street Honeywell 54 15 Retail Road Univac 103 1 The Lane Sperry 102 Not a relation in a relational database.
23
Attributes in Different Relations
An attribute of a given relation is not an attribute of any other relation. E.g. “Orders.customer no” is a different attribute from “Customers.customer no”.
24
Keys Primary key: an attribute whose value uniquely identifies a tuple. Composite key: a minimal set of attributes whose values together uniquely identify a tuple Foreign key: set of attributes pointing to a primary key or a composite key in another table.
25
Example of a Composite Key
Post Code Town Street House no. Composite key: House no., Post Code. CH3 8PN Chester High St. 23 DB21 0TQ Bury Acacia Ave. 1
26
Example of a Foreign Key
Product Due date Price Customer no. Order no. Address Name Customer no. 205 North Street Honeywell 54 15 Retail Road Univac 103 1 The Lane Sperry 102 Panel 2.4.06 £3400 54 20 Case 5.5.06 £2000 103 43 Plate 1.7.06 £1000 102 37 Order file Customer file
27
Evaluating a relational design
Avoid multiple concepts within one relation Can lead to redundant data Deleting a tuple could also delete necessary but unrelated information
28
Table Structure Each table should correspond to a single concept or task. Each row of a table should be uniquely identified by a key. Avoid multiple copies of information.
29
Improving a relational design
Decomposition = dividing the columns of a relation into two or more relations, duplicating those columns necessary to maintain relationships Lossless or nonloss decomposition = a “correct” decomposition that does not lose any information Aim: to produce a better table structure
30
Example of a Lossless Decomposition
Address Name Product Due date Price Customer no. Order no. 205 North Street Honeywell Panel 2.4.06 £3400 54 20 25 Retail Road Univac Case 5.5.06 £2000 103 43 1 The Lane Sperry Plate 1.7.06 £1000 102 37 Original table
31
Lossless Decomposition
Address Name Customer no. Product Due date Price Customer no. Order no. 205 North Street Honeywell 54 15 Retail Road Univac 103 1 The Lane Sperry 102 Panel 2.4.06 £3400 54 20 Case 5.5.06 £2000 103 43 Plate 1.7.06 £1000 102 37 Order file Customer file
32
Example of a Lossy Decomposition
Dept Job Title Empl Id Planning Manager 18 Finance 17 Assistant 203
33
Lossy Decomposition Dept Job Title Job Title Empl Id Planning Manager Finance Assistant Manager 18 17 Assistant 203 After decomposition we can no longer tell in which department employees 17 and 18 worked
34
Figure 9.7 A relation and a proposed decomposition
35
Figure 9.4 A relation containing redundancy
Problem: more than one concept has been combined in a single relation. Redesign the system with 3 relations...
36
Figure 9.5 An employee database consisting of three relations
37
Figure 9.6 Finding the departments in which employee 23Y34 has worked
38
Rules for database normalisation
39
Relational operations
Extracting information from a database Can the answers to database queries be found in a systematic way? In the relational model the answers are always relations.
40
Example Database queries
Get List of Order no and Product. List of tuples of orders for Plate. List of Order no and Name.
41
Contents of a Relational Database
Customers Orders Address Name Customer no Product Due date Price Customer no Order no 205 North Street Honeywell 54 15 Retail Road Univac 103 1 The Lane Sperry 102 Panel 2.4.06 £3400 54 20 Case 5.5.06 £2000 103 43 Plate 1.7.06 £1000 102 37 All data in a relational database are stored in tables
42
Query 1: List of Order Nos. and Products
Project: remove the attributes Customer no., Price and Due date to leave... Orders Product Due date Price Customer no Order no Panel 2.4.06 £3400 54 20 Case 5.5.06 £2000 103 43 Plate 1.7.06 £1000 102 37 Product Order no Ans1: Panel 20 Case 43 Plate 37
43
Query 2: List of Tuples of Orders for Plate
Select: take all tuples for which the product is Plate, to give... Product Due date Price Customer no Order no Panel 2.4.06 £3400 54 20 Case 5.5.06 £2000 103 43 Plate 1.7.06 £1000 102 37 Ans2: Product Due date Price Customer no Order no Plate 1.7.06 £1000 102 37
44
Query 3: List of Order No. and Name
Customers Orders Address Name Customer no Product Due date Price Customer no Order no 205 North Street Honeywell 54 15 Retail Road Univac 103 1 The Lane Sperry 102 Panel 2.4.06 £3400 54 20 Case 5.5.06 £2000 103 43 Plate 1.7.06 £1000 102 37 Join: combine tuples in Orders and Customers, then use Project to retain the attributes Order no. and Name.
45
Query 3: List of Order No. and Name (cont'd)
Join and then Project Temp <- Join Orders and Customers where O.Cno=C.Cno C.add C.Nm C.Cno O.Pr O.Due O.pr O.Cno O.no 205 North Street Honeywell 54 Panel 2.4.06 £3400 20 15 Retail Road Univac 103 Case 5.5.06 £2000 43 1 The Lane Sperry 102 Plate 1.7.06 £1000 37 Ans3 C.Nm O.no Honeywell 20 Univac 43 Sperry 37 Names of attributes are abbreviated
46
Project The original relation is R1, the new relation created by Project is R2 and the attributes projected to R2 are att1,…, attn. R2 <- Project att1,…attn from R1 Example: Ans1 <- Project Orders.Order no, Orders.Product from Orders
47
Figure 9.9 The PROJECT operation
48
Select The original relation is R1, the new relation created by Select is R2 and the conditions on the attribute values in R2 are c1,…cn. R2 <- Select from R1 where c1,…, cn. Example: Ans2 <- Select from Orders where Orders.Product=‘Plate’
49
Figure 9.8 The SELECT operation
50
Join The original relations are R1, R2, the new relation created by Join is R3 and the conditions on the attribute values in R3 are c1,…cn. R3 <- Join R1 and R2 where c1,…, cn Example: Temp <- Join Orders, Customers where O.Cno=C.Cno
51
Figure 9.10 The JOIN operation
52
Figure 9.11 Another example of the JOIN operation
53
Another Example for Join
R3 Join Orders and Customers where Orders.Price>£1500 and O.Cno=C.Cno R3 C.add C.Nm C.Cno O.Pr O.Due O.pr O.Cno O.no 205 North Street Honeywell 54 Panel 2.4.06 £3400 20 15 Retail Road Univac 103 Case 5.5.06 £2000 43
54
Join and Keys A B C Join A and B where A.foreign key=B.key C data
22 DB1 16 D2 22 28 D1 16 34 C Join A and B where A.foreign key=B.key B.data B.key A.data A.Foreign key A.key C DB2 22 D2 28 DB1 16 D1 34
55
Structured Query Language (SQL)
operations to manipulate tuples insert update delete select
56
Structured Query Language
SQL is used to retrieve data and manipulate data. SQL is declarative. It describes the required information. It does not specify how the information is obtained. Adopted by ANSI (1986) and ISO (1987). First commercial release: Oracle Corporation (1979), then IBM (1979)
57
SQL Data Retrieval 1 List of Order no and Product.
select Order no, Product from Orders SQL command The attributes Order no, Product are projected from Orders Result
58
SQL Data Retrieval 2 List of tuples of orders for Plate. select *
from Orders where Orders.Product=‘Plate’ SQL command * Means all attributes. The resulting relation contains all tuples in Orders for which Orders.Product=‘Plate’. Result
59
SQL Data Retrieval 3 List of Order no and Name.
select Orders.Order no, Customers.name from Orders, Customers where Orders.Customer no = Customers.Customer no SQL command Join Orders and Customers. Project the the attributes Orders.Order no, Customers.name Result
60
SQL Command for Data Retrieval
select <list of attributes> from <list of relations> where <list of conditions> select * from Orders, Customers where Orders.Price > 1500 and Orders.Customer no=Customers.Customer no Example
61
SQL Commands for Data Manipulation
insert into Orders values (56, 54, ‘£1000’, ‘1.8.06’, ‘Plate’) delete from Orders where Orders.Customer no = 54 update Customers set Customers.Address = ‘16 High Street’ where Customers.Customer no = 102
62
Advantages of SQL High level language with short but powerful commands. Simple, natural syntax. Details of information retrieval left to the DBMS, allowing quick applications development. Multipurpose: querying, modifying, controlling access to data, defining the database
63
Problems with SQL Large, complicated standard.
Standard does not always fully specify the effects of a command. There are many commercial extensions of and variations on the standard, thus SQL code is usually not easily portable.
64
SQL and Relational Databases
SQL allows many deviations from the relational model, including: Duplicate tuples Unnamed attributes Two or more attributes with the same name Specification of the order of attributes
65
9.3 Object-oriented databases
Object-oriented database = a database constructed by applying the object-oriented paradigm Each data entity stored as a persistent object Relationships indicated by links between objects DBMS maintains inter-object links
66
Figure 9.13 The associations between objects in an object-oriented database
67
Advantages of object-oriented databases
Matches design paradigm of object-oriented applications Intelligence can be built into attribute handlers Example: names of people Can handle exotic data types Example: multimedia Can store intelligent entities
68
9.4 Maintaining database integrity
Transaction = a sequence of operations that must all happen together Example: transferring money between bank accounts Transaction log = non-volatile record of each transaction’s activities, built before the transaction is allowed to happen Commit point = point at which transaction has been recorded in log Roll-back = procedure to undo a failed, partially completed transaction Problem of cascading rollback.
69
Maintaining database integrity (continued)
Simultaneous access problems Incorrect summary problem Lost update problem Locking = preventing others from accessing data being used by a transaction Shared lock: used when reading data Exclusive lock: used when altering data
70
9.5 Traditional File Structures
Sequential Files Indexed Files Hash Files
71
Sequential files Sequential file = file whose contents can only be read in order Reader must be able to detect end-of-file (EOF) Data can be stored in logical records, sorted by a key field Greatly increases the speed of batch updates
72
Figure 9.14 A procedure for merging two sequential files
73
Figure Applying the merge algorithm (Letters are used to represent entire records. The particular letter indicates the value of the record’s key field.)
74
Figure 9.16 The structure of a simple employee file implemented as a text file
75
Indexed files Index = list of (key, location) pairs
Sorted by key values location = where the record is stored
76
Figure 9.17 Opening an indexed file
77
Hashing Each record has a key The master file is divided into buckets
A hash function computes a bucket number for each key value Each record is stored in the bucket corresponding to the hash of its key
78
Figure 9.18 Hashing the key field value 25X3Z to one of 41 buckets
79
Figure 9.19 The rudiments of a hashing system
80
Collisions in Hashing Collision = when two keys hash to the same bucket Major problem when table is over 75% full Solution: increase number of buckets and rehash all data
81
9.6 Data mining Data mining = a set of techniques for discovering patterns in collections of data Relies heavily on statistical analyses Data warehouse = static data collection to be mined Data cube = data presented from many perspectives to enable mining Raises significant ethical issues when it involves personal information
82
Data mining strategies
Class description Class discrimination Cluster analysis Association analysis Outlier analysis Sequential pattern analysis
83
Data mining strategies
Class description identifying properties that characterise a given group of data items what characterises people who buy small economical cars, or expensive mobile phones Class discrimination identifying properties that divide two groups what distinguishes customers who buy used cars from those who buy new ones; or what distinguishes a reader of Vatan from a reader of Cumhuriyet
84
Data mining strategies
Cluster analysis trying to discover classes find properties of data items that identify groupings Association analysis looking for links between data groups do people who buy beer also buy potato chips?
85
Data mining strategies
Outlier analysis identify data entries that are out of the ordinary e.g. identifying use of stolen credit cards from altered spending patterns Sequential pattern analysis try to identify patterns of behaviour over time e.g. economic trends, climate conditions...
86
Social impact of database technology
Problems Massive amounts of personal data are being collected Often without knowledge or meaningful consent of affected people Data merging produces new, more invasive information Errors are widely disseminated and hard to correct Remedies Existing legal remedies largely ineffective Negative publicity may be more effective
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.