Databases Unit 3_6
Flat File Databases One table containing data Data must be entered as a whole each time e.g. customer name and address each time (data redundancy) Data consistency Data integrity Very quick to set up Really only useful for small amounts of data
Relational Database Two or more linked tables Related data in each table Much more complicated to set up
Terms Data consistency Data redundancy Data integrity Data independence
Consistency Because the database is using the same piece of data in different contexts – e.g. customers name in invoice or order, the data is the same i.e. consistent
Redundancy In a relational database data is entered once and used by the whole system. This means that data is only stored ONCE, reducing data redundancy.
Integrity Data is correct (GIGO, validation, verification)
Independence Data can be changed without having to change the applications which process it. In other words the data and the applications using it are independent.
Primary & Foreign Keys Primary key (key field) is a unique identifier for a record. Tables can be linked by associating a primary key with a field in another table, this new field is called the foreign key.
Normalisation Minimises the duplication of data Eliminates redundant data Ensures data integrity Allows flexible extraction of information
1NF Think of one table containing the customer data, their orders and the product information. This will clearly mean that there is repeated (redundant data). Removing the fields into their own customer, order and product tables reduces the redundancy – 1NF
2NF In the last example I deliberately left the orders in one table. Clearly this means having to create a new order for each product ordered by the customer – in other words some of the data in the order form is dependant on only one of the key fields (product name for example). In order to make this database more efficient we need to create a new table – order details table and link the order and product tables.
3NF We check to see that there are no fields that are independent of the key fields, in this example there are none, but in the example below you could leave fields in the wrong table and they would repeat (only if you were really dozy though!)
Example Reduce this list of fields into 3NF: Ward_number Ward_name Number_of_beds Nurse_name Nurse_staff_no Patient_number Patient_name Patient_address Patient_tel_no Patient_DOB Consultant_number Consultant _name Consultant_specialism
Security Flat file databases mean that you can’t lock off any data to anyone Relational databases mean that restrictions can be imposed upon certain users
Data Warehouse This is the term for a VERY large database. Consider all of the information being stored on a loyalty card….
Data Mining This is the process of extracting information from a data warehouse to discover: Patterns in data Associations in the data Trends over time Lists of customers likely to buy a product Comparisons with competitors Modelling Predictions Location information Sale patterns Customer buying patterns loyalty
Applications of Data Mining Helping against terrorism (pattern tracking) Preventing Shopifting Identification of custimer needs
DBMS Database management systems A Database Management System (DBMS) is a set of computer programs that controls the creation, maintenance, and the use of a database. Allows data to be independent of the programs that analyse the data
DBMS Allow database to be defined Allows queries Allows data to be appended Allows modification of the database Provides security Allows import and export
DBMS - Advantages Data is stored in a structured and logical way Data independence Avoids data redundancy Used by many users Data integrity is maintained Increased security Data definitions are standardised
DBMS - Disadvantages Difficult to use Costs can be high Centralisation – could lead to problems
Queries Structured Query Language (SQL) Uses a list of commands to select data e.g. SELECT Surname FROM Tbl_personnel WHERE Pers_Dept = ‘Production’ AND per_salary => 30000
Using SQL you can: Combine data Select fields Specify criteria Select field for a report Specify ordering or groups Re-use the query Save the results
Data Dictionary Details of tables Field names field types Field length validation
Distributed Databases Does exactly what it says on the tin, a database that acts as one system, but is in fact spread into different locations. Advantages Not in single location Improved performance Disaster proof (!?) Local access means less traffic
Disadvantages Much more complex Increased security risk If one server fails may experience loss of access Relies on communications Inconsistencies may occur