Introduction to FIS 318/618, Financial Systems & Databases: The Relational Database Model Oakland University School of Business Administration Accounting and Finance Joe Callaghan
The Relational Database Model Based on the theory of relational math (set theory) It is an “automatic transmission” database (with embedded relationships between tables) which replaces the “standard transmission” database (which employs flat-file techniques with explicit pointers between files and records) Flat-files (collections of similar records) are being replaced by collections of interrelated files Allows data to be broken down into logical, smaller, more manageable units - simplifies the organization of complex sets of data
Why A Relational Model? Duplicate data reduced - less input, maintenance, storage, and improved data integrity Data independence: Data can be thought of as being stored in tables regardless of how physically stored. Application independence: Databases defined independently from the systems and programs that will use them - allows users to create ad hoc queries, rather than only receive pre-specified reports A change in the database does not require rewriting all the application program codes. Ability to share same data across multiple applications and systems. It has the ability to maintain several tables of related information that can be accessed by several different users in many different ways - a single query can retrieve data from more than one table.
Some Definitions... Data: Raw facts about the organization and its business transactions that are of interest to the end user Database: A computer structure that houses a collection of data Relational database: Stores information about instances of entities (a specific sales event, salesperson), attributes of those entities (invoice no., salesperson ID) , and the relationships among these entities (each sale can only have one salesperson) - perceived by user to be a collection of two-dimensional tables RDBMS: Software that manages a relational database, controls access, and allows users to retrieve requested data through a standard data-access language, SQL.
Entity-type: Something of significance about which you want to store data in a database, e.g., customers, employees, suppliers, inventory items (note: this is a data modeling term – an entity becomes a table in a RDBMS) Table: An entity-type (e.g., customer) and its attributes Attribute: A property or characteristic of an entity. A column in a relational database table, e.g., customer name, reference #, address, zip ((note: this is a data modeling term – an attribute becomes a column in a RDBMS Row (tuple, record): A record of data in a database table - a single occurrence or entity instance Value: Data in a “cell” – the intersection between row and column in a database table
Types of Attributes Key (identifier in data modeling): Attribute, or combination of attributes, that determines the values of other attributes in each row Composite Key: Multiple-attribute keys; may be further subdivided, e.g., phone may be area code and number - can be a primary key Candidate Key (CK): Attribute (or a minimum combination of attributes) that uniquely identifies each row in a given table - there can be more than one CK (employee entity type: SSN; assigned ID#) Primary Key (PK) ( a unique identifier in data modeling): A CK selected to uniquely identify all other attributes in a given row; cannot be Null Foreign Key (FK): ( a relationship in data modeling): Attribute (combination of attributes) whose value(s) must match the Primary Key in another table in the same database, or whose value(s) must be Null Non-key Attribute: Attribute that is not part of a key
Attributes With A Null Value Null Value: An unknown attribute value (e.g., salesperson not yet allocated to a customer) - it is not a zero. It is an optional attribute. Inclusion of nulls in a table is important - they provide a consistent way to distinguish between valid data such as a 0 and missing data, e.g., an account payable with 0 is good to see; one with an unknown balance can indicate a significant problem In most cases, nulls appear as blanks on a query’s result table on a screen 1. Weaker referential integrity: You can have a null value in the FK if the relationship is a “many optional” relationship l at that end.
Relationships Data modeling term that indicates an association between tables: How the things of significance are related (A FK must match to an existing PK, or else be NULL) This controlled redundancy allows linking of tables (hence “relational”) Entity-Relationship Diagram (ERD): A data model (at the conceptual level) that shows the relationships enforcing business rules between entities (tables) in a database environment (Fig. 5.4)
Business Rules Narrative descriptions of policies, procedures, or principles in an organization Examples: A pilot cannot be on duty for more than 10 hours in a 24-hour period A professor must teach at least three classes in a semester A class may not have fewer than 10 enrollments
Concept to Definition
Example from Ch. 3 (P&S) Partial MSC: Multi-product merchandiser Sale to Customer Ship to Customer Multi-product merchandiser Salesperson, Shipping Clerk
Customer Invoice is sent an 1 M Invoice Line contains draws upon Inventory Figure 3.25 Entity-relationship diagram.
Customer Invoice is sent an 1 M Invoice Line contains InvoiceInven Inventory Figure 3.26 Revised entity-relationship diagram.
Figure 3.1 Sales transactions stored in a database accounting system.
Figure 3.7 The Customer relation, tblCustomer.
foreign key primary key Figure 3.8 Primary key and foreign key relationship.
Figure 3.9 Schemas of tables in the invoicing system.
primary key foreign key 500 records in this table Figure 3.10 Example rows in the Invoice table, tblInvoice.
primary key foreign key Figure 3.11 Example rows in the Invoice Line table, tblInvoiceLine.
primary key foreign key Figure 3.12 Example rows in the primary Inventory table, tblInventory.
primary key foreign key Figure 3.13 Example rows in the secondary Inventory table, tblInventoryDescription.
Normalization Process of taking a “raw” database and breaking it into logical units called tables, by following theoretical rules called normal forms The intent is to create a degree of controlled redundancy that allows two or more tables to be joined, by matching a FK in one table to a PK in another table Referential integrity (constraint created upon table creation) is enforced when every non-null FK value must match an existing PK value (if there is a FK, there has to be a PK for that FK in another table) Normalization has six nested normal forms Generally a well-formed business database will be normalized through 3rd normal form (3NF)
Benefits of Normalization Greater overall database organization Minimize data redundancies Data consistency within the database A more flexible database design Data can be used more productively A better handle on database security Disadvantage of Normalization Reduced database performance because database must locate requested tables and join data - requires additional processing logic
Normal Forms Normalization through a series of stages called NORMAL FORMS Each NF depends on normalization steps taken in the previous NF First Normal Form - 1NF Second Normal Form - 2NF Third Normal Form - 3NF
1NF First normal form rules: All key attributes must be defined There must be no repeating groups (values), i.e., each row/column intersection can have only one value All attributes must be functionally dependent on the PK, or part of the PK - e.g., SSN determines DOB, but DOB cannot determine SSN Hint: Put all attributes in a two-dimensional flat table, with no repeating values
General Journal Entry: Traditional View - Unnormalized Assume that the transaction # will reset to 1 at the beginning of the next fiscal year
GJ: First Normal Form
2NF Second Normal Form Rules: Table is in 1NF; and Table includes no partial dependencies; that is, no attribute is dependent on only portion of the primary key – must be dependent on entire PK Hint: Examine non-key attributes to determine whether any are dependent on only portion of a composite PK - this would violate 2NF If a table only has one attribute as a PK, then it is in 2NF.
Chart of Accounts Table
Transaction Listing Table
Transaction Detail Table (Base Table)
3NF Third Normal Form Rules: Table is in 2NF and There are no transitive dependencies Hint: You will violate 3NF if you can deduce the value of a non-key attribute by knowing the value of another non-key attribute
Normalized Transaction Detail (Base) Table
Example from Ch. 3 (P&S) Continued
repeating group Figure 3.14 Example table containing repeating groups.
Figure 3.15 Example rows of the Customer table in first normal form (1NF).
Figure 3.16 Functional dependencies in the Customer table.
Figure 3.17 Invoice table in second normal form (2NF).
Figure 3.18 Transitive dependencies in the Invoice table shown in Figure 3.17.
Operations Restrict aka Select Project Join SQL the standard language DDL: data definition language DML: data manipulation language Destructive: CUD Non-Destructive: R CRUD again
Figure 3.19 Select operation.
Figure 3.20 Project operation.
Figure 3.21 Join operation.
Figure 3.22 Joining tables with primary key/foreign key relationships.