Download presentation
Presentation is loading. Please wait.
Published byPenelope Hoover Modified over 9 years ago
1
SAN DIEGO SUPERCOMPUTER CENTER Introduction to Database Design July 2006 Ken Nunes knunes @ sdsc.edu
2
SAN DIEGO SUPERCOMPUTER CENTER Database Design Agenda Introductions General Design Considerations Entity-Relationship Model Normalization Overview of SQL Star Schemas Additional Information Q&A
3
SAN DIEGO SUPERCOMPUTER CENTER General Design Considerations Users Application Requirements Legacy Systems/Data
4
SAN DIEGO SUPERCOMPUTER CENTER Users Who are they? Administrative Scientific Technical Impact Access Controls Interfaces Service levels
5
SAN DIEGO SUPERCOMPUTER CENTER Application Requirements What kind of database? OnLine Analytical Processing (OLAP) OnLine Transactional Processing (OLTP) Budget Platform / Vendor Workflow? order of operations error handling reporting
6
SAN DIEGO SUPERCOMPUTER CENTER Legacy Systems/Data What systems are currently in place? Where does the data come from? How is it generated? What format is it in? What is the data used for? Which parts of the system must remain static?
7
SAN DIEGO SUPERCOMPUTER CENTER Entity - Relationship Model A logical design method which emphasizes simplicity and readability. Basic objects of the model are: Entities Relationships Attributes
8
SAN DIEGO SUPERCOMPUTER CENTER Entities Data objects detailed by the information in the database. Denoted by rectangles in the model. EmployeeDepartment
9
SAN DIEGO SUPERCOMPUTER CENTER Attributes Characteristics of entities or relationships. Denoted by ellipses in the model. NameSSN EmployeeDepartment NameBudget
10
SAN DIEGO SUPERCOMPUTER CENTER Relationships Represent associations between entities. Denoted by diamonds in the model. NameSSN EmployeeDepartment NameBudget works in Start date
11
SAN DIEGO SUPERCOMPUTER CENTER Relationship Connectivity Constraints on the mapping of the associated entities in the relationship. Denoted by variables between the related entities. Generally, values for connectivity are expressed as “one” or “many” NameSSN EmployeeDepartment NameBudget work 1N Start date
12
SAN DIEGO SUPERCOMPUTER CENTER Connectivity DepartmentManager has 11 DepartmentProject has N1 EmployeeProject works on NM one-to-one one-to-many many-to-many
13
SAN DIEGO SUPERCOMPUTER CENTER ER example Retailer wants to create an online webstore. The retailer requires information on: Customers Items Orders
14
SAN DIEGO SUPERCOMPUTER CENTER Webstore Entities & Attributes Customers - name, credit card, address Items - name, price, inventory Orders - item, quantity, cost, date, status Items Orders inventory priceName itemquantity Customers credit card nameaddress costdatestatus
15
SAN DIEGO SUPERCOMPUTER CENTER Webstore Relationships Identify the relationships. The orders are recorded each time a customer purchases items, so the customer and order entities are related. Each customer may make several purchases so the relationship is one-to-many Order Customer N 1 purchase
16
SAN DIEGO SUPERCOMPUTER CENTER Webstore Relationships Identify the relationships. The order consists of the items a customer purchases but each item can be found in multiple orders. Since a customer can purchase multiple items and make multiple orders the relationship is many to many. OrderItem consists NM
17
SAN DIEGO SUPERCOMPUTER CENTER Webstore ER Diagram Orders Customers Items purchase consists N N 1 M itemquantitycostnameprice credit cardnameaddress inventory status date
18
SAN DIEGO SUPERCOMPUTER CENTER Logical Design to Physical Design Creating relational SQL schemas from entity- relationship models. Transform each entity into a table with the key and its attributes. Transform each relationship as either a relationship table (many-to-many) or a “foreign key” (one-to-many and many-to-many).
19
SAN DIEGO SUPERCOMPUTER CENTER Entity tables Transform each entity into a table with a key and its attributes. NameSSN Employee create table employee (emp_no number, name varchar2(256), ssn number, primary key (emp_no));
20
SAN DIEGO SUPERCOMPUTER CENTER Foreign Keys Transform each one-to-one or one-to-many relationship as a “foreign key”. Foreign key is a reference in the child (many) table to the primary key of the parent (one) table. create table employee (emp_no number, dept_no number, name varchar2(256), ssn number, primary key (emp_no), foreign key (dept_no) references department); Employee Department has 1 N create table department (dept_no number, name varchar2(50), primary key (dept_no));
21
SAN DIEGO SUPERCOMPUTER CENTER Foreign Key Department Employee Accounting has 1 employee: Brian Burnett Human Resources has 2 employees: Nora Edwards Ben Smith IT has 3 employees: Ajay Patel John O’Leary Julia Lenin
22
SAN DIEGO SUPERCOMPUTER CENTER Many-to-Many tables Transform each many-to-many relationship as a table. The relationship table will contain the foreign keys to the related entities as well as any relationship attributes. create table project_employee_details (proj_no number, emp_no number, start_date date, primary key (proj_no, emp_no), foreign key (proj_no) references project foreign key (emp_no) references employee); Employee Project has N M Start date
23
SAN DIEGO SUPERCOMPUTER CENTER Many-to-Many tables Project Employee Project_employee_details Employee Audit has 1 employee: Brian Burnett Budget has 2 employees: Julia Lenin Nora Edwards Intranet has 3 employees: Julia Lenin John O’Leary Ajay Patel
24
SAN DIEGO SUPERCOMPUTER CENTER Normalization A logical design method which minimizes data redundancy and reduces design flaws. Consists of applying various “normal” forms to the database design. The normal forms break down large tables into smaller subsets.
25
SAN DIEGO SUPERCOMPUTER CENTER First Normal Form (1NF) Each attribute must be atomic No repeating columns within a row. No multi-valued columns. 1NF simplifies attributes Queries become easier.
26
SAN DIEGO SUPERCOMPUTER CENTER 1NF Employee (unnormalized) Employee (1NF)
27
SAN DIEGO SUPERCOMPUTER CENTER Second Normal Form (2NF) Each attribute must be functionally dependent on the primary key. Functional dependence - the property of one or more attributes that uniquely determines the value of other attributes. Any non-dependent attributes are moved into a smaller (subset) table. 2NF improves data integrity. Prevents update, insert, and delete anomalies.
28
SAN DIEGO SUPERCOMPUTER CENTER Functional Dependence Name, dept_no, and dept_name are functionally dependent on emp_no. (emp_no -> name, dept_no, dept_name) Skills is not functionally dependent on emp_no since it is not unique to each emp_no. Employee (1NF)
29
SAN DIEGO SUPERCOMPUTER CENTER 2NF Employee (1NF) Employee (2NF)Skills (2NF)
30
SAN DIEGO SUPERCOMPUTER CENTER Data Integrity Insert Anomaly - adding null values. eg, inserting a new department does not require the primary key of emp_no to be added. Update Anomaly - multiple updates for a single name change, causes performance degradation. eg, changing IT dept_name to IS Delete Anomaly - deleting wanted information. eg, deleting the IT department removes employee Barbara Jones from the database Employee (1NF)
31
SAN DIEGO SUPERCOMPUTER CENTER Third Normal Form (3NF) Remove transitive dependencies. Transitive dependence - two separate entities exist within one table. Any transitive dependencies are moved into a smaller (subset) table. 3NF further improves data integrity. Prevents update, insert, and delete anomalies.
32
SAN DIEGO SUPERCOMPUTER CENTER Transitive Dependence Dept_no and dept_name are functionally dependent on emp_no however, department can be considered a separate entity. Employee (2NF)
33
SAN DIEGO SUPERCOMPUTER CENTER 3NF Employee (2NF) Employee (3NF) Department (3NF)
34
SAN DIEGO SUPERCOMPUTER CENTER Other Normal Forms Boyce-Codd Normal Form (BCNF) Strengthens 3NF by requiring the keys in the functional dependencies to be superkeys (a column or columns that uniquely identify a row) Fourth Normal Form (4NF) Eliminate trivial multivalued dependencies. Fifth Normal Form (5NF) Eliminate dependencies not determined by keys.
35
SAN DIEGO SUPERCOMPUTER CENTER Normalizing our webstore (1NF) customers items item_idnamepriceinventory 34sweater red5021 35sweater blue5010 56t-shirt2576 72jeans755 81jacket1759 cust_idnameaddresscredit_card_numcredit_card_type 45Mike Speedy123 A St.45154visa 45Mike Speedy123 A St.32499mastercard 45Mike Speedy123 A St.12834discover 78Frank Newmon2 Main St.45698visa 102Joe Powers343 Blue Blvd.94065mastercard 102Joe Powers343 Blue Blvd.10532discover orders order_idcust_iditem_idquantitycost datestatus 40545342100 2/306shipped 4054535150 2/306shipped 4054556375 2/306shipped 4087856250 3/5/06refunded 410102722150 3/10/06shipped 410102811175 3/10/06shipped
36
SAN DIEGO SUPERCOMPUTER CENTER Normalizing our webstore (2NF & 3NF) customerscredit_cards cust_idnameaddress 45Mike Speedy123 A St. 78Frank Newmon2 Main St. 102Joe Powers343 Blue Blvd. cust_idnumtype 4545154visa 4532499mastercard 4512834discover 7845698visa 10294065mastercard 10210532discover
37
SAN DIEGO SUPERCOMPUTER CENTER Normalizing our webstore (2NF & 3NF) order details order_iditem_idquantitycost 405342100 40535150 40556375 40856250 410722150 410811175 items item_idnamepriceinventory 34sweater red5021 35sweater blue5010 56t-shirt2576 72jeans755 81jacket1759 order_id cust_iddatestatus 405 452/306shipped 408 783/5/06refunded 410 1023/10/06shipped orders
38
SAN DIEGO SUPERCOMPUTER CENTER Revisit webstore ER diagram Orders Customers Items purchase have N N 1 1 nameprice name address inventory consists N M Credit card card numbercard type status date Order details consists N 1 quantity cost
39
SAN DIEGO SUPERCOMPUTER CENTER Structured Query Language SQL is the standard language for data definition and data manipulation for relational database systems. Nonprocedural Universal
40
SAN DIEGO SUPERCOMPUTER CENTER Data Definition Language The aspect of SQL that defines and manipulates objects in a database. create tables alter tables drop tables create views
41
SAN DIEGO SUPERCOMPUTER CENTER Create Table create table customer (cust_id number, name varchar(50) not null, address varchar(256) not null, primary key (cust_id)); create table credit_card (cust_id number not null, credit_card_type char(5) not null, credit_card_num number not null, foreign key (cust_id) references customer); Customer have N 1 nameaddress Credit card card numbercard type
42
SAN DIEGO SUPERCOMPUTER CENTER Modifying Tables alter table customer modify name varchar(256); alter table customer add credit_limit number; drop table customer;
43
SAN DIEGO SUPERCOMPUTER CENTER Data Manipulation Language The aspect of SQL used to manipulate the data in a database. queries updates inserts deletes
44
SAN DIEGO SUPERCOMPUTER CENTER Data Manipulation Language The aspect of SQL used to manipulate the data in a database. queries updates inserts deletes
45
SAN DIEGO SUPERCOMPUTER CENTER Select command Used to query data from database tables. Format: Select From Where ;
46
SAN DIEGO SUPERCOMPUTER CENTER Query example Select name from customers; result: Mike Speedy Frank Newmon Joe Powers customers cust_idnameaddress 45Mike Speedy123 A St. 78Frank Newmon2 Main St. 102Joe Powers343 Blue Blvd.
47
SAN DIEGO SUPERCOMPUTER CENTER Query example select name from customers where address = ‘123 A St.’; result: Mike Speedy customers cust_idnameaddress 45Mike Speedy123 A St. 78Frank Newmon2 Main St. 102Joe Powers343 Blue Blvd.
48
SAN DIEGO SUPERCOMPUTER CENTER Query example select * from customers where customers.cust_id = credit_cards.cust_id and type = ‘visa’; returns: customers cust_idnameaddress 45Mike Speedy123 A St. 78Frank Newmon2 Main St. 102Joe Powers343 Blue Blvd. credit_cards cust_idnumtype 4545154visa 4532499mastercard 4512834discover 7845698visa 10294065mastercard 10210532discover Cust_idNameAddressCust_idNumtype 45Mike Speedy123 A St.4545154visa 78Frank Newmon2 Main St.7845698visa
49
SAN DIEGO SUPERCOMPUTER CENTER Changing Data There are 3 commands that change data in a table. Insert: insert into ( ) values ( ); insert into customer (cust_id, name) values (3, ‘Fred Flintstone’); Update: update set = where ; update customer set name = ‘Mark Speedy’ where cust_id = 45; Delete : delete from where ; delete from customer where cust_id = 45;
50
SAN DIEGO SUPERCOMPUTER CENTER Star Schemas Designed for data retrieval Best for use in decision support tasks such as Data Warehouses and Data Marts. Denormalized - allows for faster querying due to less joins. Slow performance for insert, delete, and update transactions. Comprised of two types tables: facts and dimensions.
51
SAN DIEGO SUPERCOMPUTER CENTER Fact Table The main table in a star schema is the Fact table. Contains groupings of measures of an event to be analyzed. Measure - numeric data Invoice Facts units sold unit amount total sale price
52
SAN DIEGO SUPERCOMPUTER CENTER Dimension Table Dimension tables are groupings of descriptors and measures of the fact. descriptor - non-numeric data Customer Dimension cust_dim_key name address phone Time Dimension time_dim_key invoice date due date delivered date Location Dimension loc_dim_key store number store address store phone Product Dimension prod_dim_key product price cost
53
SAN DIEGO SUPERCOMPUTER CENTER Star Schema The fact table forms a one to many relationship with each dimension table. Customer Dimension cust_dim_key name address phone Time Dimension time_dim_key invoice date due date delivered date Location Dimension loc_dim_key store number store address store phone Product Dimension prod_dim_key product price cost Invoice Facts cust_dim_key loc_dim_key time_dim_key prod_dim_key units sold unit amount total sale price 1 1 1 1 N NN N
54
SAN DIEGO SUPERCOMPUTER CENTER Analyzing the webstore Order Facts date items customers The manager needs to analyze the orders obtained from the webstore. From this we will use the order table to create our fact table.
55
SAN DIEGO SUPERCOMPUTER CENTER Webstore Dimension Item Dimension item_dim_key name price inventory We have 2 dimensions for the schema: customers and items. Customer Dimension cust_dim_key name address credit_card_type
56
SAN DIEGO SUPERCOMPUTER CENTER Webstore Star Schema 1 N 1 N Order Facts date items customers Item Dimension item_dim_key name price inventory Customer Dimension cust_dim_key name address credit_card_type
57
SAN DIEGO SUPERCOMPUTER CENTER Books and Reference Database Design for Mere Mortals, Michael J. Hernandez Information Modeling and Relational Databases, Terry Halpin Database Modeling and Design, Toby J. Teorey
58
SAN DIEGO SUPERCOMPUTER CENTER Continuing Education UCSD Extension Data Management Courses DBA Certificate Program Database Application Developer Certificate Program
59
SAN DIEGO SUPERCOMPUTER CENTER Data Central The Data Services Group provides Data Allocations for the research community. http://datacentral.sdsc.edu/ Tools and expertise for making data collections available to the broader scientific community. Provide disk, tape, and database storage resources.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.