2002.10.17 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

Slides:



Advertisements
Similar presentations
Chapter 5 Normalization of Database Tables
Advertisements

Entity-Relationship (ER) Modeling
+ Review: Normalization and data anomalies CSCI 2141 W2013 Slide set modified from courses.ischool.berkeley.edu/i257/f06/.../Lecture06_257.ppt.
Information Systems Planning and the Database Design Process
SLIDE 1IS 202 – FALL 2005 Prof. Ray Larson UC Berkeley SIMS SIMS 202: Information Organization and Retrieval Normalization & The Relational.
Database Design University of California, Berkeley
Normalization of Database Tables
9/6/2001Database Management – Fall 2000 – R. Larson Information Systems Planning and the Database Design Process University of California, Berkeley School.
Oct 31, 2000Database Management -- Fall R. Larson Database Management: Introduction to Terms and Concepts University of California, Berkeley School.
SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002
SLIDE 1IS Fall 2002 Database Design: Conceptual Model and ER Diagramming University of California, Berkeley School of Information Management.
Client/Server Databases and the Oracle 10g Relational Database
SLIDE 1IS 257 – Fall 2006 Database Design: Logical Models: Normalization and The Relational Model University of California, Berkeley School.
SLIDE 1IS 257 – Fall 2009 Database Design: Logical Models: Normalization and The Relational Model University of California, Berkeley School.
SLIDE 1IS Fall 2002 Physical Database Design University of California, Berkeley School of Information Management and Systems SIMS 202:
9/7/1999Information Organization and Retrieval Database Design: Conceptual Model and ER Diagramming University of California, Berkeley School of Information.
11/28/2000Information Organization and Retrieval Introduction to Databases and Database Design University of California, Berkeley School of Information.
Database Design Conceptual –identify important entities and relationships –determine attribute domains and candidate keys –draw the E-R diagram Logical.
SLIDE 1IS Fall 2010 Information Systems Planning and the Database Design Process Ray R. Larson University of California, Berkeley School.
The Relational Database Model:
SLIDE 1IS Fall 2002 Information Systems Planning and the Database Design Process University of California, Berkeley School of Information.
8/28/97Information Organization and Retrieval Intergalactic Courier Service: Database and Application Design University of California, Berkeley School.
SLIDE 1IS 257 – Spring 2004 Physical Database Design University of California, Berkeley School of Information Management and Systems SIMS 257:
12/5/2000Information Organization and Retrieval Database Design: Normalization and SQL University of California, Berkeley School of Information Management.
Normalization of Database Tables
Database Design: Logical Model and Normalization
SLIDE 1IS 257 – Fall 2008 Database Design: Logical Models: Normalization and The Relational Model University of California, Berkeley School.
SLIDE 1IS 257 – Fall 2005 Database Design: Normalization and The Relational Model University of California, Berkeley School of Information.
8/28/97Information Organization and Retrieval Database Design: Normalization University of California, Berkeley School of Information Management and Systems.
Callie’s Birthday SLIDE 1IS 202 – FALL 2004 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm.
SLIDE 1IS 202 – FALL 2005 Prof. Ray Larson UC Berkeley SIMS SIMS 202: Information Organization and Retrieval Introduction to Database Design.
SLIDE 1IS 257 – Fall 2004 Database Design: Normalization and The Relational Model University of California, Berkeley School of Information.
SLIDE 1IS 257 – Fall 2009 Physical Database Design University of California, Berkeley School of Information I 257: Database Management.
SLIDE 1IS 257 – Spring 2004 Information Systems Planning and the Database Design Process Ray R. Larson University of California, Berkeley.
SLIDE 1IS 257 – Spring 2004 Database Design: Conceptual Model and ER Diagramming Ray R. Larson University of California, Berkeley School of.
SLIDE 1IS 202 – FALL 2003 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2003
“DOK 322 DBMS” Y.T. Database Design Hacettepe University Department of Information Management DOK 322: Database Management Systems.
8/28/97Information Organization and Retrieval Files and Databases University of California, Berkeley School of Information Management and Systems SIMS.
Chapter 5 Normalization of Database Tables
Oct. 16, 2001Database Management -- R. Larson Database Applications: Web-Enabled Databases and Search Engines: Cont. University of California, Berkeley.
SLIDE 1IS Fall 2002 Database Applications: Using ColdFusion University of California, Berkeley School of Information Management and Systems.
8/28/97Information Organization and Retrieval Database Design University of California, Berkeley School of Information Management and Systems SIMS 202:
SLIDE 1IS Fall 2006 Information Systems Planning and the Database Design Process Ray R. Larson University of California, Berkeley School.
SLIDE 1IS 257 – Spring 2004 Database Design: Normalization and Access DB Creation University of California, Berkeley School of Information.
Information Systems Planning and the Database Design Process
PHASE 3: SYSTEMS DESIGN Chapter 7 Data Design.
Databases From A to Boyce Codd. What is a database? It depends on your point of view. For Manovich, a database is a means of structuring information in.
MIS 385/MBA 664 Systems Implementation with DBMS/ Database Management Dave Salisbury ( )
Chapter 1 Overview of Database Concepts Oracle 10g: SQL
Concepts and Terminology Introduction to Database.
Lecture 2 An Overview of Relational Database IST 318 – DB Admin.
Database Systems: Design, Implementation, and Management Tenth Edition
Database Systems: Design, Implementation, and Management Ninth Edition Chapter 6 Normalization of Database Tables.
1 DATABASE SYSTEMS DESIGN IMPLEMENTATION AND MANAGEMENT INTERNATIONAL EDITION ROB CORONEL CROCKETT Chapter 7 Normalisation.
BIS Database Systems School of Management, Business Information Systems, Assumption University A.Thanop Somprasong Chapter # 5 Normalization of Database.
Logical Database Design Relational Model. Logical Database Design Logical database design: process of transforming conceptual data model into a logical.
Chapter 1Introduction to Oracle9i: SQL1 Chapter 1 Overview of Database Concepts.
SLIDE 1IS 257 – Fall 2014 Database Design: Logical Models: Normalization and The Relational Model University of California, Berkeley School.
SLIDE 1IS 202 – FALL 2006 Prof. Ray Larson UC Berkeley SIMS SIMS 202: Information Organization and Retrieval Introduction to Database Design.
BSA206 Database Management Systems Lecture 2: Introduction to Oracle / Overview of Database Concepts.
Logical Database Design and the Relational Model.
Lecture 4: Logical Database Design and the Relational Model 1.
Logical Database Design and Relational Data Model Muhammad Nasir
SLIDE 1IS 257 – Fall 2006 Normalization Normalization theory is based on the observation that relations with certain properties are more effective.
12/4/2001Information Organization and Retrieval Database Design University of California, Berkeley School of Information Management and Systems SIMS 202:
University of California, Berkeley School of Information
Normalization.
Normalization Normalization theory is based on the observation that relations with certain properties are more effective in inserting, updating and deleting.
University of California, Berkeley School of Information
Database Design Hacettepe University
Presentation transcript:

SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall SIMS 202: Information Organization and Retrieval Lecture 14: Database Design

SLIDE 2IS 202 – FALL 2002 Lecture Overview Review –Databases and Database Design –Database Life Cycle –ER Diagrams Database Design Normalization Web-Enabled Databases

SLIDE 3IS 202 – FALL 2002 Lecture Overview Review –Databases and Database Design –Database Life Cycle –ER Diagrams Database Design Normalization Web-Enabled Databases

SLIDE 4IS 202 – FALL 2002 Models (1) Conceptual Model Logical Model External Model Conceptual requirements Conceptual requirements Conceptual requirements Conceptual requirements Application 1 Application 2Application 3Application 4 Application 2 Application 3 Application 4 External Model External Model External Model Internal Model

SLIDE 5IS 202 – FALL 2002 Database System Life Cycle Growth, Change, & Maintenance 6 Operations 5 Integration 4 Design 1 Conversion 3 Physical Creation 2

SLIDE 6IS 202 – FALL 2002 Another View of the Life Cycle Operations 5 Conversion 3 Physical Creation 2 Growth, Change 6 Integration 4 Design 1

SLIDE 7IS 202 – FALL 2002 Database Design Process Conceptual Model Logical Model External Model Conceptual requirements Conceptual requirements Conceptual requirements Conceptual requirements Application 1 Application 2Application 3Application 4 Application 2 Application 3 Application 4 External Model External Model External Model Internal Model

SLIDE 8IS 202 – FALL 2002 Entity An Entity is an object in the real world (or even imaginary worlds) about which we want or need to maintain information –Persons (e.g.: customers in a business, employees, authors) –Things (e.g.: purchase orders, meetings, parts, companies) Employee

SLIDE 9IS 202 – FALL 2002 Attributes Attributes are the significant properties or characteristics of an entity that help identify it and provide the information needed to interact with it or use it (This is the Metadata for the entities) Employee Last Middle First Name SSN Age Birthdate Projects

SLIDE 10IS 202 – FALL 2002 Relationships Relationships are the associations between entities They can involve one or more entities and belong to particular relationship types –One to One –One to Many –Many to Many

SLIDE 11IS 202 – FALL 2002 Relationships Class Attends Student Part Supplies project parts Supplier Project

SLIDE 12IS 202 – FALL 2002 Types of Relationships Concerned only with cardinality of relationship Truck Assigned EmployeeProject Assigned EmployeeProject Assigned Employee 11 n n 1 m Chen ER notation

SLIDE 13IS 202 – FALL 2002 More Complex Relationships Project Evaluation Employee Manager 1/n/n 1/1/1 n/n/1 Project Assigned Employee 4(2-10) 1 SSNProjectDate Manages Employee Manages Is Managed By 1 n

SLIDE 14IS 202 – FALL 2002 Weak Entities Owe existence entirely to another entity Order-line Contains Order Invoice # Part# Rep# QuantityInvoice#

SLIDE 15IS 202 – FALL 2002 Supertype and Subtype Entities Clerk Is one of Sales-rep Invoice Other Employee Sold Manages

SLIDE 16IS 202 – FALL 2002 Many to Many Relationships Employee Project Is Assigned Project Assignment Assigned SSN Proj# SSN Proj# Hours

SLIDE 17IS 202 – FALL 2002 Lecture Overview Review –Databases and Database Design –Database Life Cycle –ER Diagrams Database Design Normalization Web-Enabled Databases

SLIDE 18IS 202 – FALL 2002 Database Design Process Conceptual Model Logical Model External Model Conceptual requirements Conceptual requirements Conceptual requirements Conceptual requirements Application 1 Application 2Application 3Application 4 Application 2 Application 3 Application 4 External Model External Model External Model Internal Model

SLIDE 19IS 202 – FALL 2002 Database Design Process Conceptual Model Logical Model External Model Conceptual requirements Conceptual requirements Conceptual requirements Conceptual requirements Application 1 Application 2Application 3Application 4 Application 2 Application 3 Application 4 External Model External Model External Model Internal Model

SLIDE 20IS 202 – FALL 2002 Requirements Analysis Conceptual Requirements –Systems Analysis Process Examine all of the information sources used in existing applications Identify the characteristics of each data element –Numeric –Text –Date/time –Etc. Examine the tasks carried out using the information Examine results or reports created using the information

SLIDE 21IS 202 – FALL 2002 Database Design Process Conceptual Model Logical Model External Model Conceptual requirements Conceptual requirements Conceptual requirements Conceptual requirements Application 1 Application 2Application 3Application 4 Application 2 Application 3 Application 4 External Model External Model External Model Internal Model

SLIDE 22IS 202 – FALL 2002 Conceptual Design Conceptual Model –Merge the collective needs of all applications –Determine what Entities are being used Some object about which information is to maintained –What are the Attributes of those entities? Properties or characteristics of the entity What attributes uniquely identify the entity –What are the Relationships between entities How the entities interact with each other?

SLIDE 23IS 202 – FALL 2002 Developing a Conceptual Model Overall view of the database that integrates all the needed information discovered during the requirements analysis Elements of the Conceptual Model are represented by diagrams, Entity-Relationship or ER Diagrams, that show the meanings and relationships of those elements independent of any particular database systems or implementation details Can also be represented using other modeling tools (such as UML)

SLIDE 24IS 202 – FALL 2002 Database Design Process Conceptual Model Logical Model External Model Conceptual requirements Conceptual requirements Conceptual requirements Conceptual requirements Application 1 Application 2Application 3Application 4 Application 2 Application 3 Application 4 External Model External Model External Model Internal Model

SLIDE 25IS 202 – FALL 2002 Logical Design Logical Model –How is each entity and relationship represented in the Data Model of the DBMS Hierarchic? Network? Relational? Object-Oriented?

SLIDE 26IS 202 – FALL 2002 Database Design Process Conceptual Model Logical Model External Model Conceptual requirements Conceptual requirements Conceptual requirements Conceptual requirements Application 1 Application 2Application 3Application 4 Application 2 Application 3 Application 4 External Model External Model External Model Internal Model

SLIDE 27IS 202 – FALL 2002 Physical Design Internal Model –Choices of index file structure –Choices of data storage formats –Choices of disk layout

SLIDE 28IS 202 – FALL 2002 Database Design Process Conceptual Model Logical Model External Model Conceptual requirements Conceptual requirements Conceptual requirements Conceptual requirements Application 1 Application 2Application 3Application 4 Application 2 Application 3 Application 4 External Model External Model External Model Internal Model

SLIDE 29IS 202 – FALL 2002 Database Application Design External Model –User views of the integrated database –Making the old (or updated) applications work with the new database design

SLIDE 30IS 202 – FALL 2002 Lecture Overview Review –Databases and Database Design –Database Life Cycle –ER Diagrams Database Design Normalization Web-Enabled Databases

SLIDE 31IS 202 – FALL 2002 Normalization Normalization theory is based on the observation that relations with certain properties are more effective in inserting, updating and deleting data than other sets of relations containing the same data Normalization is a multi-step process beginning with an “unnormalized” relation –Hospital example from Atre, S. Data Base: Structured Techniques for Design, Performance, and Management

SLIDE 32IS 202 – FALL 2002 Normal Forms First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form (3NF) Boyce-Codd Normal Form (BCNF) Fourth Normal Form (4NF) Fifth Normal Form (5NF)

SLIDE 33IS 202 – FALL 2002 Normalization Boyce- Codd and Higher Functional dependency of nonkey attributes on the primary key - Atomic values only Full Functional dependency of nonkey attributes on the primary key No transitive dependency between nonkey attributes All determinants are candidate keys - Single multivalued dependency

SLIDE 34IS 202 – FALL 2002 Unnormalized Relations First step in normalization is to convert the data into a two-dimensional table In unnormalized relations data can repeat within a column

SLIDE 35IS 202 – FALL 2002 Unnormalized Relations

SLIDE 36IS 202 – FALL 2002 First Normal Form To move to First Normal Form a relation must contain only atomic values at each row and column –No repeating groups –A column or set of columns is called a Candidate Key when its values can uniquely identify the row in the relation

SLIDE 37IS 202 – FALL 2002 First Normal Form

SLIDE 38IS 202 – FALL NF Storage Anomalies Insertion: A new patient has not yet undergone surgery -- hence no surgeon # -- Since surgeon # is part of the key we can’t insert Insertion: If a surgeon is newly hired and hasn’t operated yet -- there will be no way to include that person in the database Update: If a patient comes in for a new procedure, and has moved, we need to change multiple address entries Deletion (type 1): Deleting a patient record may also delete all info about a surgeon Deletion (type 2): When there are functional dependencies (like side effects and drug) changing one item eliminates other information

SLIDE 39IS 202 – FALL 2002 Second Normal Form A relation is said to be in Second Normal Form when every nonkey attribute is fully functionally dependent on the primary key –That is, every nonkey attribute needs the full primary key for unique identification

SLIDE 40IS 202 – FALL 2002 Second Normal Form

SLIDE 41IS 202 – FALL 2002 Second Normal Form

SLIDE 42IS 202 – FALL 2002 Second Normal Form

SLIDE 43IS 202 – FALL NF Storage Anomalies Removed Insertion: Can now enter new patients without surgery Insertion: Can now enter Surgeons who haven’t operated Deletion (type 1): If Charles Brown dies the corresponding tuples from Patient and Surgery tables can be deleted without losing information on David Rosen Update: If John White comes in for third time, and has moved, we only need to change the Patient table

SLIDE 44IS 202 – FALL NF Storage Anomalies Insertion: Cannot enter the fact that a particular drug has a particular side effect unless it is given to a patient Deletion: If John White receives some other drug because of the penicillin rash, and a new drug and side effect are entered, we lose the information that penicillin can cause a rash Update: If drug side effects change (a new formula) we have to update multiple occurrences of side effects

SLIDE 45IS 202 – FALL 2002 Third Normal Form A relation is said to be in Third Normal Form if there is no transitive functional dependency between nonkey attributes –When one nonkey attribute can be determined with one or more nonkey attributes there is said to be a transitive functional dependency The side effect column in the Surgery table is determined by the drug administered –Side effect is transitively functionally dependent on drug so Surgery is not 3NF

SLIDE 46IS 202 – FALL 2002 Third Normal Form

SLIDE 47IS 202 – FALL 2002 Third Normal Form

SLIDE 48IS 202 – FALL NF Storage Anomalies Removed Insertion: We can now enter the fact that a particular drug has a particular side effect in the Drug relation Deletion: If John White recieves some other drug as a result of the rash from penicillin, but the information on penicillin and rash is maintained Update: The side effects for each drug appear only once

SLIDE 49IS 202 – FALL 2002 Boyce-Codd Normal Form Most 3NF relations are also BCNF relations A 3NF relation is NOT in BCNF if: –Candidate keys in the relation are composite keys (they are not single attributes) –There is more than one candidate key in the relation, and –The keys are not disjoint, that is, some attributes in the keys are common

SLIDE 50IS 202 – FALL 2002 Most 3NF Relations Are Also BCNF – Is This One?

SLIDE 51IS 202 – FALL 2002 BCNF Relations

SLIDE 52IS 202 – FALL 2002 Fourth Normal Form Any relation is in Fourth Normal Form if it is BCNF and any multivalued dependencies are trivial Eliminate non-trivial multivalued dependencies by projecting into simpler tables

SLIDE 53IS 202 – FALL 2002 Fifth Normal Form A relation is in 5NF if every join dependency in the relation is implied by the keys of the relation Implies that relations that have been decomposed in previous NF can be recombined via natural joins to recreate the original relation

SLIDE 54IS 202 – FALL 2002 Normalizing to Death Normalization splits database information across multiple tables To retrieve complete information from a normalized database, the JOIN operation must be used JOIN tends to be expensive in terms of processing time, and very large joins are very expensive

SLIDE 55IS 202 – FALL 2002 Lecture Overview Review –Databases and Database Design –Database Life Cycle –ER Diagrams Database Design Normalization Web-Enabled Databases

SLIDE 56IS 202 – FALL 2002 Overview Why use a database system for Web design and e-commerce? What systems are available? Pros and Cons of different Web database systems? Text retrieval in database systems Search engines for Intranet and Intrasite searching

SLIDE 57IS 202 – FALL 2002 Why Use a Database System? Simple Web sites with only a few pages don’t need much more than static HTML files

SLIDE 58IS 202 – FALL 2002 Simple Web Applications Server Web Server Internet Files Clients

SLIDE 59IS 202 – FALL 2002 Adding Dynamic Content to the Site Small sites can often use simple HTML and CGI scripts accessing data files to create dynamic content for small sites

SLIDE 60IS 202 – FALL 2002 Dynamic Web Applications 1 Server CGI Web Server Internet Files Clients

SLIDE 61IS 202 – FALL 2002 Issues For Scaling Up Web Applications Performance Scalability Maintenance Data integrity Transaction support

SLIDE 62IS 202 – FALL 2002 Why Use a Database System? Database systems have concentrated on providing solutions for all of these issues for scaling up Web applications –Performance –Scalability –Maintenance –Data integrity –Transaction support While systems differ in their support, most offer some support for all of these

SLIDE 63IS 202 – FALL 2002 Dynamic Web Applications 2 Server database CGI DBMS Web Server Internet Files Clients database

SLIDE 64IS 202 – FALL 2002 Server Interfaces Adapted from John P. Ashenfelter, Choosing a Database for Your Web Site DatabaseWeb Server Web Application Server Web DB App HTML JavaScript DHTML CGI Web Server API’s ColdFusion PhP Perl Java ASP SQL ODBC Native DB interfaces JDBC Native DB Interfaces

SLIDE 65IS 202 – FALL 2002 Web Application Server Software ColdFusion PHP ASP All of these are server-side scripting languages that embed code in HTML pages

SLIDE 66IS 202 – FALL 2002 ColdFusion Developing WWW sites typically involved a lot of programming to build dynamic sites –E.g., pages generated as a result of catalog searches, etc. ColdFusion was designed to permit the construction of dynamic Web sites with only minor extensions to HTML through a DBMS interface

SLIDE 67IS 202 – FALL 2002 What ColdFusion Is Good For Putting up databases onto the Web Handling dynamic databases (frequent updates, etc.) Making databases searchable and updateable by users

SLIDE 68IS 202 – FALL 2002 CFML ColdFusion Markup Language Read data from and update data to databases and tables Create dynamic data-driven pages Perform conditional processing Populate forms with live data Process form submissions Generate and retrieve messages Perform HTTP and FTP function Perform credit card verification and authorization Read and write client-side cookies

SLIDE 69IS 202 – FALL 2002 Templates Assume we have a database named contents_of_my_shopping_cart.mdb -- single table called contents... Create an HTML page (uses extension.cfm), before... SELECT * FROM contents ;

SLIDE 70IS 202 – FALL 2002 Contents of My Shopping Cart Contents of My Shopping Cart #Item# #Date_of_item# $#Price# Templates (cont.)

SLIDE 71IS 202 – FALL 2002 Contents of My Shopping Cart Bouncy Ball with Psychedelic Markings 12 December 1998 $0.25 Shiny Blue Widget 14 December 1998 $2.53 Large Orange Widget 14 December 1998 $3.75 Templates (cont.)

SLIDE 72IS 202 – FALL 2002 CFIF and CFELSE Item: #Item#

SLIDE 73IS 202 – FALL 2002 Photo Browser The current photo browser uses a combination of –Javascript for expandable hierarchies –Database in MS Access –ColdFusion to search the database when one of the facets is selected The database design for the photo database currently looks like…

SLIDE 74IS 202 – FALL 2002 Photo Browser ER

SLIDE 75IS 202 – FALL 2002 Photo Database Lets look at the photo database in the Access interface –Multi-Facet queries –Queries for multiple descriptors in the same facet (harder)

SLIDE 76IS 202 – FALL 2002 Assignment 7 (Database Design) Involves –Examining a Web Site (probably) using a DBMS for E-commerce to sell books –Inferring the structure and kinds of entities and attributes used in that site (book info only) –Creating your own design using ER diagrams showing the entities and relationships that you inferred

SLIDE 77IS 202 – FALL 2002 Next Week Introduction to Information Retrieval