Wei-Pang Yang, Information Management, NDHU Normalization Unit 7 Normalization ( 表格正規化 ) 7-1.

Slides:



Advertisements
Similar presentations
Introduction to Database System Wei-Pang Yang, IM.NDHU, Final Test-1 Example: Banking Database 1. branch 2. customer 客戶 ( 存款戶, 貸款戶 ) 5. account.
Advertisements

Wei-Pang Yang, Information Management, NDHU More on Normalization Unit 18 More on Normalization ( 表格正規化探討 ) 18-1.
Software Engineering for Digital Home 單元 2 :軟體處理程序與需求分析 2-3 需求工程處理程序 Presenter: Away.
第七章 抽樣與抽樣分配 蒐集統計資料最常見的方式是抽查。這 牽涉到兩個問題: 抽出的樣本是否具有代表性?是否能反應出母體的特徵?
Teacher : Ing-Jer Huang TA : Chien-Hung Chen 2015/6/3 Course Embedded Systems : Principles and Implementations Weekly Preview Question CH3.5 ~ CH /10/31.
第二章 太陽能電池的基本原理 及其結構 2-1 太陽能電池的基本原理 2-2 太陽能電池的基本結構 2-3 太陽能電池的製作.
“Rule” By OX. By Check CREATE TABLE 員工薪資 ( 編號 int IDENTITY PRIMARY KEY, 薪資 smallmoney, CHECK ( 薪資 > 0 AND 薪資
Reference, primitive, call by XXX 必也正名乎 誌謝 : 部份文字取於前輩 TAHO 的文章.
BY OX. 檢視表與資料表的差異性 查詢 (query) 檢視表 (View) 的紀錄,是經由查詢 (query) 而來,而檢 視表的資料來源可以是單一資料表或是多資料表,甚 至其他檢視表 但檢視表中的紀錄只存在資料表中.
中央大學。范錚強 1 其他 ER 相關觀念 以及 OO 模型 國立中央大學 資訊管理系 范錚強 2007.
: Factstone Benchmark ★★☆☆☆ 題組: Problem Set Archive with Online Judge 題號: : Factstone Benchmark 解題者:鐘緯駿 解題日期: 2006 年 06 月 06 日 題意: 假設 1960.
1 實驗二 : SIP User Mobility 實驗目的 藉由 Registra 和 Redirect Server 的設計,深入瞭解 SIP 的運 作及訊息格式。 實作部分 ( 1 )實作一個 Registrar 來接收 SIP REGISTER ,而且 要將 REGISTER 中 Contact.
: OPENING DOORS ? 題組: Problem Set Archive with Online Judge 題號: 10606: OPENING DOORS 解題者:侯沛彣 解題日期: 2006 年 6 月 11 日 題意: - 某間學校有 N 個學生,每個學生都有自己的衣物櫃.
消費者物價指數反映生活成本。當消費者物價指數上升時,一般家庭需要花費更多的金錢才能維持相同的生活水準。經濟學家用物價膨脹(inflation)來描述一般物價持續上升的現象,而物價膨脹率(inflation rate)為物價水準的變動百分比。
Last modified 2004/02 An Introduction to SQL (Structured Query Language )
1.1 電腦的特性 電腦能夠快速處理資料:電腦可在一秒內處理數百萬個 基本運算,這是人腦所不能做到的。原本人腦一天的工 作量,交給電腦可能僅需幾分鐘的時間就處理完畢。 電腦能夠快速處理資料:電腦可在一秒內處理數百萬個 基本運算,這是人腦所不能做到的。原本人腦一天的工 作量,交給電腦可能僅需幾分鐘的時間就處理完畢。
STAT0_sampling Random Sampling  母體: Finite population & Infinity population  由一大小為 N 的有限母體中抽出一樣本數為 n 的樣 本,若每一樣本被抽出的機率是一樣的,這樣本稱 為隨機樣本 (random sample)
Chapter I: 資料探勘 - 初探. 目標 定義 DM 並瞭解如何用來解決問題 瞭解電腦是最佳學習概念的工具 瞭解 DM 可被用來當成可能問題解決策略 之時機 瞭解專家系統與資料探勘使用不同的方 法來完成相似的目標 瞭解如何從事前定義的類別資料所形成 概念定義來建立監督式學習的模型.
中央大學。范錚強 1 從 ER 到 Logical Schema ── 兼談 Schema Integration 國立中央大學 資訊管理系 范錚強 2005.
1 蛋白質簡介 暨南大學資訊工程系 2003/05/06. 2 蛋白質是由 20 種胺基酸所組成.
奶酪專賣店系統 組員: B 林家榕 B 莊舜婷.
具備人臉追蹤與辨識功能的一個 智慧型數位監視系統 系統架構 在巡邏模式中 ,攝影機會左右來回巡視,並 利用動態膚色偵測得知是否有移動膚色物體, 若有移動的膚色物體則進入到追蹤模式,反之 則繼續巡視。
© The McGraw-Hill Companies, Inc., 2008 第 6 章 製造流程的選擇與設計.
圖片索引專題 指導教授:陳淑媛 教授 黃伯偉 林育瑄. 動機 & 理念  目前圖像檢索系統中使用的大多都為利用文字 標籤圖像或是圖像輪廓特徵來進行搜尋,然而 輪廓特徵的缺點卻是所有組成圖像的線條都要 逐一處理相當耗時。  所以本研究的目標在於,提出一個以像素點為 特徵的有效率與正確率的圖像檢索演算法實作。
第 1 章 PC 的基本構造. 本章提要 PC 系統簡介 80x86 系列 CPU 及其暫存器群 記憶體: Memory 80x86 的分節式記憶體管理 80x86 的 I/O 結構 學習組合語言的基本工具.
Introduction to Java Programming Lecture 17 Abstract Classes & Interfaces.
第 1 章 認識資料庫系統 著作權所有 © 旗標出版股份有限公司.
: The largest Clique ★★★★☆ 題組: Contest Archive with Online Judge 題號: 11324: The largest Clique 解題者:李重儀 解題日期: 2008 年 11 月 24 日 題意: 簡單來說,給你一個 directed.
最新計算機概論 第 5 章 系統程式. 5-1 系統程式的類型 作業系統 (OS) : 介於電腦硬體與 應用軟體之間的 程式,除了提供 執行應用軟體的 環境,還負責分 配系統資源。
圖層的操作與管理 圖層的作用就如同一張張透明的賽璐璐片, 你可以將動畫中的每項物件, 放置在不同圖 層中, 圖層交疊就形成完整的畫面。在各圖 層中的物件, 做任何的移動或變化, 都不會 相互干擾, 所以當你編輯一個物件時, 只要 在物件所在的圖層進行操作, 將可大幅降低 製作過程的複雜度與難度。
: Fast and Easy Data Compressor ★★☆☆☆ 題組: Problem Set Archive with Online Judge 題號: 10043: Fast and Easy Data Compressor 解題者:葉貫中 解題日期: 2007 年 3.
McGraw-Hill/Irwin © 2003 The McGraw-Hill Companies, Inc.,All Rights Reserved. 參 資料蒐集的方法.
:Nuts for nuts..Nuts for nuts.. ★★★★☆ 題組: Problem Set Archive with Online Judge 題號: 10944:Nuts for nuts.. 解題者:楊家豪 解題日期: 2006 年 2 月 題意: 給定兩個正整數 x,y.
6-2 認識元件庫與內建元件庫 Flash 的元件庫分兩種, 一種是每個動畫專 屬的元件庫 (Library) ;另一種則是內建元 件庫 (Common Libraries), 兩者皆可透過 『視窗』功能表來開啟, 以下即為您說明。
Teacher : Ing-Jer Huang TA : Chien-Hung Chen 2015/6/25 Course Embedded Systems : Principles and Implementations Weekly Preview Question CH 2.4~CH 2.6 &
Image Interpolation Use SSE 指導教授 : 楊士萱 學 生 : 楊宗峰 日 期 :
JAVA 程式設計與資料結構 第二十章 Searching. Sequential Searching Sequential Searching 是最簡單的一種搜尋法,此演 算法可應用在 Array 或是 Linked List 此等資料結構。 Sequential Searching 的 worst-case.
演算法 8-1 最大數及最小數找法 8-2 排序 8-3 二元搜尋法.
第十二章 供應鏈管理之決策支援系統 Decision-Support Systems for Supply Chain Management
逆向選擇和市場失調. 定義  資料不對稱 在交易其中,其中一方較對方有多些資料。  逆向選擇 出現在這個情況下,就是當買賣雙方隨意在 市場上交易,與比較主動交易者作交易為佳 。
845: Gas Station Numbers ★★★ 題組: Problem Set Archive with Online Judge 題號: 845: Gas Station Numbers. 解題者:張維珊 解題日期: 2006 年 2 月 題意: 將輸入的數字,經過重新排列組合或旋轉數字,得到比原先的數字大,
概念性產品企劃書 呂學儒 李政翰.
中央大學。范錚強 1 資料模式 Data Modeling 國立中央大學 資訊管理系 范錚強
ArcINFO &Geodatabase 由 ESRI 產生 1970 ArcINFO 一開始被設計在迷你電 腦上, 後來逐漸發展, 在 UNIX 系統上也能 執行, 直到今天, 已經可以在不同的平台上 運作.
Teacher : Ing-Jer Huang TA : Chien-Hung Chen 2015/6/30 Course Embedded Systems : Principles and Implementations Weekly Preview Question CH7.1~CH /12/26.
著作權所有 © 旗標出版股份有限公司 第 3 章 資料庫物件的關係. 本章提要 Access 資料庫物件的關係 Access 資料庫物件的關係 簡介 Access 的七大物件 簡介 Access 的七大物件 Access 的群組 Access 的群組.
Microsoft Excel.
實體關係模型 (ER Model).
: Finding Paths in Grid ★★★★☆ 題組: Contest Archive with Online Judge 題號: 11486: Finding Paths in Grid 解題者:李重儀 解題日期: 2008 年 10 月 14 日 題意:給一個 7 個 column.
Mapping - 1 Mapping From ER Model to Relational DB.
第 1 章 PC 的基本構造. 本章提要 PC 系統簡介 80x86 系列 CPU 及其暫存器群 記憶體: Memory 80x86 的分節式記憶體管理 80x86 的 I/O 結構 學習組合語言的基本工具.
國立交通大學工業工程與管理學系 花卉供應鏈地理資料倉儲的設計與實作 研究生:曾世民 指導教授:梁高榮 博士 報告組員:吳思賢 H 詹賀翔 H 鄭安祐 H 邱耀信 H
Unit 6 Database Design and
Unit 1 Introduction to DBMS (Database Management Systems)
Data Manipulation 11 After this lecture, you should be able to:  Understand the differences between SQL (Structured Query Language) and other programming.
6.8 Case Study: E-R for Supplier-and-Parts Database
7-1 Unit 4 Relational Database Design 英文版: Chap 7 “Relational Database Design” 中文版:第 6 章 “ 關聯式資料庫設計 ”
Logical Database Design Unit 7 Logical Database Design.
Dr. T. Y. Lin | SJSU | CS 157A | Fall 2015 Chapter 3 Database Normalization 1.
Data Warehousing. Make the right decisions for your organization –Rapid access to all kinds of information –Research and analyze the past data –Identify.
Logical Database Design ( 補 ) Unit 7 Logical Database Design ( 補 )
NormalizationNormalization Chapter 4. Purpose of Normalization Normalization  A technique for producing a set of relations with desirable properties,
Further Normalization I
Introduction to Database System Wei-Pang Yang, IM.NDHU, Final Test-1 Example: Banking Database 1. branch 2. customer 客戶 ( 存款戶, 貸款戶 ) 5. account.
Lecture No 14 Functional Dependencies & Normalization ( III ) Mar 04 th 2011 Database Systems.
Advanced Database System
Dr. T. Y. Lin | SJSU | CS 157A | Fall 2015 Chapter 3 Database Normalization 1.
生物資訊程式語言應用 Part 4 MySQL.
Unit 7 Normalization (表格正規化).
Example: Banking Database
Presentation transcript:

Wei-Pang Yang, Information Management, NDHU Normalization Unit 7 Normalization ( 表格正規化 ) 7-1

Wei-Pang Yang, Information Management, NDHU PART II: 資料庫設計 (Database Design)  資料庫問題分析與架構規劃 :  若有一大量資料想利用 DBMS 建資料庫來管理。第一步要分析問題,找到使用者需求  實體 - 關係模型 (Entity-Relationship Model, 簡稱 E-R Model) 是一套資料庫的設計工具。我們可以 利用 E-R Model 分析資料庫問題。它可以把真實世界中複雜的問題中的事物和關係轉化為資料 庫中的資料架構  由於利用實體 - 關係模型設計資料庫時, 並不會牽涉到資料庫的操作、儲存方式等複雜的電腦運 作。所以, 我們會把心力放在需求分析去規劃想要的資料庫,並以實體 - 關係圖 (E-R Diagram) 來 呈現  資料庫的表格正規化 :  實體 - 關係圖很容易轉化為表格 (Tables) ,而資料庫就是由許多表格 (tables) 組成的  這些表格要正規化 (Normalization) 才能避免將來操作時的異常現象發生  設計介面增刪查改資料庫 :  如何方便、又有效率的管理存取資料庫是使用者最關心的二個要素  良好的介面設計,可以讓使用者方便的查詢、方便的新增、方便的刪除、方便的修改的處理資 料庫 7-2 Unit 7 Normalization

Wei-Pang Yang, Information Management, NDHU 6-3 Real-world vs. E-R Model vs. Tables 1. branch 2. customer 客戶 ( 存款戶, 貸款戶 ) 3. depositor 存款戶 分公司 The real-world enterprise Semantic Data Model: Entity-Relationship (E-R) Data Model Unit 7 Normalization

Wei-Pang Yang, Information Management, NDHU 7-4 Contents  7.1 Introduction  7.2 Functional Dependency  7.3 First Normal Form (1NF)  7.4 Second Normal Form (2NF)  7.5 Third Normal Form (3NF)  7.6 Good and Bad Decomposition Unit 7 Normalization

Introduction  Logical Database Design vs. Physical Database Design  Problem of Normalization  Normal Forms Unit 7 Normalization

Wei-Pang Yang, Information Management, NDHU 7-6 Logical Database Design  Logical Database Design Semantic Modeling, eg. E-R model (UNIT 6) Normalization  Problem of Normalization Given some body of data to be represented in a database, how to decide the suitable logical structure they should have? what relations should exist? what attributes should they have? Unit 7 Normalization

Wei-Pang Yang, Information Management, NDHU 7-7 Problem of Normalization S1, Smith, 20, London, P1, Nut, Red, 12, London, 300 S1, Smith, 20, London, P2, Bolt, Green, 17, Paris, 200. S4, Clark, 20, London, P5, Cam, Blue, 12, Paris, 400 Normalization S S# SNAME STATUS CITY S1 Smith 20 London.. P SP P# S# P# QTY S1 P1 300 S1 P2 200 S' S# SNAME STATUS S1 Smith 20 S P SP' P# S# CITY P# QTY S1 London P1 300 S1 London P ( 異常 ) Redundancy Update Anomalies! or Unit 7 Normalization

Wei-Pang Yang, Information Management, NDHU Supplier-and-Parts Database S1, Smith, 20, London, P1, Nut, Red, 12, London, 300 S1, Smith, 20, London, P2, Bolt, Green, 17, Paris, 200. S4, Clark, 20, London, P5, Cam, Blue, 12, Paris, 400 Normalization S# SNAME STATUS CITY S1 Smith 20 London S2 Jones 10 Paris S3 Blake 30 Paris S4 Clark 20 London S ? Unit 7 Normalization 7-8 SSP

Wei-Pang Yang, Information Management, NDHU 7-9 Normal Forms  A relation is said to be in a particular normal form if it satisfies a certain set of constraints. 1NF: A relation is in First Normal Form (1NF) iff it contains only atomic values. universe of relations (normalized and un-normalized) 1NF relations (normalized relations) 2NF relations 3NF relations 4NF relations BCNF relations 5NF relations Unit 7 Normalization

Functional Dependency  Functional Dependency (FD)  Fully Functional Dependency (FFD) Unit 7 Normalization

Wei-Pang Yang, Information Management, NDHU 7-11 Functional Dependency  Functional Dependency Def: Given a relation R, R.Y is functionally dependent on R.X iff each X-value has associated with it precisely one Y-value (at any time). Note: X, Y may be the composite attributes.  Notation: R.X R.Y read as "R.X functionally determines R.Y" R X Y Unit 7 Normalization S# SNAME STATUS CITY S1 Smith 20 London S2 Jones 10 Paris S3 Blake 30 Paris S4 Clark 20 London S5 Adams 30 Athens S

Wei-Pang Yang, Information Management, NDHU 7-12 Functional Dependency (cont.) S S.S# S.SNAME S.S# S.STATUS S.S# S.CITY S.STATUS S.CITY FD Diagram: S# SNAME CITY STATUS S# SNAME STATUS CITY S1 Smith 20 London S2 Jones 10 Paris S3 Blake 30 Paris S4 Clark 20 London S5 Adams 30 Athens S Note: Assume STATUS is some factor of Supplier and no any relationship with CITY. Unit 7 Normalization

Wei-Pang Yang, Information Management, NDHU 7-13 Functional Dependency (cont.) P SP P# PNAME CITY COLOR WEIGHT S# P# QTY  If X is a candidate key of R, then all attributes Y of R are functionally dependent on X. (i.e. X Y) P# PNAME COLOR WEIGHT CITY P1 Nut Red 12 London P2 Bolt Green 17 Paris P3 Screw Blue 17 Rome P4 Screw Red 14 London P5 Cam Blue 12 Paris P6 Cog Red 19 London P S# P# QTY S1 P1 300 S1 P2 200 S1 P3 400 S1 P4 200 S1 P5 100 S1 P6 100 S2 P1 300 S2 P2 400 S3 P2 200 S4 P2 200 S4 P4 300 S4 P5 400 SP Unit 7 Normalization

Wei-Pang Yang, Information Management, NDHU 7-14 Fully Functional Dependency (FFD)  Def: Y is fully functionally dependent on X iff (1) Y is FD on X (2) Y is not FD on any proper subset of X. SP' (S#, CITY, P#, QTY) S# P# QTY FFD S# CITY P# QTY S1 London P1 300 S1 London P2 200 … …. …... S# P# CITY FD not FFD S# CITY FD S# CITY … ….. S#city 1 city 2 S1 London Taipei SP' Unit 7 Normalization

Wei-Pang Yang, Information Management, NDHU 7-15 Fully Functional Dependency (cont.) 1. Normally, we take FD to mean FFD. 2. FD is a semantic notion. S# CITY Means: each supplier is located in precisely one city. 3. FD is a special kind of integrity constraint. CREATE INTEGRITY RULE SCFD CHECK FORALL SX FORALL SY (IF SX.S# = SY.S# THEN SX.CITY = SY.CITY); 4. FDs considered here applied within a single relation. SP.S# S.S# is not considered! (Ref P.10-9) Unit 7 Normalization

First Normal Form (1NF) Unit 7 Normalization

Wei-Pang Yang, Information Management, NDHU 7-17 Normal Forms: 1NF  Def: A relation is in 1NF iff all underlying simple domains contain atomic values only. S# STATUS CITY (P#, QTY) S1 20 London {(P1, 300), (P2, 200),..., (P6, 100)} S2 10 Paris {(P1, 300), (P2, 400)} S3 10 Paris {(P2, 200)} S4 20 London {(P2, 200), (P4, 300), (P5, 400)} S # STATUS CITY P# QTY S1 20 London P1 300 S1 20 London P2 200 S1 20 London P3 400 S1 20 London P4 200 S1 20 London P5 100 S1 20 London P6 100 S2 10 Paris P1 300 S2 10 Paris P2 400 S3 10 Paris P2 200 S4 20 London P2 200 S4 20 London P4 300 S4 20 London P5 400 Key:(S#,P#), Normalized 1NF FIRST Suppose 1. CITY is the main office of the supplier. 2. STATUS is some factor of CITY Fact? Unit 7 Normalization

Wei-Pang Yang, Information Management, NDHU NF Problem: Update Anomalies! Update If suppler S1 moves from London to Paris, then 6 tuples must be updated! Insertion Cannot insert a supplier information if it doesn't supply any part, because that will cause a null key value. Deletion Delete the information that "S3 supplies P2", then the fact "S3 is located in Paris" is also deleted. S# STATUS CITY P# QTY..... S3 20 Paris P S5 30 Athens NULL NULL FIRST S # STATUS CITY P# QTY S1 20 London P1 300 S1 20 London P2 200 S1 20 London P3 400 S1 20 London P4 200 S1 20 London P5 100 S1 20 London P6 100 S2 10 Paris P1 300 S2 10 Paris P2 400 S3 10 Paris P2 200 S4 20 London P2 200 S4 20 London P4 300 S4 20 London P5 400 Key:(S#,P#), Normalized 1NF FIRST Unit 7 Normalization

Wei-Pang Yang, Information Management, NDHU NF Problem: Update Anomalies! (cont.) Suppose 1. CITY is the main office of the supplier. 2. STATUS is some factor of CITY (ref.p.7-10) Primary key of FIRST: (S#, P#) FD diagram of FIRST: S# P# CITY STATUS QTY FF D x FFD FFD FD: 1. S# STATUS 2. S# CITY 3. CITY STATUS 4. (S#, P#) QTY FFD primary key (S#, P#) STATUS primary key (S#, P#) CITY S # STATUS CITY P# QTY S1 20 London P1 300 S1 20 London P2 200 S1 20 London P3 400 S1 20 London P4 200 S1 20 London P5 100 S1 20 London P6 100 S2 10 Paris P1 300 S2 10 Paris P2 400 S3 10 Paris P2 200 S4 20 London P2 200 S4 20 London P4 300 S4 20 London P5 400 Key:(S#,P#), Normalized 1NF FIRST Unit 7 Normalization  Non-key attributes are NOT FFD on primary key. (e.g. QTY, STATUS, CITY in FIRST)

Second Normal Form (2NF) S # STATUS CITY P# QTY S1 20 London P1 300 S1 20 London P2 200 S1 20 London P3 400 S1 20 London P4 200 S1 20 London P5 100 S1 20 London P6 100 S2 10 Paris P1 300 S2 10 Paris P2 400 S3 10 Paris P2 200 S4 20 London P2 200 S4 20 London P4 300 S4 20 London P5 400 FIRST S# STATUS CITY S1 20 London S2 10 Paris S3 10 Paris S4 20 London S5 30 Athens SECOND (in 2NF) S# P# QTY S1 P1 300 S1 P2 200 S1 P3 400 S1 P4 200 S1 P5 100 S1 P6 100 S2 P1 300 S2 P2 400 S3 P2 200 S4 P2 200 S4 P4 300 S4 P5 400 SP (in 2NF)

Wei-Pang Yang, Information Management, NDHU 7-21 Normal Form: 2NF  Def: A relation R is in 2NF iff (1) R is in 1NF (i.e. atomic ) (2) Non-key attributes are FFD on primary key. ( QTY, STATUS, CITY in FIRST) FIRST is in 1NF, but not in 2NF (S#, P#) STATUS, and (S#, P#) CITY FFD Decompose FIRST into: SECOND (S#, STATUS, CITY): primary key: S# S# CITY STATUS FD: 1. S# STATUS 2. S# CITY 3. CITY STATUS S# P# CITY STATUS QTY FFD x FFD FFD SP (S#, P#, QTY): P rimary key: (S#, p#) S# P# QTY FD: 4. (S#, P#) QTY Unit 7 Normalization

Wei-Pang Yang, Information Management, NDHU 7-22 Normal Form: 2NF (cont.) S# STATUS CITY S1 20 London S2 10 Paris S3 10 Paris S4 20 London S5 30 Athens SECOND (in 2NF) S # STATUS CITY P# QTY S1 20 London P1 300 S1 20 London P2 200 S1 20 London P3 400 S1 20 London P4 200 S1 20 London P5 100 S1 20 London P6 100 S2 10 Paris P1 300 S2 10 Paris P2 400 S3 10 Paris P2 200 S4 20 London P2 200 S4 20 London P4 300 S4 20 London P5 400 FIRST Update: S1 moves from London to Paris Insertion: (S5 30 Athens) Deletion Delete "S3 supplies P2 200", then the fact "S3 is located in Paris" is also deleted. Unit 7 Normalization S# P# QTY S1 P1 300 S1 P2 200 S1 P3 400 S1 P4 200 S1 P5 100 S1 P6 100 S2 P1 300 S2 P2 400 S3 P2 200 S4 P2 200 S4 P4 300 S4 P5 400 SP (in 2NF)

Wei-Pang Yang, Information Management, NDHU 7-23 Normal Form: 2NF (cont.)  A relation in 1NF can always be reduced to an equivalent collection of 2NF relations.  The reduction process from 1NF to 2NF is non-loss decomposition. FIRST(S#, STATUS, SCITY, P#, QTY) projections natural joins SECOND(S#, STATUS, CITY), SP(S#,P#,QTY)  The collection of 2NF relations may contain “more” information than the equivalent 1NF relation. (S5, 30, Athens) 2nd SP 1st Unit 7 Normalization

Wei-Pang Yang, Information Management, NDHU 7-24 Problem: Update Anomalies in SECOND! Update Anomalies in SECOND UPDATE: if the status of London is changed from 20 to 60, then two tuples must be updated DELETE: delete supplier S5, then the fact "the status of Athens is 30" is also deleted! INSERT: cannot insert the fact "the status of Rome is 50"! Why: S.S# S.STATUS S.S# S. CITY S.CITY S.STATUS cause a transitive dependency S# STATUS CITY S1 20 London S2 10 Paris S3 10 Paris S4 20 London S5 30 Athens SECOND (in 2NF) S# CITY STATUS FD: 1. S# STATUS 2. S# CITY 3. CITY STATUS Unit 7 Normalization

Third Normal Form (3NF) Unit 7 Normalization S# STATUS CITY S1 20 London S2 10 Paris S3 10 Paris S4 20 London S5 30 Athens SECOND S# CITY S1 London S2 Paris S3 Paris S4 London S5 Athens SC (in 3NF) CITY STATUS Athens 30 London 20 Paris 10 Rome 50 CS (in 3NF)

Wei-Pang Yang, Information Management, NDHU 7-26 Normal Forms: 3NF  Def : A relation R is in 3NF iff (1) R is in 2NF (2) Every non-key attribute is non-transitively dependent on the primary key. e.g. STATUS is transitively on S# (i.e., non-key attributes are mutually independent) SP is in 3NF, but SECOND is not! S# P# QTY SP FD diagram S# STATUS CITY S1 20 London S2 10 Paris S3 10 Paris S4 20 London S5 30 Athens SECOND (not 3NF) S# CITY STATUS SECOND FD diagram Unit 7 Normalization

Wei-Pang Yang, Information Management, NDHU 7-27 Normal Forms: 3NF (cont.)  Decompose SECOND into: SC(S#, CITY) primary key : S# FD diagram: CS(CITY, STATUS): primary key: CITY FD diagram: S# CITY STATUS S# CITY S1 London S2 Paris S3 Paris S4 London S5 Athens SC (in 3NF) CITY STATUS Athens 30 London 20 Paris 10 Rome 50 CS (in 3NF) S# STATUS CITY S1 20 London S2 10 Paris S3 10 Paris S4 20 London S5 30 Athens SECOND S# STATUS CITY Unit 7 Normalization

Wei-Pang Yang, Information Management, NDHU 7-28 Normal Forms: 3NF (cont.) Unit 7 Normalization P# PNAME COLOR WEIGHT CITY P1 Nut Red 12 London P2 Bolt Green 17 Paris P3 Screw Blue 17 Rome P4 Screw Red 14 London P5 Cam Blue 12 Paris P6 Cog Red 19 London P  Consider P:  Def: A relation is in 1NF iff all underlying simple domains contain atomic values only.  Def: A relation R is in 2NF iff (1) R is in 1NF (i.e. atomic ) (2) Non-key attributes are FFD on primary key.  Def : A relation R is in 3NF iff (1) R is in 2NF (2) Every non-key attribute is non-transitively dependent on the primary key. (i.e., non-key attributes are mutually independent)

Wei-Pang Yang, Information Management, NDHU 7-29 Normal Forms: 3NF (cont.)  Note: (1) Any 2NF diagram can always be reduced to a collection of 3NF relations. (2) The reduction process from 2NF to 3NF is non-loss decomposition. (3) The collection of 3NF relations may contain "more information" than the equivalent 2NF relation. Unit 7 Normalization

Good and Bad Decomposition Unit 7 Normalization S# CITY S1 London S2 Paris S3 Paris S4 London S5 Athens SC (in 3NF) S# STATUS CITY S1 20 London S2 10 Paris S3 10 Paris S4 20 London S5 30 Athens SECOND S# CITY S1 London S2 Paris S3 Paris S4 London S5 Athens SC (in 3NF) CITY STATUS Athens 30 London 20 Paris 10 Rome 50 CS (in 3NF) S# STATUS S1 20 S2 10 S3 10 S4 20 S5 30 SS (in 3NF) S# STATUS CITY

Wei-Pang Yang, Information Management, NDHU Good and Bad Decomposition  Consider S# CITY STATUS transitive FD S# CITY STATUS Good ! (Rome, 50) can be inserted. ① Decomposition A: SC: CS: S# CITY S# STATUS Bad ! (Rome, 50) can not be inserted unless there is a supplier located at Rome. ② Decomposition B: SC: SS: S# STATUS CITY S# STATUS CITY Suppose 1. CITY is the main office of the supplier. 2. STATUS is some factor of CITY (ref. p.7-9) Unit 7 Normalization

Wei-Pang Yang, Information Management, NDHU 7-32 Good and Bad Decomposition (cont.)  Independent Projection: (by Rissanen '77, ref[10.6]) Def: Projections R1 and R2 of R are independent iff (1) Any FD in R can be reduced from those in R1 and R2. (2) The common attribute of R1 and R2 forms a candidate key for at least one of R1 and R2. Decomposition A: SECOND(S#, STATUS, CITY)  (R) (1) FD in SECOND (R): S# CITY CITY STATUS S#  STATUS FD in SC and CS R1: S# CITY R2: CITY STATUS (2) Common attribute of SC and CS is CITY, which is the primary key of CS.  SC and CS are independent  Decomposition A is good! SC (S#, CITY) CS (CITY, STATUS) Projection R1 R R2 (R1) (R2) { S# STATUS Unit 7 Normalization

Wei-Pang Yang, Information Management, NDHU 7-33 Good and Bad Decomposition (cont.) Decomposition B: SECOND(S#, STATUS, CITY) (1) FD in SC and SS: S# CITY S#  STATUS  SC and SS are dependent  decomposition B is bad! though common attr. S# is primary key of both SC, SS. SC (S#, CITY) SS (S#, STATUS) { CITY STATUS Unit 7 Normalization

Wei-Pang Yang, Information Management, NDHU Normal Forms Unit 7 Normalization 7-34 universe of relations (normalized and un-normalized) 1NF relations (normalized relations) 2NF relations 3NF relations 4NF relations BCNF relations 5NF relations

Wei-Pang Yang, Information Management, NDHU More on Normalization Unit 18 More on Normalization  18.1 Introduction  18.2 Functional Dependency  18.3 First, Second, and Third Normal Forms (1NF, 2NF, 3NF)  18.4 Boyce/Codd Normal Form (BCNF)  18.5 Fourth Normal Form (4NF)  18.6 Fifth Normal Form (5NF) 7-35

Wei-Pang Yang, Information Management, NDHU Reduction E-R Model to Relational Tables Refer to UNIT 6, Sec. 7 Transfer your E-R model to Tables Check each Table to see if it is a 1NF, 2NF, 3NF Design Query Using SQL to define and create Tables Design some queries to access your database Using SQL to query your database … Unit 6 Database Design and the E-R Model 7-36 EX.part2.2: Tables and SQL

Wei-Pang Yang, Information Management, NDHU end of unit