Presentation is loading. Please wait.

Presentation is loading. Please wait.

(C) Copyright Fabian Pascal

Similar presentations


Presentation on theme: "(C) Copyright Fabian Pascal"— Presentation transcript:

1 (C) Copyright Fabian Pascal

2 DENORMALIZATION FOR PERFORMANCE
A COSTLY ILLUSION SQLSaturday, Milan, October 2013 Fabian Pascal

3 (C) Copyright Fabian Pascal
INDUSTRY SOP “A traditional normalized structure will not outperform a denormalized schema.” --Practitioner with 20 years experience “No major application will run in third normal form.” --G. Koch, ORACLE COMPLETE REFERENCE “Denormalization can be described a a process for reducing the degree of normalization with the aim of improving query processing performance.” --Sanders & Shin, University of NY Buffalo (C) Copyright Fabian Pascal

4 (C) Copyright Fabian Pascal
THE ARGUMENT The higher the normal form, the more tables in the database; More tables  more joins; Join queries perform worse than single table queries; Denormalize for performance. (C) Copyright Fabian Pascal

5 NORMALIZATION DEGREES
<1NF: Unnormalized 1NF: Normalized * 2NF-4NF: Further normalized 5NF: Fully normalized ** * R-tables; ** “Improved” R-tables; (6NF) (C) Copyright Fabian Pascal

6 (C) Copyright Fabian Pascal
RULE OF THUMB 3NF R-tables are usually also in 5NF. It is not very often that 3NF R-tables are not fully normalized. * * Composite keys (C) Copyright Fabian Pascal

7 (C) Copyright Fabian Pascal
BUSINESS MODEL Every employee is assigned to one or more projects; Every project has one or more employees assigned to it; Every employee is assigned to one or more activities; Every activity has one or more employees assigned to it; Project and activity assignments are independent; Business rules Employees-to-projects, employees-to-tasks (N:M) (C) Copyright Fabian Pascal

8 (C) Copyright Fabian Pascal
LOGICAL MODEL (5NF) EMP_PROJ EMP_ACT Project assignment of employee identified by employee number (EMP#) is to project identified by project name (PROJNAME) Activity assignment of employee identified by employee number (EMP#) is to activity identified by activity name (ACTNAME) (C) Copyright Fabian Pascal

9 (C) Copyright Fabian Pascal
“BUNDLING” (<4NF) Redundancy * Update anomalies INSERT: proj. of empl. assigned to no act DELETE: sole proj. of empl. Assined to multiple act. Database bias Harder to understand database Error proneness * Due to bundling (C) Copyright Fabian Pascal

10 (C) Copyright Fabian Pascal
DESIGN OPTIONS Two 5NF R-tables: Domain Column Key * * FD, RI, Arb. One R-table: Domain Column Key Redundancy control! (C) Copyright Fabian Pascal

11 (C) Copyright Fabian Pascal
RC CONSTRAINT a = b1 JOIN b2 (C) Copyright Fabian Pascal

12 (C) Copyright Fabian Pascal
THE THEORY & … CREATE ASSERTION a_rd CHECK (SELECT * FROM a) = (SELECT * FROM b1) JOIN FROM b2); (C) Copyright Fabian Pascal

13 (C) Copyright Fabian Pascal
… PRACTICE (C) Copyright Fabian Pascal

14 (C) Copyright Fabian Pascal
DATA FUNDAMENTALS “IT professionals and users require databases and DBMS’s that produce correct results and are efficient, but most of them do not know whether the practices and tools they employ are sound and optimal what the real sources of problems are if they are not.” -- PRACTICAL ISSUES IN DATABASE MANAGEMENT (C) Copyright Fabian Pascal

15 MODEL & IMPLEMENTATION
“Consider a mathematical principle, say: (a+b) x (a-b) = a² - b² If you are using a calculator that uses a method involving this principle and it is slow relative to other method, which would you blame, the method or the calculator?” -- A. Sen (C) Copyright Fabian Pascal

16 (C) Copyright Fabian Pascal
BACKWARDS “Deferred to Design are the compromises between more tables to eliminate redundancy and and acceptable performance … But this normalization is not a concern as the [logical] model is built. Indeed, there is no realistic way of knowing whether the designer will chose to [further] normalize data and to what level.” -- Coad & Yourdon, OBJECT-ORIENTED ANALYSIS (C) Copyright Fabian Pascal

17 PRACTICAL IMPLICATIONS
Full normalization Purely logical Neutral database Formal guide No update anomalies Max integrity Min constraints Denormalization Log-phys conf. Database bias Ad-hoc Update anomalies Max integrity risk RC constraints defeat purpose (C) Copyright Fabian Pascal

18 (C) Copyright Fabian Pascal
RECOMMENDATIONS Learn data fundamentals * Don’t trust what you hear/read Don’t confuse Levels of representation Model with implementation Design logically sound databases Demand TRDBMS’s (C) Copyright Fabian Pascal

19 (C) Copyright Fabian Pascal

20 (C) Copyright Fabian Pascal
EDUCATION SERVICES Education--distinct from tool-specific training--useful for any and all DBMS products used; Correct myths and misconceptions about Explain the practical implications of data fundamentals concepts, principles and methods that receive little, no, or incorrect coverage in the industry in simple, accessible language; Data professionals and users who interacts with databases and prefer to think for themselves understanding to "cookbooks" soundness to marketing fads and fashion; (C) Copyright Fabian Pascal

21 (C) Copyright Fabian Pascal
SEMINARS & PAPERS Business Modeling for Database Design The Costly Illusion: Normalization, Integrity and Performance The Final NULL in the Coffin: A Relational Solution to Missing Data Truly Relational: What It Really Means (C) Copyright Fabian Pascal

22 (C) Copyright Fabian Pascal
DBDEBUNK BLOG Debunkings of industry claims; Articles on data fundamentals; Weekly Quotes & To Laugh or Cry? Industry material for which it is difficult to know which of the two reactions is warranted; Illustrates the poor state of foundation knowledge; Offer opportunity to test oneself on knowledge and comprehension of data fundamentals; (C) Copyright Fabian Pascal


Download ppt "(C) Copyright Fabian Pascal"

Similar presentations


Ads by Google