Ahsan Abdullah 1 Data Warehousing Lecture-7De-normalization Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.

Slides:



Advertisements
Similar presentations
DENORMALIZATION CSCI 6442 © Copyright 2015, David C. Roberts, all rights reserved.
Advertisements

Lecture-19 ETL Detail: Data Cleansing
Dimensional Modeling Business Intelligence Solutions.
Topic Denormalisation S McKeever Advanced Databases 1.
© Copyright 2011 John Wiley & Sons, Inc.
Chapter Physical Database Design Methodology Software & Hardware Mapping Logical Design to DBMS Physical Implementation Security Implementation Monitoring.
Data Management Design
3-1 Chapter 3 Data and Knowledge Management
Physical Database Monitoring and Tuning the Operational System.
Chapter 11 Data Management Layer Design
Lecture-33 DWH Implementation: Goal Driven Approach (1)
CSE6011 Warehouse Models & Operators  Data Models  relations  stars & snowflakes  cubes  Operators  slice & dice  roll-up, drill down  pivoting.
Introduction to Databases
Lecture-1 Introduction and Background
IST Databases and DBMSs Todd S. Bacastow January 2005.
DWH-Ahsan Abdullah 1 Data Warehousing Lab Lect-2 Lab Data Set Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
Ahsan Abdullah 1 Data Warehousing Lecture-12 Relational OLAP (ROLAP) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
Chapter 6 Physical Database Design. Introduction The purpose of physical database design is to translate the logical description of data into the technical.
Database Design Concepts
DATABASE MANAGEMENT SYSTEMS BASIC CONCEPTS 1. What is a database? A database is a collection of data which can be used: alone, or alone, or combined /
Systems analysis and design, 6th edition Dennis, wixom, and roth
Chapters 17 & 18 Physical Database Design Methodology.
CSC271 Database Systems Lecture # 30.
Ahsan Abdullah 1 Data Warehousing Lecture-17 Issues of ETL Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
PowerPoint Presentation for Dennis & Haley Wixom, Systems Analysis and Design, 2 nd Edition Copyright 2003 © John Wiley & Sons, Inc. All rights reserved.
TM 7-1 Copyright © 1999 Addison Wesley Longman, Inc. Physical Database Design.
Physical Database Design Chapter 6. Physical Design and implementation 1.Translate global logical data model for target DBMS  1.1Design base relations.
MIS 301 Information Systems in Organizations Dave Salisbury ( )
Ahsan Abdullah 1 Data Warehousing Lecture-11 Multidimensional OLAP (MOLAP) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for.
Data Warehousing 1 Lecture-24 Need for Speed: Parallelism Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
1 Data Warehousing Lecture-13 Dimensional Modeling (DM) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics Research.
PLUG IT IN 3 Fundamentals of Relational Database Operations.
CORE 2: Information systems and Databases NORMALISING DATABASES.
DWH-Ahsan Abdullah 1 Data Warehousing Lecture-4 Introduction and Background Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for.
Ahsan Abdullah 1 Data Warehousing Lecture-18 ETL Detail: Data Extraction & Transformation Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. &
Ahsan Abdullah 1 Data Warehousing Lecture-9 Issues of De-normalization Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
Data Warehousing 1 Lecture-28 Need for Speed: Join Techniques Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
1 Data Warehousing Lecture-14 Process of Dimensional Modeling Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
What's a Database A Database Primer Let’s discuss databases n Why they are hard n Why we need them.
BI Terminologies.
Ahsan Abdullah 1 Data Warehousing Lecture-10 Online Analytical Processing (OLAP) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center.
CIS 210 Systems Analysis and Development Week 6 Part II Designing Databases,
Dr. Abdul Basit Siddiqui Assistant Professor FUIEMS.
Data Warehousing Lecture-31 Supervised vs. Unsupervised Learning Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
1 Data Warehousing Lecture-15 Issues of Dimensional Modeling Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
Data Warehousing Lecture-30 What can Data Mining do? Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics Research.
DWH-Ahsan Abdullah 1 Data Warehousing Lecture-29 Brief Intro. to Data Mining Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center.
File and Database Design Class 22. File and database design: 1. Choosing the storage format for each attribute from the logical data model. 2. Grouping.
UNIT-II Principles of dimensional modeling
GIS Data Models GEOG 370 Christine Erlien, Instructor.
Methodology – Monitoring and Tuning the Operational System.
©NIIT Normalizing and Denormalizing Data Lesson 2B / Slide 1 of 18 Objectives In this section, you will learn to: Describe the Top-down and Bottom-up approach.
Indexes and Views Unit 7.
Physical Database Design Purpose- translate the logical description of data into the technical specifications for storing and retrieving data Goal - create.
DWH-Ahsan Abdullah 1 Data Warehousing Lecture-22 DQM: Quantifying Data Quality Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center.
Chapter 4 Logical & Physical Database Design
9-1 © Prentice Hall, 2007 Topic 9: Physical Database Design Object-Oriented Systems Analysis and Design Joey F. George, Dinesh Batra, Joseph S. Valacich,
Ahsan Abdullah 1 Data Warehousing Lecture-6Normalization Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
IMS 4212: Database Implementation 1 Dr. Lawrence West, Management Dept., University of Central Florida Physical Database Implementation—Topics.
11-1 © Prentice Hall, 2004 Chapter 11: Physical Database Design Object-Oriented Systems Analysis and Design Joey F. George, Dinesh Batra, Joseph S. Valacich,
SQL Server Analysis Services Understanding Unified Dimension Model (UDM)
Ahsan Abdullah 1 Data Warehousing Lecture-8 De-normalization Techniques Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics.
Database Planning Database Design Normalization.
SQL Basics Review Reviewing what we’ve learned so far…….
Lecture-3 Introduction and Background
Lecture-32 DWH Lifecycle: Methodologies
Physical Database Design and Performance
Methodology – Monitoring and Tuning the Operational System
Methodology – Monitoring and Tuning the Operational System
Lecture-35 DWH Implementation: Pitfalls, Mistakes, Keys
Presentation transcript:

Ahsan Abdullah 1 Data Warehousing Lecture-7De-normalization Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics Research National University of Computers & Emerging Sciences, Islamabad

Ahsan Abdullah 2 De-normalization

3 Striking a balance between “good” & “evil” Flat Table Data Lists Data Cubes 1 st Normal Form 2 nd Normal Form 3 rd Normal Form 4+ Normal Forms Normalization De-normalization One big flat file Too many tables

Ahsan Abdullah 4 What is De-normalization?  It is not chaos, more like a “controlled crash” with the aim of performance enhancement without loss of information.  Normalization is a rule of thumb in DBMS, but in DSS ease of use is achieved by way of denormalization.  De-normalization comes in many flavors, such as combining tables, splitting tables, adding data etc., but all done very carefully.

Ahsan Abdullah 5 Why De-normalization In DSS?  Bringing “close” dispersed but related data items.  Query performance in DSS significantly dependent on physical data model.  Very early studies showed performance difference in orders of magnitude for different number de-normalized tables and rows per table.  The level of de-normalization should be carefully considered.

Ahsan Abdullah 6 How De-normalization improves performance? De-normalization specifically improves performance by either:  Reducing the number of tables and hence the reliance on joins, which consequently speeds up performance.  Reducing the number of joins required during query execution, or  Reducing the number of rows to be retrieved from the Primary Data Table.

Ahsan Abdullah 7 4 Guidelines for De-normalization 1. Carefully do a cost-benefit analysis (frequency of use, additional storage, join time). 2. Do a data requirement and storage analysis. 3. Weigh against the maintenance issue of the redundant data (triggers used). 4. When in doubt, don’t denormalize.

Ahsan Abdullah 8 Areas for Applying De-Normalization Techniques  Dealing with the abundance of star schemas.  Fast access of time series data for analysis.  Fast aggregate (sum, average etc.) results and complicated calculations.  Multidimensional analysis (e.g. geography) in a complex hierarchy.  Dealing with few updates but many join queries. De-normalization will ultimately affect the database size and query performance.

Ahsan Abdullah 9 Five principal De-normalization techniques 1. Collapsing Tables. - Two entities with a One-to-One relationship. - Two entities with a Many-to-Many relationship. 2. Splitting Tables (Horizontal/Vertical Splitting). 3. Pre-Joining. 4. Adding Redundant Columns (Reference Data). 5. Derived Attributes (Summary, Total, Balance etc).

Ahsan Abdullah 10 Collapsing Tables ColAColBColAColC normalized ColAColBColC denormalized  Reduced storage space.  Reduced update time.  Does not changes business view.  Reduced foreign keys.  Reduced indexing.