Download presentation
Presentation is loading. Please wait.
Published byAugusta Newton Modified over 9 years ago
1
Chapter 5 DATA WAREHOUSING Study Sections 5.2, 5.3, 5.5, Pages:231-233 & Snowflake schema
2
Business Intelligence Companies collect a large amount of data from their business operations. To keep track of that information, a business uses disparate software applications, such as Excel, Access, etc. Using multiple software makes it difficult to retrieve information in a timely manner and to perform analysis of the data. Business Intelligence (BI) represents the tools and systems that play a key role in integrating and analyzing all corporate data. Generally illustrates intelligence in the areas of customer profiling, market research, product profitability (by product, region, year), etc.
3
BI Architecture Consists of 3 system components –Data warehouse –Business analytics –Performance management (BPM)
4
Data warehouse –A repository of cleaned and integrated historical /stable data for the entire business –Extracted from independent databases (internal & external) –Transformed (ie. cleaned and reformatted) - A subset of a warehouse limited to a business function is called a Data Mart (eg. Sales).
5
DW vs. Transaction DBs Differences between standard Transactional databases & Data Warehouses: DWs are not designed for performing transaction entries, but only for planning and analysis DWs are not designed for retrieval of individual records; emphasis is on summarized data DWs data pulled and integrated from disparate databases, unlike Transaction db’s which are individual applications Transaction db’s are concerned with ‘now’; DW focuses on activity over a period A transaction db is volatile (eg. an order may be cancelled); In a DW, data is only added, never deleted (as it maintains a history) Transaction db is optimized for rapid retrieval; not DWs
6
Business analytics –Reporting and queries Multi-dimensional reports, eg. Pivot tables [see Exercise 8]; SQL Queries [Exercise 9] Cube analysis [Chapter 6] –Data, text and Web mining and other sophisticated mathematical and statistical tools for searching relationships [Chapter 7] These are tools that help analyze the data towards finding solutions:
7
Dashboard reports Production reports Business Performance Management (BPM) BPM supports monitoring, measuring, and comparing of sales, profit, cost, profitability, and other performance indicators
8
Transaction Data Systems Data Source = DW Data Source Views Analytic Tools SQL, Cubes
9
Extraction, Transformation, and Load (ETL) Process A data warehousing process consists of : Extraction (i.e., reading data from a database), Transformation (i.e., converting the extracted data from its previous form into the form in which it needs to be so that it can be placed into a data warehouse), and Load (i.e., storing the data into the data warehouse)
10
Data Integration and the Extraction, Transformation, and Load (ETL) Process
11
ETL Data from multiple Sources Newly integrated schema for the Data Warehouse
12
DW Schema Structures: Star Note that data is un-normalized
13
DW Schema Structures: Snowflake Note that data is normalized
14
Designing Fact Tables: Normalization Normalization is the process of gathering attributes into tables to eliminate redundant data (the redundancy here is EquipID EquipType)
15
Normalization Exercise FIRST (Supplier#, City, CityCode, Part#, Qty) Split the table into 3 different tables: –(Supplier#, City) –(City, CityCode) –(Supplier#, Part#,Qty) Although normalized databases have less data redundancies, they are less efficient in quickly processing the data. Hence, many DWs use Star schema.
16
Data Marts provide ‘views’ of the data in the Data Warehouse (we will be working with this in our SQL exercises)
17
Summary
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.