Data Warehousing ISYS 650. What is a data warehouse? A data warehouse is a subject-oriented, integrated, nonvolatile, time-variant collection of data.

Slides:



Advertisements
Similar presentations
An overview of Data Warehousing and OLAP Technology Presented By Manish Desai.
Advertisements

BY LECTURER/ AISHA DAWOOD DW Lab # 2. LAB EXERCISE #1 Oracle Data Warehousing Goal: Develop an application to implement defining subject area, design.
Data Warehousing M R BRAHMAM.
Data Warehouse Architecture Sakthi Angappamudali Data Architect, The Oregon State University, Corvallis 16 th May, 2005.
Introduction to Data Warehouse and Data Mining MIS 2502 Data Analytics
Data Warehousing - 2 ISYS 650. Data Warehouse Design - Star Schema - Dimension tables – contain descriptions about the subjects of the business such as.
Data Warehouse IMS5024 – presented by Eder Tsang.
Decision Support and Data Warehouse. Decision supports Systems Components Data management function –Data warehouse Model management function –Analytical.
Decision Support Systems. Decision Support Trends The emerging class of applications focuses on –Personalized decision support –Modeling –Information.
Multidimensional Modeling MIS 497. What is multidimensional model? Logical view of the enterprise Logical view of the enterprise Shows main entities of.
Business Intelligence. On-Line Analytical Processing (OLAP) Tools The use of a set of graphical tools that provides users with multidimensional views.
Data Warehousing - 3 ISYS 650. Snowflake Schema one or more dimension tables do not join directly to the fact table but must join through other dimension.
Chapter 2: Data Warehousing
Data Warehousing. On-Line Analytical Processing (OLAP) Tools The use of a set of graphical tools that provides users with multidimensional views of their.
DATA WAREHOUSE (Muscat, Oman).
Data Warehousing DSCI 4103 Dr. Mennecke Introduction and Chapter 1.
CS346: Advanced Databases
An Overview of Data Warehousing and OLTP Technology Presenter: Parminder Jeet Kaur Discussion Lead: Kailang.
Data Warehousing: Defined and Its Applications Pete Johnson April 2002.
M ODULE 5 Metadata, Tools, and Data Warehousing Section 4 Data Warehouse Administration 1 ITEC 450.
ITEC 3220A Using and Designing Database Systems
Business Intelligence. Topics Chart Online Analytical Process, OLAP – Excel’s Pivot table – Data visualization with dashboard Data warehousing Data Mining.
Datawarehousing Concepts | 7.0 9/7/2015 Datawarehousing Concepts.
©Silberschatz, Korth and Sudarshan18.1Database System Concepts - 5 th Edition, Aug 26, 2005 Buzzword List OLTP – OnLine Transaction Processing (normalized,
Data Warehouse & Data Mining
Best Practices for Data Warehousing. 2 Agenda – Best Practices for DW-BI Best Practices in Data Modeling Best Practices in ETL Best Practices in Reporting.
DW-1: Introduction to Data Warehousing. Overview What is Database What Is Data Warehousing Data Marts and Data Warehouses The Data Warehousing Process.
Atlanta Microsoft Database Forum Introduction to Data Warehousing Concepts Brian Thomas Solution Builders, Inc. Presented by March 8, 2004
Data warehousing and online analytical processing- Ref Chap 4) By Asst Prof. Muhammad Amir Alam.
DIMENSIONAL MODELLING. Overview Clearly understand how the requirements definition determines data design Introduce dimensional modeling and contrast.
1 Data Warehouses BUAD/American University Data Warehouses.
OLAP & DSS SUPPORT IN DATA WAREHOUSE By - Pooja Sinha Kaushalya Bakde.
The Data Warehouse “A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection of “all” an organisation’s data in support.
Data Warehousing.
1 Reviewing Data Warehouse Basics. Lessons 1.Reviewing Data Warehouse Basics 2.Defining the Business and Logical Models 3.Creating the Dimensional Model.
Building Data and Document-Driven Decision Support Systems How do managers access and use large databases of historical and external facts?
Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,
13 1 Chapter 13 The Data Warehouse Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
New Developments in Business Intelligence ( Decision Support Systems) BUS 782.
MIS2502: Data Analytics Dimensional Data Modeling
UNIT-II Principles of dimensional modeling
Chapter 5 DATA WAREHOUSING Study Sections 5.2, 5.3, 5.5, Pages: & Snowflake schema.
Decision supports Systems Components
Business Intelligence - 2 BUS 782. Topics Data warehousing Data Mining.
Business Intelligence. Topics Chart Online Analytical Process, OLAP – Excel’s Pivot table – Data visualization with dashboard Scenario Management Data.
Business Intelligence Transparencies 1. ©Pearson Education 2009 Objectives What business intelligence (BI) represents. The technologies associated with.
Pooja Sharma Shanti Ragathi Vaishnavi Kasala. BUSINESS BACKGROUND Lowe's started as a single hardware store in North Carolina in 1946 and since then has.
ISAM 5931: Data Warehousing & Data Mining Group Project submitted by : Mudassar Hakim & Gaurav Wadhwani.
Two-Tier DW Architecture. Three-Tier DW Architecture.
Data Warehousing.
Advanced Database Concepts
Copyright© 2014, Sira Yongchareon Department of Computing, Faculty of Creative Industries and Business Lecturer : Dr. Sira Yongchareon ISCG 6425 Data Warehousing.
Acct 6910 Building Business Intelligence Systems An Introduction to Data Warehouse.
The Need for Data Analysis 2 Managers track daily transactions to evaluate how the business is performing Strategies should be developed to meet organizational.
Data Warehousing COMP3017 Advanced Databases Dr Nicholas Gibbins –
Data Warehouse – Your Key to Success. Data Warehouse A data warehouse is a  subject-oriented  Integrated  Time-variant  Non-volatile  Restructure.
Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 9: DATA WAREHOUSING.
2 Copyright © 2006, Oracle. All rights reserved. Defining Data Warehouse Concepts and Terminology.
BUSINESS INTELLIGENCE. The new technology for understanding the past & predicting the future … BI is broad category of technologies that allows for gathering,
Decision Support System ISYS 363. Decision supports Systems Components Data management function –Data warehouse Model management function –Analytical.
Data Warehouse/Data Mart It’s all about the data.
Advanced Applied IT for Business 2
Data warehouse and OLAP
Chapter 13 The Data Warehouse
Data Warehouse.
المحاضرة 4 : مستودعات البيانات (Data warehouse)
Data Warehouse and OLAP
MIS2502: Data Analytics Dimensional Data Modeling
MIS2502: Data Analytics Dimensional Data Modeling
Data Warehouse and OLAP
Presentation transcript:

Data Warehousing ISYS 650

What is a data warehouse? A data warehouse is a subject-oriented, integrated, nonvolatile, time-variant collection of data in support of management’s decision. – Subject-oriented: data is organized around major subjects of the enterprise, such as sales, rather than individual transactions, and is oriented to decision making. – Integrated: the same piece of information collected from various systems is referred to in only one way. Example: Gender: M, F; Male, Female; Sex: 0, 1 – Nonvolatile: Data is loaded into a data warehouse on a scheduled basis. – Time-variant: Historical data to support time-series and trend analysis.

What is a Data Warehouse? A physical repository where relational data are specially organized to provide enterprise-wide, cleansed data in a standardized format “The data warehouse is a collection of integrated, subject-oriented databases designed to support DSS functions, where each unit of data is non- volatile and relevant to some moment in time”

Need for Data Warehousing Separation of operational and informational systems and data for improved performance.

Types of Data in a DW Current detailed data: consistent at the time the data is extracted from the transaction system. Old detailed data: need to be archived. Summarized data Metadata: – A directory of what is in the warehouse. – A guide to mapping data from transaction database to data warehouse

Data Mart A departmental data warehouse that stores only relevant data – Dependent data mart A subset that is created directly from a data warehouse – Independent data mart A small data warehouse designed for a strategic business unit or a department

DW Framework

Extraction, transformation, and load (ETL) Data Integration and the Extraction, Transformation, and Load (ETL) Process

Representation of Data in DW Dimensional Modeling – a retrieval-based system that supports high-volume query access Star schema – the most commonly used and the simplest style of dimensional modeling – Contain a fact table surrounded by and connected to several dimension tables – Fact table contains the descriptive attributes (numerical values) needed to perform decision analysis and query reporting – Dimension tables contain classification and aggregation information about the values in the fact table Snowflakes schema – an extension of star schema where the diagram resembles a snowflake in shape

Multidimensionality The ability to organize, present, and analyze data by several dimensions, such as sales by region, by product, by salesperson, and by time (four dimensions) Multidimensional presentation – Dimensions: products, salespeople, market segments, business units, geographical locations, distribution channels, country, or industry – Measures: money, sales volume, head count, inventory profit, actual versus forecast – Time: daily, weekly, monthly, quarterly, or yearly

Example: Northwind Database

Examples of Sales Analysis Total sales by Product Sales related to Customer: – Location: Sales by City, Country Sales related to Time: – Quarterly, monthly, yearly Sales Sales related to Employee:

Analyze Sales Data Detailed Business Data Total sales: Amount of each detail line: Quantity*UnitPrice*Discount Sum (Quantity*UnitPrice*Discount) Total quantity sold: Sum(Quantity) Detailed business data: – Quantity*UnitPrice*Discount – Quantity

Dimensions for Data Analysis: Factors relevant to the detailed business data Analyze sales by: – Product, product category – Location: City, State, Country – Time: Quarterly, yearly sales – Employee: – And combinations of these dimensions: Ex: Product and Location, Product and Time

Data Warehouse Design - Star Schema - Dimension tables – contain descriptions about the subjects of the business such as customers, employees, locations, products, time periods, etc. Fact table – contain detailed business data with links to dimension tables.

Define Product Dimension Product Table: – ProductID, ProductName, SupplierID, CategoryID, QuantityPerUnit, UnitPrice, UnitsInStock, UnitsOnOrder, ReorderLevel, Discontinued Product dimension table: – ProductID, ProductName, CategoryID

Define Employee Dimenstion Employees Table: EmployeeID, LastName, FirstName, Title, TitleOfCourtesy, BirthDate, HireDate, Address, City, Region, PostalCode, Country, HomePhone, etc. Employee Dimension: – EmployeeID, FullName, Title, EmpCity

Define Location Dimension Customers table: – CustomerID, CompanyName, ContactName, ContactTitle, Address, City, Region, PostalCode, Country, Phone, Fax Location dimension: – LocationCode, City, Country – Define Location Code: This is an artificial code created to link detailed business data with the city and country. – In the Northwind database, I used the Make Table query to create a Location table from the Customers table with City and Country fields. Then I used the Customers table’s design view to add a LocationCode field with the Auto Number data type.

Define Period Dimension Period: – In the Orders table: OrderDate – In the data warehouse we define Period to be: PeriodCode, Year, Quarter OrderDate: 04-Jul > 1996, 3, 7 OrderDate: 20-Dec > 1996, 4, 12 – In Access: Create view based on Orders table Year:Year(OrderDate); Month:Month(OrderDate) Quarter: – Quarter: IIf([month]<=3,1,IIf([month]<=6,2,IIf([month]<=9,3,4))) – Define Period Code: PeriodCode:Cstr(Year) + Cstr(Quarter) 1996, 3, 7 -> , 4, 12 -> 19964

Star Schema FactTable LocationCode PeriodCode EmployeeID ProductID Qty Amount Location Dimension LocationCode City Country Employee Dimension EmployeeID FullName Title EmpCity Product Dimension ProductID ProductName CategoryID Period Dimension PeriodCode Year Quarter

A Query to retrieve data for Fact Table

Transfer Data Between Access Databases Create/Query/Design View – 1. Create the query with the data to transfer – 2. Click Make Table button Make table in the same database Make table in other database – 3. Click Run