Data Warehousing.

Slides:



Advertisements
Similar presentations
Supervisor : Prof . Abbdolahzadeh
Advertisements

An overview of Data Warehousing and OLAP Technology Presented By Manish Desai.
BY LECTURER/ AISHA DAWOOD DW Lab # 2. LAB EXERCISE #1 Oracle Data Warehousing Goal: Develop an application to implement defining subject area, design.
OLAP Tuning. Outline OLAP 101 – Data warehouse architecture – ROLAP, MOLAP and HOLAP Data Cube – Star Schema and operations – The CUBE operator – Tuning.
Data Warehousing CPS216 Notes 13 Shivnath Babu. 2 Warehousing l Growing industry: $8 billion way back in 1998 l Range from desktop to huge: u Walmart:
Data Warehousing M R BRAHMAM.
Data Warehouse Architecture Sakthi Angappamudali Data Architect, The Oregon State University, Corvallis 16 th May, 2005.
Jennifer Widom On-Line Analytical Processing (OLAP) Introduction.
Data Warehousing - 2 ISYS 650. Data Warehouse Design - Star Schema - Dimension tables – contain descriptions about the subjects of the business such as.
Data Warehouse IMS5024 – presented by Eder Tsang.
Introduction to Data Warehousing. From DBMS to Decision Support DBMSs widely used to maintain transactional data Attempts to use of these data for analysis,
Exploiting the DW data DW is a platform for creating a wide array of reports It solves data feed problems, but does not lead to specific decision support.
Data Warehousing - 3 ISYS 650. Snowflake Schema one or more dimension tables do not join directly to the fact table but must join through other dimension.
CSE6011 Warehouse Models & Operators  Data Models  relations  stars & snowflakes  cubes  Operators  slice & dice  roll-up, drill down  pivoting.
Chapter 13 The Data Warehouse
DATA WAREHOUSE (Muscat, Oman).
Data Warehousing DSCI 4103 Dr. Mennecke Introduction and Chapter 1.
CS346: Advanced Databases
Components of the Data Warehouse Michael A. Fudge, Jr.
Data Conversion to a Data warehouse Presented By Sanjay Gunasekaran.
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Decision Support Chapter 23.
Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor
©Silberschatz, Korth and Sudarshan18.1Database System Concepts - 5 th Edition, Aug 26, 2005 Buzzword List OLTP – OnLine Transaction Processing (normalized,
DW-1: Introduction to Data Warehousing. Overview What is Database What Is Data Warehousing Data Marts and Data Warehouses The Data Warehousing Process.
Cube Intro. Decision Making Effective decision making Goal: Choice that moves an organization closer to an agreed-on set of goals in a timely manner Goal:
Business Intelligence Zamaneh Jahed. What is Business Intelligence? Business Intelligence (BI) is a broad category of applications and technologies for.
DIMENSIONAL MODELLING. Overview Clearly understand how the requirements definition determines data design Introduce dimensional modeling and contrast.
1 Data Warehouses BUAD/American University Data Warehouses.
OLAP & DSS SUPPORT IN DATA WAREHOUSE By - Pooja Sinha Kaushalya Bakde.
The Data Warehouse “A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection of “all” an organisation’s data in support.
Data Warehousing.
October 28, Data Warehouse Architecture Data Sources Operational DBs other sources Analysis Query Reports Data mining Front-End Tools OLAP Engine.
Data Warehousing. Databases support: Transaction Processing Systems –operational level decision –recording of transactions Decision Support Systems –tactical.
Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,
MIS2502: Data Analytics Dimensional Data Modeling
UNIT-II Principles of dimensional modeling
Chapter 5 DATA WAREHOUSING Study Sections 5.2, 5.3, 5.5, Pages: & Snowflake schema.
CMPE 226 Database Systems October 21 Class Meeting Department of Computer Engineering San Jose State University Fall 2015 Instructor: Ron Mak
OLAP On Line Analytic Processing. OLTP On Line Transaction Processing –support for ‘real-time’ processing of orders, bookings, sales –typically access.
Two-Tier DW Architecture. Three-Tier DW Architecture.
Copyright© 2014, Sira Yongchareon Department of Computing, Faculty of Creative Industries and Business Lecturer : Dr. Sira Yongchareon ISCG 6425 Data Warehousing.
Advanced Database Concepts
Copyright© 2014, Sira Yongchareon Department of Computing, Faculty of Creative Industries and Business Lecturer : Dr. Sira Yongchareon ISCG 6425 Data Warehousing.
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Data Warehousing and Decision Support.
Pindaro Demertzoglou Data Resource Management – MGMT 4170 Lally School of Management Rensselaer Polytechnic Institute.
Data Warehousing and OLAP Outline u Models & operations u Implementing a warehouse u Future directions.
CMPE 226 Database Systems April 12 Class Meeting Department of Computer Engineering San Jose State University Spring 2016 Instructor: Ron Mak
Business Intelligence Overview
Supervisor : Prof . Abbdolahzadeh
Data Warehousing CIS 4301 Lecture Notes 4/20/2006.
Data warehouse and OLAP
Chapter 13 The Data Warehouse
MIS2502: Data Analytics Dimensional Data Modeling
Data Warehouse.
MIS2502: Data Analytics Dimensional Data Modeling
MIS2502: Data Analytics Dimensional Data Modeling
Inventory is used to illustrate:
MIS2502: Data Analytics Dimensional Data Modeling
On-Line Analytical Processing (OLAP)
CMPE 226 Database Systems April 11 Class Meeting
Components of the Data Warehouse Michael A. Fudge, Jr.
Data Warehouse and OLAP
Data Warehousing and OLAP
MIS2502: Data Analytics Dimensional Data Modeling
Data Warehousing Data Model –Part 1
MIS2502: Data Analytics Dimensional Data Modeling
Data Warehouse.
Data Warehousing Concepts
Data Warehouse and OLAP
Presentation transcript:

Data Warehousing

Databases support: Transaction Processing Systems operational level decision recording of transactions Decision Support Systems tactical and strategic decision making analysis of historical records

Can one database support both? RDBMS DSS TPS

Can one database support both? RDBMS DSS TPS low concurrency large reads significant aggregation high concurrency small transactions limited aggregation Yes… but at a cost in performance.

The Solution… TPS DSS Production Database (OLTP) Data Warehouse The TPS doesn't need 10yrs of info; it can be kept lean. You can also pre-compute certain common queries (adding up certain totals, etc) You might *denormalize* data -- take two big tables that you know people will be joining, and go ahead and join: non-normalized, joined tables are only a problem when updating; it's not a problem for read-only. Extract, Transport & Transformation Load

OLTP vs DW Characteristics OLTP Database Data Warehouse High Read/Write Concurrency Primarily Read Only Highly Normalized Highly Denormalized Limited Transaction History Massive Transaction History Very Detailed Data Detailed and Summarized Data "OLTP" -- on-line transaction processing. "external data" might have interest rates, stock prices, competitor revenues, etc. Limited External Data Significant External Data

Data Marts (3-tier approach) External Data Sources Data Mart A DSS Data Warehouse Production Database (OLTP) Data Mart B DSS ETL Data Mart C Transformation & Limitation DSS

Data Marts (bottom-up approach) External Data Sources Data Mart A DSS ETL Production Database (OLTP) External Data Sources Data Mart B ETL DSS ETL Data Mart C DSS External Data Sources

Multi-dimensional (Sales) Data 80 110 60 25 California 40 90 50 30 Utah 70 55 60 35 March 3 Arizona March 2 March 1 Diet Soda Lime Soda Soda Orange Soda

Cube Operations Cube (group by option) Slice (implement in Oracle with where clause) Dice (implement in Oracle with where clause) Drill Down (implemented in report writers) Roll-up (group by option) Pivot (not implemented by Oracle (but by Access))

Cube Data Example Create table sales ( Item varchar2(20), State varchar2(20), Amount number(6), Day date); Insert into Sales values('Soda','California',80,'01-Mar-2004'); values('Diet Soda','California',110,'01-Mar-2004'); …

Examine these queries Select * from sales; Select Item, State, sum(amount) from sales group by Item, State; group by Rollup(Item, State); Select State, Item, sum(amount) group by Rollup(State, Item); group by Cube(State, Item);

Materialized Views We looked at Materialized views earlier this year. Materialized views are one of the primary tools for Data Warehousing in Oracle. Recall the materialized views are schema objects that can be used to summarize, precompute, replicate, and distribute data. Unlike, regular views, they are not constructed when requested (like a query) but are actually materialized in secondary storage making them much faster. In data warehouses, materialized views are used to precompute and store aggregated data such as sums and averages. Materialized views in these environments are typically referred to as summaries because they store summarized data.

RDBMS Star Schema A star schema presents a set of tables that are centered around an individual event of interest. This makes it easy for users of mining and reporting applications to analyze the event across many dimensions. Item Store ItemID StoreID Name Manager Sales UnitPrice Street SalesNO Brand City SalesUnits Category Zip SalesDollars SalesCost ItemID Customer Day CustID CustID DayID StoreID Name DayOfMonth DayID Phone Month Street Year City DayOfWeek