Download presentation
Presentation is loading. Please wait.
1
Data Warehousing
2
Databases support: Transaction Processing Systems
operational level decision recording of transactions Decision Support Systems tactical and strategic decision making analysis of historical records
3
Can one database support both?
RDBMS DSS TPS
4
Can one database support both?
RDBMS DSS TPS low concurrency large reads significant aggregation high concurrency small transactions limited aggregation Yes… but at a cost in performance.
5
The Solution… TPS DSS Production Database (OLTP) Data Warehouse
The TPS doesn't need 10yrs of info; it can be kept lean. You can also pre-compute certain common queries (adding up certain totals, etc) You might *denormalize* data -- take two big tables that you know people will be joining, and go ahead and join: non-normalized, joined tables are only a problem when updating; it's not a problem for read-only. Extract, Transport & Transformation Load
6
OLTP vs DW Characteristics
OLTP Database Data Warehouse High Read/Write Concurrency Primarily Read Only Highly Normalized Highly Denormalized Limited Transaction History Massive Transaction History Very Detailed Data Detailed and Summarized Data "OLTP" -- on-line transaction processing. "external data" might have interest rates, stock prices, competitor revenues, etc. Limited External Data Significant External Data
7
Data Marts (3-tier approach)
External Data Sources Data Mart A DSS Data Warehouse Production Database (OLTP) Data Mart B DSS ETL Data Mart C Transformation & Limitation DSS
8
Data Marts (bottom-up approach)
External Data Sources Data Mart A DSS ETL Production Database (OLTP) External Data Sources Data Mart B ETL DSS ETL Data Mart C DSS External Data Sources
9
Multi-dimensional (Sales) Data
80 110 60 25 California 40 90 50 30 Utah 70 55 60 35 March 3 Arizona March 2 March 1 Diet Soda Lime Soda Soda Orange Soda
10
Cube Operations Cube (group by option)
Slice (implement in Oracle with where clause) Dice (implement in Oracle with where clause) Drill Down (implemented in report writers) Roll-up (group by option) Pivot (not implemented by Oracle (but by Access))
11
Cube Data Example Create table sales ( Item varchar2(20),
State varchar2(20), Amount number(6), Day date); Insert into Sales values('Soda','California',80,'01-Mar-2004'); values('Diet Soda','California',110,'01-Mar-2004'); …
12
Examine these queries Select * from sales;
Select Item, State, sum(amount) from sales group by Item, State; group by Rollup(Item, State); Select State, Item, sum(amount) group by Rollup(State, Item); group by Cube(State, Item);
13
Materialized Views We looked at Materialized views earlier this year. Materialized views are one of the primary tools for Data Warehousing in Oracle. Recall the materialized views are schema objects that can be used to summarize, precompute, replicate, and distribute data. Unlike, regular views, they are not constructed when requested (like a query) but are actually materialized in secondary storage making them much faster. In data warehouses, materialized views are used to precompute and store aggregated data such as sums and averages. Materialized views in these environments are typically referred to as summaries because they store summarized data.
14
RDBMS Star Schema A star schema presents a set of tables that are centered around an individual event of interest. This makes it easy for users of mining and reporting applications to analyze the event across many dimensions. Item Store ItemID StoreID Name Manager Sales UnitPrice Street SalesNO Brand City SalesUnits Category Zip SalesDollars SalesCost ItemID Customer Day CustID CustID DayID StoreID Name DayOfMonth DayID Phone Month Street Year City DayOfWeek
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.