12/18/20141 PSU’s CS 587-3 25. Data Warehouses and Decision Support Len Shapiro, for CS386, 11/2-3/05. Some slides taken from Ramakrishnan and Gherke,

Slides:



Advertisements
Similar presentations
OLTP Compared With OLAP
Advertisements

Data Warehousing and Data Mining J. G. Zheng May 20 th 2008 MIS Chapter 3.
Relational On-Line Analytical Processing (ROLAP)
An overview of Data Warehousing and OLAP Technology Presented By Manish Desai.
IS 4420 Database Fundamentals Chapter 11: Data Warehousing Leon Chen
12/18/20141 Lecture 10: Data Warehouses  Introduction  Operational vs. Warehouse  Multidimensional Data  Examples  MOLAP vs ROLAP  Dimensional Hierarchies.
Chapter 13 The Data Warehouse
Data Warehousing and Decision Support, part 2
OLAP Tuning. Outline OLAP 101 – Data warehouse architecture – ROLAP, MOLAP and HOLAP Data Cube – Star Schema and operations – The CUBE operator – Tuning.
Outline What is a data warehouse? A multi-dimensional data model Data warehouse architecture Data warehouse implementation Further development of data.
Data Warehousing Willem Visser RW334. Somebody is watching! Everybody seems to be recording your every move Loyalty cards Cookies – Facebook, Twitter,…
Data Warehousing CPS216 Notes 13 Shivnath Babu. 2 Warehousing l Growing industry: $8 billion way back in 1998 l Range from desktop to huge: u Walmart:
Data Warehousing M R BRAHMAM.
CMPT 354, Simon Fraser University, Fall 2008, Martin Ester 391 Database Systems I Data Warehousing.
ICS 421 Spring 2010 Data Warehousing 2 Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 3/30/20101Lipyeow.
ICS 421 Spring 2010 Data Warehousing (1) Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 3/18/20101Lipyeow.
1 OLAP and Decision Support Chapter 25, Part A. 2 Introduction  Increasingly, organizations are analyzing current and historical data to identify useful.
CS 286, UC Berkeley, Spring 2007, R. Ramakrishnan 1 Decision Support Chapter 25.
Business Intelligence. On-Line Analytical Processing (OLAP) Tools The use of a set of graphical tools that provides users with multidimensional views.
COMP 578 Data Warehousing And OLAP Technology Keith C.C. Chan Department of Computing The Hong Kong Polytechnic University.
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Data Warehousing and Decision Support Chapter 25, Part A.
Data Warehousing. On-Line Analytical Processing (OLAP) Tools The use of a set of graphical tools that provides users with multidimensional views of their.
CSE6011 Warehouse Models & Operators  Data Models  relations  stars & snowflakes  cubes  Operators  slice & dice  roll-up, drill down  pivoting.
Chapter 13 The Data Warehouse
XCube XML For Data Warehouses By Sven Groot. Data warehouses Contains data drawn from several databases and external sources Contains data drawn from.
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Decision Support Chapter 23.
CPSC 404, Laks V.S. Lakshmanan 1 Data Warehousing & OLAP Chapter 25, Ramakrishnan & Gehrke (Sections )
 Data warehouses  Decision support  The multidimensional model  OLAP queries.
Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor
Data Warehouse & Data Mining
Cube Intro. Decision Making Effective decision making Goal: Choice that moves an organization closer to an agreed-on set of goals in a timely manner Goal:
Data warehousing and online analytical processing- Ref Chap 4) By Asst Prof. Muhammad Amir Alam.
OLAP & DSS SUPPORT IN DATA WAREHOUSE By - Pooja Sinha Kaushalya Bakde.
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Decision Support Chapter 23.
Data Warehouse & OLAP Kuliah 1 Introduction Slide banyak mengambil dari acuan- acuan yang dipakai.
Data Warehousing.
Roadmap 1.What is the data warehouse, data mart 2.Multi-dimensional data modeling 3.Data warehouse design – schemas, indices 4.The Data Cube operator –
October 28, Data Warehouse Architecture Data Sources Operational DBs other sources Analysis Query Reports Data mining Front-End Tools OLAP Engine.
Ahsan Abdullah 1 Data Warehousing Lecture-10 Online Analytical Processing (OLAP) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center.
Data Warehousing. Databases support: Transaction Processing Systems –operational level decision –recording of transactions Decision Support Systems –tactical.
Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,
Chapter 5 DATA WAREHOUSING Study Sections 5.2, 5.3, 5.5, Pages: & Snowflake schema.
1 On-Line Analytic Processing Warehousing Data Cubes.
Data Warehousing Multidimensional Analysis
Business Intelligence Transparencies 1. ©Pearson Education 2009 Objectives What business intelligence (BI) represents. The technologies associated with.
OLAP & Data Warehousing. R. Ramakrishnan and J. Gehrke1 Decision Support Chapter 23.
Data Warehousing.
Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems.
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Data Warehousing and Decision Support.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Data Warehousing and Decision Support Chapter 25.
Introduction to OLAP and Data Warehouse Assoc. Professor Bela Stantic September 2014 Database Systems.
An Overview of Data Warehousing and OLAP Technology
Data Warehouses and OLAP 1.  Review Questions ◦ Question 1: OLAP ◦ Question 2: Data Warehouses ◦ Question 3: Various Terms and Definitions ◦ Question.
Pindaro Demertzoglou Data Resource Management – MGMT 4170 Lally School of Management Rensselaer Polytechnic Institute.
Data Warehousing COMP3017 Advanced Databases Dr Nicholas Gibbins –
Data Warehousing and OLAP Outline u Models & operations u Implementing a warehouse u Future directions.
11/20/ :11 AMData Mining 1 Data Mining – CSE 9033 Chapter – 1; Data Warehousing Dr. Goutam Sarker, B.E., M.E., Ph.D.(Engineering), Fellow: IE(I),
Pertemuan <<13>> Data Warehousing dan Decision Support
Data Warehousing CIS 4301 Lecture Notes 4/20/2006.
On-Line Analytic Processing
Data warehouse and OLAP
Chapter 13 The Data Warehouse
Data Warehouse.
On-Line Analytical Processing (OLAP)
Data Warehouse and OLAP
Introduction of Week 9 Return assignment 5-2
DATA CUBES E0 261 Jayant Haritsa Computer Science and Automation
Data Warehouse and OLAP
Data Warehousing.
Presentation transcript:

12/18/20141 PSU’s CS Data Warehouses and Decision Support Len Shapiro, for CS386, 11/2-3/05. Some slides taken from Ramakrishnan and Gherke, with permission.

12/18/20142 PSU’s CS Overview  Increasingly, organizations are analyzing current and historical data to identify useful patterns and support business strategies.  Emphasis is on complex, interactive, exploratory analysis of very large datasets created by integrating data from across all parts of an enterprise; data is fairly static.  Contrast such On-Line Analytic Processing (OLAP) with traditional On-line Transaction Processing (OLTP): mostly long queries, instead of short update Xacts. 25. Whs.

12/18/20143 PSU’s CS Enterprise Applications OLTP / Operational / ProductionDSS / Warehouse / DataMart Operate the business / ClerksDiagnose the business / Managers Short queries, small amts of dataopposite Queries change dataopposite Customer inquiry, Order Entry, etc.Statistics, Visualization, Data Mining, etc. Legacy Applications, Heterogeneous databases Opposite Often DistributedOften Centralized (Warehouse) Current dataCurrent and Historical data 25. Whs.

12/18/20144 PSU’s CS Data Warehouse Requirements OperationalWarehouse General E-R DiagramsMultidimensional data model common Locks necessaryNo Locks necessary Crash recovery requiredCrash recovery optional Smaller volume of dataHuge volume of data Need indexes designed to access small amounts of data Need indexes designed to access large volumes of data

12/18/20145 PSU’s CS Data Warehousing  Integrated data spanning long time periods, often augmented with summary information.  Several gigabytes to terabytes common.  Interactive response times expected for complex queries; ad-hoc updates uncommon. EXTERNAL DATA SOURCES EXTRACT TRANSFORM LOAD REFRESH DATA WAREHOUSE Metadata Repository SUPPORTS OLAP DATA MINING 25. Whs.

12/18/20146 PSU’s CS Multidimensional Data Model  Collection of numeric measures, which depend on a set of dimensions.  E.g., measure Sales, dimensions Product (key: pid), Location (locid), and Time (timeid) timeid pid pid timeidlocid sales locid Slice locid=1 is shown:

12/18/20147 PSU’s CS Examples of Dimensional Data  Products(ProductID, StoreID, DateID, Sale)  Product(ID, SKU, size, brand)  Store(ID, Address, Sales District, Region, Manager)  Date (Date, Week, Month, Holiday, Promotion)  Claims(ProvID, MembID, Procedure, DateID, Cost)  Providers(ID, Practice, Address, ZIP, City, State)  Members(ID, Contract, Name, Address)  Procedure (ID, Name, Type)  Telecomm (CustID, SalesRepID, ServiceID, DateID)  SalesRep(ID, Address, Sales District, Region, Manager)  Service(ID, Name, Category)

12/18/20148 PSU’s CS MOLAP vs ROLAP  Multidimensional data can be stored physically in a (disk-resident, persistent) array; called MOLAP systems. Alternatively, can store as a relation; called ROLAP systems.  The main relation, which relates dimensions to a measure, is called the fact table. Each dimension can have additional attributes and an associated dimension table.  E.g., Products(pid, pname, category, price)  Fact tables are much larger than dimensional tables. 25. Whs.

12/18/20149 PSU’s CS Dimension Hierarchies  For each dimension, some of the attributes may be organized in a hierarchy: PRODUCTTIMELOCATION pname week city PID date ZIP year category quarter state 25. Whs.

12/18/ PSU’s CS OLAP Queries  Influenced by SQL and by spreadsheets.  A common operation is to aggregate a measure over one or more dimensions.  Find total sales.  Find total sales for each city, or for each state.  Find top five products ranked by total sales.  Roll-up: Aggregating at different levels of a dimension hierarchy.  E.g., Given total sales by city, we can roll-up to get sales by state. 25. Whs.

12/18/ PSU’s CS OLAP Queries  Drill-down: The inverse of roll-up.  E.g., Given total sales by state, can drill-down to get total sales by city.  E.g., Can also drill-down on different dimension to get total sales by product for each state.  Pivoting: Aggregation on selected dimensions.  E.g., Pivoting on Location and Time yields this cross-tabulation : WI CA Total Total  Slicing and Dicing: Equality and range selections on one or more dimensions. 25. Whs.

12/18/ PSU’s CS Comparison with SQL Queries  The cross-tabulation obtained by pivoting can also be computed using a collection of SQLqueries: SELECT SUM (S.sales) FROM Sales S, Times T, Locations L WHERE S.timeid=T.timeid AND S.timeid=L.timeid GROUP BY T.year, L.state SELECT SUM (S.sales) FROM Sales S, Times T WHERE S.timeid=T.timeid GROUP BY T.year SELECT SUM (S.sales) FROM Sales S, Location L WHERE S.timeid=L.timeid GROUP BY L.state 25. Whs.

12/18/ PSU’s CS The CUBE Operator  Generalizing the previous example, if there are k dimensions, we have 2^k possible SQL GROUP BY queries that can be generated through pivoting on a subset of dimensions.  CUBE pid, locid, timeid BY SUM Sales  Equivalent to rolling up Sales on all eight subsets of the set {pid, locid, timeid}; each roll-up corresponds to an SQL query of the form: SELECT SUM (S.sales) FROM Sales S GROUP BY grouping-list Lots of work on optimizing the CUBE operator! 25. Whs.

12/18/ PSU’s CS Summary  Decision support is an emerging, rapidly growing subarea of databases.  Involves the creation of large, consolidated data repositories called data warehouses.  Warehouses exploited using sophisticated analysis techniques: complex SQL queries and OLAP “multidimensional” queries (influenced by both SQL and spreadsheets).  New techniques for database design, indexing, view maintenance, and interactive querying need to be supported (CS587). 25. Whs.