Some OLAP Issues CMPT 455/826 - Week 9, Day 2 Jan-Apr 2009 – w9d21.

Slides:



Advertisements
Similar presentations
Data Warehousing and Data Mining J. G. Zheng May 20 th 2008 MIS Chapter 3.
Advertisements

An overview of Data Warehousing and OLAP Technology Presented By Manish Desai.
BY LECTURER/ AISHA DAWOOD DW Lab # 2. LAB EXERCISE #1 Oracle Data Warehousing Goal: Develop an application to implement defining subject area, design.
Technical BI Project Lifecycle
Data Warehousing CPS216 Notes 13 Shivnath Babu. 2 Warehousing l Growing industry: $8 billion way back in 1998 l Range from desktop to huge: u Walmart:
Copyright © 2011 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Chapter 8 The Enhanced Entity- Relationship (EER) Model.
Dimensional Modeling – Part 2
Data Sources Data Warehouse Analysis Results Data visualisation Analytical tools OLAP Data Mining Overview of Business Intelligence Data visualisation.
Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.
Chap8: Trends in DBMS 8.1 Database support for Field Entities 8.2 Content-based retrieval 8.3 Introduction to spatial data warehouses 8.4 Summary.
COMP 578 Data Warehousing And OLAP Technology Keith C.C. Chan Department of Computing The Hong Kong Polytechnic University.
CSE6011 Warehouse Models & Operators  Data Models  relations  stars & snowflakes  cubes  Operators  slice & dice  roll-up, drill down  pivoting.
1 © Prentice Hall, 2002 Chapter 11: Data Warehousing.
Ch3 Data Warehouse part2 Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2009.
Data Mining: A Closer Look
Ch3 Data Warehouse Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2010.
Distributed Data Analysis & Dissemination System (D-DADS) Prepared by Stefan Falke Rudolf Husar Bret Schichtel June 2000.
Online Analytical Processing (OLAP) Hweichao Lu CS157B-02 Spring 2007.
OLAP OPERATIONS. OLAP ONLINE ANALYTICAL PROCESSING OLAP provides a user-friendly environment for Interactive data analysis. In the multidimensional model,
Dr. Bernard Chen Ph.D. University of Central Arkansas
Chetan Bhirud Raza Mohammad Abinash Sahoo Online Marketing Giant.
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Decision Support Chapter 23.
Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization.
Data Mining Techniques
DWH – Dimesional Modeling PDT Genči. 2 Outline Requirement gathering Fact and Dimension table Star schema Inside dimension table Inside fact table STAR.
1 Brett Hanes 30 March 2007 Data Warehousing & Business Intelligence 30 March 2007 Brett Hanes.
Analysis of Additivity in OLAP Systems
Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D.
Chapter 6 SAS ® OLAP Cube Studio. Section 6.1 SAS OLAP Cube Studio Architecture.
DIMENSIONAL MODELLING. Overview Clearly understand how the requirements definition determines data design Introduce dimensional modeling and contrast.
1 Data Warehouses BUAD/American University Data Warehouses.
OLAP & DSS SUPPORT IN DATA WAREHOUSE By - Pooja Sinha Kaushalya Bakde.
Data Warehousing.
October 28, Data Warehouse Architecture Data Sources Operational DBs other sources Analysis Query Reports Data mining Front-End Tools OLAP Engine.
BUSINESS ANALYTICS AND DATA VISUALIZATION
Building Data and Document-Driven Decision Support Systems How do managers access and use large databases of historical and external facts?
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
SHIFALI CHOUBEY GISE LAB IITB Decision Support System For Farmers.
Modeling Issues for Data Warehouses CMPT 455/826 - Week 7, Day 1 (based on Trujollo) Sept-Dec 2009 – w7d11.
OLAP in DWH Ján Genči PDT. 2 Outline OLAP Definitions and Rules The term OLAP was introduced in a paper entitled “Providing On-Line Analytical.
Data Warehousing Multidimensional Analysis
1 Agenda – 04/02/2013 Discuss class schedule and deliverables. Discuss project. Design due on 04/18. Discuss data mart design. Use class exercise to design.
Data Mining Data Warehouses.
Business Intelligence Transparencies 1. ©Pearson Education 2009 Objectives What business intelligence (BI) represents. The technologies associated with.
OLAP On Line Analytic Processing. OLTP On Line Transaction Processing –support for ‘real-time’ processing of orders, bookings, sales –typically access.
Distributed Data Analysis & Dissemination System (D-DADS ) Special Interest Group on Data Integration June 2000.
MIS2502: Data Analytics Advanced Analytics - Introduction.
Evaluation of DBMiner By: Shu LIN Calin ANTON. Outline  Importing and managing data source  Data mining modules Summarizer Associator Classifier Predictor.
Data Warehousing.
Advanced Database Concepts
1 Copyright © 2009, Oracle. All rights reserved. Oracle Business Intelligence Enterprise Edition: Overview.
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Data Warehousing and Decision Support.
The Need for Data Analysis 2 Managers track daily transactions to evaluate how the business is performing Strategies should be developed to meet organizational.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Data Warehousing and Decision Support Chapter 25.
1 Database Systems, 8 th Edition Star Schema Data modeling technique –Maps multidimensional decision support data into relational database Creates.
Introduction to OLAP and Data Warehouse Assoc. Professor Bela Stantic September 2014 Database Systems.
Data Warehousing and OLAP Outline u Models & operations u Implementing a warehouse u Future directions.
1 Management Information Systems M Agung Ali Fikri, SE. MM.
Data Mining and Data Warehousing: Concepts and Techniques What is a Data Warehouse? Data Warehouse vs. other systems, OLTP vs. OLAP Conceptual Modeling.
Data Analysis and OLAP Dr. Ms. Pratibha S. Yalagi Topic Title
MIS2502: Data Analytics Advanced Analytics - Introduction
Chapter 13 Business Intelligence and Data Warehouses
Data warehouse and OLAP
Data Warehouse.
Data Mining Concept Description
Data Warehouse and OLAP
Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2009
OLAP in DWH Ján Genči PDT.
Data Warehouse and OLAP
Presentation transcript:

Some OLAP Issues CMPT 455/826 - Week 9, Day 2 Jan-Apr 2009 – w9d21

OLAP Features To Consider In A Data Warehousing System (based on Gorla) Jan-Apr 2009 – w9d22

Gaining Acceptance Hypothesis: –New technology won’t be utilized effectively if it isn’t accepted –and acceptance is based on: perceived usefulness (PU) –the degree to which a person believes that using a particular system would enhance his or her job performance perceived ease of use (PEU) –the degree to which a person believes that using a particular system would be free of effort Jan-Apr 2009 – w9d23

OLAP Features Visualization –allows users to create summary tables and charts interactively –Measures: the presence of multidimensional tables multidimensional graphics Jan-Apr 2009 – w9d24

OLAP Features Summarization –the “degree of aggregation” of information i.e. supporting directed acyclic graphs within a dimension –Measures: the number of hierarchies allowed (in a single dimension) the level of detail the capability to swap between summarized and detailed levels Jan-Apr 2009 – w9d25

OLAP Features Navigation –the capability to drill-down or role-up between levels of detail –the capability to get to the information you want drill-down is going from using more general to more detailed information in a particular domain (e.g. changing location focus from state to city) roll-up is going from using more detailed to more general information in a particular domain (e.g. changing location focus from city to state) slicing is selecting certain rows in a table and ignoring the rest dicing is selecting certain attributes in a table and ignoring the rest –Measures: shareability (number of concurrent users allowed) data navigatability (availability of drill-down, slicing-dicing, and drag-drop facilities) the ability to extract detailed and real-time data Jan-Apr 2009 – w9d26

OLAP Features Query Function: –Query engines extract data from multidimensional databases and generate outputs in 3D graphics Measures: –using pre-constructed query capability –simple query building with click-select feature –query building with query languages –concurrent run of queries Jan-Apr 2009 – w9d27

OLAP Features Sophisticated Analysis: –measures: (six most common types of analyses used in decision support) statistical profiling –(e.g. list customers with highest combined sales) moving averages cross dimension comparison –(e.g. compare product sales by region over a period of time) queries with self-defined formula exception condition what-if analysis Jan-Apr 2009 – w9d28

OLAP Features Dimensionality –Measures: the number of allowable dimensions capability to redefine a dimension time for data refresh after redefinition Jan-Apr 2009 – w9d29

OLAP Features Performance –Measures: (response times for four basic functions) standard report generation customized report generation graphic/chart generation data navigation Jan-Apr 2009 – w9d210

An Analysis of Additivity in OLAP Systems (based on Horner) Jan-Apr 2009 – w9d211

Typical operations Roll-up –increases the level of aggregation along one or more classification hierarchies; Drill-down –decreases the level of aggregation along one or more classification hierarchies; Slice-Dice –selects and projects the data; Pivoting –reorients the multi-dimensional data view to allow exchanging facts for dimensions symmetrically; and, Merging –performs a union of separate roll-up operations Jan-Apr 2009 – w9d212

Summarization of measures The roll-up and merge operations –both use aggregate operators to combine finer-grained measures into summary data But not all fact data –is mathematically summarizable In certain instances –using the sum operator to summarize data can result in inaccurate summary outputs Jan-Apr 2009 – w9d213

Measures additive –along a dimension if the sum operator can be used to meaningfully aggregate values along all hierarchies in that dimension fully-additive –if it is additive across all dimensions semi-additive –if it is only additive across certain dimensions non-additive –if it is not additive across any dimension Jan-Apr 2009 – w9d214

Hierarchies A strict hierarchy –is one where each object at a lower level belongs to only one value at a higher level A non-strict hierarchy –can be thought of as a many-to many relationship between a higher level of the hierarchy and the lower level –can result in multiple or alternate path hierarchies, whereby the lower object splits into two distinct higher level objects Jan-Apr 2009 – w9d215

Hierarchies Alternate and multiple path hierarchies –are important when summarizing measures, and can specifically present problems when merging data An inaccurate summarization can result –if summaries from different paths of the same hierarchy are merged Data cannot be merged –among classification attributes that have overlapping data instances Jan-Apr 2009 – w9d216

Non-additive measures Derived data –Ratios and Percentages –Measures of Intensity –Average/Maximum/Minimum Numbers used for other than quantities –Measurements of direction –Codes (& arbitrarily assigned numbers) –Dates and time of day Jan-Apr 2009 – w9d217

So, What does this all mean? It is important for us to be able to consider –how these papers on OLAP and Data Warehousing –relate to the other material –we have covered in this course Jan-Apr 2009 – w9d218

Going back to the beginning We discussed a taxonomy of data related concepts: –Wisdom is de-contextualized truth, that is always true is different from raw facts, which are highly context specific –Knowledge is context-specific truth, that includes decisions is the result of applying rules/algorithms to information in a given context –Information is processed (extracted, summarized, etc.) data that is useful for making some decision –Data is raw facts, which need not be numbers Jan-Apr 2009 – w9d219

Going further Metadata –is data that can be used to understand / process data –can include rules / algorithms –can be implicit (in the data types or structures) –can be explicit (in additional data attributes / tables / databases) Jan-Apr 2009 – w9d220

Going further - Metadata (cont) It is always important to consider for each data attribute –the syntax (generally dealt with in a database) –the semantics (often not dealt with explicitly) the operations in which it can participate and the TASKS they serve –maintaining the data –analyzing the data (including what queries can use it and how) the USERS of the data, who –own it –input / change it –have access to read it other important attributes of the attribute –value –privacy –risks –etc. Jan-Apr 2009 – w9d221

Ontologies can help Ontologies –contain names of concepts descriptions rules –structure concepts from high level to individual attributes –support sharing of information between the database and a user across databases / systems / users Jan-Apr 2009 – w9d222

Dimensions can help, too Dimensions –organize data conceptually –in directed acyclic graphs –can support exploration within a dimension between multiple dimensions Jan-Apr 2009 – w9d223

Data Warehouses –can provide dimensional storage of data to aid in exploration –don’t rely on traditional normalization and other technology focused techniques –do rely on techniques such as OLAP to help users explore them –need a good understanding of the data so that it can be cleaned before being placed in them Jan-Apr 2009 – w9d224

OLAP –refers to a collection of techniques for exploring data in Data Warehouses and other Dimensionally organized systems –So far the papers have focused on exploration that involves summarizing, extracting, and processing DATA producing largely numerical information –What more should it do? –What’s needed so that it can do this? Jan-Apr 2009 – w9d225