OLAP – On Line Analytical Processing

Slides:



Advertisements
Similar presentations
An overview of Data Warehousing and OLAP Technology Presented By Manish Desai.
Advertisements

OLAP Tuning. Outline OLAP 101 – Data warehouse architecture – ROLAP, MOLAP and HOLAP Data Cube – Star Schema and operations – The CUBE operator – Tuning.
Online Analytical Processing OLAP
Data Warehousing CPS216 Notes 13 Shivnath Babu. 2 Warehousing l Growing industry: $8 billion way back in 1998 l Range from desktop to huge: u Walmart:
OLAP Services Business Intelligence Solutions. Agenda Definition of OLAP Types of OLAP Definition of Cube Definition of DMR Differences between Cube and.
Data Sources Data Warehouse Analysis Results Data visualisation Analytical tools OLAP Data Mining Overview of Business Intelligence Data visualisation.
COMP 578 Data Warehousing And OLAP Technology Keith C.C. Chan Department of Computing The Hong Kong Polytechnic University.
INTRODUCTION TO OLAP MIS 497. Why OLAP? Online Analytical Processing vs. Online Transaction Processing Online Analytical Processing vs. Online Transaction.
CSE6011 Warehouse Models & Operators  Data Models  relations  stars & snowflakes  cubes  Operators  slice & dice  roll-up, drill down  pivoting.
DASHBOARDS Dashboard provides the managers with exactly the information they need in the correct format at the correct time. BI systems are the foundation.
Components of the Data Warehouse Michael A. Fudge, Jr.
Online Analytical Processing (OLAP) Hweichao Lu CS157B-02 Spring 2007.
SharePoint 2010 Business Intelligence Module 6: Analysis Services.
Data Warehouse & Data Mining
Ahsan Abdullah 1 Data Warehousing Lecture-11 Multidimensional OLAP (MOLAP) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for.
OnLine Analytical Processing (OLAP)
Faster and Smarter Data Warehouses with Oracle OLAP 11g.
Data Warehouse. Design DataWarehouse Key Design Considerations it is important to consider the intended purpose of the data warehouse or business intelligence.
1 Data Warehouses BUAD/American University Data Warehouses.
OLAP & DSS SUPPORT IN DATA WAREHOUSE By - Pooja Sinha Kaushalya Bakde.
Data Warehousing.
BI Terminologies.
Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
By N.Gopinath AP/CSE. There are 5 categories of Decision support tools, They are; 1. Reporting 2. Managed Query 3. Executive Information Systems 4. OLAP.
Ayyat IT Group Murad Faridi Roll NO#2492 Muhammad Waqas Roll NO#2803 Salman Raza Roll NO#2473 Junaid Pervaiz Roll NO#2468 Instructor :- “ Madam Sana Saeed”
1 On-Line Analytic Processing Warehousing Data Cubes.
Building Dashboards SharePoint and Business Intelligence.
Data Warehousing Multidimensional Analysis
A POWER OF OLAP TECHNOLOGY National Technical University of Ukraine “Kiev Polytechnic Institute” Heat and energy design faculty Department of automation.
Business Intelligence Transparencies 1. ©Pearson Education 2009 Objectives What business intelligence (BI) represents. The technologies associated with.
What is OLAP?.
CSE 5331/7331 F'071 CSE 5331/7331 Fall 2007 Dimensional Modeling Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist University.
1 Copyright © 2009, Oracle. All rights reserved. Oracle Business Intelligence Enterprise Edition: Overview.
1 Copyright © 2011, Oracle and/or its affiliates. All rights reserved. Introduction to Essbase.
1 Database Systems, 8 th Edition Star Schema Data modeling technique –Maps multidimensional decision support data into relational database Creates.
1 Copyright © 2006, Oracle. All rights reserved. Defining OLAP Concepts.
Pindaro Demertzoglou Data Resource Management – MGMT 4170 Lally School of Management Rensselaer Polytechnic Institute.
Data Warehousing COMP3017 Advanced Databases Dr Nicholas Gibbins –
Data Warehousing and OLAP Outline u Models & operations u Implementing a warehouse u Future directions.
The Concepts of Business Intelligence Microsoft® Business Intelligence Solutions.
Data Mining & OLAP What is Data Mining? Data Mining is the set of activities used to find new, hidden, or unexpected patterns in data.
Operation Data Analysis Hints and Guidelines
Data warehouse.
Data Warehousing CIS 4301 Lecture Notes 4/20/2006.
Chapter 13 Business Intelligence and Data Warehouses
On-Line Analytic Processing
Data warehouse and OLAP
Chapter 13 The Data Warehouse
Three tier Architecture of Data Warehousing
Data storage is growing Future Prediction through historical data
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Data Warehouse.
Online Analytical Processing OLAP
Business Intelligence
MANAGING DATA RESOURCES
Components of the Data Warehouse Michael A. Fudge, Jr.
Data Warehouse and OLAP
Introduction to Essbase
Data Warehouse Overview September 28, 2012 presented by Terry Bilskie
Data Warehousing: Data Models and OLAP operations
DataMart (Data Warehouse) Tool:
OLAP Theory-English version supplement to OLAP
Introduction of Week 9 Return assignment 5-2
OLAP in DWH Ján Genči PDT.
Data Warehouse.
Data Warehousing Concepts
Online analytical processing (OLAP) is a category of software technology that enables analysts, managers, and executives to gain insight into data through.
Data Warehouse and OLAP
Presentation transcript:

OLAP – On Line Analytical Processing

Session Objectives Objectives: At the end of this session, you will be able to: Define On Line Analytical Processing Understand the need for OLAP and applications of OLAP in BI Describe the various OLAP solutions and Architecture Comparison of different OLAP architectures Evaluation parameters to be considered for selecting an OLAP tool

What is OLAP? OLAP (On Line Analytical Processing) applications - designed for online ad-hoc data access and analysis. Data organized into multiple dimensions. Access to analytical content such as time series and trend analysis views and summary level information. A set of functionality that attempts to facilitate multidimensional analysis. Offers drill-down, drill-across and slice and dice capabilities.

OLAP - Fast Analysis On Line No piles of paper, please! Analytical Establish patterns Processing Data-based Fast Analysis of Shared Multidimensional Information On-line Analytical Processing (OLAP) is a category of software technology that enables analysts, managers and executives to gain insight into data through fast, consistent, interactive access to a wide variety of possible views of information. OLAP is implemented in a multi-user client/server mode and offers consistently rapid response to queries, regardless of database size and complexity. OLAP helps the user synthesize enterprise information through comparative, personalized viewing, as well as through analysis of historical and projected data in various "what-if" data model scenarios. This is achieved through use of an OLAP Server.

Need for OLAP Dimensions can we think in ? E.g. analysis by branch, product, agent, year !!! 2 or 3 Types of values we can handle ? E.g. Sales, Profit, Cost 1 or 2 How many levels can we handle ? E.g. number of products we can analyze Spread-sheet view of 1 value type user definable row user definable column Different graphical views Cross-tab, pie chart, Drilling down , level by level Country, Region, Branch, Agents

Need for OLAP Many parameters affect a Measure (value) e.g Sales influenced by product, region, time, distribution channel, etc., Linear analysis = reports Many totals are at one level Difficult to identify the key parameters

OLAP in an Enterprise OLAP Server An OLAP server is a high capacity, multi-user data manipulation engine specifically designed to support and operate on multi-dimensional data structures. A multi- dimensional structure is arranged so that every data item is located and accessed based on the intersection of the dimension members, which define that item. The design of the server and the structure of the data are optimized for rapid ad-hoc information retrieval in any orientation, as well as for fast, flexible calculation and transformation of raw data based on formulaic relationships. OLAP Client End user applications that can request information from OLAP servers and provide two- dimensional or multi-dimensional displays, user modifications, selections, ranking, calculations, etc., for visualization and navigation purposes. OLAP clients may be as simple as a spreadsheet program retrieving a slice for further work by a spreadsheet- literate user or as high-functioned as a financial modeling or sales analysis application

Uses of OLAP Departments: Finance Marketing Sales Manufacturing Analytical Capabilities: Used by analysts and managers. Offers aggregated view of the data, such as total revenues by customer profile, by product line, by geographical regions.

Functionality of OLAP Tools Provides the decision support front-end for data warehousing. Advanced statistical, financial, and analytical calculations. Appropriate tools to access data from a relational database. Appropriate tools to access or manage multidimensional data. OLAP functionality is characterized by dynamic multi-dimensional analysis of consolidated enterprise data supporting end user analytical and navigational activities including: Calculations and modeling applied across dimensions, through hierarchies and/or across members Trend analysis over sequential time periods Slicing subsets for on-screen viewing Drill-down to deeper levels of consolidation Reach-through to underlying detail data

Features of OLAP Applications OLAP analytical features Multi-dimensional views of data Calculation intensive capabilities Time intelligence The OLAP Calculation engine in OLAP tools have a wide range of built-in calculations such as: Ratios Time calculations Statistics Ranking Custom formulas/algorithms Forecasting and modeling Multi dimensional views A dimension represents one variable. A sales data might include at least these five dimensions: Time, Product, Customer, Salesperson, Sales A simple query using above dimensions can be – “ How much revenue did Ram generated in January from all the products he sold to Ram & Co. ?” Calculation intensive OLAP does more than simple aggregation. It can be used for share calculations (Percentage of total) and allocations(which use hierarchies from a top-down perspective), trend analysis and forecasting using historical data. Time intelligence Time is a unique dimension being sequential in character. Business performance is almost always judged over time, for e.g., this month Vs last month, this month Vs the same month last year. Concepts such as year-to-date and period over period comparisons are used in an OLAP

Evolution of OLAP

Star Schema A Star Schema is a dimensional model created by mapping data entities from operational systems It has a central table (fact table) that links all the other tables (dimension tables) together Dimension: The same category of information. For example, year, month, day, and week are all part of the Time Dimension. Measure: The property that can be summed or averaged using pre computed aggregates.

Facts and Measures Sales Revenue Gross Margin Net Profit Cost Profitability Facts or Measures are the Key Performance Indicators of an enterprise Factual data about the subject area Numeric, summarized

Dimension Sales Revenue What was sold ? (Measure) Whom was it sold to ? When was it sold ? Where was it sold ? Sales Revenue (Measure) Dimensions put measures in perspective What, when and where qualifiers to the measures Dimensions could be products, customers, time, geography etc.

Star Schema

Star Schema Example

Star Schema with Sample Data

CUBE Cube Multi dimensional databases store information in the form of cubes. A cube is a collection of facts and related dimensions stored together in arrays. Geography Sales A group of data cells arranged by the dimensions of the data. For example, a spreadsheet exemplifies a two-dimensional array with the data cells arranged in rows and columns, each being a dimension. A three-dimensional array can be visualized as a cube with each dimension forming a side of the cube, including any slice parallel with that side. Higher dimensional arrays have no physical metaphor, but they organize the data in the way users think of their enterprise. Typical enterprise dimensions are time, measures, products, geographical regions, sales channels, etc. HR Time Product

Basic Terminology of a Cube Hierarchy: A hierarchy defines the navigating path for drilling up and drilling down. All attributes in a hierarchy belong to the same dimension. Levels: These are organized into one or more hierarchies, typically from a coarse-grained level (for example, Year) down to the most detailed one (for example, Day). Members: The individual category values (for example, 2002 or 21Jan2002). Measures: These are the data values that are summarized and analyzed. Examples of measures are sales figures or operational costs. Cells: These are the intersection of one member for every dimension and store the data for measures.

Basic Terminology of a Cube Dimensions consist of Dimension Name Level Hierarchy Member Time Level Of Detail YEAR QUARTER 1999 2000 2001 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2

Aggregates Add up amounts for day 1 In SQL: SELECT sum(amt) FROM SALE WHERE date = 1 81

Aggregates Add up amounts by day In SQL: SELECT date, sum(amt) FROM SALE GROUP BY date

Another Example Add up amounts by day, product In SQL: SELECT date, sum(amt) FROM SALE GROUP BY date, prodId rollup drill-down

Aggregates Operators: sum, count, max, min, median and avg “Having” clause Using dimension hierarchy average by region (within store) maximum by month (within date)

Multi-dimensional cube: The MOLAP Cube Fact table view: Multi-dimensional cube: dimensions = 2

Multi-dimensional cube: 3-D Cube Fact table view: Multi-dimensional cube: day 2 day 1 dimensions = 3

Example Dimensions: Time, Product, Store Attributes: roll-up to region Dimensions: Time, Product, Store Attributes: Product (upc, price, …) Store … … Hierarchies: Product  Brand  … Day  Week  Quarter Store  Region  Country NY Store SF roll-up to brand LA Juice Milk Coke Cream Soap Bread 10 34 56 32 12 Product roll-up to week M T W Th F S S Time 56 units of bread sold in LA on M

Cube Aggregation: Roll-up Example: computing sums day 2 . . . day 1 129 drill-down rollup

Aggregation Using Hierarchies day 2 store day 1 region country (store s1 in Region A; stores s2, s3 in Region B)

In SQL: SELECT * FROM SALE WHERE date = 1 Slicing In SQL: SELECT * FROM SALE WHERE date = 1 day 2 day 1 TIME = day 1

OLAP Solutions and Architecture

OLAP - Classification Online Analytical Processing (OLAP) can be done on: Relational databases Multidimensional databases OLAP products are grouped into three categories: Relational OLAP (ROLAP) Multidimensional OLAP (MOLAP) Hybrid OLAP (HOLAP)

MOLAP Multi-dimensional OLAP Geography Age Group Brand Multi-dimensional OLAP MOLAP is a technology which uses a multi-dimensional database that stores data as n-dimensional cube Each face of a multi-dimensional cube represents a business dimension. Every point of intersection of dimensions, called a cell, represents a business fact relating to the dimensions.Practically MOLAP can store more than three dimensions in a cube.

Architecture of MOLAP Cube Size Critical LAN Router Firewall Issues: non-live connection Used for updating the MOLAP data cube only LAN Desktop Systems Data Mart Server MOLAP Client Tools RDBMS Connectivity Middleware MOLAP Server MDDBMS/Data Cube MOLAP Application Router Firewall Multidimensional data bases are also known as MDDB or MDDBS .A class of proprietary, non-relational database management tools that store and manage data in a multidimensional manner. MOLAP servers directly store multidimensional data in special data structures like arrays or cubes. Source data is pre-consolidated. MDDs pre-calculate and store every measure at every hierarchy summary level at load time. MOLAP servers physically stage the processed multi-dimensional information to deliver consistent and rapid response times to end users Multidimensional OLAP analysis usually requires proprietary (non-SQL) access tools. MOLAP often utilizes a 3-Tier environment, where middle tier server preprocesses data from an RDBMS. Some OLAP tools access an RDBMS directly and build cubes as a fat client. Issues: Size of Data Cube Cubes deployment Size of Update Data Set Intranet Internet Thin Clients WWW Browser

MOLAP Products Oracle's Oracle Express Server Cognos - Powerplay Transformer Essbase (Hyperion Software) Holos (Seagate Software) There are also lot of MOLAP vendors in the market. The main features of MOLAP products are : Forecasting / Budgeting Summarized as well as detailed data for analysis Highly complex calculations Infrequent / Small Updates Three / N tier architecture Upsides of MOLAP are : Instant response Value add functions (ranking, % change) Provides more intuitive commands for performing drill down and other analytic operations. A single cube holds answers to a number of requests

Architecture of ROLAP Issues: Aggregate Awareness Response Time LAN Data Mart Server ROLAP Server Desktop Systems RDBMS Connectivity Middleware ROLAP Application ROLAP Client Tools Router / Firewall Issues: Aggregate Awareness Response Time Network Capacity In ROLAP data is stored in the form of records in relational databases. It also supports extensions to SQL and ad-hoc queries. ROLAP servers efficiently implement the multidimensional data model and operations in a relational database. ROLAP Uses RDBMS tables as data source .Architecture can be designed as 3/N tier system. The Calculation engine may be located anywhere.RDBMSs supports star / Snowflake schema. Supports various data types such as numeric, textual, spatial, audio, graphic, and video data. Different types of indexing mechanisms are available in order to improve query performance. Features of ROLAP : ROLAP servers are highly scaleable, in both number of users and amount of data Calculation-on-the-fly. Can be run from the client, the data-base server or a middle tier Limitless filtering and grouping Can handle extremely large number of dimensions Intranet Internet Thin Clients WWW Browser

ROLAP Products Brio Query Enterprise Business Objects Metacube DSS Server Information Advantage The specified tools are market leaders in ROLAP. Generally ROLAP architecture offers these features : Analysis of transaction detail data Very Large Database Frequent / Large Updates Aggregate aware Monitoring Three / N tier architecture Upsides of ROLAP are : Leverage RDBMS capability Value add functions (ranking, % change) No additional loads No additional data sets to manage Scalability

Architecture of HOLAP Issues: Cube elements Integration with RDBMS LAN MOLAP Server ROLAP Server Desktop Systems MDDBMS/Data Cube MOLAP Application ROLAP Application HOLAP Client Tools Router/Fire wall HOLAP (Hybrid OLAP) is a mix of MOLAP and relational architecture that supports queries against summary and transaction data in an integrated fashion.HOLAP uses a multi-dimensional database server as middle ware to access data stored in a relational database. It provides users with the detailed transaction data that contributes to the summary totals stored in the multidimensional server.A relational database stores most of the data. A separate multi-dimensional database stores the most of the dense data, which is typically a small proportion of the total volume of data. HOLAP architectures are typically more complex to implement and administer than ROLAP or MOLAP architectures Issues: Cube elements Integration with RDBMS

HOLAP Products Holos (Seagate Software) Microsoft SQL Server OLAP Services Pilot Software's Pilot Decision Support Suite SAS

MOLAP Vs ROLAP

Comparison of Architectures

Strength and Weakness of MOLAP/ROLAP

Strength and Weakness of MOLAP/ROLAP On-line Analytical Processing (OLAP) is a category of software technology that enables analysts, managers and executives to gain insight into data through fast, consistent, interactive access to a wide variety of possible views of information. OLAP is implemented in a multi-user client/server mode and offers consistently rapid response to queries, regardless of database size and complexity. OLAP helps the user synthesize enterprise information through comparative, personalized viewing, as well as through analysis of historical and projected data in various "what-if" data model scenarios. This is achieved through use of an OLAP Server.

Session Summary In this session, We have Understood the need for OLAP and significance of Multidimensional analysis in a Data Warehouse. Discussed about the evolution of OLAP. Explained architectures, characteristics as well as the merits and demerits of various OLAP solutions.

Thank you