Presentation is loading. Please wait.

Presentation is loading. Please wait.

OLAP – On Line Analytical Processing

Similar presentations


Presentation on theme: "OLAP – On Line Analytical Processing"— Presentation transcript:

1 OLAP – On Line Analytical Processing

2 Session Objectives Objectives:
At the end of this session, you will be able to: Define On Line Analytical Processing Understand the need for OLAP and applications of OLAP in BI Describe the various OLAP solutions and Architecture Comparison of different OLAP architectures Evaluation parameters to be considered for selecting an OLAP tool

3 What is OLAP? OLAP (On Line Analytical Processing) applications - designed for online ad-hoc data access and analysis. Data organized into multiple dimensions. Access to analytical content such as time series and trend analysis views and summary level information. A set of functionality that attempts to facilitate multidimensional analysis. Offers drill-down, drill-across and slice and dice capabilities.

4 OLAP - Fast Analysis On Line No piles of paper, please!
Analytical Establish patterns Processing Data-based Fast Analysis of Shared Multidimensional Information On-line Analytical Processing (OLAP) is a category of software technology that enables analysts, managers and executives to gain insight into data through fast, consistent, interactive access to a wide variety of possible views of information. OLAP is implemented in a multi-user client/server mode and offers consistently rapid response to queries, regardless of database size and complexity. OLAP helps the user synthesize enterprise information through comparative, personalized viewing, as well as through analysis of historical and projected data in various "what-if" data model scenarios. This is achieved through use of an OLAP Server.

5 Need for OLAP Dimensions can we think in ?
E.g. analysis by branch, product, agent, year !!! 2 or 3 Types of values we can handle ? E.g. Sales, Profit, Cost 1 or 2 How many levels can we handle ? E.g. number of products we can analyze Spread-sheet view of 1 value type user definable row user definable column Different graphical views Cross-tab, pie chart, Drilling down , level by level Country, Region, Branch, Agents

6 Need for OLAP Many parameters affect a Measure (value)
e.g Sales influenced by product, region, time, distribution channel, etc., Linear analysis = reports Many totals are at one level Difficult to identify the key parameters

7 OLAP in an Enterprise OLAP Server
An OLAP server is a high capacity, multi-user data manipulation engine specifically designed to support and operate on multi-dimensional data structures. A multi- dimensional structure is arranged so that every data item is located and accessed based on the intersection of the dimension members, which define that item. The design of the server and the structure of the data are optimized for rapid ad-hoc information retrieval in any orientation, as well as for fast, flexible calculation and transformation of raw data based on formulaic relationships. OLAP Client End user applications that can request information from OLAP servers and provide two- dimensional or multi-dimensional displays, user modifications, selections, ranking, calculations, etc., for visualization and navigation purposes. OLAP clients may be as simple as a spreadsheet program retrieving a slice for further work by a spreadsheet- literate user or as high-functioned as a financial modeling or sales analysis application

8 Uses of OLAP Departments: Finance Marketing Sales Manufacturing
Analytical Capabilities: Used by analysts and managers. Offers aggregated view of the data, such as total revenues by customer profile, by product line, by geographical regions.

9 Functionality of OLAP Tools
Provides the decision support front-end for data warehousing. Advanced statistical, financial, and analytical calculations. Appropriate tools to access data from a relational database. Appropriate tools to access or manage multidimensional data. OLAP functionality is characterized by dynamic multi-dimensional analysis of consolidated enterprise data supporting end user analytical and navigational activities including: Calculations and modeling applied across dimensions, through hierarchies and/or across members Trend analysis over sequential time periods Slicing subsets for on-screen viewing Drill-down to deeper levels of consolidation Reach-through to underlying detail data

10 Features of OLAP Applications
OLAP analytical features Multi-dimensional views of data Calculation intensive capabilities Time intelligence The OLAP Calculation engine in OLAP tools have a wide range of built-in calculations such as: Ratios Time calculations Statistics Ranking Custom formulas/algorithms Forecasting and modeling Multi dimensional views A dimension represents one variable. A sales data might include at least these five dimensions: Time, Product, Customer, Salesperson, Sales A simple query using above dimensions can be – “ How much revenue did Ram generated in January from all the products he sold to Ram & Co. ?” Calculation intensive OLAP does more than simple aggregation. It can be used for share calculations (Percentage of total) and allocations(which use hierarchies from a top-down perspective), trend analysis and forecasting using historical data. Time intelligence Time is a unique dimension being sequential in character. Business performance is almost always judged over time, for e.g., this month Vs last month, this month Vs the same month last year. Concepts such as year-to-date and period over period comparisons are used in an OLAP

11 Evolution of OLAP

12 Star Schema A Star Schema is a dimensional model created by mapping data entities from operational systems It has a central table (fact table) that links all the other tables (dimension tables) together Dimension: The same category of information. For example, year, month, day, and week are all part of the Time Dimension. Measure: The property that can be summed or averaged using pre computed aggregates.

13 Facts and Measures Sales Revenue Gross Margin Net Profit Cost
Profitability Facts or Measures are the Key Performance Indicators of an enterprise Factual data about the subject area Numeric, summarized

14 Dimension Sales Revenue What was sold ? (Measure)
Whom was it sold to ? When was it sold ? Where was it sold ? Sales Revenue (Measure) Dimensions put measures in perspective What, when and where qualifiers to the measures Dimensions could be products, customers, time, geography etc.

15 Star Schema

16 Star Schema Example

17 Star Schema with Sample Data

18 CUBE Cube Multi dimensional databases store information in the form of cubes. A cube is a collection of facts and related dimensions stored together in arrays. Geography Sales A group of data cells arranged by the dimensions of the data. For example, a spreadsheet exemplifies a two-dimensional array with the data cells arranged in rows and columns, each being a dimension. A three-dimensional array can be visualized as a cube with each dimension forming a side of the cube, including any slice parallel with that side. Higher dimensional arrays have no physical metaphor, but they organize the data in the way users think of their enterprise. Typical enterprise dimensions are time, measures, products, geographical regions, sales channels, etc. HR Time Product

19 Basic Terminology of a Cube
Hierarchy: A hierarchy defines the navigating path for drilling up and drilling down. All attributes in a hierarchy belong to the same dimension. Levels: These are organized into one or more hierarchies, typically from a coarse-grained level (for example, Year) down to the most detailed one (for example, Day). Members: The individual category values (for example, 2002 or 21Jan2002). Measures: These are the data values that are summarized and analyzed. Examples of measures are sales figures or operational costs. Cells: These are the intersection of one member for every dimension and store the data for measures.

20 Basic Terminology of a Cube
Dimensions consist of Dimension Name Level Hierarchy Member Time Level Of Detail YEAR QUARTER 1999 2000 2001 Q3 Q4 Q1 Q2 Q3 Q4 Q1 Q2

21 Aggregates Add up amounts for day 1 In SQL: SELECT sum(amt) FROM SALE WHERE date = 1 81

22 Aggregates Add up amounts by day In SQL: SELECT date, sum(amt) FROM SALE GROUP BY date

23 Another Example Add up amounts by day, product In SQL: SELECT date, sum(amt) FROM SALE GROUP BY date, prodId rollup drill-down

24 Aggregates Operators: sum, count, max, min, median and avg “Having” clause Using dimension hierarchy average by region (within store) maximum by month (within date)

25 Multi-dimensional cube:
The MOLAP Cube Fact table view: Multi-dimensional cube: dimensions = 2

26 Multi-dimensional cube:
3-D Cube Fact table view: Multi-dimensional cube: day 2 day 1 dimensions = 3

27 Example Dimensions: Time, Product, Store Attributes:
roll-up to region Dimensions: Time, Product, Store Attributes: Product (upc, price, …) Store … Hierarchies: Product  Brand  … Day  Week  Quarter Store  Region  Country NY Store SF roll-up to brand LA Juice Milk Coke Cream Soap Bread 10 34 56 32 12 Product roll-up to week M T W Th F S S Time 56 units of bread sold in LA on M

28 Cube Aggregation: Roll-up
Example: computing sums day 2 . . . day 1 129 drill-down rollup

29 Aggregation Using Hierarchies
day 2 store day 1 region country (store s1 in Region A; stores s2, s3 in Region B)

30 In SQL: SELECT * FROM SALE WHERE date = 1
Slicing In SQL: SELECT * FROM SALE WHERE date = 1 day 2 day 1 TIME = day 1

31 OLAP Solutions and Architecture

32 OLAP - Classification Online Analytical Processing (OLAP) can be done on: Relational databases Multidimensional databases OLAP products are grouped into three categories: Relational OLAP (ROLAP) Multidimensional OLAP (MOLAP) Hybrid OLAP (HOLAP)

33 MOLAP Multi-dimensional OLAP
Geography Age Group Brand Multi-dimensional OLAP MOLAP is a technology which uses a multi-dimensional database that stores data as n-dimensional cube Each face of a multi-dimensional cube represents a business dimension. Every point of intersection of dimensions, called a cell, represents a business fact relating to the dimensions.Practically MOLAP can store more than three dimensions in a cube.

34 Architecture of MOLAP Cube Size Critical LAN Router Firewall Issues:
non-live connection Used for updating the MOLAP data cube only LAN Desktop Systems Data Mart Server MOLAP Client Tools RDBMS Connectivity Middleware MOLAP Server MDDBMS/Data Cube MOLAP Application Router Firewall Multidimensional data bases are also known as MDDB or MDDBS .A class of proprietary, non-relational database management tools that store and manage data in a multidimensional manner. MOLAP servers directly store multidimensional data in special data structures like arrays or cubes. Source data is pre-consolidated. MDDs pre-calculate and store every measure at every hierarchy summary level at load time. MOLAP servers physically stage the processed multi-dimensional information to deliver consistent and rapid response times to end users Multidimensional OLAP analysis usually requires proprietary (non-SQL) access tools. MOLAP often utilizes a 3-Tier environment, where middle tier server preprocesses data from an RDBMS. Some OLAP tools access an RDBMS directly and build cubes as a fat client. Issues: Size of Data Cube Cubes deployment Size of Update Data Set Intranet Internet Thin Clients WWW Browser

35 MOLAP Products Oracle's Oracle Express Server
Cognos - Powerplay Transformer Essbase (Hyperion Software) Holos (Seagate Software) There are also lot of MOLAP vendors in the market. The main features of MOLAP products are : Forecasting / Budgeting Summarized as well as detailed data for analysis Highly complex calculations Infrequent / Small Updates Three / N tier architecture Upsides of MOLAP are : Instant response Value add functions (ranking, % change) Provides more intuitive commands for performing drill down and other analytic operations. A single cube holds answers to a number of requests

36 Architecture of ROLAP Issues: Aggregate Awareness Response Time
LAN Data Mart Server ROLAP Server Desktop Systems RDBMS Connectivity Middleware ROLAP Application ROLAP Client Tools Router / Firewall Issues: Aggregate Awareness Response Time Network Capacity In ROLAP data is stored in the form of records in relational databases. It also supports extensions to SQL and ad-hoc queries. ROLAP servers efficiently implement the multidimensional data model and operations in a relational database. ROLAP Uses RDBMS tables as data source .Architecture can be designed as 3/N tier system. The Calculation engine may be located anywhere.RDBMSs supports star / Snowflake schema. Supports various data types such as numeric, textual, spatial, audio, graphic, and video data. Different types of indexing mechanisms are available in order to improve query performance. Features of ROLAP : ROLAP servers are highly scaleable, in both number of users and amount of data Calculation-on-the-fly. Can be run from the client, the data-base server or a middle tier Limitless filtering and grouping Can handle extremely large number of dimensions Intranet Internet Thin Clients WWW Browser

37 ROLAP Products Brio Query Enterprise Business Objects Metacube
DSS Server Information Advantage The specified tools are market leaders in ROLAP. Generally ROLAP architecture offers these features : Analysis of transaction detail data Very Large Database Frequent / Large Updates Aggregate aware Monitoring Three / N tier architecture Upsides of ROLAP are : Leverage RDBMS capability Value add functions (ranking, % change) No additional loads No additional data sets to manage Scalability

38 Architecture of HOLAP Issues: Cube elements Integration with RDBMS LAN
MOLAP Server ROLAP Server Desktop Systems MDDBMS/Data Cube MOLAP Application ROLAP Application HOLAP Client Tools Router/Fire wall HOLAP (Hybrid OLAP) is a mix of MOLAP and relational architecture that supports queries against summary and transaction data in an integrated fashion.HOLAP uses a multi-dimensional database server as middle ware to access data stored in a relational database. It provides users with the detailed transaction data that contributes to the summary totals stored in the multidimensional server.A relational database stores most of the data. A separate multi-dimensional database stores the most of the dense data, which is typically a small proportion of the total volume of data. HOLAP architectures are typically more complex to implement and administer than ROLAP or MOLAP architectures Issues: Cube elements Integration with RDBMS

39 HOLAP Products Holos (Seagate Software)
Microsoft SQL Server OLAP Services Pilot Software's Pilot Decision Support Suite SAS

40 MOLAP Vs ROLAP

41 Comparison of Architectures

42 Strength and Weakness of MOLAP/ROLAP

43 Strength and Weakness of MOLAP/ROLAP
On-line Analytical Processing (OLAP) is a category of software technology that enables analysts, managers and executives to gain insight into data through fast, consistent, interactive access to a wide variety of possible views of information. OLAP is implemented in a multi-user client/server mode and offers consistently rapid response to queries, regardless of database size and complexity. OLAP helps the user synthesize enterprise information through comparative, personalized viewing, as well as through analysis of historical and projected data in various "what-if" data model scenarios. This is achieved through use of an OLAP Server.

44 Session Summary In this session, We have
Understood the need for OLAP and significance of Multidimensional analysis in a Data Warehouse. Discussed about the evolution of OLAP. Explained architectures, characteristics as well as the merits and demerits of various OLAP solutions.

45 Thank you


Download ppt "OLAP – On Line Analytical Processing"

Similar presentations


Ads by Google