Business Intelligence & Multi-Dimensional Databases Nirmal Jonnalagedda.

Slides:



Advertisements
Similar presentations
C6 Databases.
Advertisements

By: Mr Hashem Alaidaros MIS 211 Lecture 4 Title: Data Base Management System.
Introduction to Databases
Online Analytical Processing OLAP
OLAP Services Business Intelligence Solutions. Agenda Definition of OLAP Types of OLAP Definition of Cube Definition of DMR Differences between Cube and.
Multidimensional Databases Prof. Navneet Goyal Computer Science Department BITS, Pilani.
IT ARCHITECTURE © Holmes Miller BUILDING METAPHOR 3CUSTOMER’S CONCERN Has vision about building that will meet needs and desires 3ARCHITECT’S CONCERN.
DECISION SUPPORT SYSTEMS AND BUSINESS INTELLIGENCE
Business Intelligence Tracy Hartley Zack Johnson Marissa McGee Tracy Hartley Zack Johnson Marissa McGee.
File Systems and Databases
Business Intelligence Michael Gross Tina Larsell Chad Anderson.
Database Management: Getting Data Together Chapter 14.
Data Sources Data Warehouse Analysis Results Data visualisation Analytical tools OLAP Data Mining Overview of Business Intelligence Data visualisation.
Ch1: File Systems and Databases Hachim Haddouti
COMP 578 Data Warehousing And OLAP Technology Keith C.C. Chan Department of Computing The Hong Kong Polytechnic University.
Chapter 14 The Second Component: The Database.
CSE6011 Warehouse Models & Operators  Data Models  relations  stars & snowflakes  cubes  Operators  slice & dice  roll-up, drill down  pivoting.
Chapter 13 The Data Warehouse
Online Analytical Processing (OLAP) Hweichao Lu CS157B-02 Spring 2007.
1 Basic concepts of On-Line Analytical processing DT211 /4.
CISB594 – Business Intelligence
What is Business Intelligence? Business intelligence (BI) –Range of applications, practices, and technologies for the extraction, translation, integration,
Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization.
Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor
Week 1 Lecture MSCD 600 Database Architecture Samuel ConnSamuel Conn, Asst. Professor Suggestions for using the Lecture Slides.
Copyright © 2003 by Prentice Hall Computers: Tools for an Information Age Chapter 13 Database Management Systems: Getting Data Together.
The McGraw-Hill Companies, Inc Information Technology & Management Thompson Cats-Baril Chapter 3 Content Management.
Multi-Dimensional Databases & Online Analytical Processing This presentation uses some materials from: “ An Introduction to Multidimensional Database Technology,
DW-1: Introduction to Data Warehousing. Overview What is Database What Is Data Warehousing Data Marts and Data Warehouses The Data Warehousing Process.
@ ?!.
OnLine Analytical Processing (OLAP)
Faster and Smarter Data Warehouses with Oracle OLAP 11g.
OLAP & DSS SUPPORT IN DATA WAREHOUSE By - Pooja Sinha Kaushalya Bakde.
Online analytical processing (OLAP) is a category of software technology that enables analysts, managers, and executives to gain insight into data through.
1.file. 2.database. 3.entity. 4.record. 5.attribute. When working with a database, a group of related fields comprises a(n)…
C6 Databases. 2 Traditional file environment Data Redundancy and Inconsistency: –Data redundancy: The presence of duplicate data in multiple data files.
Introduction – Addressing Business Challenges Microsoft® Business Intelligence Solutions.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
BUSINESS ANALYTICS AND DATA VISUALIZATION
Building Data and Document-Driven Decision Support Systems How do managers access and use large databases of historical and external facts?
Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
MANAGING DATA RESOURCES ~ pertemuan 7 ~ Oleh: Ir. Abdul Hayat, MTI.
By N.Gopinath AP/CSE. There are 5 categories of Decision support tools, They are; 1. Reporting 2. Managed Query 3. Executive Information Systems 4. OLAP.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Lexmark By Rosanna Nadal & Irina Yermolovich. Lexmark International Global manufacturer of printing products and solutions for customers in more then.
DATA RESOURCE MANAGEMENT
CISB594 – Business Intelligence Business Analytics and Data Visualization Part I.
What is OLAP?.
The Need for Data Analysis 2 Managers track daily transactions to evaluate how the business is performing Strategies should be developed to meet organizational.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Data Warehousing and Decision Support Chapter 25.
1 Copyright © 2006, Oracle. All rights reserved. Defining OLAP Concepts.
Database Concepts and Applications in HRIS
1 Management Information Systems M Agung Ali Fikri, SE. MM.
Multi-Dimensional Databases & Online Analytical Processing This presentation uses some materials from: “An Introduction to Multidimensional Database Technology,”
Managing Data Resources File Organization and databases for business information systems.
Fundamentals & Ethics of Information Systems IS 201
Chapter 13 The Data Warehouse
Fundamentals of Information Systems
Data Warehouse.
MANAGING DATA RESOURCES
Chapter 1 Database Systems
File Systems and Databases
Business Intelligence
Chapter 1 Database Systems
Supporting a Business Process
Chapter 3 Database Management
Online analytical processing (OLAP) is a category of software technology that enables analysts, managers, and executives to gain insight into data through.
Supporting a Business Process
Presentation transcript:

Business Intelligence & Multi-Dimensional Databases Nirmal Jonnalagedda

Outline 1. BI: History 2. BI: Overview 3. Common Functions of BI 4. BI: What can you do with it? 5. Multidimensional Databases 6. Contrast MDD and Relational Databases 7. When is MDD (In)appropriate? 8. MDD Features 9. Pros/Cons of MDD

BI: History Term first used by IBM researcher Hans Peter Luhn He defined intelligence as: “the ability to apprehend the interrelationships of presented facts in such a way as to guide action towards a desired goal” BI is understood to have evolved decision support systems (DSS) in the 1960’s In the 80’s DSS concepts evolved and split data warehouses, Executive Information Systems, OLAP

BI : an overview There are many different opinions Depends on where you work Generally BA is a subset of BI BI - Ability for an organization to take its capabilities and convert these things into knowledge Often includes the implementation of Key Performance Indicators (KPIs), Trending Analysis, Predictive Modeling

What does BI provide historical, current and predictive views of business operations Where does BI get this information From within your business Not necessarily focused on the actions of others Called competitive analysis End Goal : Support better decision making BI is sometimes called a decision support system (DSS)

BI applications can often vary in scope Can be enterprise wide, focusing on critical business applications Monitoring the popularity of a product in a nationwide grocery chain Tracking responses to mail offers and only mailing those who respond Can be department or project specific, focused on individual decisions and how those affect an organization Monitoring employee productivity and department spending

Common Functions of BI Reporting Online Analytical Processing Analytics Data, Process, Text Mining Complex Event Processing Business Performance Management Benchmarking Predictive, Prescriptive Analytics

BI : What can you do with it? Identify cost cutting ideas and practices Uncover new business opportunities React and even predict retail demand Avoid repeating costly mistakes Especially useful in large enterprises with many departments Easily correlate and group business information and metrics into an understandable format Understand customer behavior

Database Evolution Flat files Hierarchical and Network Relational Distributed Relational Multidimensional

MDDB: Why? No single "best" data structure for all applications within an enterprise Organizations have abandoned the search for the holy grail of globally accepted database Instead selecting the most appropriate data structure on a case-by-case basis from a palette of standard database structures Multidimensional Databases for OLAP?

From econometric research conducted at MIT in the 1960s, the multidimensional database has matured into the database engine of choice for data analysis applications Inherent ability to integrate and analyze large volumes of enterprise data Offers a good conceptual fit with the way end-users visualize business data Most business people already think about their businesses in multidimensional terms Managers tend to ask questions about product sales in different markets over specific time periods

Spreadsheets – A 2D database? Functionalities What about a stack of similar spreadsheets for different times? Limitations? We can not relate data in different sheets easily

What is a Multi-Dimensional Database? A multidimensional database (MDDB) is a computer software system designed to allow for the efficient and convenient storage and retrieval of large volumes of data that are  intimately related and  stored, viewed and analyzed from different perspectives. These perspectives are called dimensions.

A Motivating Example An automobile manufacturer wants to increase sale volumes by examining sales data collected throughout the organization. The evaluation would require viewing historical sales volume figures from multiple dimensions such as Sales volume by model Sales volume by color Sales volume by dealer Sales volume over time

Contrasting Relational and Multi- Dimensional Models The Relational Structure

Multidimensional Structure Measurement Dimension Positions Dimension

Differences between MDDB and Relational Databases Normalized RelationalMDDB Data reorganized based on query. Perspectives are placed in the fields – tells us nothing about the contents Perspectives embedded directly in the structure. Browsing and data manipulation are not intuitive to user Data retrieval and manipulation are easy Slows down for large datasets due to multiple JOIN operations needed. Fast retrieval for large datasets due to predefined structure. Flexible. Anything an MDDB can do, can be done this way. Relatively Inflexible. Changes in perspectives necessitate reprogramming of structure.

Contrasting Relational Model and MDD-Example 2

Mutlidimensional Representation

Viewing Data - An Example Assume that each dimension has 10 positions, as shown in the cube above How many records would be there in a relational table? Implications for viewing data from an end-user standpoint?

Performance Advantages Volume figure when car type = SEDAN, color=BLUE, & dealer=GLEASON? RDBMS – all 1000 records might need to be searched to find the right record MDB has more ‘knowledge’ about where the data lies Maximum of 30 position searches Average case 15 vs. 500

Total Sales across all colors and dealers when model = SEDAN? RDBMS – all 1000 records must be searched to get the answer MDB – Sum the contents of one 10x10 ‘slice’

Data manipulation that requires a minute in RDBMS may require only a few seconds in MDB MDBs are an order of magnitude faster than RDBMSs Performance benefits are more for queries that generate cross-tab views of data The performance advantages offered by multidimensional technology facilitates the development of interactive decision support applications like OLAP that can be impractical in a relational environment.

Real World Benefits Ease of data presentation and navigation Ease of maintenance Performance

Ease of Data Presentation and Navigation Intuitive spreadsheet like data views are natural output of MDDBs Obtaining the same views in a relational environment, requires either a complex SQL or a SQL generator against a RDB to convert the table outputs into a more intuitive format Even for end users well skilled in SQL, some forms of output, such as ranking reports (i.e. top ten, bottom 20%), simply cannot be performed with SQL at all!

Ease of Maintenance Ease of maintenance because data is stored as it is viewed No additional overhead is required to translate user queries into requests for data To provide same intuitiveness, RDBs use indexes and sophisticated joins which require significant maintenance and storage

Performance Multidimensional databases achieve performance levels that are difficult to match in a relational environment. These high performance levels enable and encourage OLAP applications Performance of MDBs can be matched by RDBs through database tuning Not possible to tune the database for all possible adhoc queries Tuning requires resources of an expensive DB specialist

Adding Dimensions- An Example

When is MDD (In)appropriate? First, consider situation 1

When is MDD (In)appropriate? Now consider situation 2 1. Set up a MDD structure for situation 1, with LAST NAME and Employee# as dimensions, and AGE as the measurement. 2. Set up a MDD structure for situation 2, with MODEL and COLOR as dimensions, and SALES VOLUME as the measurement.

When is MDD (In)appropriate? Note the sparseness in the second MDD representation MDD Structures for the Situations

When is MDD (In)appropriate? Our sales volume dataset has a great number of meaningful interrelationships Interrelationships more meaningful than individual data elements themselves. The greater the number of inherent interrelationships between the elements of a dataset, the more likely it is that a study of those interrelationships will yield business information of value to the company. Highly interrelated dataset types be placed in a multidimensional data structure for greatest ease of access and analysis

When is MDD (In)appropriate? No last name is matching with more than one emp # and no emp # is matching with more than one last name In contrast, there is a sales figure associated with every combination of model and color resulting in a completed filled up 3x3 matrix Performance suffers (RDB 9 vs. MDB 18)

When is MDD (In)appropriate? The relative performance advantages of storing multidimensional data in a multidimensional array increase as the size of the dataset increases The relative performance disadvantages of storing non-multidimensional data in a multidimensional array increase as the size of the dataset increases. NO inherent value of storing Non-multidimensional data (employee data) in multidimensional arrays

When is MDD (In)appropriate? The relative performance advantages of storing multidimensional data in a multidimensional array increase as the size of the dataset increases The relative performance disadvantages of storing non-multidimensional data in a multidimensional array increase as the size of the dataset increases. NO inherent value of storing Non-multidimensional data (employee data) in multidimensional arrays

When is MDD Appropriate? The greater the number of inherent interrelationships between the elements of a dataset, the more likely it is that a study of those interrelationships will yield business information of value to the company. Most companies have limited time and resources to devote to analyzing data It therefore becomes critical that these highly interrelated dataset types be placed in a multidimensional data structure for greatest ease of access and analysis.

When is MDD Appropriate? Examples of applications that are suited for multidimensional technology: Financial Analysis and Reporting Budgeting Promotion Tracking Quality Assurance and Quality Control Product Profitability Survey Analysis

MDD Features - Rotation Also referred to as “data slicing.” Each rotation yields a different slice or two dimensional table of data – a different face of the cube.

MDD Features - Rotation

All the six views can be obtained by simple rotation In MDBs rotations are simple as no rearrangement of data is required Rotation is also referred to as “data slicing”

MDD Features - Ranging How sales volume of models painted with new metallic blue compared with the sales of normal blue color models? The user knows that only Sports Coupe and Mini Van models have received the new paint treatment Also the user knows that only 2 dealers viz, Carr and Clyde have unconstrained supply of these models

MDD Features - Ranging The end user selects the desired positions along each dimension. Also referred to as "data dicing." The data is scoped down to a subset grouping

MDD Features - Ranging The reduced array can now be rotated and used in computations in the same was as the parent array Referred to as “Data Dicing” as data is scoped down to a subset grouping Complex SQL query is required in RDB Performance is better in MDB as less resource consuming searches are required

MDD Features - Roll-Ups & Drill Downs Users want different views of the same data For eg., Sales Volume by model vs sales volume by dealership Many times views are similar Sales volume by dealership vs. volume by district Natural relationship between Sales Volumes at the DEALERSHIP level and Sales Volumes at the DISTRICT level Sales Volumes for all the dealerships in a district sum to the Sales Volumes for that district

MDD Features - Roll-Ups & Drill Downs Multidimensional database technology is specially designed to facilitate the handling of natural relationships Define two related aggregates on the same dimension One aggregation is dealership and the other district District is at a higher level of aggregation than dealership

MDD Features - Roll-Ups & Drill Downs The figure presents a definition of a hierarchy within the organization dimension. Aggregations perceived as being part of the same dimension. Moving up and moving down levels in a hierarchy is referred to as “roll-up” and “drill-down.”

MDD Features - Roll-Ups & Drill Downs

Queries High degree of structure in MDB makes the query language very simple and efficient Query language is intuitive Output is immediately useful to end user

Queries: Example Display sales volume by model for each dealership PRINT TOTAL.(SALES_VOLUME KEEP MODEL DEALERSHIP)

Queries: Example Corresponding SQL SELECT MODEL, DEALERSHIP, SUM(SALES_VOLUME) FROM SALES_VOLUME GROUP BY MODEL, DEALERSHIP ORDER BY MODEL, DEALERSHIP

Queries: Example

Pros/Cons of MDD Cognitive Advantages for the User Ease of Data Presentation and Navigation, Time dimension Performance Less flexible Requires greater initial effort

?