 MDX (Multi-Dimensional eXpressions) is a query language used to retrieve data from multi-dimensional databases.  MDX was originally designed by Microsoft.

Slides:



Advertisements
Similar presentations
Chapter 4 Tutorial.
Advertisements

Outline What is a data warehouse? A multi-dimensional data model Data warehouse architecture Data warehouse implementation Further development of data.
April 30, Data Warehousing and OLAP Technology: An Overview  What is a data warehouse?  Data warehouse architecture  From data warehousing to.
2008/2/141 OLAP (Online Analytical Processing). 2008/2/142 Data Warehouse A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile.
Data Warehousing CPS216 Notes 13 Shivnath Babu. 2 Warehousing l Growing industry: $8 billion way back in 1998 l Range from desktop to huge: u Walmart:
Introduction to Data Warehousing CPS Notes 6.
OLAP Services Business Intelligence Solutions. Agenda Definition of OLAP Types of OLAP Definition of Cube Definition of DMR Differences between Cube and.
1 Lecture 09: OLAP
Decision Support and Data Warehouse. Decision supports Systems Components Data management function –Data warehouse Model management function –Analytical.
Data Warehousing Xintao Wu. Evolution of Database Technology (See Fig. 1.1) 1960s: Data collection, database creation, IMS and network DBMS 1970s: Relational.
Data Sources Data Warehouse Analysis Results Data visualisation Analytical tools OLAP Data Mining Overview of Business Intelligence Data visualisation.
Dr. M. Sulaiman Khan Dept. of Computer Science University of Liverpool 2010 COMP207: Data Mining Data Warehousing COMP207: Data Mining.
Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.
COMP 578 Data Warehousing And OLAP Technology Keith C.C. Chan Department of Computing The Hong Kong Polytechnic University.
© Tan,Steinbach, Kumar Introduction to Data Mining 8/05/ Data Warehouse and Data Cube Lecture Notes for Chapter 3 Introduction to Data Mining By.
Lab3 CPIT 440 Data Mining and Warehouse.
Data Warehousing. On-Line Analytical Processing (OLAP) Tools The use of a set of graphical tools that provides users with multidimensional views of their.
CSE6011 Warehouse Models & Operators  Data Models  relations  stars & snowflakes  cubes  Operators  slice & dice  roll-up, drill down  pivoting.
Microsoft SQL Server 2012 Analysis Services (SSAS) Reporting Services (SSRS)
Ch3 Data Warehouse part2 Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2009.
DATA WAREHOUSE (Muscat, Oman).
1 Data Warehousing and OLAP. 2 Data Warehousing & OLAP Defined in many different ways, but not rigorously.  A decision support database that is maintained.
Chapter 4 Tutorial.
CS346: Advanced Databases
Online Analytical Processing (OLAP) Hweichao Lu CS157B-02 Spring 2007.
OLAP OPERATIONS. OLAP ONLINE ANALYTICAL PROCESSING OLAP provides a user-friendly environment for Interactive data analysis. In the multidimensional model,
1 Basic concepts of On-Line Analytical processing DT211 /4.
8/20/ Data Warehousing and OLAP. 2 Data Warehousing & OLAP Defined in many different ways, but not rigorously. Defined in many different ways, but.
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Decision Support Chapter 23.
SQL Analysis Services Microsoft® SQL Server 2005 Analysis Services provides unified, fully integrated views of your business data to support online.
Ahsan Abdullah 1 Data Warehousing Lecture-11 Multidimensional OLAP (MOLAP) Virtual University of Pakistan Ahsan Abdullah Assoc. Prof. & Head Center for.
OnLine Analytical Processing (OLAP)
Data Warehousing Xintao Wu. Can You Easily Answer These Questions? What are Personnel Services costs across all departments for all funding sources? What.
1 Data Warehouses BUAD/American University Data Warehouses.
OLAP & DSS SUPPORT IN DATA WAREHOUSE By - Pooja Sinha Kaushalya Bakde.
Data Warehousing.
Module 1: Introduction to Data Warehousing and OLAP
Roadmap 1.What is the data warehouse, data mart 2.Multi-dimensional data modeling 3.Data warehouse design – schemas, indices 4.The Data Cube operator –
T.ROKAYAH BAYAN OLAP IN THE DATA WAREHOUSE. CHAPTER OBJECTIVES  Review the major features and functions of OLAP in detail  Grasp the intricacies of.
BI Terminologies.
October 28, Data Warehouse Architecture Data Sources Operational DBs other sources Analysis Query Reports Data mining Front-End Tools OLAP Engine.
BUSINESS ANALYTICS AND DATA VISUALIZATION
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
13 1 Chapter 13 The Data Warehouse Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
Dr. N. MamoulisAdvanced Database Technologies1 Topic 6: Data Warehousing & OLAP Defined in many different ways, but not rigorously. A decision support.
SHIFALI CHOUBEY GISE LAB IITB Decision Support System For Farmers.
1 On-Line Analytic Processing Warehousing Data Cubes.
Decision supports Systems Components
Data Mining Data Warehouses.
Business Intelligence Transparencies 1. ©Pearson Education 2009 Objectives What business intelligence (BI) represents. The technologies associated with.
What is OLAP?.
CSE 5331/7331 F'071 CSE 5331/7331 Fall 2007 Dimensional Modeling Margaret H. Dunham Department of Computer Science and Engineering Southern Methodist University.
Copyright© 2014, Sira Yongchareon Department of Computing, Faculty of Creative Industries and Business Lecturer : Dr. Sira Yongchareon ISCG 6425 Data Warehousing.
Data Warehousing.
1 Database Systems, 8 th Edition 1 Chapter 13 Business Intelligence and Data Warehouses Objectives In this chapter, you will learn: –How business intelligence.
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Data Warehousing and Decision Support.
The Need for Data Analysis 2 Managers track daily transactions to evaluate how the business is performing Strategies should be developed to meet organizational.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Data Warehousing and Decision Support Chapter 25.
Data Warehousing COMP3017 Advanced Databases Dr Nicholas Gibbins –
Data Warehousing and OLAP Outline u Models & operations u Implementing a warehouse u Future directions.
Data Mining: Data Warehousing
On-Line Analytic Processing
Data warehouse and OLAP
What is OLAP OLAP allows to model data in a multidimensional way like a data cube in order to look for the data from many perspectives.
OLAP Concepts and Techniques
Data Warehouse.
Data Warehousing and OLAP Technology for Data Mining
Lecture 4: From Data Cubes to ML
Data Warehouse.
Presentation transcript:

 MDX (Multi-Dimensional eXpressions) is a query language used to retrieve data from multi-dimensional databases.  MDX was originally designed by Microsoft and introduced along with Analysis Services 7.0 in 1998.

 Cube The cube is the foundation of a multidimensional database. Each cube typically contains more than two dimensions.  Measures The Measures object is basically a special dimension of the cube which is a collection of measures. Measures are quantitative entities which are used for analysis.

 Dimension Ex: Customer and Product.  Hierarchy Ex: Calendar of Date dimension.  Level Ex: Product Line and Country.  Member Ex: USA, Germany, France, UK, Australia and Canada are members of the Country level.

 you can use the following format for accessing a member: [DimensionName].[HierarchyName]. [LevelName].[MemberName] Ex: [Customer].[Country].Australia

 Let’s start to build our first MDX query. The basic MDX query has the following structure: SELECT axis A1,……, axis An ON COLUMNS, axis B1,……, axis Bn ON ROWS FROM cube  Let’s compare that to the similar SQL statement: SELECT column 1, column 2, …, column n FROM table

SELECT Measures.[Internet Sales Amount] on COLUMNS, {[Customer].[Country].[France], [Customer].[Country].[Germany], [Customer].[Country].[United Kingdom]} on ROWS FROM [Adventure Works]

CountryInternet Sales Amount France120,000 Germany999,999 United Kingdom55,000

SELECT Measures.[Sales] ON COLUMS FROM ProductsCube WHERE ([Product].[Color].[Silver])

 Members Views all members of a specific level. Ex: [Customer].[Country].members. The previous statement will view all members in the Country level.

 Children Views all members under a specific member in the hierarchy. Ex: [Date].[Calendar].[Calendar Year -2003].[Semester 2].children The previous statement will view Quarter 3 and Quarter 4 which are the members under the member Semester 2 of Calendar Year in the Calendar hierarchy.

 Named Sets: SELECT Measures.[Internet Sales Amount] on COLUMNS, {[Customer].[Country].[France], [Customer].[Country].[Germany], [Customer].[Country].[United Kingdom]} on ROWS FROM [Adventure Works]

WITH SET [EUROPE] AS '{[Customer].[Country].[France], [Customer].[Country].[Germany], [Customer].[Country].[United Kingdom]}' SELECT Measures.[Internet Sales Amount] on COLUMNS, [EUROPE] on ROWS FROM [Adventure Works]

 Calculated Members: WITH MEMBER [MEASURES].[Profit] AS ([Measures].[Internet Sales Amount] - [Measures].[Total Product Cost]) SELECT [MEASURES].[Profit] ON COLUMNS, [Customer].[Country].MEMBERS ON ROWS FROM [Adventure Works]

2008/2/1417

A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection of data in support of management’s decision-making process. 2008/2/1418

An OLAP system manages large amount of historical data, provides facilities for summarization and aggregation, and stores and manages information at different levels of granularity. 2008/2/1419

 High performance for both systems ◦ DBMS— tuned for OLTP: access methods, indexing, concurrency control, recovery ◦ Warehouse—tuned for OLAP: complex OLAP queries, multidimensional view, consolidation.  Different functions and different data: ◦ missing data: Decision support requires historical data which operational DBs do not typically maintain ◦ data consolidation: DS requires consolidation (aggregation, summarization) of data from heterogeneous sources ◦ data quality: different sources typically use inconsistent data representations, codes and formats which have to be reconciled 2008/2/1420

 A data warehouse is based on a multidimensional data model which views data in the form of a data cube.  A data cube, such as sales, allows data to be modeled and viewed in multiple dimensions  In data warehousing literature, an n-D base cube is called a base cuboid. The top most 0-D cuboid, which holds the highest-level of summarization, is called the apex cuboid. The lattice of cuboids forms a data cube. 2008/2/1421

2008/2/1422

 Relational OLAP (ROLAP): ◦ Use relational or extended-relational DBMS to store and manage warehouse data and OLAP middle ware to support missing pieces. ◦ Include optimization of DBMS backend, implementation of aggregation navigation logic, and additional tools and services ◦ greater scalability  Multidimensional OLAP (MOLAP): ◦ Array-based multidimensional storage engine (sparse matrix techniques) ◦ fast indexing to pre-computed summarized data  Hybrid OLAP (HOLAP): ◦ User flexibility, e.g., low level: relational, high-level: array.  Specialized SQL servers: ◦ specialized support for SQL queries over star.snowflake schemas 2008/2/1423

 Sales volume as a function of product, month, and region 2008/2/1424 Product Region Month Dimensions: Product, Location, Time Hierarchical summarization paths Industry Region Year Category Country Quarter Product City Month Week Office Day

2008/2/1425

 Roll up (drill-up): summarize data ◦ by climbing up hierarchy or by dimension reduction  Drill down (roll down): reverse of roll-up ◦ from higher level summary to lower level summary or detailed data, or introducing new dimensions  Slice and dice: ◦ project and select  Pivot (rotate): ◦ reorient the cube, visualization, 3D to series of 2D planes.  Other operations ◦ drill through: through the bottom level of the cube to its back-end relational tables (using SQL) 2008/2/1426

2008/2/1427

2008/2/1428

 Cube definition and computation in OLAP 1.define cube sales[item, city, year]: sum(sales_in_dollars) 2.compute cube sales  Transform it into a SQL-like language (with a new operator cube by) SELECT item, city, year, SUM (amount) FROM SALES CUBE BY item, city, year  Need compute the following Group-Bys (date, product, customer), (date,product),(date, customer), (product, customer), (date), (product), (customer) () 2008/2/1429 (item)(city) () (year) (city, item)(city, year)(item, year) (city, item, year)

The roll-up operation performs aggregation on a data cube, either by climbing up a concept hierarchy for a dimension or by dimension reduction such that one or more dimensions are removed from the given cube. Drill-down is the reverse of roll-up. It navigates from less detailed data to more detailed data. Drill-down can be realized by either stepping down a concept hierarchy for a dimension or introducing additional dimensions. 2008/2/1430

The slice operation performs a selection on one dimension of the given cube, resulting in a sub_cube. The dice operation defines a sub_cube by performing a selection on two or more dimensions. 2008/2/1431

2008/2/1432

2008/2/1433

2008/2/1434

2008/2/1435

The select clause defines axis dimensions on COLUMNS and on ROWS, where clause supplies slicer dimensions, and Cube is the name of the data cube. Select axis [, axis] From Cube Where slicer [, slicer] 2008/2/1436

 For the majority of MDX statements, the context of the query will be limited to a single cube. It is important to know how all data within a cube is divided into the following relationship: Dimensions Hierarchies Levels Members 2008/2/1437

SELECT {[Gender].[Gender].Members} ON COLUMNS, {[Product].[Product Family].Members} ON ROWS, FROM [Sales] WHERE ([Measures].[Unit Sales], [Customers].[All Customers], [Education Level].[All Education Level], [Marital Status].[All Martial status], [Promotions].[All Promotions], [Store].[All Stores], [Store Size in SQFT].[All], [Store Type].[All], [Yearly Income].[All Yearly Income] 2008/2/1438

Example on Star Schema 2008/2/1439

2008/2/1940 SELECT [SALES].[AMOUNT] ON COLUMNS, [store].[Kowloon] ON ROWS FROM SALES MDX SQL select sum(amount), area from SALES where (area='Kowloon') group by area

Graphical Description on Roll-up Example 2008/2/1441 Roll-up on Store Dimension (from store to area) Store is the child of Area

2008/2/1942 SELECT [SALES].[AMOUNT] ON COLUMNS, [time].[2003].[Q4].[Dec].[31], [time].[2003].[Q4].[Dec].[30],… …, [time].[2003].[Q4].[Dec].[2], [time].[2003].[Q4].[Dec].[1] ON ROWS FROM SALES MDX SQL select sum(amount), the_date from SALES where (the_date='2003-Dec-31') or (the_date='2003-Dec-30') or… …or (the_date='2003-Dec-2') or (the_date='2003-Dec-1') group by the_date

Graphical Description of Drill-down Example 2008/2/1443 Drill-down on Time Dimension (from month to date) Month is the parent of Date

2008/2/1944 MDX SQL SELECT [SALES].[AMOUNT] ON COLUMNS, [store].[Kowloon].[292] ON ROWS FROM SALES select sum(amount), storecode from SALES where (storecode='292') group by storecode

Graphical Description of Slice 2008/2/1445 Slice WHERE store = ‘292’

2008/2/1946 MDX SQL SELECT [SALES].[AMOUNT] ON COLUMNS, [store].[HK],[store].[NT], [store].[Kowloon] ON ROWS FROM SALES WHERE [time].[2003].[Q4].[Dec].[24 ] select sum(amount), area from SALES where ( (area='HK') or (area='NT') or (area='Kowloon')) and (the_date='2003-Dec-24') group by area

Graphical Description of Dice 2008/2/1447 Dice WHERE (store = ‘292’) and (time = ‘Dec’) and product = ‘ALL’

The CrossJoin ( ) function is used to generate the cross-product of two input sets. If two sets exist in two independent dimensions, the CrossJoin operator creates a new set consisting of all of the combinations of the members in the two dimensions. 2008/2/1448

In some respects, the Filter ( ) function and the slicer axis have similar purposes. The difference between the two is that the Filter ( ) function defines the members in a set, while slicers determine a slice of the cube returned from a query. 2008/2/1449

The Order ( ) function provides sorting capabilities within the MDX language. When the Order expression is used, it can either sort within the natural hierarchy (ASC and BDESC), or it can sort without the hierarchy (BASC and BDESC). The “B” indicates “break” hierachy. 2008/2/1450

The TopCount () and BottomCount() functions provide rank functionality critical in a decision support and data analysis environment. These expressions sort a set based on a numerical expression and pick the top index items based on rank order. 2008/2/1451

2008/2/1752

Dimension tables: [Gender].[Gender Members] [Product].[Product Name] [Marital Status].[All Martial status] [Promotions].[All Promotions], [Store].[All Stores], [Store Size in SQFT].[All], [Store Type].[All], [Yearly Income].[All Yearly Income] [Time].[Year] Fact table: [Measures].[Unit Sales], [Measures].[Store Cost], [Measures].[Store Sales], [Measures].[Sales Count], [Measures].[Store Sales Net] 2008/2/1453

Select CrossJoin({[Gender]. [Gender]. Members}, {[Time].[Year]. Members}) ON COLUMNS, {[Measures].Members} ON ROWS FROM [Sales] Where CrossJoin operator of source sets creates a new set consisting of all of the combinations of the members of the source sets. 2008/2/1454

 The two specific sets that are within the CrossJoin are the two members of the gender level of the Gender dimension, and the two years in the year level of the Time dimension.  The set of all members of the Measure dimension is included on the rows axis.  There is no member explicitly added as a slicer in this query. 2008/2/1455

Unit Sales131, , Store Cost111, , Store Sales280, , Sales count Store Sales Net168, , /2/1456 FM

SELECT {[Measures]. [Unit Sales]} ON COLUMNS, {Filter({[Product]. [Product Department].Members}, ([Gender]. [All Gender].[F],[Measures].[Unit Sales]) > 10000)} ON ROWS FROM [Sales] 2008/2/1457

The results of this query show that the set returned on the rows axis consists of product departments for which unit sales to females is greater than $ /2/1758

Unit Sales Frozen Food26, Produce37, Snack Foods30, Household27, /2/1459

SELECT {[Customers].[All Customers].[USA], [Customers].[All Customers].[USA].Children}ON COLUMNS, {TopCount({[Product].[Product Category].Members}, 5, [Measures].[Unit Sales]), BottomCount({[Product].[Product Category].Members}, 5, [Measures].[Unit Sales])} ON ROWS FROM [Sales] Where TopCount is to request the highest count of the data as a result of the query. Similarly, BottomCount is to request the lowest count of the data as a result of the query. 2008/2/1460

The product categories are included on the rows axis in a comma-separated list where different operators are used to specify a particular subset of the [Product].[Product Category]. Members set. Unit sales is used as the measure with which to select the top five product categories and the bottom five product categories. 2008/2/1461

USACAORWA Snack Foods 30, , , , Vegetables 20, , , , Dairy 12, , , , Jams & Jelies 11, , , , Fruit 11, , , , Canned Oysters Canned Shimp Hardware Candies Canned Food /2/1462

Select {[Marital Status].[All Marital Status].[S]} ON COLUMNS, {Order ({[Promotion Media].[Media Type].Members}, [Unit Sales], BDESC)} ON ROWS FROM [Sales] Where BDESC means sort descending without hierarchy. 2008/2/1463

The sort was performed using the [Marital Status].[All Marital Status].[S] member in descending order without hierarchy of the unit sales. 2008/2/1764

Marital StatusS No Media Daily Paper, Radio, TV 4, Daily Paper 3, Product Attachment 3, Daily Paper, Radio 3, Cash Register Handout 3, Sunday Paper, Radio 3, Street Handout 2, Sunday Paper 2, Bulk Mail 2, In Store Coupon 1, TV 1, Sunday Paper, Radio, TV 1, Radio 1, /2/1465

Select {[Gender], Members} ON COLUMNS, {TopCount ({[Product].[Product Name].Members},10, ([Gender].[Gender].[F], [Measures].[Unit Sales]))} ON ROWS FROM [Sales] WHERE ([Marital Status].[All Marital Status].[M], [Measures].[Unit Sales]) 2008/2/1466

 The columns axis contains all members of the gender dimension, [All Gender], [F]. and [M]. [All Gender] is included because the.Members function was placed on the gender dimension instead of on the [Gender].[Gender] level.  The fundamental set in the rows axis consists of names of products (members of the [Product].[Product Name] level). In this query the TopCount ( ) function is used to examine some of the products. Of specific interest here are the top 10 products in unit sales pruchased by females. Therefore, the index in the TopCount ( ) function is 10, and the numeric expression is the tuple ([Gender].[Gender].[F], [Measures].[Unit Sales]).  The slicer contains the two members explicitly defined, [Marital Status].[All Marital Status].[M] and [Measures].[Unit Sales], because only data with these characteristics is desired. 2008/2/1467

All GenderFM Fabulous Berry Juice Fast Beef Jerky BBB Best Pepper Ebony Cantelope Peart Cheable Wine Skinner Diel Cola Shdy Lake Manicotti Pearl Light Beer Shady Lake Rice Medly TriState Potatos /2/1468

Starting with the base cuboid [Year, Month, Customer, Product, Sales-person, Sales-quota, Actual-sales], what specific OLAP operation should be performed in order to list the total Actual Sales by Customer in year 2000? 2008/2/1469 The requested SQL statement is: Select Customer, Sum(Actual_sales) From Sales Where year = ‘2000’ Group by customer The requested MDX statement is: Select{[Sales].Actual_sales}on Columns ({[Customer].Customer_name},{[Time].Year}) on Rows from Cuboid Where([Time].[Year].[2000])

“Data Mining: Concepts and Techniques” Second edition by Han and Kamber, Morgan Kaufmann publishers, 2007, chapter 3, pp /2/1470

Discuss three methods of implementing an online analytical processing command. Give an example of using one of them with a given Star schema. 2008/2/1471

Suppose that a data warehouse consists of the three dimensions time, doctor, and patent, and the two measures count and charge, where charge is the fee that a doctor charges a patient for a visit. Starting with the base cuboid [day, doctor, patient], provide a MDX (Multidimensional Expression) query to list the total fee collected by each doctor in 2000? To obtain the same list, write an SQL query assuming the data is stored in a relational database with the table fee (day, month, year, doctor, hospital, patient, count, charge). 1. Starting with the base cuboid [day, doctor, patient], provide a MDX (Multidimensional Expression) query to list the total fee collected by each doctor in 2000? 2. To obtain the same list, write an SQL query assuming the data is stored in a relational database with the table fee (day, month, year, doctor, hospital, patient, count, charge). 2008/2/1472