Horizontal data sets: Number of attributes is of the same order to several orders of magnitude higher than the number of records. Example: genetic data.

Slides:



Advertisements
Similar presentations
Mining Association Rules
Advertisements

An Array-Based Algorithm for Simultaneous Multidimensional Aggregates By Yihong Zhao, Prasad M. Desphande and Jeffrey F. Naughton Presented by Kia Hall.
Materialization and Cubing Algorithms. Cube Materialization Each cell of the data cube is a view consisting of an aggregation of interest. The values.
OLAP Tuning. Outline OLAP 101 – Data warehouse architecture – ROLAP, MOLAP and HOLAP Data Cube – Star Schema and operations – The CUBE operator – Tuning.
Relational Algebra, Join and QBE Yong Choi School of Business CSUB, Bakersfield.
Outline What is a data warehouse? A multi-dimensional data model Data warehouse architecture Data warehouse implementation Further development of data.
Chapter 18: Data Analysis and Mining Kat Powell. Chapter 18: Data Analysis and Mining ➔ Decision Support Systems ➔ Data Analysis and OLAP ➔ Data Warehousing.
5.1Database System Concepts - 6 th Edition Chapter 5: Advanced SQL Advanced Aggregation Features OLAP.
Chapter 11 Group Functions
Polaris: A System for Query, Analysis and Visualization of Multi-dimensional Relational Databases Presented by Darren Gates for ICS 280.
Jennifer Widom On-Line Analytical Processing (OLAP) Introduction.
Implementation & Computation of DW and Data Cube.
May 14, 2001California Digital Library Using DDI Extensions as Intermediary for Data Storage and Data Display Patricia Cruse Marsha Fanshier Fredric Gey.
Decision Support and Data Warehouse. Decision supports Systems Components Data management function –Data warehouse Model management function –Analytical.
Data Cube and OLAP Server
Advanced Querying OLAP Part 2. Context OLAP systems for supporting decision making. Components: –Dimensions with hierarchies, –Measures, –Aggregation.
©Silberschatz, Korth and Sudarshan22.1Database System Concepts 4 th Edition 1 SQL:1999 Advanced Querying Decision-Support Systems Data Warehousing Data.
©Silberschatz, Korth and Sudarshan22.1Database System Concepts 4 th Edition 1 Extended Aggregation SQL-92 aggregation quite limited  Many useful aggregates.
Chap8: Trends in DBMS 8.1 Database support for Field Entities 8.2 Content-based retrieval 8.3 Introduction to spatial data warehouses 8.4 Summary.
COMP 578 Data Warehousing And OLAP Technology Keith C.C. Chan Department of Computing The Hong Kong Polytechnic University.
© Tan,Steinbach, Kumar Introduction to Data Mining 8/05/ Data Warehouse and Data Cube Lecture Notes for Chapter 3 Introduction to Data Mining By.
Lab3 CPIT 440 Data Mining and Warehouse.
Data Warehousing. On-Line Analytical Processing (OLAP) Tools The use of a set of graphical tools that provides users with multidimensional views of their.
Chapter 13 – Data Warehousing. Databases  Databases are developed on the IDEA that DATA is one of the critical materials of the Information Age  Information,
Dimensions And Measures data cubes have categories of data called dimensions and measures. measure – represents some fact (or number)
Online Analytical Processing (OLAP) Hweichao Lu CS157B-02 Spring 2007.
OLAP OPERATIONS. OLAP ONLINE ANALYTICAL PROCESSING OLAP provides a user-friendly environment for Interactive data analysis. In the multidimensional model,
Advanced Databases 5841 DATA CUBE. Index of Content 1. The “ALL” value and ALL() function 2. The New Features added in CUBE 3. Computing the CUBE and.
Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals Presenter : Parminder Jeet Kaur Discussion Lead : Kailang.
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Decision Support Chapter 23.
Data Warehousing.
DATA WAREHOUSING IN SQL SERVER 2005/2008 BUSINESS INTELLIGENCE.
1 CUBE: A Relational Aggregate Operator Generalizing Group By By Ata İsmet Özçelik.
OLAP Theory-English version On-Line Analytical processing (Buisness Intzlligence) [Ing.Skorkovský,CSc] KPH_ESF_MU.
1 Cube Computation and Indexes for Data Warehouses CPS Notes 7.
Computing & Information Sciences Kansas State University Monday, 26 Nov 2007CIS 560: Database System Concepts Lecture 37 of 42 Monday, 26 November 2007.
Data Warehousing.
©Silberschatz, Korth and Sudarshan18.1Database System Concepts - 5 th Edition, Aug 26, 2005 Extended Aggregation in SQL:1999 The cube operation computes.
Some OLAP Issues CMPT 455/826 - Week 9, Day 2 Jan-Apr 2009 – w9d21.
13 1 Chapter 13 The Data Warehouse Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
Modeling Issues for Data Warehouses CMPT 455/826 - Week 7, Day 1 (based on Trujollo) Sept-Dec 2009 – w7d11.
Computing & Information Sciences Kansas State University Wednesday, 29 Nov 2006CIS 560: Database System Concepts Lecture 39 of 42 Wednesday, 29 November.
On-Line Application Processing Warehousing Data Cubes (Data Mining) (slides borrowed from Stanford)
Implementing Data Cube Construction Using a Cluster Middleware: Algorithms, Implementation Experience, and Performance Ge Yang Ruoming Jin Gagan Agrawal.
Information Integration Entity Resolution – 21.7 Presented By: Deepti Bhardwaj Roll No: 223_103.
Business Intelligence Transparencies 1. ©Pearson Education 2009 Objectives What business intelligence (BI) represents. The technologies associated with.
Database System Concepts, 6 th Ed. ©Silberschatz, Korth and Sudarshan See for conditions on re-usewww.db-book.com Module B: Advanced SQL.
Online Analytical Processing (OLAP) An Overview Kian Win Ong, Nicola Onose Mar 3 rd 2006.
Copyright© 2014, Sira Yongchareon Department of Computing, Faculty of Creative Industries and Business Lecturer : Dr. Sira Yongchareon ISCG 6425 Data Warehousing.
Data Warehousing.
Relational Algebra p BIT DBMS II.
Database Management Systems, 2 nd Edition. R. Ramakrishnan and J. Gehrke1 Data Warehousing and Decision Support.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 Data Warehousing and Decision Support Chapter 25.
1 Database Systems, 8 th Edition Star Schema Data modeling technique –Maps multidimensional decision support data into relational database Creates.
Data Warehousing and OLAP Outline u Models & operations u Implementing a warehouse u Future directions.
Chapter 20 Data Warehousing and Mining 1 st Semester, 2016 Sanghyun Park.
Chapter 20 Data Warehousing and Mining 1 st Semester, 2016 Sanghyun Park.
Data Analysis Decision Support Systems Data Analysis and OLAP Data Warehousing.
Data Analysis and OLAP Dr. Ms. Pratibha S. Yalagi Topic Title
Data Transformation: Normalization
Chapter 5: Advanced SQL Database System concepts,6th Ed.
Based on notes by Jim Gray
Theory behind the relational engine
Theory behind the relational engine
DATA CUBE Advanced Databases 584.
On-Line Analytical Processing (OLAP)
HAVING,INDEX,COMMIT & ROLLBACK
SDMX Information Model: An Introduction
Dissemination and use of aggregate data: structures and functionality
Slides based on those originally by : Parminder Jeet Kaur
Presentation transcript:

Horizontal data sets: Number of attributes is of the same order to several orders of magnitude higher than the number of records. Example: genetic data sets, can have 10,000 attributes and 100 records. 10, 000 attributes, up to 100 million combinations of two attributes and up to 1 trillion 3 attribute sets!

Data Driven Algorithm Constructing the Max-conf kernel for small data sets: Input: i) a Database DB ii) a fixed consequent C Output: a set R of rules such that for any rule of the form X->C there exists a rule X'->C in R, where X' is a superset of X and X'->C has a a higher confidence then X->C

Algorithm: // DB(C) is the set of records that satisfy the consequent // RS is a working set which maintain the current subset of records that satisfy the consequent COMMON is the set of common descriptors for the record set RS; MaxConfKernelSet(DB, C, DB(C), RS, COMMON) { i= size(RS)+1; if (i==1) { COMMON=Descriptors in the ith record in DB(C);} RS=RS \union {ith record in DB(C)}; while (i<=size(DB(C))) do { Delete from COMMON the descriptors not shared by the ith record; Compute support of records satisfying {COMMON-C}; Compute the confidence of COMMON-C->C; if (COMMON-C)!=null) { if sufficient support and not duplicate output "COMMON-C->C [support, conf]" ; MaxConfKernelSet(DB, C, DB(C), RS, COMMON); RS=RS-{ith record in DB(C)}; i++; RS=RS \union {ith record in DB(C)}; } Invoke: MaxConfKenalSet(DB,C, DB(C), null, null); // RS, COMMON is empty initially

OLAP and Statistical databases Statistical databases – from early 80s –Mutidimensional datasets concerned with summariziation over the dimensions of the data sets. 2-D representations – census, socioeconomic data etd OLAP: on line analytical processing: mid 90s

Multi-dimensional Statistical Table

2-D representation of statistical data

A graph model for statistical data

A scheme for stat data

More schemes

Relational representation of statistical object

Automatic aggregation concept

Terms in SDB and OLAP

SDB and OLAP operators

Completeness of statistical algebra

Overlapping and timevarying categories

Physical organization

Encoding column category values

Array linearization

Header compression

Lattice of materialization

Partitioning of a data cube into subcubes

Cube operator

Data Cube – shortcomings of SQL

Sales Roll Up by Model by Year and by color

Using ALL value

3 dimensional rollup in SQL

Cross-tabulation in SQL

Cross Tabulation

CUBE operator

Support of histograms

A 3D data cube

ALL value and decoration field

Decorations

ROLLUP operator

Percentage of total as an aggregate function

Indices

STAR scheme