Chapter 15 Data Warehousing, OLAP, and Data Mining

Slides:



Advertisements
Similar presentations
CR32 Knowledge Management and Adaptive Systems 09: Data Warehousing based on an online presentation by Ronald J Norman
Advertisements

Supervisor : Prof . Abbdolahzadeh
An overview of Data Warehousing and OLAP Technology Presented By Manish Desai.
Data Warehousing – An Introductory Perspective
OLAP Tuning. Outline OLAP 101 – Data warehouse architecture – ROLAP, MOLAP and HOLAP Data Cube – Star Schema and operations – The CUBE operator – Tuning.
C6 Databases.
By: Mr Hashem Alaidaros MIS 211 Lecture 4 Title: Data Base Management System.
Management Information Systems, Sixth Edition
Data Warehousing M R BRAHMAM.
Data Warehouse Architecture Sakthi Angappamudali Data Architect, The Oregon State University, Corvallis 16 th May, 2005.
ICS 421 Spring 2010 Data Warehousing (1) Asst. Prof. Lipyeow Lim Information & Computer Science Department University of Hawaii at Manoa 3/18/20101Lipyeow.
Data Mining and Data Warehousing – a connected view.
Chapter 4: Database Management. Databases Before the Use of Computers Data kept in books, ledgers, card files, folders, and file cabinets Long response.
Chapter 14 The Second Component: The Database.
Chapter 1 Why & What is Data Mining? Note: Included in this Slide Set is both Chapter 1 material and additional material from the instructor.
DATA WAREHOUSE (Muscat, Oman).
Data Warehousing DSCI 4103 Dr. Mennecke Introduction and Chapter 1.
CS346: Advanced Databases
Designing a Data Warehouse
An Overview of Data Warehousing and OLTP Technology Presenter: Parminder Jeet Kaur Discussion Lead: Kailang.
XP Information Information is everywhere in an organization Employees must be able to obtain and analyze the many different levels, formats, and granularities.
ITEC 3220A Using and Designing Database Systems
Designing a Data Warehouse Issues in DW design. Three Fundamental Processes Data Acquisition Data Storage Data a Access.
Intro to MIS – MGS351 Databases and Data Warehouses Chapter 3.
Data Warehouse & Data Mining
Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D.
DW-1: Introduction to Data Warehousing. Overview What is Database What Is Data Warehousing Data Marts and Data Warehouses The Data Warehousing Process.
Database Design Part of the design process is deciding how data will be stored in the system –Conventional files (sequential, indexed,..) –Databases (database.
Data warehousing and online analytical processing- Ref Chap 4) By Asst Prof. Muhammad Amir Alam.
MIS DATABASE SYSTEMS, DATA WAREHOUSES, AND DATA MARTS CHAPTER 3
1 Data Warehouses BUAD/American University Data Warehouses.
OLAP & DSS SUPPORT IN DATA WAREHOUSE By - Pooja Sinha Kaushalya Bakde.
Data Warehousing.
1 Reviewing Data Warehouse Basics. Lessons 1.Reviewing Data Warehouse Basics 2.Defining the Business and Logical Models 3.Creating the Dimensional Model.
1 Categories of data Operational and very short-term decision making data Current, short-term decision making, related to financial transactions, detailed.
October 28, Data Warehouse Architecture Data Sources Operational DBs other sources Analysis Query Reports Data mining Front-End Tools OLAP Engine.
Datawarehouse A sneak preview. 2 Data Warehouse Approach An old idea with a new interest: Cheap Computing Power Special Purpose Hardware New Data Structures.
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
MANAGING DATA RESOURCES ~ pertemuan 7 ~ Oleh: Ir. Abdul Hayat, MTI.
1 Categories of data Operational and very short-term decision making data Current, short-term decision making, related to financial transactions, detailed.
Ch3 Data Warehouse Dr. Bernard Chen Ph.D. University of Central Arkansas Fall 2009.
CISB113 Fundamentals of Information Systems Data Management.
Chapter 5 DATA WAREHOUSING Study Sections 5.2, 5.3, 5.5, Pages: & Snowflake schema.
Foundations of Business Intelligence: Databases and Information Management.
Business Intelligence Transparencies 1. ©Pearson Education 2009 Objectives What business intelligence (BI) represents. The technologies associated with.
Data Warehousing.
Advanced Database Concepts
1 Database Systems, 8 th Edition 1 Chapter 13 Business Intelligence and Data Warehouses Objectives In this chapter, you will learn: –How business intelligence.
1 Categories of data Operational and very short-term decision making data Current, short-term decision making, related to financial transactions, detailed.
Introduction to OLAP and Data Warehouse Assoc. Professor Bela Stantic September 2014 Database Systems.
An Overview of Data Warehousing and OLAP Technology
© 2009 Pearson Education, Inc. Publishing as Prentice Hall 1 Lecture 14: Data Warehousing Modern Database Management 9 th Edition Jeffrey A. Hoffer, Mary.
Data Warehousing COMP3017 Advanced Databases Dr Nicholas Gibbins –
1 Management Information Systems M Agung Ali Fikri, SE. MM.
Foundations of information systems : BIS 1202 Lecture 4: Database Systems and Business Intelligence.
1 Data Warehousing Data Warehousing. 2 Objectives Definition of terms Definition of terms Reasons for information gap between information needs and availability.
Chapter 3 Building Business Intelligence Chapter 3 DATABASES AND DATA WAREHOUSES Building Business Intelligence 6/22/2016 1Management Information Systems.
Data Mining and Data Warehousing: Concepts and Techniques What is a Data Warehouse? Data Warehouse vs. other systems, OLTP vs. OLAP Conceptual Modeling.
Business Intelligence Overview
Supervisor : Prof . Abbdolahzadeh
Intro to MIS – MGS351 Databases and Data Warehouses
Data warehouse and OLAP
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Databases and Data Warehouses Chapter 3
MANAGING DATA RESOURCES
Data Warehouse and OLAP
CHAPTER SIX OVERVIEW SECTION 6.1 – DATABASE FUNDAMENTALS
Introduction of Week 9 Return assignment 5-2
Data Warehousing Concepts
Data Warehouse and OLAP
Presentation transcript:

Chapter 15 Data Warehousing, OLAP, and Data Mining

Introduction Data, data, data…everywhere! Information…that’s another story! Especially, the right information @ the right time! Data warehousing’s goal is to make the right information available @ the right time Data warehousing is a data store (eg., a database of some sort) and a process for bringing together disparate data from throughout an organization for decision-support purposes

Introduction Data warehouses are natural allies for data mining (work together well) Data mining can help fulfill some of the goal of data warehouses – right information @ the right time Relational database management systems (RDBMS), such as Oracle, DB2, Sybase, Informix, Focus, SQL Server, etc. are often used for data warehousing

Definitions of a Data Warehouse “A subject-oriented, integrated, time-variant and non-volatile collection of data in support of management's decision making process” 1. - W.H. Inmon “A copy of transaction data, specifically structured for query and analysis” 2. - Ralph Kimball

Data Warehouse For organizational learning to take place, data from many sources must be gathered together and organized in a consistent and useful way – hence, Data Warehousing (DW) DW allows an organization (enterprise) to remember what it has noticed about its data Data Mining techniques make use of the data in a Data Warehouse

Data Warehouse Enterprise “Database” Transactions Data Data Mining Customers Orders Transactions Vendors Etc… Etc… Data Miners: “Farmers” – they know “Explorers” - unpredictable Copied, organized summarized Data Warehouse Data Mining

Data Warehouse A data warehouse is a copy of transaction data specifically structured for querying, analysis, reporting, and more rigorous data mining Note that the data warehouse contains a copy of the transactions which are not updated or changed later by the transaction system Also note that this data is specially structured, and may have been transformed when it was copied into the data warehouse

Data Mart A Data Mart is a smaller, more focused Data Warehouse – a mini-warehouse. A Data Mart typically reflects the business rules of a specific business unit within an enterprise.

Data Warehouse to Data Mart Decision Support Information Data Mart Data Warehouse Decision Support Information Data Mart Decision Support Information Data Mart

Generic Architecture of Data (synonym) Transaction data

Transaction (Operational) Data Operational (production) systems create (massive number of) transactions, such as sales, purchases, deposits, withdrawals, returns, refunds, phone calls, toll roads, web site “hits”, etc… Transactions are the base level of data – the raw material for understanding customer behavior Unfortunately, operational systems change due to changing business needs Fortunately, operational systems can usually be changed to support changing business needs Data warehousing strategies need to be aware of operational system changes

Operational Summary Data Summaries are for a specific time period and utilize the transaction data for that time period Other Examples???

Decision Support Summary Data The data that are used to help make decisions about the business Financial Data, such as: Income Statements (Profit & Loss) Balance Sheets (Assets – Liabilities = Net Worth) Sales summaries Other examples??? Data warehouses maintain this type of data, however financial data “of record” (for audit purposes) usually comes from databases and not the data warehouse (confusing???) Generally, it is a bad idea to use the same system for analytic and operational purposes

Database Schema Database schema defines the structure of data, not the values of the data (e.g., first name, last name = structure; Ron Norman = values of the data) In RDBMS: Columns = fields = attributes (A,B,C) Rows = records = tuples (1-7)

Logical & Physical Database Schema Describes data in a way that is familiar to business users Describes the data the way it will be stored in an RDBMS which might be different than the way the logical shows it

Metadata General definition: Data about data !!! Examples: A library’s card catalog (metadata) describes publications (data) A file system maintains permissions (metadata) about files (data) A form of system documentation including: Values legally allowed in a field (e.g., AZ, CA, OR, UT, WA, etc.) Description of the contents of each field (e.g., start date) Date when data were loaded Indication of currency of the data (last updated) Mappings between systems (e.g., A.this = B.that) Invaluable, otherwise have to research to find it

Business Rules Highest level of abstraction from operational (transaction) data Describes why relationships exist and how they are applied Examples: Need to have 3 forms of ID for credit Only allow a maximum daily withdrawal of $200 After the 3rd log-in attempt, lock the log-in screen Accept no bills larger than $20 Others???

General Architecture for Data Warehousing Source systems Extraction, (Clean), Transformation, & Load (ETL) Central repository Metadata repository Data marts Operational feedback End users (business)

Where does OLAP fit in?

OLAP Overview Interactive, exploratory analysis of multidimensional data to discover patterns

OLAP Architecture

Server Options Single processor Symmetric multiprocessor (SMP) Massively parallel processor (MPP)

OLAP Server Options ROLAP (Relational) MOLAP (Multidimensional) HOLAP (Hybrid)

OLAP – Online Analytical Processing A definition: Data representation is in the form of a CUBE OLAP goes beyond SQL with its analysis capabilities Key feature of OLAP: Relevant multi-dimensional views such as products, time, geography

OLAP Cube - 1

OLAP Cube - 2

OLAP Cube - 3 Star Structure (quite common)

OLAP Cube - 4 The Cube

OLAP Cube - 5 Three- Dimensional Cube Display

OLAP Cube - 6 Six- Dimensional Cube

Rotation (Pivot Table)

Drill Down

OLAP Examples http://perso.wanadoo.fr/bernard.lupin/english/example.htm Excel Pivot Table example (similar to OLAP cube)

Sample of OLAP products Just a snippet from http://www.olapreport.com/ProductsIndex.htm ; not an endorsement

Data Mining versus OLAP

Data Mining versus OLAP OLAP - Online Analytical Processing Provides you with a very good view of what is happening, but can not predict what will happen in the future or why it is happening

Results of Data Mining Include: Forecasting what may happen in the future Classifying people or things into groups by recognizing patterns Clustering people or things into groups based on their attributes Associating what events are likely to occur together Sequencing what events are likely to lead to later events

End of Chapter 15