Datawarehouse Objectives

Slides:



Advertisements
Similar presentations
Chapter 13 The Data Warehouse
Advertisements

Intro to Data Mining: Extracting Information and Knowledge from Data.
Chapter 13 The Data Warehouse.
Introduction to Data Warehouse and Data Mining MIS 2502 Data Analytics
Chapter 13 Business Intelligence and Data Warehouses
Database Systems: Design, Implementation, and Management Tenth Edition
Chapter 9 DATA WAREHOUSING Transparencies © Pearson Education Limited 1995, 2005.
Chapter 12 The Data Warehouse
DATA WAREHOUSING.
13 Chapter 13 The Data Warehouse Hachim Haddouti.
Chapter 13 The Data Warehouse
1 © Prentice Hall, 2002 Chapter 11: Data Warehousing.
DATA WAREHOUSE (Muscat, Oman).
Designing a Data Warehouse
Components of the Data Warehouse Michael A. Fudge, Jr.
Chapter 13 – Data Warehousing. Databases  Databases are developed on the IDEA that DATA is one of the critical materials of the Information Age  Information,
Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization.
Data Warehousing/Mining 1 Data Warehousing/Mining Comp 150 Additional Information Instructor: Dan Hebert.
Chapter 13 The Data Warehouse
12 The Data Warehouse and Data Mining MIS 304 Winter 2006.
Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor
SharePoint 2010 Business Intelligence Module 6: Analysis Services.
Dr.S.Sridhar,Ph.D., RACI(Paris),RZFM(Germany),RMR(USA),RIEEEProc.
Data Warehouse & Data Mining
Data Management Turban, Aronson, and Liang Decision Support Systems and Intelligent Systems, Seventh Edition.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
© 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang 5-1 Chapter 5 Business Intelligence: Data.
1 Data Warehouses BUAD/American University Data Warehouses.
13 Chapter 13 The Data Warehouse Database Systems: Design, Implementation, and Management 4th Edition Peter Rob & Carlos Coronel.
OLAP & DSS SUPPORT IN DATA WAREHOUSE By - Pooja Sinha Kaushalya Bakde.
Data Warehousing.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
1 Categories of data Operational and very short-term decision making data Current, short-term decision making, related to financial transactions, detailed.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
Database Systems: Design, Implementation, and Management Ninth Edition Chapter 13 Business Intelligence and Data Warehouses.
1 Topics about Data Warehouses What is a data warehouse? How does a data warehouse differ from a transaction processing database? What are the characteristics.
Building Data and Document-Driven Decision Support Systems How do managers access and use large databases of historical and external facts?
Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
13 1 Chapter 13 The Data Warehouse Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
Data Mining In contrast to the traditional (reactive) DSS tools, the data mining premise is proactive. Data mining tools automatically search the data.
Sachin Goel (68) Manav Mudgal (69) Piyush Samsukha (76) Rachit Singhal (82) Richa Somvanshi (85) Sahar ( )
1 Categories of data Operational and very short-term decision making data Current, short-term decision making, related to financial transactions, detailed.
Ayyat IT Group Murad Faridi Roll NO#2492 Muhammad Waqas Roll NO#2803 Salman Raza Roll NO#2473 Junaid Pervaiz Roll NO#2468 Instructor :- “ Madam Sana Saeed”
Fox MIS Spring 2011 Data Warehouse Week 8 Introduction of Data Warehouse Multidimensional Analysis: OLAP.
Database Systems: Design, Implementation, and Management Eighth Edition Chapter 13 Business Intelligence and Data Warehouses.
Business Intelligence Transparencies 1. ©Pearson Education 2009 Objectives What business intelligence (BI) represents. The technologies associated with.
Managing Data for DSS II. Managing Data for DS Data Warehouse Common characteristics : –Database designed to meet analytical tasks comprising of data.
What is OLAP?.
Data Warehousing.
Advanced Database Concepts
1 Database Systems, 8 th Edition 1 Chapter 13 Business Intelligence and Data Warehouses Objectives In this chapter, you will learn: –How business intelligence.
1 Categories of data Operational and very short-term decision making data Current, short-term decision making, related to financial transactions, detailed.
12 1 Database Systems: Design, Implementation, & Management, 6 th Edition, Rob & Coronel 12.4 Online Analytical Processing OLAP creates an advanced data.
Data Resource Management Agenda What types of data are stored by organizations? How are different types of data stored? What are the potential problems.
ITEC 3220M Using and Designing Database Systems Instructor: Prof. Z.Yang Course Website: c3220m.htm Office: TEL.
The Need for Data Analysis 2 Managers track daily transactions to evaluate how the business is performing Strategies should be developed to meet organizational.
1 Database Systems, 8 th Edition Star Schema Data modeling technique –Maps multidimensional decision support data into relational database Creates.
BUSINESS INTELLIGENCE. The new technology for understanding the past & predicting the future … BI is broad category of technologies that allows for gathering,
1 Data Warehousing Data Warehousing. 2 Objectives Definition of terms Definition of terms Reasons for information gap between information needs and availability.
Data Mining and Data Warehousing: Concepts and Techniques What is a Data Warehouse? Data Warehouse vs. other systems, OLTP vs. OLAP Conceptual Modeling.
Dr.S.Sridhar,Ph.D., RACI(Paris),RZFM(Germany),RMR(USA),RIEEEProc.
Chapter 13 Business Intelligence and Data Warehouses
Chapter 13 The Data Warehouse
Data Warehouse.
Chapter 13 – Data Warehousing
المحاضرة 4 : مستودعات البيانات (Data warehouse)
Data Warehouse and OLAP
Introduction of Week 9 Return assignment 5-2
Data Warehouse and OLAP
Presentation transcript:

Datawarehouse Objectives Understand needs and concepts of decision support systems Understand concepts and issues of the Data Warehouse Understand concepts of On-Line Analytical Processing (OLAP) Understand how the Data Warehouse is designed and implemented Understand concepts of data mining Slide

The Need for Data Analysis Constant pressure from external and internal forces requires prompt tactical and strategic decisions. The decision-making cycle time is reduced, while problems are increasingly complex with a growing number of internal and external variables. Managers need support systems for facilitating quick decision making in a complex environment. Decision support systems (DSS). Slide

Decision Support Systems Decision Support is a methodology (or a series of methodologies) designed to extract information from data and to use such information as a basis for decision making. A decision support system (DSS) is an arrangement of computerized tools used to assist managerial decision making within a business. A DSS usually requires extensive data “massaging” to produce information. The DSS is used at all levels within an organization and is often tailored to focus on specific business area or problems. The DSS is interactive and provides ad hoc query tools to retrieve data and to display data in different formats. Slide

Decision Support Systems Four Components of a DSS The data store component is a basically a DSS database. The data extraction and filtering component is used to extract and validate the data taken from the operational database and the external data sources. The end user query tool is used by the data analyst to create the queries that access the database. The end user presentation tool is used by the data analyst to organize and present the data. Slide

Decision Support Systems Slide

Decision Support Systems Three Main Areas in Which DSS Data Differ from Operational Data Time span Operational data represent current (atomic) transactions. DSS data tend to cover a longer time frame. Granularity Operational data represent specific transactions that occur at a given time. DSS data must be presented at different levels of aggregation. Dimensionality Operational data focuses on representing atomic transactions. DSS data can be analyzed from multiple dimensions. Slide

Decision Support Systems The DSS Database Requirements Database Schema The DSS database schema must support complex (non-normalized) data representations. The queries must be able to extract multidimensional time slices. Data Extraction and Loading The DBMS must support advanced data extracting and filtering tools. The data extraction capabilities should support different data sources and multiple vendors. Data filtering capabilities must include the ability to check for inconsistent data or data validation rules. The DBMS must support advanced data integration, aggregation, and classification capabilities. Slide

Decision Support Systems End-User Analytical Interface The DSS DBMS must support advanced data modeling and data presentation tools, data analysis tools, and query generation and optimization components. The end user analytical interface is one of the most critical components. Database Size Requirements The DBMS must be capable of supporting very large databases (VLDB). Slide

The Data Warehouse The Data Warehouse is an integrated, subject-oriented, time-variant, non-volatile database that provides support for decision making. Integrated The Data Warehouse is a centralized, consolidated database that integrates data retrieved from the entire organization. Subject-Oriented The Data Warehouse data is arranged and optimized to provide answers to questions coming from diverse functional areas within a company. Slide

The Data Warehouse Time Variant Non-Volatile The Warehouse data represent the flow of data through time. It can even contain projected data. Non-Volatile Once data enter the Data Warehouse, they are never removed. The Data Warehouse is always growing. Slide

The Data Warehouse Slide

The Data Warehouse Data Mart A Data Mart is a small, single-subject Data Warehouse subset that provides decision support to a small group of people. Data Marts can serve as a test vehicle for companies exploring the potential benefits of Data Warehouses. Data Marts addresses local or departmental problems, while a Data Warehouse involves a company-wide effort to support decision making at all levels in the organization. Slide

The Data Warehouse The Evolution of the Data Warehouse Reporting systems of the 1980s Direct access to the operational data through a menu interface. Text-only presentation tools. Sophisticated form of decision support Lightly summarized data extracted from the operational database. Stored in an RDBMS and accessed through a SQL-based query tool. Predefined as well as ad hoc query capabilities. Use of spreadsheets or statistical package to analyze data. Use of desktop tool. Slide

The Data Warehouse Twelve Rules That Define a Data Warehouse 1. The Data Warehouse and operational environments are separated. 2. The Data Warehouse data are integrated. 3. The Data Warehouse contains historical data over a long time horizon. 4. The Data Warehouse data are snapshot data captured at a given point in time. 5. The Data Warehouse data are subject-oriented. 6. The Data Warehouse data are mainly read-only with periodic batch updates from operational data. No online updates are allowed. 7. The Data Warehouse development life cycle differs from classical systems development. The Data Warehouse development is data driven; the classical approach is process driven. Slide

The Data Warehouse 8. The Data Warehouse contains data with several levels of detail; current detail data, old detail data, lightly summarized, and highly summarized data. 9. The Data Warehouse environment is characterized by read-only transactions to very large data sets. The operational environment is characterized by numerous update transactions to a few data entities at the time. 10. The Data Warehouse environment has a system that traces data resources, transformation, and storage. 11. The Data Warehouse’s metadata are a critical component of this environment. The metadata identify and define all data elements. The metadata provide the source, transformation, integration, storage, usage, relationships, and history of each data element. 12. The Data Warehouse contains a charge-back mechanism for resource usage that enforces optimal use of the data by end users. Slide

On-Line Analytical Processing On-Line Analytical Processing (OLAP) is an advanced data analysis environment that supports decision making, business modeling, and operations research activities. Four Main Characteristics of OLAP Use multidimensional data analysis techniques. Provide advanced database support. Provide easy-to-use end user interfaces. Support client/server architecture. Slide

On-Line Analytical Processing Slide

On-Line Analytical Processing Relational OLAP Relational On-Line Analytical Processing (ROLAP) provides OLAP functionality by using relational database and familiar relational query tools. Extensions to RDBMS Multidimensional data schema support within the RDBMS Data access language and query performance optimized for multidimensional data Support for very large databases Slide

On-Line Analytical Processing Multidimensional OLAP (MOLAP) MOLAP extends OLAP functionality to multidimensional databases (MDBMS). MDBMS use special proprietary techniques to store data in matrix-like n_dimensional arrays. MDBMS end users visualize the stored data as a multidimensional cube known as a data cube. Data cubes are created by extracting data from the operational databases or from the Data Warehouse. Data cubes are static and require front-end design work. MOLAP is generally faster than their ROLAP counterparts. It is also more resource-intensive. Slide

Star Schema The star schema is a data modeling technique used to map multidimensional decision support into a relational database. Star schemas yield an easily implemented model for multidimensional data analysis while still preserving the relational structure of the operational database. Four Components: Facts Dimensions Attributes Attribute hierarchies Slide

Star Schema Slide

Multidimensional Cube Slide

Hierarchies in Multidimensional Analysis Slide

Data Mining In contrast to the traditional (reactive) DSS tools, the data mining premise is proactive. Data mining tools automatically search the data for anomalies and possible relationships, thereby identifying problems that have not yet been identified by the end user. Data mining tools -- based on algorithms that form the building blocks for artificial intelligence, neural networks, inductive rules, and predicate logic -- initiate analyses to create knowledge. Slide

Data Mining Four Phases of Data Mining 1. Data Preparation Identify and cleanse data sets. Data Warehouse is usually used for data mining operations. 2. Data Analysis and Classification Identify common data characteristics or patterns using Data groupings, classifications, clusters, or sequences. Data dependencies, links, or relationships. Data patterns, trends, and deviations. Slide

Data Mining 3. Knowledge Acquisition 4. Prognosis Select the appropriate modeling or knowledge acquisition algorithms. Examples: neural networks, decision trees, rules induction, genetic algorithms, classification and regression tree, memory-based reasoning, or nearest neighbor and data visualization). 4. Prognosis Predict future behavior and forecast business outcomes using the data mining findings. Slide