ISQS 3358, Business Intelligence Data Warehousing Zhangxi Lin Texas Tech University 1.

Slides:



Advertisements
Similar presentations
Jose Chinchilla MCITP: Database Administrator, SQL Server 2008 MCITP: Business Intelligence Design and Implementation, SQL Server 2008 President & CEO,
Advertisements

C6 Databases.
Database Management3-1 L3 Database Management Santa R. Susarapu Ph.D. Student Virginia Commonwealth University.
Lecture-7/ T. Nouf Almujally
Data Warehousing M R BRAHMAM.
Chapter 3 Database Management
1 Introduction The Database Environment. 2 Web Links Google General Database Search Database News Access Forums Google Database Books O’Reilly Books Oracle.
Chapter 5 DATA WAREHOUSING.
Chapter 2: Data Warehousing
Data Resource Management Data Concepts Database Management Types of Databases Chapter 5 McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies,
Introduction to Building a BI Solution 권오주 OLAPForum
Data Warehousing DSCI 4103 Dr. Mennecke Introduction and Chapter 1.
Building a Data Warehouse with SQL Server Presented by John Sterrett.
ISQS 3358, Business Intelligence Creating Data Marts Zhangxi Lin Texas Tech University 1.
Intro to MIS – MGS351 Databases and Data Warehouses Chapter 3.
Chapter 5 Lecture 2. Principles of Information Systems2 Objectives Understand Data definition language (DDL) and data dictionary Learn about popular DBMSs.
Ihr Logo Chapter 5 Business Intelligence: Data Warehousing, Data Acquisition, Data Mining, Business Analytics, and Visualization Turban, Aronson, and Liang.
ISQS 6339, Business Intelligence Data Warehousing Zhangxi Lin Texas Tech University 1 1.
DW-1: Introduction to Data Warehousing. Overview What is Database What Is Data Warehousing Data Marts and Data Warehouses The Data Warehousing Process.
© 2005 Prentice Hall, Decision Support Systems and Intelligent Systems, 7th Edition, Turban, Aronson, and Liang 5-1 Chapter 5 Business Intelligence: Data.
ISQS 3358, Business Intelligence Dimensional Modeling Zhangxi Lin Texas Tech University 1 1.
Data warehousing and online analytical processing- Ref Chap 4) By Asst Prof. Muhammad Amir Alam.
1 Data Warehouses BUAD/American University Data Warehouses.
2 Copyright © Oracle Corporation, All rights reserved. Defining Data Warehouse Concepts and Terminology.
Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 8: Data Warehousing.
The Data Warehouse “A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection of “all” an organisation’s data in support.
5-1 McGraw-Hill/Irwin Copyright © 2007 by The McGraw-Hill Companies, Inc. All rights reserved.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
CISB594 – Business Intelligence
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
MANAGING DATA RESOURCES ~ pertemuan 7 ~ Oleh: Ir. Abdul Hayat, MTI.
CISB594 – Business Intelligence Data Warehousing Part I.
CISB594 – Business Intelligence Data Warehousing Part I.
 Understand the basic definitions and concepts of data warehouses  Describe data warehouse architectures (high level).  Describe the processes used.
Chapter 5 DATA WAREHOUSING Study Sections 5.2, 5.3, 5.5, Pages: & Snowflake schema.
Management Information Systems, 4 th Edition 1 Chapter 8 Data and Knowledge Management.
Zhangxi Lin Texas Tech University ISQS 6347, Data & Text Mining 1 ISQS 6339 Data Management and Business Intelligence Database Review.
CISB594 – Business Intelligence Data Warehousing Part I.
1 ISQS 3358, Business Intelligence Data Warehousing Zhangxi Lin Texas Tech University 1.
DATA RESOURCE MANAGEMENT
CISB594 – Business Intelligence Data Warehousing Part I.
Zhangxi Lin Texas Tech University
 Understand the basic definitions and concepts of data warehouses  Describe data warehouse architectures (high level).  Describe the processes used.
The Need for Data Analysis 2 Managers track daily transactions to evaluate how the business is performing Strategies should be developed to meet organizational.
Chapter 8: Data Warehousing. Data Warehouse Defined A physical repository where relational data are specially organized to provide enterprise- wide, cleansed.
Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 5: Data Warehousing.
Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 8: Data Warehousing.
Business Intelligence and Decision Support Systems (9 th Ed., Prentice Hall) Chapter 8: Data Warehousing.
2 Copyright © 2006, Oracle. All rights reserved. Defining Data Warehouse Concepts and Terminology.
1 Copyright © 2008, Oracle. All rights reserved. Repository Basics.
Managing Data Resources File Organization and databases for business information systems.
Data Mining and Data Warehousing: Concepts and Techniques What is a Data Warehouse? Data Warehouse vs. other systems, OLTP vs. OLAP Conceptual Modeling.
Business Intelligence Overview
Intro to MIS – MGS351 Databases and Data Warehouses
Advanced Applied IT for Business 2
Defining Data Warehouse Concepts and Terminology
Zhangxi Lin Texas Tech University
Data warehouse and OLAP
Data Warehousing and Data Mining By N.Gopinath AP/CSE
Data Warehouse.
Databases and Data Warehouses Chapter 3
Defining Data Warehouse Concepts and Terminology
MANAGING DATA RESOURCES
Data Warehouse and OLAP
Introduction of Week 9 Return assignment 5-2
The Database Environment
Chapter 3 DATA WAREHOUSING.
Data Warehousing Concepts
Big DATA.
Data Warehouse and OLAP
Presentation transcript:

ISQS 3358, Business Intelligence Data Warehousing Zhangxi Lin Texas Tech University 1

Outlines So far students have learned ◦ Basic concepts of business intelligence ◦ The definition and importance of data warehouse In this lecture, the following topics will be covered ◦ SQL Server 2008 data mart case study  How to access data in a network directory  How to access SQL Server 2008 on the Citrix Server  How to load data from an Excel file to a database ◦ Data warehouse overview ◦ Data warehouse architecture ISQS 3358 BI2

Data Warehousing Definitions and Concepts Data warehouse ◦ Video – Overview of data warehouse 2’38” Video – Overview of data warehouse A physical repository where relational data are specially organized to provide enterprise-wide, cleansed data in a standardized format Benefits of data warehouse 3’18” Benefits of data warehouse 3 ISQS 3358 BI

Data mart Definition A localized data warehouse that stores only relevant data to a department or even an individual ◦ Dependent data mart A subset that is created directly from a data warehouse ◦ Independent data mart A small data warehouse designed for a strategic business unit or a department 4 ISQS 3358 BI

Data Mart - The IMW Case IMW, standing for Internet Media Works!, is an ASP in real estate information services. It is headquartered in Austin, Texas. CEO is Gary Anderson. Web page:

Why need Data Mart? Data mart complements the centralized data warehousing based on UDM model, for the situations where UDM cannot be used ◦ Legacy databases ◦ Data are from nondatabase sources ◦ No physical connection the centralized data warehouse ◦ Data are not clean 6 ISQS 3358 BI

Data Mart Structures Fact tables ◦ Measures Dimension tables ◦ Dimensions and Hierarchies ◦ Attributes (or columns) Dimensional modeling – Stars and Snowflakes 7 ISQS 3358 BI

Measures A numeric quantity expressing some of the organization’s performance. The information represented by this quantity is used to support or evaluate the decision making and performance of the organization. A measure is also called a fact The table holding measure information is called as a fact table Dimensions vs. Measures 2’38” Dimensions vs. Measures ISQS 3358 BI8

9 Commrex Real Estate Operational Database Users: property listors, webmaster, marketing manager of IMW Objective: Encourage realtors to use the online ASP services with the best information services to increase IMW’s revenue. Value Chain ◦ Listors create their account ◦ Listors post their real estate properties to the web-based database services and pay listing fees ◦ Property buyers search the website-based database and buy properties from listors. This is the incentive for listors to use the ASP services Business Processes ◦ Listor sign up ◦ Listor account management ◦ Property data posting ◦ Property search ◦ Property database maintenance 9 ISQS 3358 BI9

10 Property ID Listor ID Address Property Type City                     Company ID Chapter Functions Specializations                    Comp Name Address      Telephone # Listor Name UpdateDate Feature Property Type Subtype 1      Type Name      Subtype 2 Subtype n M:1 M:M Primary Key Secondary Key Link to a table Legends Property Listing Database Membership Database IMW’s Database ERD Model Company ID TransactionID PropID UserID M:1 ISQS 3358 BI10

Commrex Data Warehousing Users: CEO of IMW, IMW business analyst, IMW marketing manager Analytic themes ◦ Fast retrieval of business key performance indicators (KPIs) ◦ Decision making on business promotions Applications ◦ Geographic distribution of property listings ◦ Scorecard for main performance indicators ◦ Dashboard Questions ◦ How to model data warehouse? ◦ What are required in data transformation and preprocessing? ◦ Any missing dimension for data ware housing? ◦ How to perform routine data warehouse updates – frequency, timing, etc. ISQS 3358 BI11

12 Property ID Listor ID Address PropType City                     Company ID Chapter Functions Specializations                    Company ID Address      Telephone # Listor Name UpdateDate Features PropType … SubName Primary Key Secondary Key Link to a table Legends Property Listing Fact Membership Dimension IMW’s Data Warehouse Dimensional Model Company Dimension Property Type Dimension Comp Name      Year Month Date Quarter ISQS 3358 BI12

Data Warehouse Overview

Data Warehousing Characteristics Basic characteristics of data warehousing ◦ Subject oriented ◦ Integrated ◦ Time variant (time series) ◦ Nonvolatile (not allow to change) Others ◦ Web based ◦ Relational/multidimensional ◦ Client/server ◦ Real-time ◦ Include metadata 14 ISQS 3358 BI

Data Warehousing Process Overview Data in DW are constantly accumulated. ◦ Organizations continuously collect data, information, and knowledge at an increasingly accelerated rate and store them in computerized systems The number of users is constantly increasing. ◦ The number of users needing to access the information continues to increase as a result of improved reliability and availability of network access, especially the Internet The organization using data warehouse relied on DW more and more 15 ISQS 3358 BI

Data Warehousing More Concepts Operational data stores (ODS) A type of database often used as an interim area for a data warehouse, especially for customer information files Enterprise data warehouse (EDW) A large-scale data warehouse used across the enterprise for decision support. It integrates different sources of information into a consolidated information system. Metadata (Video 1’41”)Video Data about data. In a data warehouse, metadata describe the contents of a data warehouse and the manner of its use ◦ Syntactic metadata, structural metadata, and semantic metadata 16 ISQS 3358 BI

Data Warehousing Process Overview 17 ISQS 3358 BI

Data Warehousing Process Overview The major components of a data warehousing process ◦ Data sources ◦ Data extraction ◦ Data loading ◦ Comprehensive database ◦ Metadata ◦ Middleware tools 18 ISQS 3358 BI

Data Warehouse Architectures

Three Parts of Data Warehouse The data warehouse that contains the data and associated software Data acquisition (back-end) software that extracts data from legacy systems and external sources, consolidates and summarizes them, and loads them into the data warehouse Client (front-end) software that allows users to access and analyze data from the warehouse 20 ISQS 3358 BI

Three-Tier Data Warehouse 21 ISQS 3358 BI

Alternative Data Warehouse Architectures (1) 22 ISQS 3358 BI

Alternative Data Warehouse Architectures (2) 23 ISQS 3358 BI

Alternative Data Warehouse Architectures (3) 24 ISQS 3358 BI

Alternative Data Warehouse Architectures (4) 25 ISQS 3358 BI

Alternative Data Warehouse Architectures (5) 26 ISQS 3358 BI

27 Architectures Comparison ISQS 3358 BI

Teradata’s EDW 28 ISQS 3358 BI

Structure and Components of Business Intelligence 29 SSMS SSIS SSAS SSRS SAS EM SAS EM SAS EG SAS EG MS SQL Server 2008 BIDS ISQS 3358 BI

Exercise 1 – Walk through data warehousing process Learning Objectives ◦ To gain a general impression how to use SQL Server 2008 to implement a data mart Tasks ◦ Create your database with SSMS, named as ISQS lastname ◦ Import data from Commrex_2011.xls ◦ Use SSMS to create a ERD diagram ◦ Create a SSAS project using BIDS ◦ Define data source, data source view, and cube Deliverable: ◦ One-page printout of the screenshot of the cube diagram ◦ Due Feb 5, 2016, Friday, either submit a hardcopy to TA or to 30 ISQS 3358 BI30

Distributed business intelligence Deal with big data – the open & distributed approach ◦ LAMP: Linux, Apache, MySQL, PHP/Perl/Python ◦ Hadoop ◦ MapReduce ◦ HDFS ◦ NOSQL ◦ Zookeeper ◦ Storm 31ISQS 3358 BI

Hadoop – for BI in the Cloudera Hadoop is a free, Java-based programming framework that supports the processing of large data sets in a distributed computing environment. Hadoop makes it possible to run applications on systems with thousands of nodes involving thousands of terabytes.terabyte Hadoop was inspired by Google's MapReduce, a software framework in which anapplication is broken down into numerous small parts. Doug Cutting, Hadoop's creator, named the framework after his child's stuffed toy elephant.GoogleMapReduceapplication 32ISQS 3358 BI

MapReduce 33 MapReduce is a framework for processing parallelizable problems across huge datasets using a large number of computers (nodes), collectively referred to as a cluster or a grid. ISQS 3358 BI

Cloudera’s Hadoop System 34ISQS 3358 BI

Comparison between big data platform and traditional BI platform ISQS 3358 BI35