MGS4020_05.ppt/Jun 23, 2016/Page 1 Georgia State University - Confidential MGS 4020 Business Intelligence Ch 2 – The Data Warehouse Jun 23, 2016.

Slides:



Advertisements
Similar presentations
CHAPTER OBJECTIVE: NORMALIZATION THE SNOWFLAKE SCHEMA.
Advertisements

Chapter 10: Designing Databases
Lecture-7/ T. Nouf Almujally
The database approach to data management provides significant advantages over the traditional file-based approach Define general data management concepts.
Accounting System Design
Relational Databases Chapter 4.
The Hierarchy of Data Bit (a binary digit): a circuit that is either on or off Byte: 8 bits Character: each byte represents a character; the basic building.
CSE 190: Internet E-Commerce Lecture 10: Data Tier.
Mgt 20600: IT Management & Applications Databases
Data Warehousing - 3 ISYS 650. Snowflake Schema one or more dimension tables do not join directly to the fact table but must join through other dimension.
Data Warehouse success depends on metadata
Chapter 4: Database Management. Databases Before the Use of Computers Data kept in books, ledgers, card files, folders, and file cabinets Long response.
Chapter 14 The Second Component: The Database.
Chapter 4 Relational Databases Copyright © 2012 Pearson Education, Inc. publishing as Prentice Hall 4-1.
Mgt 20600: IT Management & Applications Databases Tuesday April 4, 2006.
Chapter 4 Relational Databases Copyright © 2012 Pearson Education 4-1.
Database Management System
© 2003, Prentice-Hall Chapter Chapter 2: The Data Warehouse Modern Data Warehousing, Mining, and Visualization: Core Concepts by George M. Marakas.
1 Applied CyberInfrastructure Concepts ISTA 420/520 Fall Nirav Merchant Bio Computing & iPlant Collaborative Eric Lyons.
L/O/G/O Metadata Business Intelligence Erwin Moeyaert.
Database Lecture # 1 By Ubaid Ullah.
Intro to MIS – MGS351 Databases and Data Warehouses Chapter 3.
Chapter 5 Lecture 2. Principles of Information Systems2 Objectives Understand Data definition language (DDL) and data dictionary Learn about popular DBMSs.
1 Introduction to databases concepts CCIS – IS department Level 4.
Module Title? DBMS Introduction to Database Management System.
1 Overview of Databases. 2 Content Databases Example: Access Structure Query language (SQL)
Database and Data Warehouse Module B: Designing and Building a Relational Database Chapter 3.
RDB/1 An introduction to RDBMS Objectives –To learn about the history and future direction of the SQL standard –To get an overall appreciation of a modern.
Introduction to SQL Steve Perry
1 INTRODUCTION TO DATABASE MANAGEMENT SYSTEM L E C T U R E
Chapter 7: Database Systems Succeeding with Technology: Second Edition.
1 January Management of Information Technology Chapter 10 Database Management Asst. Prof. Wichai Bunchua
311: Management Information Systems Database Systems Chapter 3.
© 2007 by Prentice Hall (Hoffer, Prescott & McFadden) 1 Introduction to SQL.
© 2009 Pearson Education, Inc. Publishing as Prentice Hall 1 UNIT 6: Chapter 7: Introduction to SQL Modern Database Management 9 th Edition Jeffrey A.
I Information Systems Technology Ross Malaga 4 "Part I Understanding Information Systems Technology" Copyright © 2005 Prentice Hall, Inc. 4-1 DATABASE.
Marakas: Decision Support Systems, 2nd Edition © 2003, Prentice-Hall Chapter Chapter 10: The Data Warehouse Decision Support Systems in the 21 st.
CPS120: Introduction to Computer Science Lecture 19 Introduction to SQL.
Lecturer: Gareth Jones. How does a relational database organise data? What are the principles of a database management system? What are the principal.
MIS DATABASE SYSTEMS, DATA WAREHOUSES, AND DATA MARTS CHAPTER 3
Database Design and Management CPTG /23/2015Chapter 12 of 38 Functions of a Database Store data Store data School: student records, class schedules,
Building Data and Document-Driven Decision Support Systems How do managers access and use large databases of historical and external facts?
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
MANAGING DATA RESOURCES ~ pertemuan 7 ~ Oleh: Ir. Abdul Hayat, MTI.
Database Fundamental & Design by A.Surasit Samaisut Copyrights : All Rights Reserved.
SQL Jan 20,2014. DBMS Stores data as records, tables etc. Accepts data and stores that data for later use Uses query languages for searching, sorting,
Distribution of Marks For Second Semester Internal Sessional Evaluation External Evaluation Assignment /Project QuizzesClass Attendance Mid-Term Test Total.
Fundamentals of Information Systems, Sixth Edition Chapter 3 Database Systems, Data Centers, and Business Intelligence.
uses of DB systems DB environment DB structure Codd’s rules current common RDBMs implementations.
Chapter 3 The Relational Model. Why Study the Relational Model? Most widely used model. Vendors: IBM, Informix, Microsoft, Oracle, Sybase, etc. “Legacy.
Database Management Systems 3ed, R. Ramakrishnan and J. Gehrke1 The Relational Model Chapter 3.
D ATABASE MANAGEMENT SYSTEM By Rubel Biswas. W HAT IS I NFORMATION ? It’s just something you can’t avoid. It is generally referred to as data.
MGS4020 Final Exam Review.ppt/Jul 21, 2011/Page 1 Georgia State University - Confidential MGS 4020 Business Intelligence Final Exam Review Jul 21, 2011.
Fundamental of Database Systems
Fundamentals of DBMS Notes-1.
Intro to MIS – MGS351 Databases and Data Warehouses
Chapter 1 Introduction.
Fundamentals of Information Systems, Sixth Edition
MGS 8020 Business Intelligence The Data Warehouse & Relational Database Management System Jul 17, 2017.
Fundamentals & Ethics of Information Systems IS 201
Applied CyberInfrastructure Concepts Fall 2017
Databases and Data Warehouses Chapter 3
Basic Concepts in Data Management
Accounting System Design
Agenda SQL Queries Join. MGS 4020 Business Intelligence Relational Algebra and Structured Query Language Jul 3, 2018.
Accounting System Design
Presentation transcript:

MGS4020_05.ppt/Jun 23, 2016/Page 1 Georgia State University - Confidential MGS 4020 Business Intelligence Ch 2 – The Data Warehouse Jun 23, 2016

MGS4020_05.ppt/Jun 23, 2016/Page 2 Georgia State University - Confidential Agenda Designing & Building the Data Warehouse Data Warehouse Relational Database

MGS4020_05.ppt/Jun 23, 2016/Page 3 Georgia State University - Confidential The Data Warehouse is physically separated from all other operational systems holds aggregated data and transactional data for management separate from that data used for online transaction processing

MGS4020_05.ppt/Jun 23, 2016/Page 4 Georgia State University - Confidential Data Flow Operational Data Store Data Warehouse Data Mart Metadata Legacy Systems Personal Data Warehouse

MGS4020_05.ppt/Jun 23, 2016/Page 5 Georgia State University - Confidential The Data Warehouse is physically separated from all other operational systems holds aggregated data and transactional data for management separate from that data used for online transaction processing

MGS4020_05.ppt/Jun 23, 2016/Page 6 Georgia State University - Confidential Characteristics of a Data Warehouse Subject Orientation Data Integrated Consistent Naming and Measurement Attributes Time Variant Nonvolatility

MGS4020_05.ppt/Jun 23, 2016/Page 7 Georgia State University - Confidential Metadata What is Metadata? Data about Data Without metadata, the data is meaningless Provides consistency of the truth Components of Metadata Transformation Mapping Extraction and Relationship History Algorithms for Summarization (and calculations) Data Ownership Patterns of Warehouse Access Business Friendly naming conventions Status Information

MGS4020_05.ppt/Jun 23, 2016/Page 8 Georgia State University - Confidential 7 Deadly Sins of Data Warehousing “If you build it, they will come” Omission of an architectural framework Underestimating the importance of documenting assumptions Failure to use the right tool for the job Life cycle abuse Ignorance concerning the resolution of data conflicts Failure to learn from mistakes

MGS4020_05.ppt/Jun 23, 2016/Page 9 Georgia State University - Confidential Data Warehouse Vendors Business Objects Cognos Hyperion IBM Microsoft NCR / Teradata Oracle SAS

MGS4020_05.ppt/Jun 23, 2016/Page 10 Georgia State University - Confidential Agenda Data Warehouse Designing & Building the Data Warehouse Relational Database

MGS4020_05.ppt/Jun 23, 2016/Page 11 Georgia State University - Confidential Relational Database A relational database is a collection of data items organized as a set of formally-described tables from which data can be accessed or reassembled in many different ways without having to reorganize the database tables. The relational database was invented by E. F. Codd at IBM in 1970.databasedata The standard user and application program interface to a relational database is the structured query language (SQL). SQL statements are used both for interactive queries for information from a relational database and for gathering data for reports.SQL A relational database is a set of tables containing data fitted into predefined categories. Each table (which is sometimes called a relation) contains one or more data categories in columns. Each row contains a unique instance of data for the categories defined by the columns. For example, a typical business order entry database would include a table that described a customer with columns for name, address, phone number, and so forth. Another table would describe an order: product, customer, date, sales price, and so forth. A user of the database could obtain a view of the database that fitted the user's needs. For example, a branch office manager might like a view or report on all customers that had bought products after a certain date. A financial services manager in the same company could, from the same tables, obtain a report on accounts that needed to be paid.

MGS4020_05.ppt/Jun 23, 2016/Page 12 Georgia State University - Confidential Relational Database When creating a relational database, you can define the domain of possible values in a data column and further constraints that may apply to that data value. For example, a domain of possible customers could allow up to ten possible customer names but be constrained in one table to allowing only three of these customer names to be specifiable. The definition of a relational database results in a table of metadata or formal descriptions ofmetadata the tables, columns, domains, and constraints. Meta is a prefix that in most information technology usages means "an underlying definition or description." Thus, metadata is a definition or description of data and metalanguage is a definition or description of language. A database is a collection of data that is organized so that its contents can easily be accessed, managed, and updated. The most prevalent type of database is the relational database, a tabular database in which data is defined so that it can be reorganized and accessed in a number of different ways. A distributed database is one that can be dispersed or replicated among different points in a network. An object-oriented programming database is one that is congruent with the data defined in object classes and subclasses.datarelational databaseobject-oriented programming SQL (Structured Query Language) is a standard interactive and programming language for getting information from and updating a database. Although SQL is both an ANSI and an ISO standard, many database products support SQL with proprietary extensions to the standard language. Queries take the form of a command language that lets you select, insert, update, find out the location of data, and so forth.databaseANSIISO

MGS4020_05.ppt/Jun 23, 2016/Page 13 Georgia State University - Confidential Business Intelligence Environment Internal Source Systems External Data Sources Extract, Transformation and Load Data Warehouse Data Mart

MGS4020_05.ppt/Jun 23, 2016/Page 14 Georgia State University - Confidential Relational Database IBM DB2, DB2/400 Microsoft SQL/Server Teradata Oracle Sybase Informix / Red Brick Microsoft Access MySQL

MGS4020_05.ppt/Jun 23, 2016/Page 15 Georgia State University - Confidential SQL SQL – Structured Query Language 1. DDL – Data Definition Language Create Drop Alter 2.DML – Data Manipulation Language Insert Update Delete Select

MGS4020_05.ppt/Jun 23, 2016/Page 16 Georgia State University - Confidential Relational Database RDBMS Software Application SQL Request Result Set

MGS4020_05.ppt/Jun 23, 2016/Page 17 Georgia State University - Confidential Business Intelligence Environment Internal Source Systems External Data Sources Extract, Transformation and Load Data Warehouse Data Mart Extract, Transformation and Load

MGS4020_05.ppt/Jun 23, 2016/Page 18 Georgia State University - Confidential SQL Select Statement SELECTcolumn1, column2,... FROMtable1, table2,... WHEREcriteria1 AND/ORcriteria ORDER BYcolumn1, column1,...

MGS4020_05.ppt/Jun 23, 2016/Page 19 Georgia State University - Confidential SQL Select Statement SELECTcolumn1, column2,... FROMtable1, table2,... WHEREcriteria1 AND/ORcriteria ORDER BYcolumn1, column1,... GROUP BYcolumn1, column1,... HAVINGcriteria1 AND/ORcriteria Aggregation

MGS4020_05.ppt/Jun 23, 2016/Page 20 Georgia State University - Confidential SQL – Example 1 SQL CREATE TABLE ADDR_BOOK ( NAME char(30), COMPANY char(20), E_MAIL char (25) Output NameCompany John Jeff

MGS4020_05.ppt/Jun 23, 2016/Page 21 Georgia State University - Confidential SQL – Example 2 2a) SQL SELECT NAME, COMPANY, E_MAIL FROM ADDR_BOOK WHERE COMPANY = ‘Microsoft' Output NameCompany John 2b) Table - Product IDNameCategory IInternetA BBrowsersA A ApplicationNull GGraphicsNull SQL SELECT ID, NAME from PRODUCT WHERE CATEGORY = NULL

MGS4020_05.ppt/Jun 23, 2016/Page 22 Georgia State University - Confidential SQL – Example 3 SQL SELECT ADDR_BOOK.NAME, COMPANY. FROM ADDR_BOOK, COMPANY WHERE ADDR_BOOK.EMPLOYEE_ID = COMPANY.EMPLOYEE_ID Output Name John Jeff

MGS4020_05.ppt/Jun 23, 2016/Page 23 Georgia State University - Confidential SQL – Example 4 SQL CREATE TABLE CUSTOMER ( CUST_NO INTEGER, FIRST_NAME CHAR(30), LAST_NAME CHAR(30), ADDRESS CHAR(50), CITY CHAR(30), STATE CHAR (2), ZIP_CODE CHAR(9), COUNTRY CHAR(20) ) CREATE TABLE ORDER ( ORDER_NO INTEGER, DATE_ENTERED DATE, CUST_NO INTEGER ) SQL SELECT ORDER.ORDER_NO, CUSTOMER.NAME, CUSTOMER.ADDRESS, CUSTOMER.CITY, CUSTOMER.ZIP_CIDE, CUSTOMER.COUNTRY FROM ORDER, CUSTOMER WHERE ORDER.CUST_NO = CUSTOMER.CUST_NO AND ORDER.DATE_ENTERED = ' '

MGS4020_05.ppt/Jun 23, 2016/Page 24 Georgia State University - Confidential SQL – Example 5 SQL CREATE TABLE ADDR_BOOK ( NAME char(30), COMPANY char(20), E_MAIL char (25) Output NameCompany John Jeff

MGS4020_05.ppt/Jun 23, 2016/Page 25 Georgia State University - Confidential SQL – Example 6 – Referential Integrity SQL CREATE TABLE CUSTOMER ( CUST_NO INTEGER PRIMARY KEY, FIRST_NAME CHAR(30), LAST_NAME CHAR(30), ADDRESS CHAR(50), CITY CHAR(30), ZIP_CODE CHAR(9), COUNTRY CHAR(20) ) CREATE TABLE ORDER ( ORDER_NO INTEGER PRIMARY KEY, DATE_ENTERED DATE, CUST_NO INTEGER REFERENCES CUSTOMER (CUST_NO) ) SQL CREATE TABLE ORDER_ITEMS ( ORDER_NO INTEGER, ITEM_NO INTEGER, PRODUCT CHAR(30), QUANTITY INTEGER, UNIT_PRICE MONEY ) ALTER TABLE ORDER_ITEMS ADD PRIMARY KEY PK_ORDER_ITEMS (ORDER_NO, ITEM_NO) ALTER TABLE ORDER_ITEMS ADD FOREIGN KEY FK_ORDER_ITEMS_1 (ORDER_NO) REFERENCES ORDER (ORDER_NO)

MGS4020_05.ppt/Jun 23, 2016/Page 26 Georgia State University - Confidential SQL – Example 7 – Index When you have a primary key, you already have an implicitly (or explicitly) defined unique index on the primary key columns. It's generally a good idea to define non-unique indexes on the foreign keys. SQL CREATE UNIQUE INDEX PK_CUSTOMER ON CUSTOMER (CUST_NO) CREATE UNIQUE INDEX PK_ORDER ON ORDER (ORDER_NO) CREATE INDEX FK_ORDER_1 ON ORDER (CUST_NO) CREATE UNIQUE INDEX PK_ORDER_ITEMS ON ORDER_ITEMS (ORDER_NO, ITEM_NO) CREATE INDEX FK_ORDER_ITEMS_1 ON ORDER_ITEMS (ORDER_NO)

MGS4020_05.ppt/Jun 23, 2016/Page 27 Georgia State University - Confidential Agenda Data Warehouse Designing & Building the Data Warehouse Relational Database

MGS4020_05.ppt/Jun 23, 2016/Page 28 Georgia State University - Confidential Why Business Intelligence 1.Improve consistency and accuracy of reporting 2.Reduce stress on operational systems for reporting and analysis 3.Faster access to information 4.BI tools provide increased analytical capabilities 5.Empowering the Business User 6.Companies are realizing that data is a company’s most underutilized asset

MGS4020_05.ppt/Jun 23, 2016/Page 29 Georgia State University - Confidential ERM vs. DM ERM - Entity Relationship Model Remove redundancy Efficiency of transactions DM - Dimensional Model Intuitive View of the Data Efficiency of access and analysis

MGS4020_05.ppt/Jun 23, 2016/Page 30 Georgia State University - Confidential Dimensional Model

MGS4020_05.ppt/Jun 23, 2016/Page 31 Georgia State University - Confidential Retail Sales Dimensional Model (Partial)

MGS4020_05.ppt/Jun 23, 2016/Page 32 Georgia State University - Confidential Fact Table 1.Contains Foreign Keys that relate to Dimension Tables 2.Have a many-to-one relationship to Dimension Tables 3.Contains Metrics to be aggregated 4.Typically does not contain any non-foreign key or non-metric data elements 5.Level of Granularity defines depth and flexibility of analysis

MGS4020_05.ppt/Jun 23, 2016/Page 33 Georgia State University - Confidential Dimension Table 1.Contains a Primary Key that relates to the Fact Table(s) 2.Has a one-to-many relationship to the Fact Table(s) 3.Contains Descriptive data used to limit and aggregated metrics from the Fact Table(s) 4.Can sometimes contain pre- aggregated data

MGS4020_05.ppt/Jun 23, 2016/Page 34 Georgia State University - Confidential Warehouse Architecture Specification Common Sources Common Dimensions Common Business Rules Common Semantics Common Metrics

MGS4020_05.ppt/Jun 23, 2016/Page 35 Georgia State University - Confidential Time Dimension Week – defined by an end of week day Month – January, February, March,... Quarter – Q1: 01/01 – 03/31 Q2: 04/01 – 06/30 Q3: 07/01 – 09/30 Q4: 10/01 – 12/31 Year – 2000, 2001, 2002, 2003 Date (Primary Key)– a day, 365 per year Fiscal Month– 4/4/5 Fiscal Quarter Fiscal Year

MGS4020_05.ppt/Jun 23, 2016/Page 36 Georgia State University - Confidential Time Dimension Weekday/Weekend Day of Week – Monday, Tuesday, Wednesday,... Season – Winter, Spring, Summer, Fall Holiday – Labor Day, 4th of July, Memorial Day, Date (Primary Key)– a day, 365 per year