3.Planning & Project management

Slides:



Advertisements
Similar presentations
Relational Database and Data Modeling
Advertisements

Dimensional Modeling.
CHAPTER OBJECTIVE: NORMALIZATION THE SNOWFLAKE SCHEMA.
Chapter 10: Designing Databases
An overview of Data Warehousing and OLAP Technology Presented By Manish Desai.
1 Use or disclosure of data contained on this sheet is subject to the restriction on the title page of this proposal or quotation. An Introduction to Data.
BY LECTURER/ AISHA DAWOOD DW Lab # 2. LAB EXERCISE #1 Oracle Data Warehousing Goal: Develop an application to implement defining subject area, design.
Chapter_11.indd 13/17/11 5:11 PM. Chapter_11.indd 23/17/11 5:11 PM.
C6 Databases.
Dimensional Modeling CS 543 – Data Warehousing. CS Data Warehousing (Sp ) - Asim LUMS2 From Requirements to Data Models.
Data Warehouse IMS5024 – presented by Eder Tsang.
Dimensional Modeling – Part 2
Organizing Data & Information
Marakas: Decision Support Systems, 2nd Edition © 2003, Prentice-Hall Chapter Chapter 1: Introduction to Decision Support Systems Decision Support.
Data Warehousing - 3 ISYS 650. Snowflake Schema one or more dimension tables do not join directly to the fact table but must join through other dimension.
Databases and Database Management Systems
CSE6011 Warehouse Models & Operators  Data Models  relations  stars & snowflakes  cubes  Operators  slice & dice  roll-up, drill down  pivoting.
Chapter 13 The Data Warehouse
Mgt 20600: IT Management & Applications Databases Tuesday April 4, 2006.
G.Anuradha Information Package Diagram. Information Packages – novel idea for determining and recording information requirements for a data warehouse.
Tanvi Madgavkar CSE 7330 FALL Ralph Kimball states that : A data warehouse is a copy of transaction data specifically structured for query and analysis.
Principles of Dimensional Modeling
Chapter 13 – Data Warehousing. Databases  Databases are developed on the IDEA that DATA is one of the critical materials of the Information Age  Information,
Week 6 Lecture The Data Warehouse Samuel Conn, Asst. Professor
DWH – Dimesional Modeling PDT Genči. 2 Outline Requirement gathering Fact and Dimension table Star schema Inside dimension table Inside fact table STAR.
Overview of the Database Development Process
CSI315CSI315 Web Development Technologies Continued.
Sayed Ahmed Logical Design of a Data Warehouse.  Free Training and Educational Services  Training and Education in Bangla: Training and Education in.
Copyright © 2003 by Prentice Hall Computers: Tools for an Information Age Chapter 13 Database Management Systems: Getting Data Together.
Chapter 6: Foundations of Business Intelligence - Databases and Information Management Dr. Andrew P. Ciganek, Ph.D.
9/14/2012ISC329 Isabelle Bichindaritz1 Database System Life Cycle.
Data warehousing and online analytical processing- Ref Chap 4) By Asst Prof. Muhammad Amir Alam.
DIMENSIONAL MODELLING. Overview Clearly understand how the requirements definition determines data design Introduce dimensional modeling and contrast.
Chapter 1 Adamson & Venerable Spring Dimensional Modeling Dimensional Model Basics Fact & Dimension Tables Star Schema Granularity Facts and Measures.
Data Warehouse. Design DataWarehouse Key Design Considerations it is important to consider the intended purpose of the data warehouse or business intelligence.
1 Data Warehouses BUAD/American University Data Warehouses.
The Data Warehouse “A data warehouse is a subject-oriented, integrated, time-variant, and nonvolatile collection of “all” an organisation’s data in support.
Module 2: Information Technology Infrastructure Chapter 5: Databases and Information Management.
5 - 1 Copyright © 2006, The McGraw-Hill Companies, Inc. All rights reserved.
DataBase Management System What is DBMS Purpose of DBMS Data Abstraction Data Definition Language Data Manipulation Language Data Models Data Keys Relationships.
DEFINING the BUSINESS REQUIREMENTS. Introduction OLTP and DW planning is different in term of requirements clarity Planning DW is about solving users’
Decision Support and Date Warehouse Jingyi Lu. Outline Decision Support System OLAP vs. OLTP What is Date Warehouse? Dimensional Modeling Extract, Transform,
6.1 © 2010 by Prentice Hall 6 Chapter Foundations of Business Intelligence: Databases and Information Management.
Data resource management
DIMENSIONAL MODELING MIS2502 Data Analytics. So we know… Relational databases are good for storing transactional data But bad for analytical data What.
IS 438 Business Dimensions. Business dimensions are the core components or categories of a business, anything that you want to analyze in reports. Business.
DATABASE MANAGEMENT SYSTEM ARCHITECTURE
UNIT-II Principles of dimensional modeling
Management Information Systems, 4 th Edition 1 Chapter 8 Data and Knowledge Management.
Managing Data Resources. File Organization Terms and Concepts Bit: Smallest unit of data; binary digit (0,1) Byte: Group of bits that represents a single.
DATA RESOURCE MANAGEMENT
June 08, 2011 How to design a DATA WAREHOUSE Linh Nguyen (Elly)
Business Intelligence Training Siemens Engineering Pakistan Zeeshan Shah December 07, 2009.
1 Copyright © 2009, Oracle. All rights reserved. Oracle Business Intelligence Enterprise Edition: Overview.
Data Warehousing DSCI 4103 Dr. Mennecke Chapter 2.
The Need for Data Analysis 2 Managers track daily transactions to evaluate how the business is performing Strategies should be developed to meet organizational.
1 Database Systems, 8 th Edition Star Schema Data modeling technique –Maps multidimensional decision support data into relational database Creates.
Building the Corporate Data Warehouse Pindaro Demertzoglou Data Resource Management.
Copyright © 2016 Pearson Education, Inc. Modern Database Management 12 th Edition Jeff Hoffer, Ramesh Venkataraman, Heikki Topi CHAPTER 9: DATA WAREHOUSING.
Advanced Applied IT for Business 2
Data warehouse and OLAP
Chapter 13 The Data Warehouse
Data Warehouse.
Chapter 13 – Data Warehousing
An Introduction to Data Warehousing
Introduction of Week 9 Return assignment 5-2
Chapter 13 The Data Warehouse
Dimensional Model January 16, 2003
DWH – Dimesional Modeling
Data Warehousing Concepts
Presentation transcript:

3.Planning & Project management 2/23/2012 3.Planning & Project management/D.S.Jagli

3.Planning & Project management: topics to be covered How is it different? Life-cycle approach The Development Phases Dimensional Analysis Dimensional Modeling Star Schema Snowflake Scheme 2/23/2012 3.Planning & Project management/D.S.Jagli

3.Planning & Project management Reasons for DWH projects failure Improper planning Inadequate project management Planning for Data ware house is necessary. Key issues needs to be planned Value and expectation Risk assessment Top-down or bottom –up Build or Buy Single vender or best of breed Business requirement ,not technology Top management support Justification 2/23/2012 3.Planning & Project management/D.S.Jagli

3.Planning & Project management Example for DWH Project Outline for overall plan Introduction Mission statement Scope Goals& objectives Key issues & Options Value& expectations Justification Executive sponsorship Implementation Strategy Tentative schedule Project authorization 2/23/2012 3.Planning & Project management/D.S.Jagli

3.1 How is it different? DWH Project Different from OLTP System Project DWH Distinguish features and Challenges for Project Management Data Acquisition Data Storage – Info . Delivery- 2/23/2012 3.Planning & Project management/D.S.Jagli

2/23/2012 3.Planning & Project management/D.S.Jagli

3.2 The life-cycle Approach Fig: DW functional components and SDLC 2/23/2012 3.Planning & Project management/D.S.Jagli

DWH Project Plan: Sample outline 2/23/2012 3.Planning & Project management/D.S.Jagli

3.3 DWH Development Phases 2/23/2012 3.Planning & Project management/D.S.Jagli

3.3 DWH Development Phases Project plan Requirements definition Design Construction Deployment Growth and maintenance Interleaved within the design and construction phases are the three tracks along with the definition of the architecture and the establishment of the infrastructure 2/23/2012 3.Planning & Project management/D.S.Jagli

3.4 Dimensional Analysis A data warehouse is an information delivery system. It is not about technology, but about solving users’ problems and providing strategic information to the user. In the phase of defining requirements, you need to concentrate on what information the users need, not on how you are going to provide the required information. 2/23/2012 3.Planning & Project management/D.S.Jagli

Dimensional Analysis Usage of Information Unpredictable In providing information about the requirements for an operational system, the users are able to give you precise details of the required functions, information content, and usage patterns Dimensional Nature of Business Data Even though the users cannot fully describe what they want in a data warehouse, they can provide you with very important insights into how they think about the business. 2/23/2012 3.Planning & Project management/D.S.Jagli

Managers think in business dimensions : example 2/23/2012 3.Planning & Project management/D.S.Jagli

Dimensional Nature of Business Data 2/23/2012 3.Planning & Project management/D.S.Jagli

Dimensional Nature of Business Data 2/23/2012 3.Planning & Project management/D.S.Jagli

Examples of Business Dimensions 2/23/2012 3.Planning & Project management/D.S.Jagli

Examples of Business Dimensions 2/23/2012 3.Planning & Project management/D.S.Jagli

INFORMATION PACKAGES—A NEW CONCEPT a novel idea is introduced for determining and recording information requirements for a data warehouse. This concept helps us to give a concrete form to the various insights, nebulous thoughts, and opinions expressed during the process of collecting requirements. The information packages, put together while collecting requirements, are very useful for taking the development of the data warehouse to the next phases. 2/23/2012 3.Planning & Project management/D.S.Jagli

Requirements Not Fully Determinate Information packages enable us to: Define the common subject areas Design key business metrics Decide how data must be presented Determine how users will aggregate or roll up Decide the data quantity for user analysis or query Decide how data will be accessed Establish data granularity Estimate data warehouse size Determine the frequency for data refreshing Ascertain how information must be packaged 2/23/2012 3.Planning & Project management/D.S.Jagli

An information package. 2/23/2012 3.Planning & Project management/D.S.Jagli

Business Dimensions business dimensions form the underlying basis of the new methodology for requirements definition. Data must be stored to provide for the business dimensions. The business dimensions and their hierarchical levels form the basis for all further phases. 2/23/2012 3.Planning & Project management/D.S.Jagli

Dimension Hierarchies/Categories Examples: Product: Model name, model year, package styling, product line, product category, exterior color, interior color, first model year Dealer: Dealer name, city, state, single brand flag, date first operation Customer demographics: Age, gender, income range, marital status, household size, vehicles owned, home value, own or rent Payment method: Finance type, term in months, interest rate, agent Time: Date, month, quarter, year, day of week, day of month, season, holiday flag 2/23/2012 3.Planning & Project management/D.S.Jagli

Key Business Metrics or Facts The numbers the users analyze are the measurements or metrics that measure the success of their departments. These are the facts that indicate to the users how their departments are doing in fulfilling their departmental objectives. 2/23/2012 3.Planning & Project management/D.S.Jagli

Example: automobile sales The set of meaningful and useful metrics for analyzing automobile sales is as follows: Actual sale price MSRP sale price Options price Full price Dealer add-ons Dealer credits Dealer invoice Amount of down payment Manufacturer proceeds Amount financed 2/23/2012 3.Planning & Project management/D.S.Jagli

3.5 DIMENSIONAL MODELING Star Schema Snowflake Scheme 2/23/2012 3.Planning & Project management/D.S.Jagli

FROM REQUIREMENTS TO DATA DESIGN The requirements definition completely drives the data design for the data warehouse. A group of data elements form a data structure. Logical data design includes determination of the various data elements that are needed and combination of the data elements into structures of data. Logical data design also includes establishing the relationships among the data structures. 2/23/2012 3.Planning & Project management/D.S.Jagli

FROM REQUIREMENTS TO DATA DESIGN The information package diagrams form the basis for the logical data design for the data warehouse. The data design process results in a dimensional data model 2/23/2012 3.Planning & Project management/D.S.Jagli

From requirements to data design. 2/23/2012 3.Planning & Project management/D.S.Jagli

Dimensional Modeling Basics: Formation of the automaker sales fact table. 2/23/2012 3.Planning & Project management/D.S.Jagli

Formation of the automaker dimension tables. 2/23/2012 3.Planning & Project management/D.S.Jagli

Concept of Keys for Dimension table Surrogate Keys A surrogate key is the primary key for a dimension table and is independent of any keys provided by source data systems. Surrogate keys are created and maintained in the data warehouse and should not encode any information about the contents of records; Automatically increasing integers make good surrogate keys. The original key for each record is carried in the dimension table but is not used as the primary key. Surrogate keys provide the means to maintain data warehouse information when dimensions change. Business Keys Natural keys Will have a meaning and can be generated out of the data from source system or can be used as is from source system field

The criteria for combining the tables into a dimensional model. The model should provide the best data access. The whole model must be query-centric. It must be optimized for queries and analyses. The model must show that the dimension tables interact with the fact table. It should also be structured in such a way that every dimension can interact equally with the fact table. The model should allow drilling down or rolling up along dimension hierarchies. 2/23/2012 3.Planning & Project management/D.S.Jagli

The dimensional model :a STAR schema With these requirements, we find that a dimensional model with the fact table in the middle and the dimension tables arranged around the fact table satisfies the condition 2/23/2012 3.Planning & Project management/D.S.Jagli

Case study: STAR schema for automaker sales. 2/23/2012 3.Planning & Project management/D.S.Jagli

E-R Modeling Versus Dimensional Modeling DW meant to answer questions on overall process DW focus is on how managers view the business DW focus business trends Information is centered around a business process Answers show how the business measures the process The measures to be studied in many ways along several business dimensions OLTP systems capture details of events transactions OLTP systems focus on individual events An OLTP system is a window into micro-level transactions Picture at detail level necessary to run the business Suitable only for questions at transaction level Data consistency, non-redundancy, and efficient data storage critical 2/23/2012 3.Planning & Project management/D.S.Jagli

E-R Modeling Versus Dimensional Modeling Dimensional modeling for the data warehouse. E-R modeling for OLTP systems 2/23/2012 3.Planning & Project management/D.S.Jagli

THE STAR SCHEMA 2/23/2012 3.Planning & Project management/D.S.Jagli

Star Schemas Data Modeling Technique to map multidimensional decision support data into a relational database. Current Relational modeling techniques do not serve the needs of advanced data requirements 2/23/2012 3.Planning & Project management/D.S.Jagli

Star Schema Facts Dimensions Attributes Attribute Hierarchies 4 Components Facts Dimensions Attributes Attribute Hierarchies 2/23/2012 3.Planning & Project management/D.S.Jagli

Facts Numeric measurements (values) that represent a specific business aspect or activity. Stored in a fact table at the center of the star scheme. Contains facts that are linked through their dimensions. Updated periodically with data from operational databases 2/23/2012 3.Planning & Project management/D.S.Jagli

Dimensions Qualifying characteristics that provide additional perspectives to a given fact DSS data is almost always viewed in relation to other data Dimensions are normally stored in dimension tables 2/23/2012 3.Planning & Project management/D.S.Jagli

Attributes Dimension Tables contain Attributes Attributes are used to search, filter, or classify facts Dimensions provide descriptive characteristics about the facts through their attributed Must define common business attributes that will be used to narrow a search, group information, or describe dimensions. (ex.: Time / Location / Product) No mathematical limit to the number of dimensions (3-D makes it easy to model) 2/23/2012 3.Planning & Project management/D.S.Jagli

Attribute Hierarchies Provides a Top-Down data organization Aggregation Drill-down / Roll-Up data analysis Attributes from different dimensions can be grouped to form a hierarchy 2/23/2012 3.Planning & Project management/D.S.Jagli

Concept of Keys for Star schema Surrogate Keys The surrogate keys are simply system-generated sequence numbers and is independent of any keys provided by source data systems. They do not have any built-in meanings. Surrogate keys are created and maintained in the data warehouse and should not encode any information about the contents of records; Automatically increasing integers make good surrogate keys. The original key for each record is carried in the dimension table but is not used as the primary key. Business Keys Primary Keys Each row in a dimension table is identified by a unique value of an attribute designated as the primary key of the dimension. Foreign Keys Each dimension table is in a one-to-many relationship with the central fact table. So the primary key of each dimension table must be a foreign key in the fact table.

Star Schema for Sales Dimension Tables Fact Table 2/23/2012 3.Planning & Project management/D.S.Jagli

Star Schema Representation Fact and Dimensions are represented by physical tables in the data warehouse database. Fact tables are related to each dimension table in a Many to One relationship (Primary/Foreign Key Relationships). Fact Table is related to many dimension tables The primary key of the fact table is a composite primary key from the dimension tables. Each fact table is designed to answer a specific DSS question 2/23/2012 3.Planning & Project management/D.S.Jagli

Star Schema The fact table is always the larges table in the star schema. Each dimension record is related to thousand of fact records. Star Schema facilitated data retrieval functions. DBMS first searches the Dimension Tables before the larger fact table 2/23/2012 3.Planning & Project management/D.S.Jagli

Star Schema : advantages Easy to understand Optimizes Navigation Most Suitable for Query Processing 2/23/2012 3.Planning & Project management/D.S.Jagli

THE SNOWFLAKE SCHEMA 2/23/2012 3.Planning & Project management/D.S.Jagli

THE SNOWFLAKE SCHEMA Snowflaking” is a method of normalizing the dimension tables in a STAR schema. 2/23/2012 3.Planning & Project management/D.S.Jagli

Sales: a simple STAR schema. 2/23/2012 3.Planning & Project management/D.S.Jagli

Product dimension: partially normalized 2/23/2012 3.Planning & Project management/D.S.Jagli

When to Snowflake The principle behind snowflaking is normalization of the dimension tables by removing low cardinality attributes and forming separate tables. In a similar manner, some situations provide opportunities to separate out a set of attributes and form a subdimension. 2/23/2012 3.Planning & Project management/D.S.Jagli

Advantages and Disadvantages Small savings in storage space Normalized structures are easier to update and maintain Disadvantages Schema less intuitive and end-users are put off by the complexity Ability to browse through the contents difficult Degraded query performance because of additional joins 2/23/2012 3.Planning & Project management/D.S.Jagli

??? Thank you 2/23/2012 3.Planning & Project management/D.S.Jagli