Presentation is loading. Please wait.

Presentation is loading. Please wait.

Agile data warehouse design

Similar presentations


Presentation on theme: "Agile data warehouse design"— Presentation transcript:

1

2 Agile data warehouse design
Dao Vo Confidential

3 Overview of data warehousing
agenda Overview of data warehousing Designing and implementing a data warehouse Waterfall BI/WH development Agile BI/WH development framework Q&A Audience? Confidential

4 Overview of data warehousing
What is a data warehouse? Confidential

5 OVERVIEW OF DATA WAREHOUSING
The business problem What is a data warehouse? BI/WH Architectures Confidential

6 Module 1: Introduction to Data Warehousing
Course 10777A Module 1: Introduction to Data Warehousing The Business Problem Key business data is distributed across multiple systems Emphasize that the drivers for a data warehousing solution are typically business-related rather than technical. Ask students about the various applications and data stores in their organizations. How easy or difficult is it for students to collate data from them to create an overall, business-wide view of key measures that drive strategic business decision making?

7 Module 1: Introduction to Data Warehousing
Course 10777A Module 1: Introduction to Data Warehousing The Business Problem Finding the information required for business decision making is time-consuming and error-prone Emphasize that the drivers for a data warehousing solution are typically business-related rather than technical. Ask students about the various applications and data stores in their organizations. How easy or difficult is it for students to collate data from them to create an overall, business-wide view of key measures that drive strategic business decision making?

8 Module 1: Introduction to Data Warehousing
Course 10777A Module 1: Introduction to Data Warehousing The Business Problem Fundamental business questions are hard to answer Emphasize that the drivers for a data warehousing solution are typically business-related rather than technical. Ask students about the various applications and data stores in their organizations. How easy or difficult is it for students to collate data from them to create an overall, business-wide view of key measures that drive strategic business decision making?

9 What Is a Data Warehouse?
Course 10777A Module 1: Introduction to Data Warehousing What Is a Data Warehouse? Explain that this course uses a fairly generic definition for a data warehouse. There are many very specific definitions in use throughout the data warehousing industry, and students may be aware of some of the more common schools of thought with regard to database design, including those of Bill Inmon and Ralph Kimball. This course does not advocate one approach over another, although the lab solutions and data warehouse schema design that are discussed in this course are more in line with a Kimball-based approach than any other.

10 What Is a Data Warehouse?
Course 10777A Module 1: Introduction to Data Warehousing What Is a Data Warehouse? A centralized store of business data for reporting and analysis Typically, a data warehouse: Contains large volumes of historical data Is optimized for querying data (as opposed to inserting or updating) Is incrementally loaded with new business data at regular intervals Provides the basis for enterprise business intelligence solutions Explain that this course uses a fairly generic definition for a data warehouse. There are many very specific definitions in use throughout the data warehousing industry, and students may be aware of some of the more common schools of thought with regard to database design, including those of Bill Inmon and Ralph Kimball. This course does not advocate one approach over another, although the lab solutions and data warehouse schema design that are discussed in this course are more in line with a Kimball-based approach than any other.

11 Designing and implementing a data warehouse
How to design a data warehouse and BI solution? Confidential

12 DESIGN AND IMPLEMENT WH
Introduction to Dimensional Modeling Star Schemas Considerations for Dimension Tables Considerations for Fact Tables Snowflake Schemas Confidential

13 Warehouse Modeling Confidential

14 Introduction to Dimensional Modeling
Module 3: Designing and Implementing a Data Warehouse Course 10777A Introduction to Dimensional Modeling Business questions focus on measures that are aggregated by business dimensions Measures are facts about the business Dimensions are ways in which the measures can be aggregated Time Product Line At this stage, the focus is on identifying the measures and dimensions, not on defining specific fact and dimension tables. Point out that the measures and dimensions by which the tables will be aggregated are generally identified through discussions with the stakeholders who will use the data warehouse. Quantity Revenue Cost Profit Region Customer Sales person Product

15 Module 3: Designing and Implementing a Data Warehouse
Course 10777A Module 3: Designing and Implementing a Data Warehouse Star Schemas DimSalesPerson SalesPersonKey SalesPersonName StoreName StoreCity StoreRegion Group related dimensions into dimension tables Group related measures into fact tables Relate fact tables to dimension tables by using foreign keys Encourage students to suggest other examples of business events that include measures, and the dimensions by which they may be aggregated. FactOrders CustomerKey SalesPersonKey ProductKey ShippingAgentKey TimeKey OrderNo LineItemNo Quantity Revenue Cost Profit DimCustomer CustomerKey CustomerName City Region DimDate DateKey Year Quarter Month Day DimProduct ProductKey ProductName ProductLine SupplierName DimShippingAgent ShippingAgentKey ShippingAgentName

16 Module 3: Designing and Implementing a Data Warehouse
Course 10777A Module 3: Designing and Implementing a Data Warehouse Snowflake Schemas DimSalesPerson SalesPersonKey SalesPersonName StoreKey DimStore StoreKey StoreName GeographyKey DimDate DateKey Year Quarter Month Day FactOrders CustomerKey SalesPersonKey ProductKey ShippingAgentKey TimeKey OrderNo LineItemNo Quantity Revenue Cost Profit Alert students to the danger that is inherent in refactoring a star schema to a snowflake schema. Most database professionals are experienced in designing OLTP databases, and can have a tendency toward normalization. When you design a data warehouse, you need to keep this tendency in check and ensure that you only create normalized snowflake dimensions when they can be justified by one of the considerations described in this topic. DimGeography GeographyKey City Region DimCustomer CustomerKey CustomerName GeographyKey DimShippingAgent ShippingAgentKey ShippingAgentName DimProductLine ProductLineKey ProductLineName DimProduct ProductKey ProductName ProductLineKey SupplierKey DimSupplier SupplierKey SupplierName

17 Warehouse Modeling Confidential

18 Waterfall BI/WH development
Traditional SDLC to develop a BI/WH product Confidential

19 Waterfall BI/WH development
SDLC Overview Confidential

20 Waterfall BI/WH development
Confidential

21 SDLC OVERVIEW Confidential

22 Agile BI/WH development framework
Incremental development framework for BI/WH product Confidential

23 Agile BI/WH development framework
Agile BI/WH life cycle Agile DW design overview Agile ETL Solution Confidential

24 Agile BI/WH Life Cycle Confidential

25 Agile BI/WH Life Cycle Confidential

26 Agile DW design overview
How to design to answer business question? Confidential

27 Agile DW design overview
How do we ask question? The 7Ws framework Design using natural language Straightforward methodology Model storming BEAM methodology Confidential

28 How do we ask question? Events/Transactions Interrogatives:
A immutable "fact" that occurs in a time and place Interrogatives: Who, What, When, Where, Why Descriptive context that fully describes the event A set of “dimensions" that describe events Immutable: Do not change Confidential

29 The 7Ws framework Why Where How Who When What How Many Confidential

30 The 7Ws framework HOW – FACTs Much Many Often £ $ € Who Customer
Employee Seller Organization What Product Service Transactions Booking Event Why Causal Promotion Reason Weather Competition Where Location Geographic Store Ship to Hospital When Time Day Month Year The 7Ws framework

31 Design using natural language
Verbs – Events – Relationships – Fact Tables Nouns – Details – Entities – Dimensions Main Clause – Subject-Verb-Object Prepositions – connect additional details to the main clause Interrogatives – The 7Ws – Dimension Types Confidential

32 Straightforward methodology
11111 4 Who 1 5 What Subject-Verb-Object 2 3 When Declare Event Type 6 Initial Data Examples Where 7 How (many) Quantities - Facts 8 Why Sufficient Detail Fact Granularity How 9 Confidential

33 Design using natural language
Verbs – Events – Relationships – Fact Tables Nouns – Details – Entities – Dimensions Main Clause – Subject-Verb-Object Prepositions – connect additional details to the main clause Interrogatives – The 7Ws – Dimension Types Confidential

34 Business Event Analysis and Modeling (BEAM✲)
An agile approach to dimensional modeling Confidential

35 Quick Inclusive Interactive Fun Model Storming Data Modeler
Inclusive: Tinh toan ven Interactive Fun Confidential Data Modeler BI Stakeholders

36 BEAM✲ BEAM ✲ Methodology
Structured, non-technical, collaborative working conversation directly with BI Users BEAM✲ Logical and Physical Dimensional Data Models Example data Detailed and Testable ETL Specification DW Prototype BI User’s Business Process, Organizational, Hierarchical, and Data Knowledge Focused Data Profiling Data Modeler BI Stakeholders Confidential

37 Q&A

38 Thank you.


Download ppt "Agile data warehouse design"

Similar presentations


Ads by Google