Presentation is loading. Please wait.

Presentation is loading. Please wait.

Copyright © Starsoft Inc, 2000 1 Data Warehouse Architecture By Slavko Stemberger.

Similar presentations


Presentation on theme: "Copyright © Starsoft Inc, 2000 1 Data Warehouse Architecture By Slavko Stemberger."— Presentation transcript:

1 Copyright © Starsoft Inc, 2000 1 Data Warehouse Architecture By Slavko Stemberger

2 Copyright © Starsoft Inc, 2000 2 Some Acronyms/Terms OLAP –On-line Analytical Processing ROLAP –Relational OLAP OLTP –On-Line Transaction Processing (operational system)

3 Copyright © Starsoft Inc, 2000 3 Some Acronyms/Terms Metadata –Data about data (data dictionary) Source System –An operational system that provides data for the data warehouse MOLAP –Multidimensional OLAP

4 Copyright © Starsoft Inc, 2000 4 Some Acronyms/Terms Data Warehouse –A queryable source of data Data Mart –A logical subset of a data warehouse Data Staging Area –An intermediate storage location used for ETL ETL –Extract, Transform and Load

5 Copyright © Starsoft Inc, 2000 5 Data Structures/Databases Hierarchical DB Network DB Relational DB O-O DB Dimensional DB Flat Files

6 Copyright © Starsoft Inc, 2000 6 Modeling Methods Dimensional Object Oriented (O-O) Entity-Relationship (E-R)

7 Copyright © Starsoft Inc, 2000 7 Entity-Relationship Modeling Instantaneous snapshot of the business Removed data redundancy (eliminates update anomalies) Shows detail relationships Complex network of entities can be difficult for end-users to understand Used for operational system

8 Copyright © Starsoft Inc, 2000 8 Dimensional Modeling Data duplication is allowed (in the dimensions) Query based Easier for users to understand –Not as much detail shows as in E-R Used in data warehouses

9 Copyright © Starsoft Inc, 2000 9 Dimensional Models Star Schema Snowflake Schema The “Cube”

10 Copyright © Starsoft Inc, 2000 10 The “Cube” Logical structure of ALL data warehouses Can be implemented physically in an RDB like Oracle Some view this as limited to data marts

11 Copyright © Starsoft Inc, 2000 11 Star Schema Easy to understand Flexible in type of questions that can be asked Supports very large data warehouses There is data redundancy (in the dimensions)

12 Copyright © Starsoft Inc, 2000 12 Snowflake Schema “Normalized” star schema More complex than the star schema - harder to understand and work with Solves some problems that cannot be done with star schema

13 Copyright © Starsoft Inc, 2000 13 Dimension Tables Each variable has a set of known, relatively small, set of values 4 - 20 dimensions per data warehouse/data mart is the norm A set of independent variables that affect an observation

14 Copyright © Starsoft Inc, 2000 14 Dimension Tables (cont…) Some numeric values are descriptive –Numeric descriptive values should be suspect of being facts e.g. standard product price may be a fact because it can change and one can ask “what was the average standard price of the product over the last 12 months” Columns are descriptive and usually textual

15 Copyright © Starsoft Inc, 2000 15 Dimension Tables (cont…) Time dimension keys may be/should be assigned in the order of the dates in the fact table - this allows physical partitioning In general avoid “smart” keys - they should be meaningless Avoid production keys Dimension keys should be meaningless surrogate keys

16 Copyright © Starsoft Inc, 2000 16 Dimension Tables - Granularity Keep the grain of the data as small as possible (as detail as possible) –This makes the warehouse more resistant to change –It is easier to add attributes to existing dimensions –superior results in data mining operations Definition: The level of detail of the data

17 Copyright © Starsoft Inc, 2000 17 Dimension Tables - “Types” Degenerate “Junk” Other Time

18 Copyright © Starsoft Inc, 2000 18 Dimension Tables - Time Must be consistent across all fact tables Create partial attributes year, month and day and their concatenations (year + month, year + month + day, year + week, …) –Without the concatenations, it is difficult to ask for time ranges All data marts and warehouses have at least one time dimension

19 Copyright © Starsoft Inc, 2000 19 Dimension Tables - Degenerate Usually a control document id such as order number, invoice number, etc No value in creating a physical table Put the id into the fact table Dimensions with only one attribute

20 Copyright © Starsoft Inc, 2000 20 Dimension Tables - “Junk” Possible Actions: –Put the these flags into the fact table –Make each one into a dimension –Drop them from the design –Create one dimension with all combinations of these flags Given: Leftover flags and text attributes

21 Copyright © Starsoft Inc, 2000 21 Fact Tables Degenerate dimension keys (if they exist) Facts –Additive –Semi-additive –Non-additive –None (factless tables) Dimension keys

22 Copyright © Starsoft Inc, 2000 22 Facts - Additive Can be added across all combination of dimensions Examples: sales in dollars or units These are measures of activity

23 Copyright © Starsoft Inc, 2000 23 Facts - Semi-additive/non- additive Some may be added across some dimensions but not others –e.g. Bank Balance Some may not be added at all –e.g. Temperature These are measures of intensity

24 Copyright © Starsoft Inc, 2000 24 Closing Other things to look at –Mutating dimensions –Hierarchical data (e.g. product structures) –Security –Data Loading –Cleansing –etc.


Download ppt "Copyright © Starsoft Inc, 2000 1 Data Warehouse Architecture By Slavko Stemberger."

Similar presentations


Ads by Google