Download presentation
Presentation is loading. Please wait.
Published byAdrian Blankenship Modified over 9 years ago
1
Data warehousing theory and modelling techniques Graduate course on dimensional modelling
2
1. EXTENDED DIMENSION TABLE DESIGNS 1.1.1.1. Many-to-many dimensions 1.1. 1.2.1.2. Many-to-many traps 1.2. 1.3.1.3. Role-playing dimensions 1.3. 1.4.1.4. Organisation and parts hierarchies 1.4. 1.5.1.5. Time stamping the changes 1.5. 1.6.1.6. Building an audit dimension 1.6. 1.7. conclusion: too few or too many dimensions
3
2. EXTENDED FACT TABLE DESIGNS 2.1.2.1. Facts of differing granularity and allocating 2.1. 2.2.2.2. Time of day 2.2. 2.3.2.3. Multiple units of measurement 2.3. 2.4.2.4. Value band reporting 2.4.
4
1.1. Many-to-many dimensions: if one of the dimensions has many values
5
n Solution: –A bridge table between the fact table and the many-to-many dimension –we must generalise the original diagnoses key in the fact table to be a special diagnoses key and we use a group key and a weighting factor which sums up to 1.00
6
1.1.1.1. Many-to-many dimensions: if one of the dimensions has many values 1.1.
7
1.2.1.2. Many-to-many traps: be aware of different cardinality when a dimension is attached to several fact tables 1.2.
8
1.3.1.3. Role-playing dimensions 1.3. A role in a data warehouse is a situation where a single dimension appears several times in the same fact table. Then the underlying dimension may exist as a single physical table, but each of the roles must be presented in a separately labelled view. A role in a data warehouse is a situation where a single dimension appears several times in the same fact table. Then the underlying dimension may exist as a single physical table, but each of the roles must be presented in a separately labelled view. E.g. the telecommunications industry of a single call we might register: source system provider local switch provider long distance provider added value service provider E.g. the telecommunications industry of a single call we might register: source system provider local switch provider long distance provider added value service provider
9
1.4. Organisation and parts hierarchies
11
1.4.1.4. Organisation and parts hierarchies 1.4.
12
1.5. Time stamping n E.g. human resources environment of a large enterprise with 100.000 employees n 3 kinds of queries against the HR data: 1. summary statistics of the entire employee data base (monthly) 2.to profile the employee population at any moment in time (month end or not) 3.we demand that every employee transaction can be represented distinctly: we thus want to see every transaction on a given employee, with the correct transaction sequence and the correct timing of each transaction 3 = detailed transaction theory or fundamental truth
13
1.5.1.5. Time stamping e.g. an employee transaction dimension time stamped with current and next transaction dates and times 1.5.
14
1.6.1.6. Building an audit dimension during the extract process in the data staging phase 1.6.
15
2.1. Facts of differing granularity and allocating E.g. shipment invoice shipment invoiceshipment invoice n Individual fact records should be on the lowest atomic level n When faced with facts of differing granularity, try to force all the facts to the lowest level n When allocating facts to the lowest level is impossible, the higher level facts have to be presented in separate tables n High-level plans or forecasts: make sure that the actual aggregates exist at the same level of the plans –1. Aggregate table and plan table share exactly the same dimensions 1. => combine them in 1 single physical table –2. Aggregate table and plan table have different dimensions => combination in 1 single physical table is impossible
16
2.1. Facts of differing granularity and allocating
17
2.1. Facts of differing granularity and allocating: high level plans or forecasts
18
2.1.2.1. Facts of differing granularity and allocating: high level plans or forecasts 2.1. The dimension plan version makes it impossible to combine with an aggregate table
19
2.2.2.2. Time of day 2.2.
20
2.3. Multiple units of measure The wrong design when fact table quantities need to be expressed in several units of measure
21
2.3.2.3. Multiple units of measure 2.3. The recommended design for multiple units of measure
22
2.4.2.4. Value band reporting 2.4. n Report of form: balance number of total of range accounts balances 0-1000 45678 $10222543 1001-2000 36154 $45455789 2001-5000 11485 $30851455
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.