Download presentation
Presentation is loading. Please wait.
Published byLee Tucker Modified over 8 years ago
1
DWH, Prof. Bayer, SS 20021 Caller Prefixsmallint100 Numberinteger10 7 Namestring Adress...string... Callee Prefixsmallint100 Numberinteger10 7 Namestring Adress...string... TimeOfCall Yearsmallint10 Monthstring12 Daysmallint31 Hoursmallint24 Minutesmallint60 RateCentsdecimal LocCaller XCoordinteger 10 4 YCoordinteger 10 4 LocCalleee XCoordinteger 10 4 YCoordinteger 10 4 CallsFacts Prefix...YCoord (13 key components, 7 dimensions) DurationSecinteger Solution proposal for exercise 1 on sheet 1
2
DWH, Prof. Bayer, SS 20022 Solution proposal for exercise 2 on sheet 1 10 8 calls/day * 365 days/year * 49 B/call = 1.7885 * 10 12 B/year ~ 2 TB/year The size of the space spanned by the dimensions is: 100*10 7 *100 *10 7 *10 * 12 *31*24*60* 10 4 * 10 4 * 10 4 * 10 4 = 5,4 * 10 40 The number of Tuples is 10 8 calls/day * 365 days/year = 3.65 * 10 10 Sparsity 1- (3.65 * 10 10 / 5,4 * 10 40 ) ~ 1-10 -30 =0.999999999999999999999999999999 i.e. extremely sparse, but not unusual for datamining
3
DWH, Prof. Bayer, SS 20023 Solution proposal for exercise 3 on sheet 1 The partially aggregated cube has 100 * 100 * 10 * 12 * 31 * 24 * 60 * 10 16 = 5,4 * 10 26 tuples, it cannot be computed or stored with forseeable technology.
4
DWH, Prof. Bayer, SS 20024 Solution proposal for exercise 4 on sheet 1 1. Bit Vectors of length 3.65*10 10 Bits = 36.5*10 9 Bits = 5*10 9 B = 5 GB uncompressed. Bit vectors for TimeOfCall : 137 vectors of 5 GB ~ 685 GB Bit vectors for LocCaller : 10.000 vectors of 5 GB ~ 50 TB per coordinate Bit vectors usable at most for TimeOfCall, but not for other dimensions 2. Compound B-Trees: 45B/compound key + 4B/fact relation size ~ 2 TB/8KB/page = 0.25*10 9 pages = 225.000.000 pages Height of B-Tree: at 45 B/key + 4 B/pointer branching degree ~ 160 for 8 KB pages height 5
5
DWH, Prof. Bayer, SS 20025 Solution proposal for exercise 4 on sheet 1 continued Example Query1: select Name, Prefix, Number, Year, Month, sum(duration) from Caller C, CallsFacts F where C.Name = ‘Rudolf Bayer’ and C.Prefix = F.Prefix and C.Number = F.Number group by Year, Month Time estimate: ~ 10 calls/day = 3650 calls/year = 3650 calls/year* 49 B/call = 182.500 B/year/ 8000 B/page = 23 pages*10 ms/page ~ ¼ second with B-Tree index Example Query2:... from Callee C... B-Tree not usable relation scan at 10 MB/s ~ 2 days
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.