Download presentation
Presentation is loading. Please wait.
1
Creating HIGH PERFORMANCE TABULAR MODELS
James BERESFORD DIRECTOR AGILE BI SYDNEY, AUSTRALIA
2
Creating High Performance Tabular Models
Thinking Tabular Compression Snowflaking and Stars Complexity Loading Performance Security Server stuff 2 | 11/21/201811/21/2018
3
A little about me www.bimonkey.com @BI_Monkey www.agilebi.com.au
3 | 11/21/201811/21/2018
4
Thinking Tabular Tabular and OLAP are different Tabular and SQL are different I will guide you to new thinking! 4 | 11/21/201811/21/2018
5
Compression #1 – How it works
A few clever things going on but… Dictionary Compression is the one that matters the most 5 | 11/21/201811/21/2018
6
Compression #1 – Dictionary Compression
Values are translated to Keys Value Key Bob Smith 1 Fred Gorilla 2 Jane Chimp 3 Single Byte Many Bytes 6 | 11/21/201811/21/2018
7
Compression #1 – Dictionary Compression
Value Key Amount Bob Smith 1 23.50 Fred Gorilla 2 10.99 Jane Chimp 3 11.99 23.78 22.98 1.05 7.66 What you see is not what the engine is storing Displayed but not stored 7 | 11/21/201811/21/2018
8
Compression #2 – Not all Data is equal
Some data compresses better than others, or is more efficient generally Good OK Bad Repeated Text Date Unique Text Integers Decimals (less dp the better) DateTime Boolean Currency Floats 8 | 11/21/201811/21/2018
9
Compression #3 - Repetition
Repetition is ok Repetition is ok 9 | 11/21/201811/21/2018
10
Compression #3 - Repetition
Many rows of data is often better than many columns vs 10 | 11/21/201811/21/2018
11
Compression #4 – Digging in
SSAS DMV DISCOVER_OBJECT_MEMORY_USAGE 11 | 11/21/201811/21/2018
12
Compression #4 – Digging in
Handily, I built an explorer memory-usage-in-tabular-models/ 12 | 11/21/201811/21/2018
13
Compression #4 – Digging in
Demo! Not all data is equal Rows vs Columns User Experience 13 | 11/21/201811/21/2018
14
Relationships are expensive
Snowflaking Relationships are expensive 14 | 11/21/201811/21/2018
15
Snowflaking Flatten Everything 15 | 11/21/201811/21/2018
16
Snowflaking – OLAP style
Animal Gorilla Orang Utang Grey Hippo Dwarf Hippo Zebra Normal Horse This Graffe That Giraffe Great White Etc… Animal SubCategory Great Apes Regular Apes Hippopotamuses Horses Giraffes Sharks Freshwater Eels Animal Category Apes Ungulates Fish 16 | 11/21/201811/21/2018
17
Snowflaking – Tabular style
Animal Category SubCategory Apes Great Apes Gorilla Regular Apes Orang Utang Ungulates Hippopotamuses Grey Hippo Dwarf Hippo Horses Zebra Normal Horse Giraffes This Graffe That Giraffe Fish Sharks Great White Etc… 17 | 11/21/201811/21/2018
18
Snowflaking – Tabular style
Demo! Hierarchies User Experience 18 | 11/21/201811/21/2018
19
Snowflaking – Tabular style
Need to flatten to get hierarchies anyway Relationships cost Repetition is ok 19 | 11/21/201811/21/2018
20
Stars Simple stars are ok! Use Integer Keys*
There is a cardinality limit around 2m keys per query * This needs validating 20 | 11/21/201811/21/2018
21
Complexity Complex stars are less ok
Many to Many handling is not great But it does work… The more relationships to navigate, the more likely it is to break After a point the processing kills the model 21 | 11/21/201811/21/2018
22
Loading – what can you control?
Compresses Reads Data Row Calculations Processing Complete Builds Relationships 22 | 11/21/201811/21/2018
23
ETL is more efficient than Cube Use Columnstores Use Views to abstract
Loading ETL is more efficient than Cube Use Columnstores Use Views to abstract Reduce the work involved in processing just to compression and building relationships 23 | 11/21/201811/21/2018
24
Learn to love DAX Studio
Performance Storage Engine Good Formula Engine Bad Learn to love DAX Studio 24 | 11/21/201811/21/2018
25
Performing – Storage Engine
Brute force scanning of memory Fast, sub second responses over vast data Storage – Brute Scanning Simple but Fast 25 | 11/21/201811/21/2018
26
Performing – Formula Engine
Single Threaded calculations Biggest Bottleneck Formula – CPU Calculations Complex but Slow 26 | 11/21/201811/21/2018
27
Badly written security slows everything down
Same rules apply to security as to all other calculations 27 | 11/21/201811/21/2018
28
Demo: Hierarchy walking approaches
Security Demo: Hierarchy walking approaches 28 | 11/21/201811/21/2018
29
25GB/s - $9/GB 1GB/s - $0.03/GB Server Stuff
Memory overhead = 3x Cube size Caches Query Execution NEVER RUN OUT OF MEMORY!! 25GB/s - $9/GB 1GB/s - $0.03/GB 29 | 11/21/201811/21/2018
30
More, faster RAM Faster CPU NUMA issues Server Stuff
30 | 11/21/201811/21/2018
31
A Big Thanks to our Sponsors
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.