Presentation is loading. Please wait.

Presentation is loading. Please wait.

Creating HIGH PERFORMANCE TABULAR MODELS

Similar presentations


Presentation on theme: "Creating HIGH PERFORMANCE TABULAR MODELS"— Presentation transcript:

1 Creating HIGH PERFORMANCE TABULAR MODELS
James BERESFORD DIRECTOR AGILE BI SYDNEY, AUSTRALIA

2 Creating High Performance Tabular Models
Thinking Tabular Compression Snowflaking and Stars Complexity Loading Performance Security Server stuff 2 | 11/21/201811/21/2018

3 A little about me www.bimonkey.com @BI_Monkey www.agilebi.com.au
3 | 11/21/201811/21/2018

4 Thinking Tabular Tabular and OLAP are different Tabular and SQL are different I will guide you to new thinking! 4 | 11/21/201811/21/2018

5 Compression #1 – How it works
A few clever things going on but… Dictionary Compression is the one that matters the most 5 | 11/21/201811/21/2018

6 Compression #1 – Dictionary Compression
Values are translated to Keys Value Key Bob Smith 1 Fred Gorilla 2 Jane Chimp 3 Single Byte Many Bytes 6 | 11/21/201811/21/2018

7 Compression #1 – Dictionary Compression
Value Key Amount Bob Smith 1 23.50 Fred Gorilla 2 10.99 Jane Chimp 3 11.99 23.78 22.98 1.05 7.66 What you see is not what the engine is storing Displayed but not stored 7 | 11/21/201811/21/2018

8 Compression #2 – Not all Data is equal
Some data compresses better than others, or is more efficient generally Good OK Bad Repeated Text Date Unique Text Integers Decimals (less dp the better) DateTime Boolean Currency Floats 8 | 11/21/201811/21/2018

9 Compression #3 - Repetition
Repetition is ok Repetition is ok 9 | 11/21/201811/21/2018

10 Compression #3 - Repetition
Many rows of data is often better than many columns vs 10 | 11/21/201811/21/2018

11 Compression #4 – Digging in
SSAS DMV DISCOVER_OBJECT_MEMORY_USAGE 11 | 11/21/201811/21/2018

12 Compression #4 – Digging in
Handily, I built an explorer memory-usage-in-tabular-models/ 12 | 11/21/201811/21/2018

13 Compression #4 – Digging in
Demo! Not all data is equal Rows vs Columns User Experience 13 | 11/21/201811/21/2018

14 Relationships are expensive
Snowflaking Relationships are expensive 14 | 11/21/201811/21/2018

15 Snowflaking Flatten Everything 15 | 11/21/201811/21/2018

16 Snowflaking – OLAP style
Animal Gorilla Orang Utang Grey Hippo Dwarf Hippo Zebra Normal Horse This Graffe That Giraffe Great White Etc… Animal SubCategory Great Apes Regular Apes Hippopotamuses Horses Giraffes Sharks Freshwater Eels Animal Category Apes Ungulates Fish 16 | 11/21/201811/21/2018

17 Snowflaking – Tabular style
Animal Category SubCategory Apes Great Apes Gorilla Regular Apes Orang Utang Ungulates Hippopotamuses Grey Hippo Dwarf Hippo Horses Zebra Normal Horse Giraffes This Graffe That Giraffe Fish Sharks Great White Etc… 17 | 11/21/201811/21/2018

18 Snowflaking – Tabular style
Demo! Hierarchies User Experience 18 | 11/21/201811/21/2018

19 Snowflaking – Tabular style
Need to flatten to get hierarchies anyway Relationships cost Repetition is ok 19 | 11/21/201811/21/2018

20 Stars Simple stars are ok! Use Integer Keys*
There is a cardinality limit around 2m keys per query * This needs validating 20 | 11/21/201811/21/2018

21 Complexity Complex stars are less ok
Many to Many handling is not great But it does work… The more relationships to navigate, the more likely it is to break After a point the processing kills the model 21 | 11/21/201811/21/2018

22 Loading – what can you control?
Compresses Reads Data Row Calculations Processing Complete Builds Relationships 22 | 11/21/201811/21/2018

23 ETL is more efficient than Cube Use Columnstores Use Views to abstract
Loading ETL is more efficient than Cube Use Columnstores Use Views to abstract Reduce the work involved in processing just to compression and building relationships 23 | 11/21/201811/21/2018

24 Learn to love DAX Studio
Performance Storage Engine Good Formula Engine Bad Learn to love DAX Studio 24 | 11/21/201811/21/2018

25 Performing – Storage Engine
Brute force scanning of memory Fast, sub second responses over vast data Storage – Brute Scanning Simple but Fast 25 | 11/21/201811/21/2018

26 Performing – Formula Engine
Single Threaded calculations Biggest Bottleneck Formula – CPU Calculations Complex but Slow 26 | 11/21/201811/21/2018

27 Badly written security slows everything down
Same rules apply to security as to all other calculations 27 | 11/21/201811/21/2018

28 Demo: Hierarchy walking approaches
Security Demo: Hierarchy walking approaches 28 | 11/21/201811/21/2018

29 25GB/s - $9/GB 1GB/s - $0.03/GB Server Stuff
Memory overhead = 3x Cube size Caches Query Execution NEVER RUN OUT OF MEMORY!! 25GB/s - $9/GB 1GB/s - $0.03/GB 29 | 11/21/201811/21/2018

30 More, faster RAM Faster CPU NUMA issues Server Stuff
30 | 11/21/201811/21/2018

31 A Big Thanks to our Sponsors


Download ppt "Creating HIGH PERFORMANCE TABULAR MODELS"

Similar presentations


Ads by Google