Download presentation
Presentation is loading. Please wait.
Published byInmaculada Serrano Modified over 5 years ago
1
Tools and techniques for managing big data with Power BI
Adam marshall
2
Our sponsors Platinum and event Gold Global Silver Bronze Raffle
3
Agenda Smart Building Power BI Need, Usecases, Sensors, Dataflow
Tools for managing big data My experience
4
About me Adam Marshall, Senior Consultant at EVRY From Leeds, UK
Based in Oslo Background in Finance & Business i.e. from the Excel world adam_a_marshall Great to be talking at SQL Saturday. Been coming for the last 5 years, got to have a beer and 30 mins with Marco Russo about 4 years ago.
5
Project: Smart Building
Use data to optimise the use of buildings within the workplace. Optimal use of real estate Good working environment Productive and happy employees Optimal running and maintenance Measurements Footfall Movement Temperature Light Dioxide Sound VOC (Volatile Organic Compounds) Productivity Index (weighted measure based on comfort scores) GOAL MAKE BETTER USE OF BUILDINGS BY USING THE DATA WITHIN THEM This means> explain 4 areas. 1st phase – create report for FM to understand their buildings Run through what we measured
6
Sensor Yanzi Cloud Telenor IoT Hub SQL Server Azure Power BI
Flow of Data Sensor Yanzi Cloud Telenor IoT Hub SQL Server Azure Power BI Telenor App Aggregations written to a customer DB Data read into PBI – design My involvement in SQL and Power BI side.
7
Customer requirements
Data at lowest level available (if we need it) Data going back to the beginning of time (if we need it) Navigate and drill down through various levels of time aggregation Data in real time Warning when things are going wrong In short, customer wants everything – Started with this brief to see what was possible.
8
Goal: To explain how much a building/locations were in use.
Sensors at desks, meeting rooms, coffee zones. Sensors counting movements in room. Summarized in SQL to say movement/no movement per minute. Explain % utilization Rapport shows over time, by location
9
Shows carbon dioxide Report is dynamic Min/Avg/Max
10
How to succeed with big data in Power BI
Storage mode matters Optimize your model New features (aggregations, incremental refresh) Back to source Keep report pages and visuals simple Performance analyzer Jump to conclusion, these are my tips. Will go through them individually in more detail, but v quickly: One liner explaining each one
11
Choosing a Storage Mode: Direct Query vs Import
Data source (Azure SQL DB) queried each time access /interact with report. Slow loading of visuals Real time Lose functionality in query editor / DAX No refresh issues for big data Import Mode Copy of data is imported from source and cached in memory. Fast loading of visuals Not real time Full functionality of query editor / DAX Refresh times a problem for big data Start with Import mode (default) – importing a copy of data from source, query in memory Explain pros and cons and why it worked for me Reference DQ that was tool slow and lost lots of DAX func. Import mode is your friend (unless aggregations)
12
How to succeed with big data in Power BI
Storage mode matters Optimize your model New features (aggregations, incremental refresh) Back to source Keep report pages and visuals simple Performance analyzer Most important bit! Model is foundation of your report.
13
Original Model Explain my model:
Date and time table, along with a sensor dimension, slicing a fact table. Built measures for min, max and average based on the average value (this is not great as taking avg of avg) This was a good start and follows some best practices, but problem was granularity (too many rows)
14
Microsoft guidance on optimizing model
Follows general modelling advice Star schema Fact table with few columns Reduce cardinality of data Only bring in data that you need Keep queries in query editor simple Avoid DAX that requires evaluation of every row e.g. RankX Start with this link for good guidance on optimizing model Summarising main points: I followed most of these, but as I say, customer wanted granular data. So I needed some better tools
15
How to succeed with big data in Power BI
Storage mode matters Optimize your model New features (aggregations, incremental refresh) Back to source Keep report pages and visuals simple Performance analyzer Luckily, PBI came with improvements!
16
Direct Query / Import mode
evolution May 2018 (preview) Jul 2018 (preview) Sep 2018 (preview) Direct Query / Import mode Incremental Refresh Composite Models Aggregations First had to make simple choice between DQ or Import mode Then came incremental fresh, composite model and aggregations, so start to look into these… Aggs out of preview July. IR still in.
17
Incremental refresh Only refresh last x days/weeks/months/years data
Keep this data for x days/weeks/months/years Power BI premium feature What is it Refresh latest data, not history. Premium feature Setup (briefly) DateTime parameters Use these to filter fact you want to IR Follow setup dialogue RUN DEMO in PBI! Problems First refresh can be problematic in Service when paratmeters removed and timeout Can’t redownload report once published. Dates table not updating when based on min and max date of fact table
18
Incremental refresh – dates issue
DateRanges gives min and max dates, which then drives filters on Date table. Works in PBI desktop, but not in PBI Service. A: Do filtering in SQL and push back!
19
Composite models Instead of choosing Direct Query or Import mode, you can now have both in the same model. Import tables are updated when refresh runs. Direct query tables send query back to data source when user navigates the report. Dual table uses of last imported table when queried via the import fact table, but will switches to DQ when queried via the DQ fact table. Dual Import Direct Query What is it How did I use it (real time Last 24 hours) SHOW NEXT SLIDE Can’t get model when using streaming in PBI
20
Problem with DQ – slice and dice will use cache of data
21
Aggregations Requires composite model to work.
Create a DQ model at the base (import coming soon), supported by imported tables at a higher level of aggregation. Note: Didn’t actually need to split Time into three seperate tables… Direct Query Import Dual What is it? (don’t need to decide granularity in advance – have several versions) Have to have DQ in base (will come update soon when you can just use import) DEMO USE PAPER! Show 3 tables, contents and row count Show AGGSDEMO (with 1 measure at heart) Show how to setup on daily table Final point – will show how to detect which table has been hit later…
22
How to succeed with big data in Power BI
Storage mode matters Optimize your model New features (aggregations, incremental refresh) Back to source Keep report pages and visuals simple Performance analyzer
23
Back to Source Push ETL back to source system
Cumbersome transformations with large data can be painful on a PC – push back to SQL server Avoid inc refresh date issue Avoid complex DAX (precalculate in query editor if you don’t need dynamic calculations)
24
How to succeed with big data in Power BI
Storage mode matters Optimize your model New features (aggregations, incremental refresh) Back to source Keep report pages and visuals simple Performance analyzer
25
Keep Visualisations Simple
Explain graphs Box and whisker too complex, too many points, calculated in the fly SLOWVIZDEMO – show how badly they perform PAPER! Performance analyzer and component parts Introduce DAX studio connect to model server timings for key figures Run daily summary Explain two engines Aggs rewrite Repeat for quarter and last hour Define measures
26
How to succeed with big data in Power BI
Storage mode matters Optimize your model New features (aggregations, incremental refresh) Back to source Keep report pages and visuals simple Performance analyzer Summary!!
27
adam_a_marshall
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.