Tools and techniques for managing big data with Power BI

Slides:



Advertisements
Similar presentations
Technical BI Project Lifecycle
Advertisements

OLAP Cubes and Pivot Tables Leveraging the Power of a Microsoft EPM Solution EPM Customization Series Part 1 February 21 st, 2007 Brendan Giles, PMP, MCP.
Virtual techdays INDIA │ august 2010 Building ASP.NET applications using SQL Server Compact Chaitanya Solapurkar │ Partner Technical Consultant,
OM. Brad Gall Senior Consultant
Intro to Datazen.
Best practices for Power BI Julian Wissel Hans Fousert.
Technology Drill Down: Windows Azure Platform Eric Nelson | ISV Application Architect | Microsoft UK |
Introduction to the Power BI Platform Presented by Ted Pattison.
OM. Platinum Level Sponsors Gold Level Sponsors Pre Conference Sponsor Venue Sponsor Key Note Sponsor.
Agenda Integration points between Excel and Power BI How can I decide between the two technologies Do I need to chose? Q&A.
Microsoft PowerBI – Advanced Solutions with Microsoft Excel and PowerBI Presented by: Phillip Guglielmi, CPA | Senior BI Consultant and Solutions Architect.
Review DirectQuery in SSAS 2016, best practices and use cases
Data Visualization with Tableau
Just Enough Database Theory for Power Pivot / Power BI
Mile Hi Power BI User Group
What’s new in SQL Server 2017 for BI?
Make Power BI Your Own with the Power BI APIs
Operation Data Analysis Hints and Guidelines
10 Amazing Things About Power BI You Don’t Know
Power BI Performance Tips & Tricks
Using a Gateway to Leverage On-Premises Data in Power BI
Power BI after more than 1 year in production
Using a Gateway to Leverage On-Premises data in Power BI
6/12/2018 2:19 PM BRK3245 DirectQuery in Analysis Services: best practices, performance, and use cases Marco Russo SQLBI © Microsoft Corporation. All rights.
Make Power BI Your Own with the Power BI APIs
Microsoft Ignite /22/2018 3:27 PM BRK2121
ABC! Always Be…. Coding (calculated measures)
Qualitative Text Analysis
Using a Gateway to Leverage On-Premises Data in Power BI
Where I am at: Swagatika Sarangi MDM Lead PASS Summit SQL Saturdays
Blazing-Fast Performance:
Exam Braindumps
Make Power BI Your Own with the Power BI APIs
Right click – loop – while class is gathering
Power BI Performance …Tips and Techniques.
Introduction to tabular models
Introduction to tabular models
Creating HIGH PERFORMANCE TABULAR MODELS
“5 Minutes to WOW”, but HOW?
Effective report authoring using Power BI Desktop
20 Questions with Azure SQL Data Warehouse
SQL Saturday New York City May 19th, 2018
Make Power BI Your Own with the Power BI APIs
Near Real Time ETLs with Azure Serverless Architecture
Populating a Data Warehouse
Power BI for large databases
Armando Lacerda
Designing Complex Tabular Models
Data Modeling and Prototyping
Data Visualization with SSRS Mobile Reports
Building your First Cube with SSAS
Power BI with Analysis Services
Power BI.
Welcome to SQLSaturday #767! Hosted by Lincoln SQL Server User Group
If you are expecting … Power BI Data Modeling This session explains why data modeling is so important even if Power BI utilizes the in-memory columnar.
Introduction to Dataflows in Power BI
Armando Lacerda
Power BI at Enterprise-Scale
Data modelling for Power BI using brand new Analysis Services Features
Armando Lacerda
Power BI – Introduction to Dataflows
Dashboard in an Hour Using Power BI
Analysis Services Analysis Services vs. the Data Warehouse vs. OLTP DB
Using Columnstore indexes in Azure DevOps Services. Lessons learned
Using Columnstore indexes in Azure DevOps Services. Lessons learned
Learning DAX? Five things to get you started fast
Beyond orchestration with Azure Data Factory
Power BI Desktop.
Dimension Load Patterns with Azure Data Factory Data Flows
Data Modeling and Prototyping
Presentation transcript:

Tools and techniques for managing big data with Power BI Adam marshall

Our sponsors Platinum and event Gold Global Silver Bronze Raffle

Agenda Smart Building Power BI Need, Usecases, Sensors, Dataflow Tools for managing big data My experience

About me Adam Marshall, Senior Consultant at EVRY From Leeds, UK Based in Oslo Background in Finance & Business i.e. from the Excel world adam.marshall@evry.com adam_a_marshall Great to be talking at SQL Saturday. Been coming for the last 5 years, got to have a beer and 30 mins with Marco Russo about 4 years ago.

Project: Smart Building Use data to optimise the use of buildings within the workplace. Optimal use of real estate Good working environment Productive and happy employees Optimal running and maintenance Measurements Footfall Movement Temperature Light Dioxide Sound VOC (Volatile Organic Compounds) Productivity Index (weighted measure based on comfort scores) GOAL MAKE BETTER USE OF BUILDINGS BY USING THE DATA WITHIN THEM This means> explain 4 areas. 1st phase – create report for FM to understand their buildings Run through what we measured

Sensor Yanzi Cloud Telenor IoT Hub SQL Server Azure Power BI Flow of Data Sensor Yanzi Cloud Telenor IoT Hub SQL Server Azure Power BI Telenor App Aggregations written to a customer DB Data read into PBI – design My involvement in SQL and Power BI side.

Customer requirements Data at lowest level available (if we need it) Data going back to the beginning of time (if we need it) Navigate and drill down through various levels of time aggregation Data in real time Warning when things are going wrong In short, customer wants everything – Started with this brief to see what was possible.

Goal: To explain how much a building/locations were in use. Sensors at desks, meeting rooms, coffee zones. Sensors counting movements in room. Summarized in SQL to say movement/no movement per minute. Explain % utilization Rapport shows over time, by location

Shows carbon dioxide Report is dynamic Min/Avg/Max

How to succeed with big data in Power BI Storage mode matters Optimize your model New features (aggregations, incremental refresh) Back to source Keep report pages and visuals simple Performance analyzer Jump to conclusion, these are my tips. Will go through them individually in more detail, but v quickly: One liner explaining each one

Choosing a Storage Mode: Direct Query vs Import Data source (Azure SQL DB) queried each time access /interact with report. Slow loading of visuals Real time Lose functionality in query editor / DAX No refresh issues for big data Import Mode Copy of data is imported from source and cached in memory. Fast loading of visuals Not real time Full functionality of query editor / DAX Refresh times a problem for big data Start with Import mode (default) – importing a copy of data from source, query in memory Explain pros and cons and why it worked for me Reference DQ that was tool slow and lost lots of DAX func. Import mode is your friend (unless aggregations)

How to succeed with big data in Power BI Storage mode matters Optimize your model New features (aggregations, incremental refresh) Back to source Keep report pages and visuals simple Performance analyzer Most important bit! Model is foundation of your report.

Original Model Explain my model: Date and time table, along with a sensor dimension, slicing a fact table. Built measures for min, max and average based on the average value (this is not great as taking avg of avg) This was a good start and follows some best practices, but problem was granularity (too many rows)

Microsoft guidance on optimizing model Follows general modelling advice https://docs.microsoft.com/en-us/power-bi/power-bi-reports-performance Star schema Fact table with few columns Reduce cardinality of data Only bring in data that you need Keep queries in query editor simple Avoid DAX that requires evaluation of every row e.g. RankX Start with this link for good guidance on optimizing model Summarising main points: I followed most of these, but as I say, customer wanted granular data. So I needed some better tools

How to succeed with big data in Power BI Storage mode matters Optimize your model New features (aggregations, incremental refresh) Back to source Keep report pages and visuals simple Performance analyzer Luckily, PBI came with improvements!

Direct Query / Import mode evolution May 2018 (preview) Jul 2018 (preview) Sep 2018 (preview) Direct Query / Import mode Incremental Refresh Composite Models Aggregations First had to make simple choice between DQ or Import mode Then came incremental fresh, composite model and aggregations, so start to look into these… Aggs out of preview July. IR still in.

Incremental refresh Only refresh last x days/weeks/months/years data Keep this data for x days/weeks/months/years Power BI premium feature What is it Refresh latest data, not history. Premium feature Setup (briefly) DateTime parameters Use these to filter fact you want to IR Follow setup dialogue RUN DEMO in PBI! Problems First refresh can be problematic in Service when paratmeters removed and timeout Can’t redownload report once published. Dates table not updating when based on min and max date of fact table

Incremental refresh – dates issue DateRanges gives min and max dates, which then drives filters on Date table. Works in PBI desktop, but not in PBI Service. A: Do filtering in SQL and push back!

Composite models Instead of choosing Direct Query or Import mode, you can now have both in the same model. Import tables are updated when refresh runs. Direct query tables send query back to data source when user navigates the report. Dual table uses of last imported table when queried via the import fact table, but will switches to DQ when queried via the DQ fact table. Dual Import Direct Query What is it How did I use it (real time Last 24 hours) SHOW NEXT SLIDE Can’t get model when using streaming in PBI

Problem with DQ – slice and dice will use cache of data

Aggregations Requires composite model to work. Create a DQ model at the base (import coming soon), supported by imported tables at a higher level of aggregation. Note: Didn’t actually need to split Time into three seperate tables… Direct Query Import Dual What is it? (don’t need to decide granularity in advance – have several versions) Have to have DQ in base (will come update soon when you can just use import) DEMO USE PAPER! Show 3 tables, contents and row count Show AGGSDEMO (with 1 measure at heart) Show how to setup on daily table Final point – will show how to detect which table has been hit later…          

How to succeed with big data in Power BI Storage mode matters Optimize your model New features (aggregations, incremental refresh) Back to source Keep report pages and visuals simple Performance analyzer

Back to Source Push ETL back to source system Cumbersome transformations with large data can be painful on a PC – push back to SQL server Avoid inc refresh date issue Avoid complex DAX (precalculate in query editor if you don’t need dynamic calculations)

How to succeed with big data in Power BI Storage mode matters Optimize your model New features (aggregations, incremental refresh) Back to source Keep report pages and visuals simple Performance analyzer

Keep Visualisations Simple Explain graphs Box and whisker too complex, too many points, calculated in the fly SLOWVIZDEMO – show how badly they perform PAPER! Performance analyzer and component parts Introduce DAX studio connect to model server timings for key figures Run daily summary Explain two engines Aggs rewrite Repeat for quarter and last hour Define measures

How to succeed with big data in Power BI Storage mode matters Optimize your model New features (aggregations, incremental refresh) Back to source Keep report pages and visuals simple Performance analyzer Summary!!

adam.marshall@evry.com adam_a_marshall