Understanding the Azure Data Stack Matan Yungman, CTO, Madeira
Where Did We Come From?
Things Slowly Started to Change V olume V ariety V elocity
Things Slowly Started to Change New Data Models Key-Value, Document, Graph, Column-Oriented, … Not conforming to old rules
Things Slowly Started to Change Distributed Computing Cloud Computing
About MadeiraData.com SQLServerRadio.co.il SQLServerRadio.com
How to Start Technologies used in your company Architecture view Just do it
On Premises Storage Servers Networking O/S Middleware Virtualization Data Applications Runtime You manage Infrastructure (as a Service) Storage Servers Networking O/S Middleware Virtualization Data Applications Runtime Managed by Microsoft You manage Platform (as a Service) Managed by Microsoft You manage Storage Servers Networking O/S Middleware Virtualization Applications Runtime Data Software (as a Service) Managed by Microsoft Storage Servers Networking O/S Middleware Virtualization Applications Runtime Data Cloud Models
IaaS - Azure VMs Point of Service Devices Self Checkout Stations Kiosks Smart Phones Slates/ Tablets PCs/ Laptops Servers Digital Signs Diagnostic Equipment Remote Medical Monitors Logic Controllers Specialized Devices Thin Clients Handhelds Security POS Terminals Automation Devices Vending Machines Kinect ATM Cloud Management Portal >_ Scripting (Windows, Linux and Mac) REST API Boot VM from New Disk Server
Azure VMs - Databases Point of Service Devices Self Checkout Stations Kiosks Smart Phones Slates/ Tablets PCs/ Laptops Servers Digital Signs Diagnostic Equipment Remote Medical Monitors Logic Controllers Specialized Devices Thin Clients Handhelds Security POS Terminals Automation Devices Vending Machines Kinect ATM Structured SQL, Oracle, DB2 Server Unstructured/NoSql Hadoop, Cloudera MongoDB, Couch etc
Azure SQL Database A relational database-as-a-service, fully managed by Microsoft. For cloud-designed apps when near-zero administration and enterprise-grade capabilities are key. Best for… TCO benefits SQL Server in a VMAzure SQL Database Scalability Resources
Azure SQL Database Increased from 99.9% to 99.99% uptime SLA New service design point enables scale up of resources, delivering predictable throughput & performance SLA Performanc e Point-in-time-restore, geo-restore, and standard and active geo- replication protect against human & environmental-initiated events Azure certifications: ISO, HIPAA BAA, EU Model Clause Auditing on SQL Database Protection Compliance Hourly billing & broad set of price points Flexibility
Store data in columnar format for massive compression Load data into or out of memory for next- generation performance with up to 60% improvement in data loading speed Updateable and clustered for real-time trickle loading 18 Up to 100x faster queries Updateable clustered columnstore vs. table with customary indexing Up to 15x more compression Columnstore index representation Parallel query execution Query Results
Provides a single T-SQL query model for PDW and Hadoop with rich features of T-SQL, including joins without ETL Uses the power of MPP to enhance query execution performance Supports Windows Azure HDInsight to enable new hybrid cloud scenarios Provides the ability to query non-Microsoft Hadoop distributions, such as Hortonworks and Cloudera SQL Server Parallel Data Warehouse Microsoft Azure HDInsight PolyBase Hadoop Hortonworks for Windows and Linux Cloudera Connecting islands of data with PolyBase Bringing Hadoop point solutions and the data warehouse together for users and IT Result set Select…
Introducing SQL Data Warehouse Fully managed relational data warehouse-as-a-service The first elastic cloud data warehouse with enterprise-grade capabilities Support your smallest to largest data sets
Elastic Scale Spin up for heavy workloads, cycle down for daily activity Buy time to insight based on what you need, when you need it Choose the combo of compute and storage that meets your needs
Sample - Portal UX
Pause Data remains in place – no reloading / restoring of data When paused, cloud-scale storage is min cost Automate via PowerShell/REST API $$$$
SQL Server Compatibility Mature enterprise-ready SQL for sophisticated DW scenarios Existing SQL Server scripts and tools just work Continuous enhancements on language surface Modular programming (write once, execute multiple times) Faster code execution Encapsulated programming logic Easier maintenance of large tables Improves performance Enhanced scalability and availability Allows proper use and comparisons of characters in different languages Mature Column-Store technology for best- in-class DW query performance
Document Database
What is a document database? Ideally suited to this kind of document - { "id": "13244_user", "firstName": "John", "lastName": "Smith", "age": 25, "employmentHistory" : [ { "company":"Contoso Inc" "start": {"date":"Thu, 02 Apr :54:45 GMT", "epoch": }, "position":"CEO" }, { "start": {"date":"Thu, 02 Apr :54:45 GMT", "epoch": }, "end": {"date":"Thu, 01 Apr :54:45 GMT", "epoch": }, "position":"GM"}, ], "address": { "streetAddress": "21 2nd Str", "city": "New York", "state": "NY", "postalCode": "10021" }, "children": [ {"name":"Megan", "age":10}, {"name": "Bruce", "age":7}, {"name": "Angus", "sports" : ["football", "basketball", "hockey"]} ] "mobileNumber": " " }
Come as you are Data normalization ORM
Part of the NoSQL family of databases Built for simplicity, scale and performance Non-relational, no schema enforced Flexible query options What is a document database?
fully managed, scalable, queryable, schemafree JSON document database service for modern applications transactional processing rich query managed as a service elastic scale internet accessible http/rest schema-free data model arbitrary data formats Microsoft Azure Data Services
Azure DocumentDB Fully-managed, highly-scalable, NoSQL document database service query over schema-free JSON multi-document transactions tunable, high performance fully managed and designed for massive scale JS { } SQ L
Azure Stream Analytics Process real-time data in Azure Consumes millions of real-time events from Event Hub collected from devices, sensors, infrastructure, and applications Performs time-sensitive analysis using SQL-like language against multiple real-time streams and reference data Outputs to persistent stores, dashboards or back to devices Point of Service Devices Self Checkout Stations Kiosks Smart Phones Slates/ Tablets PCs/ Laptops Servers Digital Signs Diagnostic Equipment Remote Medical Monitors Logic Controllers Specialized Devices Thin Clients Handhelds Security POS Terminals Automation Devices Vending Machines Kinect ATM
Power BI Dashboards
Power BI Graphs
Power BI NLP
Power BI Integrations
Azure Data Factory Connect to relational or non- relational data that is on- premises or in the cloud Orchestrate data movement & data processing Publish to Power BI users as a searchable data view Operationalize (schedule, manage, debug) workflows Lifecycle management, monitoring Orchestrate trusted information production in Azure Microsoft Confidential – Under Strict NDA Azure Machine Learning
Wasn’t covered but Worth Mentioning Azure Tables Azure Search Azure Redis Cache Cortana
How to Start Technologies used in your company Architecture view Just do it
Please fill evaluation forms
Special thanks to our great sponsors!