Download presentation
Presentation is loading. Please wait.
1
Where Should My Data Live (and Why)?
Matt Gordon, Architect II, Insight Digital Innovation
2
Speaker Info Matt Gordon Architect II Matt.Gordon@insight.com
3
Evaluations Saturday feedback: https://forms.gle/rddzwGffw3qa8JZu5
Conference feedback:
4
About Me – What I’ve Done
15+ years of SQL Server experience Microsoft Data Platform MVP 2019 IDERA ACE MCSE: Data Management and Analytics Two-time PASS Summit speaker Leader of Lexington, KY (USA) PASS Local Group
5
About Me – Where I Live
6
About Me – Where I Live (Kentucky, USA)
7
How I Picked My Twitter Handle and Domain Name
8
Agenda Where Does Our Data Live Now?
Why Does Our Data Live Where It Does? Cloud, On-Premises, or Both? Case Studies Wrap-up
9
Definitions
10
Discussion Points Cloud is just somebody else’s computer in somebody else’s datacenter Rapid development from cloud providers constantly expands options Are you locked into deployment locations for certain platforms? Database engine always on-premises Hadoop always in cloud Blending of technologies and platforms may/may not be the right answer
11
Where Does Our Data Live Now?
12
Where Does Our Data Live Now?
How many of us do not have a single data environment in the cloud? How many of us have only test/dev/QA data environments in the cloud? How many of us have a “trial” production data environment in the cloud? How many of us have all production data environments in the cloud? How many of us have all (or nearly all) data environments in the cloud?
13
Why Does Our Data Live Where It Does?
14
Why Does Our Data Live Where It Does?
On-Premises Pros Cost Leveraging “investments” Can cost less if uptime is not critical Comfort level “I can go see it” Physical control and security Data accessible even when all external telecom is down Licensing
15
Why Does Our Data Live Where It Does?
On-Premises Cons Generally requires large up-front investment Requires corresponding infrastructure Rackspace, cooling, cabling, telecom, fire suppression, etc. May require backup datacenter Depends on uptime requirements On-site personnel often needed to maintain operations More expensive from a resource perspective
16
Why Does Our Data Live Where It Does?
Cloud Pros Cost Buy only what you need Scalability (vertical and horizontal) Global redundancy Storage durability Data availability from all locations PaaS often satisfactory to government security audits/approvals High availability and disaster recovery often built-in*
17
Why Does Our Data Live Where It Does?
Cloud Cons Can require robust Internet connectivity VPN cost can be significant Minimal to no control over underlying infrastructure “Noisy neighbors” Design apps to deal with connection hiccups more efficiently Perception of lighter security “Things happen by magic”
18
Cloud, On-Premises, or Both?
19
Cloud Deployment Options (Azure)
SQL Server (IaaS) Azure SQL DW Azure SQL DB Hadoop Mimics on-premises behavior but resources are on Azure Full control of configuration Full control of maintenance MPP cloud-based, scale-out, relational database Separates storage and compute Can pause compute capacity when not needed PaaS flavor of SQL Server database Very limited control of maintenance Limited control of configuration Microsoft’s flavor is known as HDInsight Used for semi-structured data Can connect from database engine using PolyBase Mention databricks for unstructured data on Azure and Amazon slides
20
Cloud Deployment Options (Amazon)
SQL Server on EC2 Amazon Redshift Amazon RDS Hadoop Mimics on-premises behavior but resources are on Amazon EC2 Full control of configuration Full control of maintenance Amazon equivalent of Azure SQL DW Fully managed Easily scalable Amazon PaaS offering Supports six database engines Minimal configuration control Amazon’s HDInsight equivalent is EMR Supports traditional Hadoop tooling Can connect from database engine using PolyBase Mention databricks for unstructured data on Azure and Amazon slides
21
Cloud Deployment Options (Google)
Google Compute Engine Google BigQuery Google Cloud SQL Hadoop SQL Server on Google Cloud Platform is IaaS offering Full control of configuration Full control of maintenance Multiple versions and editions supported Currently requires large amounts of data for efficient performance PaaS flavor of database engines Supports MySQL and PostgreSQL (beta) Fully-managed Google’s fully-managed flavor is known as Google Cloud Dataproc Used for semi-structured data Can connect from database engine using PolyBase Google BigQuery
22
On-Premises Deployment Options
SQL Server Microsoft APS PaaS Database Hadoop Traditional deployment of the database engine Full control of configuration Full control of maintenance MPP appliance Evolution of Parallel Data Warehouse Architecture of Azure SQL DW based on this design No on-premises equivalent of Azure SQL Database Microsoft’s flavor is known as HDInsight Many other non-Microsoft deployment options Can connect from database engine using PolyBase
23
Pause for One More PaaS Option…
Azure SQL Database Managed Instance New deployment model of Azure SQL Database Provides near 100% compatibility with SQL Server on-premises Enterprise Edition Preserves PaaS capabilities Automatic patching & version updates Automated backups High availability Using DMS (Data Migration Service) is a popular lift & shift path Not in west central – still regional specific
24
Hybrid Deployment Options/Scenarios
On-Premises App Servers & Azure SQL DB Availability Groups with Azure Replica(s) Easy to create and destroy databases as needed for development and deployment Removes management responsibility from devs Good choice if DBA team short on resources Uses Azure as backup datacenter(s) Requires robust network infrastructure Good for minimum datacenter proximity requirements
25
Hybrid Deployment Options/Scenarios
Replication to Azure IaaS VM Replication to Azure SQL Database Tried and true technology in use Identical to doing this on-premises other than network portion Good way to ease into comfort with the cloud Azure SQL Database can be a replication subscriber Eases DBA team into cloud and PaaS interactions Straightforward setup
26
Setting up replication to Azure SQL Database
Demo Setting up replication to Azure SQL Database
27
Hybrid Deployment Options/Scenarios
Log Shipping to Azure IaaS VM PolyBase to Azure Blob Storage Popular with customers who want a copy of data stored completely off-site Straightforward setup Expands environment without requiring cluster or other complicated infrastructure Great for querying large quantities of semi-structured data Good way to introduce team to PolyBase Subject of our first case study
28
Case Studies
29
Transportation Planning Agency
Statistical models generating TBs of output every year Storage costs spiraling upward and difficult to manage Output stored in relational database tables requiring constant maintenance Output generated as text files which were fed into the relational tables Loaded output files into Azure Blob Storage (cold) Query performance increased Storage costs decreased by 96% ($2k per year vs. $75k per year)
30
Querying statistical model outputs stored in Azure Blob Storage
Demo Querying statistical model outputs stored in Azure Blob Storage
31
Geospatial Research Center
Hosted Hadoop cluster Hosted HDFS storage storing Excel, CSV, XML, JSON, etc. SQL Server installed on Azure VMs Database engine, DQS, MDS, and SSAS in use PolyBase used to query semi-structured data from main SQL Server databases Data consumers presented with common interface to access heterogeneous data
32
Wrap-up
33
Discussion Points Cloud is just somebody else’s computer in somebody else’s datacenter Rapid development from cloud providers constantly expands options Are you locked into deployment locations for certain platforms? Database engine always on-premises Hadoop always in cloud Blending of technologies and platforms may/may not be the right answer
34
Recommendations Set expectations what cloud technologies are and what they can do Management Team HA/DR isn’t done by magic – it’s just different Stay abreast of new technologies Research Training Azure Stack Embrace it all! HPE, Dell, Lenovo, etc for Azure Stack partners
35
Thank You For Attending!
Matt Gordon Architect II
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.