Presentation is loading. Please wait.

Presentation is loading. Please wait.

Where Should My Data Live (and Why)?

Similar presentations


Presentation on theme: "Where Should My Data Live (and Why)?"— Presentation transcript:

1 Where Should My Data Live (and Why)?
Matt Gordon, Architect II, Insight Digital Innovation

2 Speaker Info Matt Gordon Architect II Matt.Gordon@insight.com

3 Evaluations Saturday feedback: https://forms.gle/rddzwGffw3qa8JZu5
Conference feedback:

4 About Me – What I’ve Done
15+ years of SQL Server experience Microsoft Data Platform MVP 2019 IDERA ACE MCSE: Data Management and Analytics Two-time PASS Summit speaker Leader of Lexington, KY (USA) PASS Local Group

5 About Me – Where I Live

6 About Me – Where I Live (Kentucky, USA)

7 How I Picked My Twitter Handle and Domain Name

8 Agenda Where Does Our Data Live Now?
Why Does Our Data Live Where It Does? Cloud, On-Premises, or Both? Case Studies Wrap-up

9 Definitions

10 Discussion Points Cloud is just somebody else’s computer in somebody else’s datacenter Rapid development from cloud providers constantly expands options Are you locked into deployment locations for certain platforms? Database engine always on-premises Hadoop always in cloud Blending of technologies and platforms may/may not be the right answer

11 Where Does Our Data Live Now?

12 Where Does Our Data Live Now?
How many of us do not have a single data environment in the cloud? How many of us have only test/dev/QA data environments in the cloud? How many of us have a “trial” production data environment in the cloud? How many of us have all production data environments in the cloud? How many of us have all (or nearly all) data environments in the cloud?

13 Why Does Our Data Live Where It Does?

14 Why Does Our Data Live Where It Does?
On-Premises Pros Cost Leveraging “investments” Can cost less if uptime is not critical Comfort level “I can go see it” Physical control and security Data accessible even when all external telecom is down Licensing

15 Why Does Our Data Live Where It Does?
On-Premises Cons Generally requires large up-front investment Requires corresponding infrastructure Rackspace, cooling, cabling, telecom, fire suppression, etc. May require backup datacenter Depends on uptime requirements On-site personnel often needed to maintain operations More expensive from a resource perspective

16 Why Does Our Data Live Where It Does?
Cloud Pros Cost Buy only what you need Scalability (vertical and horizontal) Global redundancy Storage durability Data availability from all locations PaaS often satisfactory to government security audits/approvals High availability and disaster recovery often built-in*

17 Why Does Our Data Live Where It Does?
Cloud Cons Can require robust Internet connectivity VPN cost can be significant Minimal to no control over underlying infrastructure “Noisy neighbors” Design apps to deal with connection hiccups more efficiently Perception of lighter security “Things happen by magic”

18 Cloud, On-Premises, or Both?

19 Cloud Deployment Options (Azure)
SQL Server (IaaS) Azure SQL DW Azure SQL DB Hadoop Mimics on-premises behavior but resources are on Azure Full control of configuration Full control of maintenance MPP cloud-based, scale-out, relational database Separates storage and compute Can pause compute capacity when not needed PaaS flavor of SQL Server database Very limited control of maintenance Limited control of configuration Microsoft’s flavor is known as HDInsight Used for semi-structured data Can connect from database engine using PolyBase Mention databricks for unstructured data on Azure and Amazon slides

20 Cloud Deployment Options (Amazon)
SQL Server on EC2 Amazon Redshift Amazon RDS Hadoop Mimics on-premises behavior but resources are on Amazon EC2 Full control of configuration Full control of maintenance Amazon equivalent of Azure SQL DW Fully managed Easily scalable Amazon PaaS offering Supports six database engines Minimal configuration control Amazon’s HDInsight equivalent is EMR Supports traditional Hadoop tooling Can connect from database engine using PolyBase Mention databricks for unstructured data on Azure and Amazon slides

21 Cloud Deployment Options (Google)
Google Compute Engine Google BigQuery Google Cloud SQL Hadoop SQL Server on Google Cloud Platform is IaaS offering Full control of configuration Full control of maintenance Multiple versions and editions supported Currently requires large amounts of data for efficient performance PaaS flavor of database engines Supports MySQL and PostgreSQL (beta) Fully-managed Google’s fully-managed flavor is known as Google Cloud Dataproc Used for semi-structured data Can connect from database engine using PolyBase Google BigQuery

22 On-Premises Deployment Options
SQL Server Microsoft APS PaaS Database Hadoop Traditional deployment of the database engine Full control of configuration Full control of maintenance MPP appliance Evolution of Parallel Data Warehouse Architecture of Azure SQL DW based on this design No on-premises equivalent of Azure SQL Database Microsoft’s flavor is known as HDInsight Many other non-Microsoft deployment options Can connect from database engine using PolyBase

23 Pause for One More PaaS Option…
Azure SQL Database Managed Instance New deployment model of Azure SQL Database Provides near 100% compatibility with SQL Server on-premises Enterprise Edition Preserves PaaS capabilities Automatic patching & version updates Automated backups High availability Using DMS (Data Migration Service) is a popular lift & shift path Not in west central – still regional specific

24 Hybrid Deployment Options/Scenarios
On-Premises App Servers & Azure SQL DB Availability Groups with Azure Replica(s) Easy to create and destroy databases as needed for development and deployment Removes management responsibility from devs Good choice if DBA team short on resources Uses Azure as backup datacenter(s) Requires robust network infrastructure Good for minimum datacenter proximity requirements

25 Hybrid Deployment Options/Scenarios
Replication to Azure IaaS VM Replication to Azure SQL Database Tried and true technology in use Identical to doing this on-premises other than network portion Good way to ease into comfort with the cloud Azure SQL Database can be a replication subscriber Eases DBA team into cloud and PaaS interactions Straightforward setup

26 Setting up replication to Azure SQL Database
Demo Setting up replication to Azure SQL Database

27 Hybrid Deployment Options/Scenarios
Log Shipping to Azure IaaS VM PolyBase to Azure Blob Storage Popular with customers who want a copy of data stored completely off-site Straightforward setup Expands environment without requiring cluster or other complicated infrastructure Great for querying large quantities of semi-structured data Good way to introduce team to PolyBase Subject of our first case study

28 Case Studies

29 Transportation Planning Agency
Statistical models generating TBs of output every year Storage costs spiraling upward and difficult to manage Output stored in relational database tables requiring constant maintenance Output generated as text files which were fed into the relational tables Loaded output files into Azure Blob Storage (cold) Query performance increased Storage costs decreased by 96% ($2k per year vs. $75k per year)

30 Querying statistical model outputs stored in Azure Blob Storage
Demo Querying statistical model outputs stored in Azure Blob Storage

31 Geospatial Research Center
Hosted Hadoop cluster Hosted HDFS storage storing Excel, CSV, XML, JSON, etc. SQL Server installed on Azure VMs Database engine, DQS, MDS, and SSAS in use PolyBase used to query semi-structured data from main SQL Server databases Data consumers presented with common interface to access heterogeneous data

32 Wrap-up

33 Discussion Points Cloud is just somebody else’s computer in somebody else’s datacenter Rapid development from cloud providers constantly expands options Are you locked into deployment locations for certain platforms? Database engine always on-premises Hadoop always in cloud Blending of technologies and platforms may/may not be the right answer

34 Recommendations Set expectations what cloud technologies are and what they can do Management Team HA/DR isn’t done by magic – it’s just different Stay abreast of new technologies Research Training Azure Stack Embrace it all! HPE, Dell, Lenovo, etc for Azure Stack partners

35 Thank You For Attending!
Matt Gordon Architect II


Download ppt "Where Should My Data Live (and Why)?"

Similar presentations


Ads by Google