SQL Server 2019 and Big Data Clusters on Dell EMC Storage

Slides:



Advertisements
Similar presentations
System Center 2012 R2 Overview
Advertisements

SQL Reporting Another tool in our IT toolbox. It may not be the sharpest, but it’s free with msSQL and it empowers the users, some. By Bryan Yates -
“It’s going to take a month to get a proof of concept going.” “I know VMM, but don’t know how it works with SPF and the Portal” “I know Azure, but.
Protecting your online and on premises assets "Cloud Style" Mike Martin Architect / Microsoft Azure MVP.
608D CloudStack 3.0 Omer Palo Readiness Specialist, WW Tech Support Readiness May 8, 2012.
Apache Hadoop on Windows Azure Avkash Chauhan
Agenda  What is Cloud Computing?  Milestone of Cloud Computing  Common Attributes of Cloud Computing  Cloud Service Layers  Cloud Implementation.
Univa Grid Engine Makes Work Management Automatic and Efficient, Accelerates Deployment of Cloud Services with Power of Microsoft Azure MICROSOFT AZURE.
Let's talk about Linux and Virtualization in 'vLAMP'
ESSENTIAL WORDS.
Automated Enterprise-wide SQL Server Auditing
Docker Birthday #3.
6/11/2018 8:14 AM THR2175 Building and deploying existing ASP.NET applications using VSTS and Docker on Windows Marcel de Vries CTO, Xpirit © Microsoft.
HPE Synergy.
Couchbase Server is a NoSQL Database with a SQL-Based Query Language
Welcome! Thank you for joining us. We’ll get started in a few minutes.
Reading execution plans successfully
Andrew Pruski SQL Server & Containers
In-Memory Performance
Using External Persistent Volumes to Reduce Recovery Times and Achieve High Availability Dinesh Israni, Senior Software Engineer, Portworx Inc.
Add intelligence to Dynamics AX with Cortana Intelligence suite
Kubernetes Container Orchestration
Database Code Management with VS 2017 and RedGate
Introduction to Clustering
Managing Clouds with VMM
Virtualization Meetup Discussion
Intro to Machine Learning
HDFS on Kubernetes -- Lessons Learned
The Challenges of moving Document Creation to the Cloud
20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.
Server & Tools Business
DeFacto Planning on the Powerful Microsoft Azure Platform Puts the Power of Intelligent and Timely Planning at Any Business Manager’s Fingertips Partner.
Developing for the cloud with Visual Studio
Glynk on Microsoft Azure: A Social Networking Platform Connecting Like-Minded People Nearby for Recommendations, Activities, and Meetups MICROSOFT AZURE.
The Mac DBA, using Docker and SQL Operations Studio
XtremeData on the Microsoft Azure Cloud Platform:
HDFS on Kubernetes -- Lessons Learned
1/2/2019 5:18 PM THR3016 Customer stories: Plan and orchestrate large resource deployments on Azure infrastructure Igal Figlin Principal PM Manager – Azure.
Designing SSIS Packages for Performance
Container cluster management solutions
SSIS Project Deployment: The T-SQL Way
Inside SQL Server Polybase
IST346: Virtualization and Containerization
Advanced Dashboard Creation with PerformancePoint Services 2010
Containerized Spark at RBC
OpenStack Summit Berlin – November 14, 2018
Day 2, Session 2 Connecting System Center to the Public Cloud
SQL Server 2017 on Containers
Erin Dempster SQL Server 2019 Sneak Peek.
Azure Container Service
Dell EMC SQL Server Solutions Doug Bernhardt
Modernizing on SQL Server 2019
Microsoft 365 Business Technical Fundamentals Series
SQL Server Data Mobility
Docker for DBAs SQL Saturday 8/17/2019.
Containers on Azure Peter Lasne Sr. Software Development Engineer
SSDT, Docker, and (Azure) DevOps
SQL Server 2019 Bringing Apache Spark to SQL Server
Bringing source control to BI world!
Introduction to PowerApps and Flow
Introduction to PowerApps and Flow
The Future of Database Development (with containers)
SSDT, Docker, and (Azure) DevOps
Environment Automation
Deploying SQL Server 2019 Big Data Clusters on VMware
SQL Server on Containers
Creating a Marketing Dashboard with Power BI & Dax
DAX: Functions and Context That’s What It’s All About!
An Introduction to Partitioning
Presentation transcript:

SQL Server 2019 and Big Data Clusters on Dell EMC Storage Doug Bernhardt Doug.Bernhardt@dell.com SQL Server 2019 and Big Data Clusters on Dell EMC Storage

Thank you Sponsors! Platinum Sponsor: Gold Sponsors: Visit the Sponsor Booths Lots of Great Raffle Prizes! Get your parking paid via Sponsor Bingo Thank you Sponsors! Platinum Sponsor: Gold Sponsors: Global Alliance Partners:

PASSMN – News/Info Thanks to all our sponsors of 2019! We need Speakers & Sponsors for 2020 PASSMN Meetings! Sign up to present at one of the monthly meetings! Monthly Meetup: 3rd Tuesday of Each Month (except Oct) at Microsoft MTC in Edina (food usually provided) Signup on Meetup: https://www.meetup.com/MN-SQL-Server-User-Group-PASSMN/ Board Member Elections in November/December: Your chance to help out the MN SQL community!

November 3th Through November 8th Join the brightest data professionals focused on the Microsoft Data Platform! November 3th Through November 8th Pre-Conference Sessions – Monday/Tuesday Conference – Wednesday through Friday

SQLSaturday #913 – After Party Location: 4th Floor of Mall of America Time: 6:30PM – 10PM There will be drinks and appetizers as well as free game cards and bowling! Hang out with some new friends you’ve made.

Intro Doug Bernhardt Senior Principal Engineer – Dell EMC Technical marketing Dell EMC since 2012 SQL Server since 1994 Architecture, Performance, Tuning, Storage Integration Not a typo in the title…..Big T, little m 

Agenda SQL Server 2019 BDC Overview Storage Changes in SQL 2019 BDC Storage Q&A This presentation came out of my journey and lessons learned deploying SQL 2019 Big Data Clusters as part of the Microsoft Early Adoption Program Agenda…. In the SQL Server 2019 BDC overview I’ll talk about the major components and what we need to know from a deployment standpoint. I won’t go in-depth about all the various use cases or capabilities, that’s for other sessions Deployment overview – what’s the big deal? Why an entire session on deployment? This is SQL Server so you just run setup.exe right? Deployment on VMware (or any hypervisor). Why in the world would you essentially do virtualization on virtualization? We will walk through the steps and various tips of deploying your own environment on VMware, so my goal is that coming away from this session you feel confident that you could do this yourself…and if not, at least you’ll know who to ask 

SQL Server 2019 BDC Overview

SQL Server 2019 BDC Intro Connect all of your data Relational noSQL Hadoop Create intelligence from all your data SQL Apache Spark Single pane of glass management Azure Data Studio Scalable cluster Simplified deployment Same but different https://docs.microsoft.com/en-us/sql/big-data-cluster/big-data-cluster-overview The MS story with SQL Server 2019 is that they have created a platform to connect all of your data whether that be relational data in SQL Server, or other relational databases, noSQL sources, or data sources like Hadoop. They’re also enabling you to use tools like SQL that you probably already know and love as well and other powerful analytics tools such as Apache Spark. You can do this in a single pane of glass with Azure Data Studio. Instead of building a Big Data environment and then figuring out how to scale it later, the ability to build and deploy a scalable cluster is built in from the beginning. This is what you will install from default. The deployment of a Big Data environment has been greatly simplified. In the past, you had to install you database instances, a Hadoop cluster, configure Spark, all of these things had to be installed and configured separately. With SQL 2019 BDCs a simplified deployment has been created. Its worth noting in case you’re wondering, this is a separate product of sorts from the traditional SQL Server on Windows, or even SQL Server on Linux. So while there is an instance of SQL Server deployed as part of the cluster, it’s installed with a completely set of tools and it’s a completely different installation experience than SQL Server on Windows, or even on Linux.

Big data clusters deployed on Kubernetes, OpenShift, AKS Control Applications Controller Svc Proxy Kibana Grafana Custom apps Configuration Store (SQL Server) Elastic Search InfluxDB BI Analytics SQL Sever 2019 Cluster (BDC) SQL SQL Server master instance External data sources Compute pool Directly read from HDFS SQL Compute Instance SQL Compute Instance SQL Compute Instance SQL Compute Instance SQL Compute Instance SQL Compute Instance … IoT data Application pool Data pool SQL Datapool Instance Storage pool Application Spark SQL Server Spark SQL Server Spark SQL Server … Let’s take a quick peek under the hood. There’s a lot going on here…but some main observations here……Where’s my Windows OS? What’s with this Kubernetes piece? HDFS? What about this persistent storage layer at the bottom? All great questions, and we’re going to get into that. The main takeaways here are the unfamiliar components and what we have to from an infrastructure or DevOps standpoint to stand this thing up. We’ve got this Kubernetes piece that we need to understand, and then there’s this persistent storage layer that needs to be resolved as well. NEED TO FINISH HIGHLIGHTING ALL THE NEW COMPONENTS HERE HDFS Datanode HDFS Datanode HDFS Datanode Storage Storage Kubernetes pod Node Node Node Node Node Node Node Persistent storage Microsoft architecture diagram for SQL Server 2019 BDC

Big data clusters deployed on Kubernetes, OpenShift, AKS Control Applications Controller Svc Proxy Kibana Grafana Custom apps Configuration Store (SQL Server) Elastic Search InfluxDB BI Analytics SQL Sever 2019 Cluster (BDC) SQL SQL Server master instance External data sources Compute pool Directly read from HDFS SQL Compute Instance SQL Compute Instance SQL Compute Instance SQL Compute Instance SQL Compute Instance SQL Compute Instance … IoT data Application pool Data pool SQL Datapool Instance Storage pool Application Spark SQL Server Spark SQL Server Spark SQL Server … Let’s take a quick peek under the hood. There’s a lot going on here…but some main observations here……Where’s my Windows OS? It’s not an omission, there isn’t any. What’s with this Kubernetes piece? HDFS, Spark, What about this persistent storage layer at the bottom? Lots of pieces that need to be built out and lots of unfamiliar components that need to be deployed from an infrastructure or DevOps standpoint to stand this thing up. Luckily Microsoft has helped us out with a lot of this. , but first we We’ve got this Kubernetes piece that we need to understand, and then there’s this persistent storage layer that needs to be resolved as well. NEED TO FINISH HIGHLIGHTING ALL THE NEW COMPONENTS HERE HDFS Datanode HDFS Datanode HDFS Datanode Storage Storage Kubernetes pod Node Node Node Node Node Node Node Persistent storage Microsoft architecture diagram for SQL Server 2019 BDC

What did they (MS) do? Data lake components containerized Kubernetes SQL Server Spark HDFS Elasticsearch …and more Kubernetes Tools Deployment Management So what did they do with the architecture that we just looked at to enable this simplified deployment…..a few things… First, all the different components that we saw have been placed inside containers. Next, they chose Kubernetes as a platform – so to bring you up to speed, Kubernetes, or K8s if you’re cool, is an orchestration platform for containers. It allows for management, scaling, and deployment, among other things, of containers across a cluster of hosts. Finally, they provided a handful of tools for the deployment and management of BDCs

Some background on Docker and K8s Containers are designed to be ephemeral Containerized applications were supposed to be stateless No concept of “persistent storage” until 2018. 2018 We need to save data  Persistent storage is born K8s 1.10 (1.16 = current release) Already on second iteration Before we jump right in, a little background to put things into perspective…… Containers are ephemeral….or short lived….meaning that if they die, you just spin up another and you’re back in business…and when they do die, their storage is recreated. Did you catch that? So at this point, if you store your database inside a container, what happens if the container dies and is recreated? Bye-Bye DB! for those of you that were involved in the early days of web development…tell me if this sounds familiar…. Containerized applications were supposed to be stateless….. In 2018, someone wisely decided that sometimes it’s good to keep data! Because of this history, you will get some K8s purists that will look at you like you have 3 heads when you tell them you’re going to deploy a database environment on K8s…….this is not for us to decide as we are mere mortals. But, this does give some context to the things that you will experience when working with K8s and it’s important to understand the difference between container storage and persistent storage. Coming from a database background, we assume all storage is persistent, in K8s, it’s a special type.

Storage Changes in SQL Server 2019 BDC

K8s Storage vs Persistent Storage Container based Folders within container Ephemeral Persistent Storage Cluster based Presented to pod Folder within container Persistent  By default, storage in K8s is ephemeral, so if you deploy SQL Server as a container and you then create a database or restore a database into it without presenting outside storage, when you restart the container, you will lose that data. When running containers on a single host outside of K8s, not a big deal, you can present a local volume. Howe

Persistent Storage Concepts Single host – easy Multiple hosts – tougher Persistent Storage In-tree (cloud provider) Out-of-tree (CSI) CSI expanded features Snapshots Cloning For a single host, you can just present a local volume. There’s no concern for sharing, because it’s only one host. Multiple hosts, that’s a different story. Anyone have experience with Windows Server Failover Clustering and SQL Server failover clustering? Remember the concept of shared storage on SAN volumes? Similar concept, multiple hosts can have access and need to have access, but ownership needs to be managed and controlled, not everyone can be allowed to write to same location at same time.

CSI and Dell EMC Storage VMware supports both enables all CSI bare metal support PowerMax, VxFlexOS, Isilon, XtremeIO Unity (Beta) Unity CSI driver to release by EOY, so by time SQL Server 2019 is released and you get your hands on those bits, it should be ready to go!

HDFS and Tiering HDFS Tiering Underlying filesystem for BDC Replication factor Access via Azure Data Studio and azdata Tiering BDC’s allow for external HDFS mount External mount displays as folder ADLS, S3, Dell EMC Isilon By default the replication factor is 2, still getting to the bottom of this, how to change, and what will be supported.

Storage Snapshots How do you protect big data environments? Multiple formats Large data footprint Leverage storage snapshots K8s CSI driver support Dell EMC supported Array features AppSync K8s has realized the power and importance of this thru CSI features such as volume snapshots and cloning. These utilize array features to make this happen.

Dell EMC Storage Q&A

Thank You! Stop by Dell EMC Booth and say hello  Doug.Bernhardt@Dell.com We can talk about SQL, we can talk about storage, you can play “stump the chump”, or you can just say Hi .