Docker for DBAs SQL Saturday 8/17/2019
About Me –Brad(ley) Nielsen 11+ years development experience Masters in Human Computer Interaction IU Graduate Certificate in Data Science MCSE in Business Intelligence Microsoft Professional Program for Data Science 1+ Years working with Docker
Contents Docker What is it? How to use it? Manage State in Docker SQL Server + Docker Docker benefits Big topic. Try to smooth over nuances where possible. Look at a narrow slice of the ecosystem. Most Docker documentation is shrouded in app dev talk
What is Docker? Open platform for developing, shipping, and running applications. Main Benefits Write once deploy anywhere Improved security Fast deployment Efficient resource utilization https://opensource.com/resources/what-docker https://www.docker.com/resources/what-container Start vague and get more detailed as we go Understand whole to understand the pieces.
Container Analogy Container imagery is everywhere. Why are they called containers? Analogy to shipping containers. Before moving stuff was hard. Different shapes and sizes. Containers standardized that.
Container Analogy Docker containers aim to do the same with applications
Docker Components Image Container Registry Host Binary “executable” Running “executable” Registry Versioned image storage and sharing Host Resource that spawns and manages containers Rough analogy of an “executable” Container = double click and run “executable”
Image Overview Images are layered, read-only, snapshots of applications Contains Application Dependencies (DLLs, Jars, libraries, etc.) Any other file specified by build (Config files, etc.) Does not include Operating system files Unit of deployment in Docker Relying on the operating system is why Docker is so much faster.
Image Overview Images are built in layers to save space and provide reusability. Union File system Take note of the R/W layer R/W goes away when container reboots Note the R/O layers
Container Overview Container = Image + R/W Layer + Process(es) Images may be used to create one to many containers Processes within container are isolated Much less resource usage vs VM From a processes point of view they appear to be alone. (Various OS tricks.) Containers don’t recreate OS resources
Container Overview Containers are to applications as virtual machines are to operating systems Don’t lean too heavily in VM analogy. Light weight VM good analogy but not much in common technically Container Virtual Machine
Registry Overview Stores, shares, and versions Docker images. Repository = 1 type of image Registry = many repositories Types Local Private Public Private = usually a corporate registry Repository = Table Registry = DB
Host Overview Docker Daemon Azure Other Orchestrators Azure App Service Azure Kubernetes Service Azure Container Instance Other Orchestrators Docker Swarm Elastic Container Service HashiCorp Nomad Docker Daemon = single machine Orchestrator = advanced multi container deployments. Usually cluster based
Azure Host Overview Azure Container Instances Azure App Service Container equivalent to Azure Functions Per second billing Azure App Service Allows you to run any web stack in App Service One to many containers Same frills as other App Service Azure Kubernetes Service For complicated multi container services ACI = Serverless App Service = Web apps Azure K S = Heavy duty enterprise cluster Couple others, but those are the main ways
Put It All Together We’ll go through these steps in a demo
How to use it? DEMO Build Push to Repository Deploy to Azure Container Instances Deploy to Azure App Service
Let’s talk about state in Docker Containers are stateless Most apps need some sort of state management Databases are very much stateful What is state? Things you need to save if the app reboots
Option 0: Use a database Most Docker apps are stateless Rely on a database that lives outside the cluster Obviously not going to work when your app is a database
Option 1: Embrace stateless Useful for testing Fast spin up and tear down Changes to data do not persist through reboots Great for QA and Testing
Option 2: Volumes Conceptually like network drives for containers Stored in host file system, but managed by Docker Can also mount remote storage such as Azure Files Default is folder on the host system
Option 3: StatefulSets Kubernetes only Uses Volume claims (Docker Volumes for clusters) Other advanced features for stateful applications Other orchestrators have similar features
SQL Server on Docker Linux SQL Server 2017 or 2019 Windows SQL Server Express (limited support) Operate identically to their VM counterparts
SQL Server on Docker DEMO Deploy to Azure App Service
SQL Server Kubernetes HA Groups Easy deployment… if you know Kubernetes
SQL Server 2019 Big Data Cluster •A compute pool is a group of SQL Server pods used for parallel ingestion of data from an external source–such as Oracle, HDFS, or another SQL Server–and for cross-partition aggregation and shuffling of the dataas part of a query. •A storage pool is a group of pods containing SQL Serverengine, HDFS data node, and Spark containers. This provides the scalable storage tier along with the collocated compute for SQL Server and Spark right next to the data. •A data pool is a group of SQL Serverenginepods that is used either to cache data from an external source or to store an incoming stream of append-only data. In either case,the data is partitioned and distributed across all of the SQL Server instances in the pool Beware version 1
Docker Benefits Higher server density over VMs Reduce errors moving from development to production Easy app scaling Easy deployment Security DevOps integration Worked on my machine
Docker Drawbacks Large learning curve Orchestrators (Kubernetes, etc.) make this curve worse May be overkill for smaller applications Caution is recommended for production DB deployments