Using External Persistent Volumes to Reduce Recovery Times and Achieve High Availability Dinesh Israni, Senior Software Engineer, Portworx Inc.

Using External Persistent Volumes to Reduce Recovery Times and Achieve High Availability
Dinesh Israni, Senior Software Engineer, Portworx Inc.

Agenda Types of stateful services
Advantages of using External Persistent Volumes Introduction to Portworx Deploying services to use Portworx volumes Demo Q&A

Types of stateful services
Simple applications that run non-clustered Don’t do their own replication across nodes Example: wordpress, mysql, postgres Applications which run clustered Maintain their own replicas across nodes Repairs data when nodes crash or are replaced Example: Cassandra, HDFS

Why is the replication strategy important?
Things go wrong all the time Nodes crash, network has issues, disks fail Clustered applications There is always another copy on another node Replace node and bootstrap Can take a long time to recover though Non-clustered applications If you had no backup, you lose your data when a node fails

How can External Persistent Storage help
Provide high availability for applications even if they don’t provide replication Reduce recovery times for applications that do provide their own replication Virtualize your storage so that you maximize utilization of your clusters

About Portworx First production-ready software defined storage solution designed for microservices Container granular virtual storage Run your workloads local with your storage Snapshots and CloudSnaps for backup and DR Bring-Your-Own-Key Encryption Automate provisioning and control repeatedly on-prem and in any cloud Runs as a container itself on your agents!

Portworx Is Cloud-Native Storage Built for Containers and Schedulers
Scheduler (Kubernetes, Mesosphere, Swarm) TODO Cloud SSD EBS Portworx Nginx Python Cassandra SSD HDD html5 rest db POD-aware Any infrastructure Programmatic SAN No more volumes or storage per application to manage

Block Level Replication
Synchronous replication All replicas accessible from any node in the cluster If replica fails, it is repaired when it comes back up If replica fails permanently, data is re-replicated onto another node in the cluster

Recovery times – Without External Storage
Tasks use local storage, so they are pinned to a node Node crash: Faster your node reboots, lower the recovery time But maintenance windows can be long Network outages can last hours Node failure: Requires copying ALL the data that was on the node that failed Causes network congestion Reduces performance of entire cluster Example: For Cassandra you need to “bootstrap” and “repair” a new node

Recovery times – With External Storage
Storage is accessible from any node in the cluster Node Crash: Task can fail over to another node as soon as scheduler determines the node is down Node Failure: Treated similar to a node crash Block level replication ensures another node has an exact copy of the same data Example: Cassandra you just need to run “repair” and skip the “bootstrap” step

Advantages of using Portworx
Using a SAN or NAS with container workloads is an anti-pattern Static, out-of-band provisioning Increases latencies Introduces issues and complexities during failure scenarios Network issues can bring down ALL services on your cluster Portworx is built for micro services from the ground up Tight integration with schedulers Container granular volumes Unified solution for hybrid deployments Works with VMs, in the cloud and on bare metal, truly cloud-native

Why not just use EBS directly?
Doesn’t work for hybrid deployments. Need different solutions for on-prem and cloud Limit of 16 EBS volumes per EC2 instance EBS volumes regularly get stuck in attaching/detaching state. Requires manual intervention Poor performance, unless you use Provisioned IOPS, which is expensive Failover is slow

Using Portworx with your stateful services
Deploy simple services using marathon or from the DCOS Universe Deploy services based on dcos-commons from the universe Use the dcos-commons framework to write your own distributed services

Deploying through marathon
Just use “pxd” as the volume-driver You can provide volume options while deploying services (size, replication factor, encryption key, etc) No need to pre-provision volumes Can be used to deploy services with Docker and UCR!

Using Portworx with your stateful services
Enhanced dcos-commons to support Portworx Volumes Supported services with Portworx (now available in the DCOS Universe!) Cassandra Hadoop Elastic Search Kafka Supports failover of tasks when nodes crash Higher uptime for your services Reduces recovery times Tasks co-located with the volumes Reduces latencies Reduces network congestion

Creating services using dcos-commons and Portworx Volumes
Dcos-commons makes it easier to write and deploy complex stateful services But it only supports ROOT and MOUNT volumes Not efficient when dealing with failed nodes Added support for Portworx volumes in dcos-commons SDK As easy as adding docker_volume_name and docker_volume_options to any services you write for dcos-commons Just build the service and deploy Volumes will be automatically provisioned Source:

Demo Time!

Any Questions?

Learn More Visit us at our MesosCon booth Visit www.portworx.com
Download from Mesosphere Service Catalog: Get started ASAP: get-started-asap.html Download free PX-Developer: Request a demo and free trial of PX-Enterprise at

Using External Persistent Volumes to Reduce Recovery Times and Achieve High Availability Dinesh Israni, Senior Software Engineer, Portworx Inc.

Similar presentations

Presentation on theme: "Using External Persistent Volumes to Reduce Recovery Times and Achieve High Availability Dinesh Israni, Senior Software Engineer, Portworx Inc."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Using External Persistent Volumes to Reduce Recovery Times and Achieve High Availability Dinesh Israni, Senior Software Engineer, Portworx Inc.

Similar presentations

Presentation on theme: "Using External Persistent Volumes to Reduce Recovery Times and Achieve High Availability Dinesh Israni, Senior Software Engineer, Portworx Inc."— Presentation transcript:

Similar presentations

About project

Feedback