Ceph: de factor storage backend for OpenStack

Slides:



Advertisements
Similar presentations
Living with Exadata Presented by: Shaun Dewberry, OS Administrator, RDC Tom de Jongh van Arkel, Database Administrator, RDC Komaran Hansragh, Data Warehouse.
Advertisements

Profit from the cloud TM Parallels Dynamic Infrastructure AndOpenStack.
Ceph: A Scalable, High-Performance Distributed File System
Ceph scalable, unified storage files, blocks & objects Tommi Virtanen / DreamHostOpenStack Conference
Ceph: A Scalable, High-Performance Distributed File System Priya Bhat, Yonggang Liu, Jing Qin.
Sheepdog: Yet Another All-In-One Storage For Openstack Openstack Hong Kong Summit Liu Yuan
11 HDS TECHNOLOGY DEMONSTRATION Steve Sonnenberg May 12, 2014 © Hitachi Data Systems Corporation All Rights Reserved.
“Better together” PowerVault virtualization solutions
Take An Internal Look at Hadoop Hairong Kuang Grid Team, Yahoo! Inc
A BigData Tour – HDFS, Ceph and MapReduce These slides are possible thanks to these sources – Jonathan Drusi - SCInet Toronto – Hadoop Tutorial, Amir Payberah.
Ceph Storage in OpenStack Part 2 openstack-ch,
Hadoop Hardware Infrastructure considerations ©2013 OpalSoft Big Data.
FlashSystem family 2014 © 2014 IBM Corporation IBM® FlashSystem™ V840 Product Overview.
CEPH: A SCALABLE, HIGH-PERFORMANCE DISTRIBUTED FILE SYSTEM S. A. Weil, S. A. Brandt, E. L. Miller D. D. E. Long, C. Maltzahn U. C. Santa Cruz OSDI 2006.
Ceph: A Scalable, High-Performance Distributed File System
CoprHD and OpenStack Ideas for future.
PanDA Status Report Kaushik De Univ. of Texas at Arlington ANSE Meeting, Nashville May 13, 2014.
“Big Storage, Little Budget” Kyle Hutson Adam Tygart Dan Andresen.
CERN - IT Department CH-1211 Genève 23 Switzerland t High Availability Databases based on Oracle 10g RAC on Linux WLCG Tier2 Tutorials, CERN,
Awesome distributed storage system
2011 Storage Developer Conference. © Insert Your Company Name. All Rights Reserved. GPFS-FPO: A Cluster File System for Big Data Analytics Prasenjit Sarkar.
SOFTWARE DEFINED STORAGE The future of storage.  Tomas Florian  IT Security  Virtualization  Asterisk  Empower people in their own productivity,
Software Defined Datacenter – from Vision to Solution
St. Petersburg, 2016 Openstack Disk Storage vs Amazon Disk Storage Computing Clusters, Grids and Cloud Erasmus Mundus Master Program in PERCCOM Author:
File system: Ceph Felipe León fi Computing, Clusters, Grids & Clouds Professor Andrey Y. Shevel ITMO University.
E VALUATION OF GLUSTER AT IHEP CHENG Yaodong CC/IHEP
OPENSTACK Presented by Jordan Howell and Katie Woods.
Journey to the HyperConverged Agile Infrastructure
Windows 2012R2 Hyper-V and System Center 2012
Transparent Cloud Tiering
Course: Cluster, grid and cloud computing systems Course author: Prof
Section 4 Block Storage with SES
Organizations Are Embracing New Opportunities
RHEV Platform at LHCb Red Hat at CERN 17-18/1/17
Backup and Recovery for Hadoop: Plans, Survey and User Inputs
Reducing Risk with Cloud Storage
By Michael Poat & Dr. Jérôme Lauret
Efficient data maintenance in GlusterFS using databases
CloudStack 4.0 (Incubating)
Section 6 Object Storage Gateway (RADOS-GW)
Trial.iO Makes it Easy to Provision Software Trials, Demos and Training Environments in the Azure Cloud in One Click, Without Any IT Involvement MICROSOFT.
Using OpenStack Sahara & Manila to run Analytics
Section 7 Erasure Coding Overview
Agenda Backup Storage Choices Backup Rule
Building the Business Case for Cloud Services
Google Filesystem Some slides taken from Alan Sussman.
Large Scale Test of a storage solution based on an Industry Standard
Using External Persistent Volumes to Reduce Recovery Times and Achieve High Availability Dinesh Israni, Senior Software Engineer, Portworx Inc.
OpenStack Ani Bicaku 18/04/ © (SG)² Konsortium.
Xen Summit Spring 2007 Platform Virtualization with XenEnterprise
Multisite BP and OpenStack Kingbird Discussion
A Survey on Distributed File Systems
Real IBM C exam questions and answers
Guide to Access Control Systems
Managing Clouds with VMM
Virtualization Meetup Discussion
Logsign All-In-One Security Information and Event Management (SIEM) Solution Built on Azure Improves Security & Business Continuity MICROSOFT AZURE APP.
Through the Microsoft Azure Platform, TARGIT Decision Suite Enables Organizations to Analyze Critical Data, Giving Them the Courage to Act MICROSOFT AZURE.
20409A 7: Installing and Configuring System Center 2012 R2 Virtual Machine Manager Module 7 Installing and Configuring System Center 2012 R2 Virtual.
CloneManager® Helps Users Harness the Power of Microsoft Azure to Clone and Migrate Systems into the Cloud Cost-Effectively and Securely MICROSOFT AZURE.
Quasardb Is a Fast, Reliable, and Highly Scalable Application Database, Built on Microsoft Azure and Designed Not to Buckle Under Demand MICROSOFT AZURE.
Guarantee Hyper-V, System Center Performance and Autoscale to Microsoft Azure with Application Performance Control System from VMTurbo MICROSOFT AZURE.
What are the popular tips to Build a Scalable Technology Stack.
OpenStack Summit Berlin – November 14, 2018
PerformanceBridge Application Suite and Practice 2.0 IT Specifications
Nolan Leake Co-Founder, Cumulus Networks Paul Speciale
Ceph at the Tier-1 Tom Byrne.
Setting up PostgreSQL for Production in AWS
Hitachi’s Openstack Activities Steve Sonnenberg, HDS
OpenStack for the Enterprise
Presentation transcript:

Ceph: de factor storage backend for OpenStack OpenStack Summit 2013 Hong Kong

Whoami Worldwide offices coverage 💥 Sébastien Han 💥 French Cloud Engineer working for eNovance 💥 Daily job focused on Ceph and OpenStack 💥 Blogger Personal blog: http://www.sebastien-han.fr/blog/ Company blog: http://techs.enovance.com/ Worldwide offices coverage We design, build and run clouds – anytime - anywhere

Ceph What is it?

The project Unified distributed storage system Started in 2006 as a PhD by Sage Weil Open source under LGPL license Written in C++ Build the future of storage on commodity hardware Insist on commodity hardware: Open source so no vendor lock-in No software nor hardware lock-in You don’t need big boxes anymore You can diverse hardware (old, crap, recent) Which means that it moves along with your needs and your budget as well And obviously it makes it easy to test

Key features Self managing/healing Self balancing Painless scaling Data placement with CRUSH It provides numerous features Self healing: if something breaks, the cluster reacts and triggers a recovery process Self balancing, as soon as you add a new disk or a new node, the cluster moves and re-balance data Self managing: periodic tasks such as scrubbing to check object consistency and if something is wrong ceph repairs the object Painless scaling: it’s fearly easy to add a new disk, node especially with all the tools outthere to deploy ceph (puppet, chef, ceph-deploy) Intelligent data placement, so you can logically reflect your physical infrastructure and you can build placement rules objects are automatically placed, balanced, migrated in a dynamic cluster

Controlled replication under scalable hashing Pseudo-random placement algorithm Statistically uniform distribution Rule-based configuration Controlled replication under scalable hashing pseudo-random placement algorithm fast calculation, no lookup repeatable, deterministic statistically uniform distribution rule-based configuration infrastructure topology aware adjustable replication The way CRUSH is configured is somewhat unique. Instead of defining pools for different data types, workgroups, subnets, or applications, CRUSH is configured with the physical topology of your storage network. You tell it how many buildings, rooms, shelves, racks, and nodes you have, and you tell it how you want data placed. For example, you could tell CRUSH that it’s okay to have two replicas in the same building, but not on the same power circuit. You also tell it how many copies to keep.

Overview RADOS is a distributed object store, and it’s the foundation for Ceph. On top of RADOS, we have built three systems that allow us to store data Several ways to access data RGW Native RESTful S3 and Swift compatible Multi-tenant and quota Multi-site capabilities Disaster recovery RBD Thinly provisioned Full and Incremental Snapshots Copy-on-write cloning Native Linux kernel driver support Supported by KVM and Xen CephFS POSIX-compliant semantics Subdirectory snapshots

Building a Ceph cluster General considerations

How to start? ➜ Use case ➜ Amount of data (usable not RAW) IO profile: Bandwidth? IOPS? Mixed? Guaranteed IOs : how many IOPS or Bandwidth per client do I want to deliver? Usage: do I use Ceph in standalone or is it combined with a software solution? ➜ Amount of data (usable not RAW) Replica count Failure ratio - How much data am I willing to rebalance if a node fail? Do I have a data growth planning? ➜ Budget :-)

Things that you must not do ➜ Don't put a RAID underneath your OSD Ceph already manages the replication Degraded RAID breaks performances Reduce usable space on the cluster ➜ Don't build high density nodes with a tiny cluster Failure consideration and data to re-balance Potential full cluster ➜ Don't run Ceph on your hypervisors (unless you're broke)

State of the integration Including best Havana’s additions

Why is Ceph so good? It unifies OpenStack components Ceph tighly interacts with openstack components

Havana’s additions Complete refactor of the Cinder driver: Librados and librbd usage Flatten volumes created from snapshots Clone depth Cinder backup with a Ceph backend: backing up within the same Ceph pool (not recommended) backing up between different Ceph pools backing up between different Ceph clusters Support RBD stripes Differentials Nova Libvirt_image_type = rbd Directly boot all the VMs in Ceph Volume QoS

Today’s Havana integration

Is Havana the perfect stack? …

Well, almost…

What’s missing? Direct URL download for Nova Already on the pipe, probably for 2013.2.1 Nova’s snapshots integration Ceph snapshot https://github.com/jdurgin/nova/commits/havana-ephemeral-rbd

Icehouse and beyond Future

Tomorrow’s integration

Icehouse roadmap « J » potential roadmap Implement “bricks” for RBD Re-implement snapshotting function to use RBD snapshot RBD on Nova bare metal Volume migration support RBD stripes support « J » potential roadmap Manila support

Ceph, what’s coming up? Roadmap

Firefly Tiering - cache pool overlay Erasure code Ceph OSD ZFS Full support of OpenStack Icehouse

Many thanks! Questions? Contact: sebastien@enovance.com Twitter: @sebastien_han IRC: leseb