Download presentation
Presentation is loading. Please wait.
1
Section 7 Erasure Coding Overview
2
Objectives What is Erasure Coding? Erasure Coding in Ceph
Configure Erasure Coding
3
What is Erasure Coding? Objective Notes:
4
What is Erasure Code? In information theory : an erasure code is a forward error correction (FEC) code for the binary erasure channel, which transforms a message of k symbols into a longer message (code word) with n symbols such that the original message can be recovered from a subset of the n symbols. The fraction r = k/n is called the code rate, the fraction k’/k, where k’ denotes the number of symbols required for recovery, is called reception efficiency. Thanks Wikipedia – that really helps!
5
Why do we have Erasure Coding?
The default replication strategy in SES is simple replication Defined by the size parameter of a pool Each object is replicated a number of times to provide resilience This approach is simple and effective but comes at a price For replication size of n the raw storage requirement is n times the amount of data being stored Data replication overhead is high, especially as replication size increases Erasure coding provides an alternative Trading off some resilience and performance to lower the raw disk requirements for storage
6
Replication vs Erasure Coding
Replication (default) Use for active data Simple and fast Uses more disk space Erasure coding Use for archive data Calculates recovery data Definable redundancy level Needs a cache layer for use with rbd Data is accessed via a replicated pool and then migrated to the Erasure Coded pool Can use both at same time But in different pools
7
Erasure Coding in Ceph Objective Notes:
8
A quick video overview...
9
Normal Ceph Read/Write
10
Erasure Coded Write
11
Erasure Coded Write Encode takes place on Primary OSD host node
Example is k=3,m=2 so 5 OSDs required
12
Erasure Coded Write With k=3, data is split into 3 shards, each written to a different OSD (via CRUSH calculation)
13
Erasure Coded Write With m=2, 2 recovery shards are calculated and
written to different OSDs
14
Erasure Coding (just the basics)
Calculates parity blocks to recover data Configurable K+M parameters (example at 10+6) All data now stored on 16 disks Requires 10 disks to recover Data safe with 6 failures Only requires 60% extra raw capacity Performance All disks need to acknowledge writes Slower recovery (think RAID 6) More chance of failure during rebuild Do not use K+M of 10+1 – you need to have sufficient failure cover
15
Erasure Coding overview
Makes storage much cheaper With no reduction in reliability (if properly configured) Great for archival storage Power consumption advantages Trades disk running for CPU load for writing and recovery Makes reads, writes and recovery slower You will probably want to add a cache tier To maximize the performance Can access via RADOS RADOS is Ceph native API Requires a cache tier for RBD access But you probably want one anyway
16
Erasure Coding Plugins
The EC algorithm and implementation are pluggable jerasure/gf-complete (free, open, and very fast) ( ISA-L (Intel library; optimized for modern Intel processors) LRC (local recovery code – layers over existing plugins) SHEC (trades extra storage for recovery efficiency – new from Fujitsu) Parameterized Pick “k” and “m”, stripe size OSD handles data path, placement, rollback, etc. Erasure plugin handles Encode and decode maths Given these available shards, which ones should I fetch to satisfy a read? Given these available shards and these missing shards, which ones should I fetch to recover?
17
Erasure Coding Parameters
Two key parameters in Erasure Coding configuration K : Erasure coding works by spitting data into shards which are then written to separate OSDs. K determines the number of shards into which data is split. The default is k=2 M : Erasure coding calculates additional data which is used to reconstruct missing shards (for example caused by OSD failure). M determines how many additional shards are calculated, and this is also the number of OSD failures which the an erasure coded pool can withstand. The default is m=1 The default values stripe data on two osds, and calculated data on a third. The loss of any one OSD is tolerable, similar to a replication size of 2 For 1GB of storage, replication size of 2 needs 2GB, erasure coded data with k=2/m=1 only requires 1.5GB
18
Configure Erasure Coding
19
Erasure Code Profiles Profiles define the parameters for erasure coding Profile contains k and m values CRUSH ruleset Plugin Default jerasure Technique Default reed_sol_van When a pool is created as an EC pool the profile determines the setup for Erasure Coding This cannot be changed later
20
Command: ceph osd erasure-code-profile
Syntax: ceph osd erasure-code-profile OPTIONS Option Description get view details of an existing EC profile set set a profile, requires k and m values, with optional values such as ruleset, plugin ls list profiles Notes:
21
Default EC Profile The default profile provides a basic erasure coding configuration which will function in almost any cluster Uses minimum practical values k=2, m=1 Is written over 3 OSDs Two data shards One recovery shard
22
Setting a custom EC Profile
Use the ceph osd erasure-code-profile set command with the following options: Profile name K : number of stripes required M : number of failed units ruleset-failure-domain = crush bucket level for failure OSD, Host, rack etc as defined in CRUSH map Example with k=8,m=2 and failure at host level ceph osd erasure-code-profile set example-profile k=8 m=2 ruleset-failure-domain=host
23
Command: ceph osd pool create
Syntax: ceph osd pool create <name> <pg> erasure <profile> Same command used to create standard replication pools but with the addition of the erasure option and a profile name (or the default profile is used) Notes:
24
Section 7 Exercises Objective Notes:
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.