Presentation is loading. Please wait.

Presentation is loading. Please wait.

Device-mapper RAID 4/5 target (dm-raid45) Linux-Kongress Nürnberg 2006 Heinz Mauelshagen Consulting Development Engineer Linux-Kongress 2006.

Similar presentations


Presentation on theme: "Device-mapper RAID 4/5 target (dm-raid45) Linux-Kongress Nürnberg 2006 Heinz Mauelshagen Consulting Development Engineer Linux-Kongress 2006."— Presentation transcript:

1 Device-mapper RAID 4/5 target (dm-raid45) Linux-Kongress Nürnberg 2006 Heinz Mauelshagen Consulting Development Engineer Linux-Kongress 2006

2 device-mapper RAID 4/5 target Top Linux-Kongress 2006  Requirements  Device-Mapper architecture/features  dm-raid45 architecture/features  Architecture overview  Mapping table syntax  dmsetup tool  Project Status/Summary  URLs

3 device-mapper RAID 4/5 target Requirements (1) Linux-Kongress 2006  Device-Mapper runtime  dm-raid45:  Group N devices into RAID sets (N>2) to be able to survive a single disk failure per set  Selectable allocation algorithms for data and parity (left/right symmetric/asymmetric)  Stripe cache to cache data and parity  Calculate parity chunks using an xor algorithm  Recover parity/data in case of a single device failure  Initialize sets

4 device-mapper RAID 4/5 target Requirements (2) Linux-Kongress 2006  Target features (continued...):  Support failed device replacement  Selectable device to initialize  Selectable parity device with RAID 4  Provide RAID device/set properties  Provide device events (e.g. disk failure)  Decoupled IO size from chunk size

5 device-mapper RAID 4/5 target Data/Parity allocation (1) Linux-Kongress 2006 D 0 D 2 P 45... D 1 P 23 D 4... P 01 D 3 D 5...  RAID 5, left asymmetric algoritm D = Data chunk P = Parity chunk

6 device-mapper RAID 4/5 target Data/Parity allocation (2) Linux-Kongress 2006 P 01 D 2 D 4... D 0 P 23 D 5... D 1 D 3 P 45...  RAID 5, right asymmetric algoritm D = Data chunk P = Parity chunk

7 device-mapper RAID 4/5 target Data/Parity allocation (3) Linux-Kongress 2006 D 0 D 3 P 45... D 1 P 23 D 4... P 01 D 2 D 5...  RAID 5, left symmetric algoritm D = Data chunk P = Parity chunk

8 device-mapper RAID 4/5 target Data/Parity allocation (4) Linux-Kongress 2006 P 01 D 3 D 4... D 0 P 23 D 5... D 1 D 2 P 45...  RAID 5, right symmetric algoritm D = Data chunk P = Parity chunk

9 device-mapper RAID 4/5 target Data/Parity allocation (5) Linux-Kongress 2006 D 0 D 2 D 4... D 1 D 3 D 5... P 01 P 23 P 45...  RAID 4 (dedicated selectable parity device) D = Data chunk P = Parity chunk

10 device-mapper RAID 4/5 target Data/Parity allocation (6) Linux-Kongress 2006 P 1-3 D 7 D 10 D 13 D 16... D 0 P 4-7 D 11 D 14 D 17... D 1 D 4 P 8-11 D 15 D 18...  RAID 5, right symmetric algoritm D = Data chunk P = Parity chunk D 2 D 5 P 8 P 12-15 D 19... D 3 D 6 P 9 D 12 P 16-19...

11 device-mapper RAID 4/5 target Device-mapper architecture (1) Linux-Kongress 2006  Device-Mapper is the generic mapping runtime platform in the Linux kernel (since 2.5)  Can be used by multiple applications which need block device mapping (eg, dmraid, LVM2 or EVMS)  Manages Mapped-Devices (create, remove, …)  Handles text formatted Mapping-Tables (load, unload, reload)  All table loading actions happen online to reflect changed layouts (eg, an MD size change)  Code to handle mappings factored out into Mapping-Targets, for e.g. linear, striped, mirrored, snapshot, multipath, zero, error and crypt mappings  Maps to arbitrary block devices (e.g. (i)SCSI)

12 device-mapper RAID 4/5 target Device-mapper architecture (2) Linux-Kongress 2006  (continued...)  Mapping-Targets (e.g. dm-raid45) are dynamically loadable and register with the Device-Mapper core  Mapped-Devices can be stacked in order to build complex mappings (e.g. to create a RAID50 set)  More than 2 Terabytes per mapped device in Linux 2.6 (CONFIG_LBD)  Comes with a user space library (libdm) to be interfaced by Device/Volume Management applications (e.g. dmraid, lvm2) and a test tool dmsetup  Lib creates nodes to access Mapped Devices in /dev/mapper/

13 device-mapper RAID 4/5 target Device-mapper architecture (3) Linux-Kongress 2006  Examples of Device-Mapper tables which can be activated using the dmsetup tool  0 1024 linear /dev/hde1 40 1024 2048 striped 2 64 /dev/hde1 1064 /dev/hdf1 0  0 1024 zero 1024 1000 error  0 83886080 mirror core 1 64 2 /dev/sda 0 /dev/sdb 0

14 device-mapper RAID 4/5 target dm-raid45 architecture (1) Linux-Kongress 2006  Target falls apart into  Constructor/destructor of RAID sets  A mapping function to call by Device-Mapper core for each io  Suspend/Resume of RAID set resynchronization for table reloads  Device status/mapping table output  Stripe cache  Dirty log to keep track of dirty regions of the set  Region hash to keep track of writes active per region to be able to update the region in the dirty log on transitions to/from 0 and to create quiesced regions (those without io) for recovery

15 device-mapper RAID 4/5 target dm-raid45 architecture (2) Linux-Kongress 2006  Target falls apart into (continued...)  Main IO thread (i.e. a kernel work queue)  Handling of stripes with end ios  Merging of write data into stripes  Calculation of parity chunks (N^^2 * PAGE_SIZE)  Flushing of stripes with dirty or not uptodate chunks  Degraded mode handling (i.e. one device in set dead)

16 device-mapper RAID 4/5 target dm-raid45 architecture (3) Linux-Kongress 2006  Constructor  Takes the arguments of the mapping table line to parse them  Allocates memory for a RAID set structure, device structures and the stripe cache for the set  Starts a kernel thread (i.e. a work queue) to handle the IO on the set which it takes from an input queue filled by the mapping function

17 device-mapper RAID 4/5 target dm-raid45 architecture (4) Linux-Kongress 2006  Destructor  Stops any ongoing recovery  Destroys the work queue  Frees the stripe cache, the device and RAID set structures

18 device-mapper RAID 4/5 target dm-raid45 architecture (5) Linux-Kongress 2006  Mapping function  Checks for read aheads and rejects them in order to save stripe cache space  Takes io (i.e. a bio) from Device-Mapper core and pushs it onto the input queue  Wakes the main thread to handle the IO

19 device-mapper RAID 4/5 target dm-raid45 architecture (6) Linux-Kongress 2006  Main IO thread  Updates the region hash and the dirty log  Starts recovery using the region hash and dirty log, which 'knows' about areas to recover  Pops bios off the input queue and tries to associate them to available stripes (if none available pushs them back onto the input queue)  Calculates missing chunks in degraded mode  End ios all bios on uptodate/clean stripes  Calculates parity chunks for dirty stripes  Starts IO on all non-uptodate/dirty stripes  Recovers missing chunks in degraded mode

20 device-mapper RAID 4/5 target Architecture Overview Linux-Kongress 2006 lvm2 dmcore libdm targe t lld dm = device-mapper core lld = low level driver Userspace Kernelpace dmraid dmsetup kpartx... dmioctl lld optionally e.g. dm-raid45

21 device-mapper RAID 4/5 target Mapping table syntax Linux-Kongress 2006  start length raid45 dl_type dl_param [dl_parm...] raid_type \ raid_params [raid_parm...] #_devices dev_to_init [dev_path offset]{3,} start – start offset on mapped device length – segment length on mapped device raid45 – keyword to select RAID4/5 mapping target dl_type – dirty log (dl) type (i.e. core, disk) dl_param - # of dl parameters (1-2 for 'core', 2-3 for 'disk') dl_parm – dl device (only with 'disk' type), region size and optional sync/nosync keyword to synchronize the set completely or prevent synchronization at all raid_type – raid4, raid5_la, raid5_ls, raid5_ra, raid5_rs raid_params – # of RAID parameters (0-5) raid_parm – chunk size, amount of stripes, io size, recovery io size and recovery bandwidth #_devices – number of RAID devices in set [3...N] dev_to_init – index of device to initialize [0...(N-1)] dev_path – RAID device path offset – offset into RAID device

22 device-mapper RAID 4/5 target Mapping table examples Linux-Kongress 2006  A Raid set with ~136GB, core dirty log, 8192 sectors region size, synchronization prohibited, RAID5 left asymmetric mapping, 0 raid paramters, 3 RAID devices with the first one to initialize (noop): 0 286727968 raid45 core 2 8192 nosync raid5_la 0 3 0 /dev/sdc1 0 /dev/sdd1 0 /dev/sde1 0  A Raid set with ~780GB, disk dirty log, 262144 (128MB) sectors region size, synchronization required, RAID5 right asymmetric mapping, 2 raid paramters (64 sectors chunk size and 256 stripes in cache), 3 RAID devices with the last one to initialize: 0 1562845536 raid45 disk 3 /dev/system/5disks_usb_log 262144 sync raid5_ra \ 2 64 256 3 2 /dev/sda 0 /dev/sdb 0 /dev/sdc 0

23 device-mapper RAID 4/5 target dmsetup tool Linux-Kongress 2006  dmsetup create mapped_device file  dmsetup status mapped_device  dmsetup remove mapped_device ... E.g.: dmsetup create r5 r5_3disks mkfs -t xfs -f /dev/mapper/r5 mount /dev/mapper/r5 /mnt/r5... umount /mnt/r5 dmsetup remove r5

24 device-mapper RAID 4/5 target Status and Future Directions Linux-Kongress 2006  Device-Mapper in Linux 2.6 Mainline -> Distributors are shipping it  dm-raid45 as patch on my people page  Experimental support for dm-raid45 in dmraid  Planned support for dm-raid45 in LVM2

25 device-mapper RAID 4/5 target Summar y Linux-Kongress 2006  dm-raid45 adds a higher level of resilience to device-mapper by supporting redundant RAID4 and RAID5 mappings

26 device-mapper RAID 4/5 target URLs Linux-Kongress 2006  http://sources.redhat.com/dm (Device-Mapper tool+library)  http://people.redhat.com/heinzm/sw/dm/dm-raid45/  http://www.redhat.com/mailman/listinfo/dm-devel to subscribe to dm-devel@redhat.comdm-devel@redhat.com  http://www.redhat.com/mailman/listinfo/ataraid-list to subscribe to ataraid-list@redhat.com

27 device-mapper RAID 4/5 target Q&AQ&A Linux-Kongress 2006


Download ppt "Device-mapper RAID 4/5 target (dm-raid45) Linux-Kongress Nürnberg 2006 Heinz Mauelshagen Consulting Development Engineer Linux-Kongress 2006."

Similar presentations


Ads by Google