iSCSI guides and suggestions. For most implementations
iSCSI –What it is. Internet Small Computer System Interface, an (IP)-based storage networking standard for linking data storage facilities by carrying SCSI commands over IP networks typically using TCP ports 860 and 3260 remote The ReadyDATA is the remote end… Requires an initiator and a LUN An initiator can be software which is needed to implement iSCSI with our devices You can think of this software as a method to bus data from a remote target (LUN) This software utilizes code to implement iSCSI This code contains a kernel resident device driver that will use the existing NIC and network stack to emulate SCSI An initiator can be hardware to implement iSCSI (not used or covered in this document) It is required for the remote end to also speak the iSCSI protocol.
The concept is that a computer running an initiator will initiate a connection to a host which has been setup with a LUN as target. These connections can be insecure, or secure using CHAP. The concept is that a computer running an initiator will initiate a connection to a host which has been setup with a LUN as target. These connections can be insecure, or secure using CHAP. SAN
iSCSI iSCSI deduplication is strongly discouraged as it will adversely affect your systems performance. The reason for this degradation is because iSCSI uses strictly 8k blocks instead of the variable block size ZFS uses (128k average). Keep in mind that your memory overhead goes up in proportion to the decrease in block size. Deduplication requires creating a deduplication store which uses a lot of RAM Huge amounts of RAM are used when Dedupe is used. Dedup
iSCSI uses a small block size (8k) to be able to be managed by many different file systems. Because it is more granular, other overlaying file systems will align well. This comes at an increased load on the CPU, but because of the deduplication store tables which consume a great amount of memory it should be avoided with iSCSI. iSCSI uses a small block size (8k) to be able to be managed by many different file systems. Because it is more granular, other overlaying file systems will align well. This comes at an increased load on the CPU, but because of the deduplication store tables which consume a great amount of memory it should be avoided with iSCSI. iSCSI Dedup The numbers are not exact and will vary based on the very DATA that is being handled in the LUN. A good rule of thumb is to double the normal dedup-required-ram of 5GB to 1 TB of DATA to more than 10GB RAM for every 1TB of DATA. The numbers are not exact and will vary based on the very DATA that is being handled in the LUN. A good rule of thumb is to double the normal dedup-required-ram of 5GB to 1 TB of DATA to more than 10GB RAM for every 1TB of DATA. If the LUN is created with the purpose of using ESX or other VMs DO NOT USE DEDUPLICATION ON THE READYDATA instead use the VMs software dedupe! If the LUN is created with the purpose of using ESX or other VMs DO NOT USE DEDUPLICATION ON THE READYDATA instead use the VMs software dedupe! Some of the attributes that can be assigned to LUNS like, compression and deduplication can work for or against you. Particularly on a SAN implementation.
Visualizing DEUPLICATION Visualizing DEUPLICATION
In the next slides we will look at how the data is managed by the inodes and how it is written to the disks, both, with Dedupe ON and OFF. Deduplication (dedup for short) is a method to remove redundant data blocks from a data set. Only unique blocks will be written to disk and any blocks that are identical to an existing block will only be referenced as a component of multiple data sets. What this means is that the capacity of the disk will be dedicated to unique and distinct blocks of data while the inodes maintain files structured correctly by using components shared across many inodes. Resulting in a dramatic increase of efficiency. Deduplication Inodes will be represented by Blocks will be represented by
Lets look at two formatted drives... These are our sectors These are our inodes Deduplication
As the data flows in, it is simply written as it comes in Dedupe OFF... MEMORY Deduplication
Dedupe OFF... All data is written to the array in its original form, and in this example our drive has filled up. We need to expand our volume to store more data. Note the low use of memory. All data is written to the array in its original form, and in this example our drive has filled up. We need to expand our volume to store more data. Note the low use of memory. MEMORY Deduplication
Dedupe ON... Only unique blocks are written to the array leaving much, much more capacity available for more unique blocks. Because a reference to all unique blocks is kept in memory in order to identify repeating blocks, it uses a lot of it. A single inode will be referenced for all files requiring that particular block of data. Only unique blocks are written to the array leaving much, much more capacity available for more unique blocks. Because a reference to all unique blocks is kept in memory in order to identify repeating blocks, it uses a lot of it. A single inode will be referenced for all files requiring that particular block of data. MEMORY Deduplication
MEMORY Dedupe ON Dedupe OFF MEMORY Deduplication
Deduplication does use more compute cycles and a lot more RAM… A LOT of RAM!!! Particularly when there is a lots and lots of unique data. Deduplication does use more compute cycles and a lot more RAM… A LOT of RAM!!! Particularly when there is a lots and lots of unique data. The ReadyDATA 5200 has 16GB of ECC (error correcting) memory. ReadyDATA uses ZFS which has a max file size of 16 Exabytes The maximum number of files is e+14 or 281,474,976,710,656 files…..So you will not be running out of inodes any time soon, but memory is limited. The ReadyDATA 5200 has 16GB of ECC (error correcting) memory. ReadyDATA uses ZFS which has a max file size of 16 Exabytes The maximum number of files is e+14 or 281,474,976,710,656 files…..So you will not be running out of inodes any time soon, but memory is limited. Deduplication The amount of memory needed is around 5 GB for every 1TB of DATA
USE DEDUPLICATION WISELY It is UNWISE to dedup an iSCSI target USE DEDUPLICATION WISELY It is UNWISE to dedup an iSCSI target Deduplication
Thick or Thin LUN The question of creating thin or a thick LUN will have a definite impact on how memory is used if one decides to use dedupe on a LUN. Thick?... Thin?... Thick provisioning will reserve the entered Size from the volume immediately. Thin provisioning will not reserve the entered Size from the volume at all, but once the target is mounted it will report to the host FS as the entered size.
Thick or Thin LUN Thick?... Thin?... Thick provisioning, in terms of iSCSI would create an enormous dedupe table in memory. If dedupe is absolutely desired on the LUN create it as thin provisioned. Thick provisioning, in terms of iSCSI would create an enormous dedupe table in memory. If dedupe is absolutely desired on the LUN create it as thin provisioned. Also, if you decide to use dedupe after all, we strongly advise you to use a pair of read cache SSD drives.
ReadyDATA Disk Packs For meeting the differing needs of specific applications, ReadyDATA 5200 users can mix-and match SATA, near-line SAS, SAS, and SSD drives within volumes to provide a stunning boost to performance and flexible capacity. NOTE: ReadyDATA supports SATA, NL-SAS, SAS and SSD drives Optimize the ratio of performance to capacity by mixing drive types within a volume Only ReadyDATA disk packs from NETGEAR are recognized by ReadyDATA 5200 ReadyDATA storage is available as a diskless chassis (RD5200) or in a pre-populated 12TB SATA configuration (RD521210). Only pre-certified disk packs from NETGEAR are compatible with ReadyDATA storage devices. For your convenience, NETGEAR offers a wide variety of disk types, capacity and speeds. ReadyDATA Disk Packs For meeting the differing needs of specific applications, ReadyDATA 5200 users can mix-and match SATA, near-line SAS, SAS, and SSD drives within volumes to provide a stunning boost to performance and flexible capacity. NOTE: ReadyDATA supports SATA, NL-SAS, SAS and SSD drives Optimize the ratio of performance to capacity by mixing drive types within a volume Only ReadyDATA disk packs from NETGEAR are recognized by ReadyDATA 5200 ReadyDATA storage is available as a diskless chassis (RD5200) or in a pre-populated 12TB SATA configuration (RD521210). Only pre-certified disk packs from NETGEAR are compatible with ReadyDATA storage devices. For your convenience, NETGEAR offers a wide variety of disk types, capacity and speeds.
SATA Nearline SAS SAS (LFF) SSD (SFF)
End