Latest trends and technologies in Storage Networking By: Gururaja Nittur Advisor: Dr. Chung E Wang Second Reader: Dr. Du Zhang
Scope of the Project Study the new technologies in the storage networking arena Fibre channel protocol NAS, SAN and iSCSI Storage Virtualization High Availability Demonstrate high availability by writing a DMP(Dynamic Multi Pathing) driver for Solaris
What is Storage Networking? “The practice of creating, installing, administering, or using networks whose primary purpose is the transfer of data between computer systems and storage elements and among storage elements”
Why Storage Networks? Explosive growth of business data “The total amount of data being stored doubles every year. Also, more than 90% of companies today would fail to survive a catastrophic data loss. Businesses face a mission-critical need to protect, access, and manage their ever-growing volume of storage assets” Explosive growth of business data Internet and Multimedia High Availability Management complexity
Why Storage Networks? (Contd..)
Fibre Channel (FC) A serial, high-speed data transfer technology Open standard, defined by ANSI and OSI Data rate upto 100 MB/sec. (200 MB/sec. full-duplex) Supports most important higher protocols like IP, ATM, SCSI etc. Does not have its own command set, but facilitates data transfers between individual FC devices.
Parallel Transmission Set of data signals are sent simultaneously through 8, 16 or even more wires. Problems with parallel transmission data sent simultaneously over all the wires have to be received simultaneously as well Total time = t+dt ‘t’ - time taken for the signals to reach receiver ‘dt’ – additional delay due to hardware inconsistencies ‘dt’ increases with cable length causing lesser frequency Example - SCSI Bus length limitations Max bus speed is limited (~40 MB/sec in Ultra SCSI) Limited device count
Serial Transmission Serial transmission uses single cable Examples All signals are delayed the same and arrive at the receiver in the same order in which they were sent. Higher bus length Examples SSA (Serial Storage Architecture by IBM) Fibre Channel
Current Technology DAS using SCSI
Emerging Technologies Network Attached Storage (NAS) Storage Area Networks (SAN) Storage over IP (iSCSI)
Network Attached Storage
Network Attached Storage Storage device will have a built-in network interface NAS unit can be plugged directly into the network to allow quick and easy access Standard network protocols such as CIFS and NFS can be used to share data
Network Attached Storage NAS engine is usually SCSI for low-end systems for cost reasons and Fibre Channel for the enterprise systems NAS is easy to install and relatively easy to maintain Network is used exclusively for data transfer causing additional overhead Backup using LAN is really a overhead
Storage Area Networks
Storage Area Networks As much as 60% of the traffic on a std corporate network is made up of housekeeping actions like Backup Storage Area Network has been fuelled significantly by the desire to get this housekeeping off the network Primary interface for SAN infrastructure is Fibre Channel
Storage Area Networks SAN provides excellent performance and easier management SAN implementations are expensive due to hardware costs Better resource sharing could make up for the initial investment SAN is very flexible in that more storage and servers can be added easily
iSCSI Motivation GB Ethernet iSCSI is a draft standard protocol to encapsulate SCSI commands into TCP/IP packets Can be used to build IP based SANs
Storage Virtualization The research firm Gartner Group estimated that 80% of the storage costs is used up for managing the installed storage Switch and array management becomes very difficult with increased storage hardware Virtualization provides a logical view and eases management. Examples – Veritas Volume Manager, IBM Tivoli etc.
Future of network storage SAN islands connected by IP networks Network Unified Storage (NUS) NAS + SAN on GB Ethernet networks
High Availability Host Single Point Failure Disk Host Multi Pathing
Dynamic Multi Pathing Increased disk availability Load balancing Identifies disks uniquely from different hosts
Dynamic Multi Pathing Host .. Disk /dev/rdsk entries c1t1d0 c2t1d0 …. cnt1d0
Implementation Details Scan the disks listed in /dev/rdsk If no UUID is present, generate a unique UUID and stamp it in the disk’s private region Add this device to a hash table hashed on UUID Load this table to the kernel and write the ioctls to update this info Use an algorithm (Currently round robin) to efficiently load balance the I/O requests. If a path is bad for more than five I/O attempts, mark it bad and do not use it for path selection.
Implementation Details User code Read /dev/rdsk folder and generate a hashed list of available disks Load this list to kernel. Also provide APIs to push newly added/removed disks. Kernel code Filter driver to choose the best path ioctls to do the following LOAD_DISKS NEW_DISK MODIFY_DISK GET_DISK_HANDLE …
Questions??