Smart Storage and Linux An EMC Perspective Ric Wheeler
Why Smart Storage? Central control of critical data One central resource to fail-over in disaster planning Banks, trading floor, air lines want zero downtime Smart storage is shared by all hosts & OS’es Amortize the costs of high availability and disaster planning over all of your hosts Use different OS’es for different jobs (UNIX for the web, IBM mainframes for data processing) Zero-time “transfer” from host to host when both are connected Enables cluster file systems
Data Center Storage Systems Change the way you think of storage Shared Connectivity Model “Magic” Disks Scales to new capacity Storage that runs for years at a time Symmetrix case study Symmetrix 8000 Architecture Symmetrix Applications Data center class operating systems
Traditional Model of Connectivity Direct Connect Disk attached directly to host Private - OS controls access and provides security Storage I/O traffic only Separate system used to support network I/O (networking, web browsing, NFS, etc)
Shared Models of Connectivity VMS Cluster Shared disk & partitions Same OS on each node Scales to dozens of nodes IBM Mainframes Shared disk & partitions Same OS on each node Handful of nodes Network Disks Shared disk/private partition Same OS Raw/block access via network Handful of nodes
New Models of Connectivity Every host in a data center could be connected to the same storage system Heterogeneous OS & data format (CKD & FBA) Management challenge: No central authority to provide access control Shared Storage IRIX DGUX FreeBSD MVS VMS Linux Solaris HPUX NT
Magic Disks Instant copy Devices, files or data bases Remote data mirroring Metropolitan area 100’s of kilometers 1000’s of virtual disks Dynamic load balancing Behind the scenes backup No host involved
Scalable Storage Systems Current systems support 10’s of terabytes Dozens of SCSI, fibre channel, ESCON channels per host Highly available (years of run time) Online code upgrades Potentially 100’s of hosts connected to the same device Support for chaining storage boxes together locally or remotely
Symmetrix Architecture 32 PowerPC 750’s based “directors” Up to 32 GB of central “cache” for user data Support for SCSI, Fibre channel, Escon, … 384 drives (over 28 TB with 73 GB units)
Linux Wish List: Lots of Devices Customers can uses hundreds of targets and LUN’s (logical volumes) 128 SCSI devices per system is too few Better naming system to track lots of disks Persistence for “not ready” devices in the name space would help some of our features devfs solves some of this Rational naming scheme Potential for tons of disk devices (need SCSI driver work as well)