A multi-tiered storage and data protection strategy Carl Follstad Manager, University Data Mgmt Services Office of Information Technology University of Minnesota Copyright Carl Follstad, This work is the intellectual property of the author. Permission is granted for this material to be shared for non-commercial, educational purposes, provided that this copyright statement appears on the reproduced materials and notice is given that the copying is by permission of the author. To disseminate otherwise or to republish requires written permission from the author.
What UMN leadership and customers were saying... “Storage requirements will increase >50% a year. Year after year …” “Keep the data secure!” “Do it for less money than last year!” “Keep me HIPAA/GLB/S-OX/FERPA compliant!” “Oh … also fit it all into our DR/BC plan!”
Cost-justifying Enterprise Storage Used most critical business systems to justify initial Storage investment (Peoplesoft, Library) If successfully deployed, the hope was to leverage the investment to drive Enterprise Storage further into the business To manage risk, we started with a pilot SAN in March 2004
Cost-justifying Enterprise Storage Other business drivers: u BC/DR u Increased system availability u Increased I/O performance and scalability u Storage re-use u Provides foundation for a storage classification model u Provides foundation for a sophisticated data protection model
UMN SAN rollout timeline: 2004 March - 50TB SAN pilot, two data centers July - Upgrade to use SAN directors (from switches) Summer - Hook up several new application servers Summer - Began doing SAN Snapshot backups Sept - Add more Tier 3 storage for new applications
UMN SAN rollout timeline: 2005 Jan - Add two EMC DMX “Tier 1” arrays, move original SAN pilot applications to Tier 1 storage (Peoplesoft, , library) Jan - Add a third SAN “core” data center Spring - Begin adding “edge” SAN equipment rooms to fabrics to service department SAN connectivity Fall - Moved to dedicated EMC DMX arrays in three “core” sites Winter - Rollout out first of IP-based storage offerings: NFS
UMN SAN rollout timeline: 2006 Jan - Rollout Tier 3 array (S-ATA) Summer - Rollout campus-wide CIFS (NAS) service to complement NAS (centralize authentication on AD) Summer - Rollout campus-wide storage collaboration tool (based on Xythos) FINISHED WITH 3-TIERED PRIMARY STORAGE PLAN
Formalized primary storage tiers Tier 1 - Targeted at highest availability applications (99.999%) Tier % availability, FC performance, mirroring and snapshot services available Tier 3 - SUBSIDIZED OFFERING: Non standard configs, ATA RAID protection, inexpensive but reliable. No DR mirroring, no Snapshots. “Best effort” SLE.
Data classification (In other words: You have all the data now, how do you manage it effectively and economically?) Industry observation: data loses value the older it gets! If that is the case, how can we leverage Enterprise Storage to economically but reliably manage and protect data throughout it’s lifetime? That’s what all the ILM vendor-speak is about.
Active data (<2%) Sum of information in your Enterprise Rarely used (“WORN”) data (85%+) Aging data (10%)
A new thought paradigm Enterprise storage: primary copy (of data) Probably tiered because there is a lot of it DR storage: secondary copy (probably a mirror) Business needs determine sync or async mirror Backup: protected copy of data (probably PIT) Business needs determine tape, disk, off site, on site
Data protection (aka “Backup”) You can only screw it up! Tie backups to Enterprise storage, leverage E.S. where possible (SAN backups, clone backups). Make sure HSM/ISL software aligns with Backup processes. Help your customers understand the difference between BACKUP and ARCHIVE.
Data protection (aka “Backup”) Optimizing backup of Enterprise storage: SAN-backup for all hosts >100GB SAN SnapShot or SAN Clone as source of backups Hot DB backups
UMN Storage strategy: future Late 2006: Roll out Digital Archive, implement automated ILM tools u HIPAA, FERPA, S-OX, G-L-B compliant u Offers electronic “shredding” of data u Replicated via IP over long distance u Not backed up by traditional means Roll out S.I.S. for Backup service u The more data you manage, the more duplication there is
UMN Storage strategy: future Late 2006: Roll out iSCSI for SAN services over IP u Used for connecting servers not in data centers
Closing thoughts The more data you manage centrally, the more economical it becomes. The more data you manage centrally, the more leverageable it is. Data consolidation DRIVES other consolidation: servers, backup, facilities.
Questions? Thank you! Carl Follstad