Sunday, May 24, 2015 Data Minimisation Managing Data Growth While Containing Cost and Carbon Footprint Ken Hall, Dimension Data
Agenda Introductions Today’s data management challenges Energy efficiency in the data centre What is Data Minimisation? Online Active Archiving Backup Data De-Duplication Data Minimisation effects Developing the business case Questions & Answers
Dimension Data - ‘Data Centre & Storage Solutions’ Network Integration Microsoft Solutions Infrastructure Microsoft Solutions Application Integration Security Managed Services Customer Interactive Solutions Data Centre & Storage Solutions – Availability, Compliance & Optimisation Storage Solutions – SAN, NAS, CAS Virtualisation Solutions – DR, Server & Desktop Consolidation Backup, Recovery & Archiving Solutions Data Centre Environmental’s – Power, Cooling & Rack Solutions Key Technology Partners APC, Cisco, EMC, HDS, HP, IBM, Microsoft, NetApp, Quantum, Symantec, Sun
The Digital Universe is Rapidly Expanding Source: IDC White Paper, "The Diverse and Exploding Digital Universe," March 2008 Ten-fold growth in five years! 1,773 exabytes 173 exabytes Exabytes ,000 1,200 1,400 1,600 1, Amount of Digital Information Created and Replicated Each Year
Typical DD Customer – Exponential Data Growth Annual Compound Data Growth of 65% Daily Incremental and Weekly Full 2 Week Retention on Disk (3 Full’s - 10 Incr) 4 Week Retention on Tape 12 Monthly’s on Tape kept indefinitely Having to squeeze more into Backup Window B2D Requirement Growing Rapidly Backup Media Server/s Under Pressure Network Bandwidth Constraints Tape Infrastructure &Handling Costs Increasing
Coping with Information Growth in Today’s Economy *“Global purchases of IT goods and services… will equal $1.66 trillion in 2009, declining by 3 percent after an 8 percent rise in 2008.” Global IT Market Outlook: 2009, Forrester Research, January 12, 2009 In 2009, IT budgets are flat or declining* Escalating costs for primary storage Difficulty meeting backup and recovery windows Ensuring high availability of information Providing timely access to historical information
Data Center Energy Use is Doubling IT energy use has doubled since 2000 and will likely double again by 2011 Energy operating costs will soon exceed the cost of purchase for servers Existing conservation technologies can reduce consumption to 2002 levels Comparison of Projected Electricity Use, 2007 to 2011 Source: EPA report to Congress, Annual Electricity Use (billion kWh/year State of the art scenario Historical energy use
Available Capabilities for Energy Efficiency Improve Efficiency – Reduce Energy Consumption INCREASE UTILIZATION REDUCE CAPACITY Storage tiering Virtual LUNS File and tiering Storage virtualisation Large-capacity drives Replication across storage tiers Snaps Clones Compression De-duplication Archiving Server virtualisation Data migration Storage consolidation Virtual Provisioning Flash drives Optimisation algorithms Automated discovery Document management
How can we... Implement a Data Minimisation Strategy Manage exponential data growth, while... Improving access to organisational data Containing data management and infrastructure costs Reducing the data centre’s carbon footprint... Online archiving of and file systems Backup with data de-duplication
Data Minimisation Elements Retention and compliance Data reduction Universal access Simplify management Tier backup infrastructure Optimise media: B2D, VTL, de-dupe and tape Address security issues Simplify management Identify candidates for archiving Classify and move Establish SLAs based on information class New Technologies and Services are Enablers Primary Storage Backup Archive
Data Minimisation – How it works 1. Archive the inactive data before you perform the backup process Identify Inactive Data based on polices Automate the movement of the data to a lower cost storage tier or dedicated archive platform leaving stubs behind Items are retrieved from the online archive on user demand Backup up the archive infrequently or never 2. Backup the remaining data using resource efficient data de-duplication Rapid ‘Full Backups’ - only the ‘sub-file’ changes are sent and stored on disk Minimal Bandwidth – only a fraction of the typical 200% is sent over the wire Minimal Storage Consumption – only unique ‘sub-file’ blocks are stored Protect more, with less for longer
Today: Energy-Efficient Storage Design 1 TB Data on Different Capacity/Performance Drives 787 kWh/yr 1,434 kWh/yr 3,048 kWh/yr 94% 87% 73% 50% 393 kWh/yr CONSUME LESS ENERGY BY CAPACITY 15K 73 GB 15K 146 GB 10K 300 GB 7.2K 500 GB 7.2K 1 TB 6,096 kWh/yr 73 GB Flash drive 3,790 kWh/yr 30x IOPS 38% Less Energy
24 May 2015 File System Archiving Extract inactive, final-form data to an archive Enhance performance of production applications Reduce size of backup datasets Free up expensive Tier 1 disk Store archived data on high density low cost energy efficient storage 10 TB Extract Always available Before Backup full, 10 TB After Back up 4 TB, active data only Active archive Primary storage 4 TB 6 TB Secondary storage Inactive data Reclaimed storage Production Active data Active data
Archiving Message Server Archive Server Space saved on server is typically 60–80% Shortcut User’s Inbox Message 1 Jan. 1, 2008 To: RickSubject: Question Attached: Message 2 Jan. 1, 2008 To: RonSubject: Update Attached: Message 3 Feb. 1, 2008 To: Bill Subject: Training Message 1 Jan. 1, 2008 To: RickSubject: Question Attached: Message 2 Jan. 1, 2008 To: RonSubject: Update Attached: Message 3 Feb. 1, 2008 To: Bill Subject: Training Shortcut Mail Archival automatically create shortcuts to archived messages / attachments…and deletes the original attachments from the server Archive Message 1 Jan. 1, 2008 To: RickSubject: Question Attached: Message 2 Jan. 1, 2008 To: RonSubject: Update Attached: Message 3 Feb. 1, 2008 To: Bill Subject: Training Message 1 Jan. 1, 2008 To: RickSubject: Question Attached: Message 2 Jan. 1, 2008 To: RonSubject: Update Attached: Message 3 Feb. 1, 2008 To: Bill Subject: Training
Definition of De-duplication “The process of detecting and identifying the unique data segments within a given set of information, enabling the elimination of redundancy when stored or moved.” Before: total segments = 39After: Unique segments = 6 Data Set 3 Data Set 2 Data Set 1 De-duplication
Data De-duplication: How it Works ABCD Unique data stored on disk, available for immediate recovery Only unique data segments are backed up A B C D Data already backed up, so only a unique ID pointer is stored (20 bytes) E E New data segment identified and backed up First Instance Duplicate Instance Modified Instance AB CD AB CD B CD E May 2007 June 2008
Key Point – Data Minimisation requires a platform that doesn’t need to be backed up! WORM DISK Tier 3 Disk Active Archiving WORM delivers unique features for online archives Location independence Self-healing and management Guaranteed authenticity Single-instancing Online Archiving Tier 3 Disk with SATA and NAS with ATA Offline Archiving Tape is best suited for offline archives Tape Customer Archival Requirements Management Efficiency Archiving Functionality
Data Minimisation Strategy - How it all fits together Tier 1 Primary Storage Tier 2 Secondary Storage Tier 3 Archive long term Retention on disk 80% of data Tier 5 Legacy long Term retention On tape Optional 20% Tier 4 Backup to disk (De-Dupe) Quick recovery Optional 20% Daily data backups OH De-duped Data Automated movement relative to age Data backup Static Data growth Static Data growth Tier 3 Data Growth No management required
Quantified Results – Reduce Tier 1/2 with Archiving Major reduction in expensive Tier1/2 Storage Tier 3 Archive storage minimised due to single instancing & compression 73% reduction in power and cooling requirements for archived data
Quantified Results – The Data Minimisation Leverage Good Tier 4 Savings with Archiving or De-Duplication Excellent results by combining Archiving & Backup Data De-Duplication 6 x reduction in power and cooling requirements for B2D storage
Quantified Results – Less Tape Infrastructure Associated reduction in Tape Library Slots, Drives, Management & Handling Power of combining Archiving & De-Duplication – 560 Less LTO4 Tapes in Year3 Tape could be removed altogether – Offsite Replication & Disk Spin-Down
Data management cost comparison – Data Minimisation Significant Reduction of Backup Infrastructure and Tape Management Tape Drive, Tape Licences, Slots, Library, Backup Server, Tape Media, Offsite Storage & Recall Costs, Admin Costs 24 May 2015© Copyright Dimension Data
Data Minimisation Assessment – Business Case Current backup minimisation methods give you better efficient backups However it doesn't fix the cause of the problem which is data growth A combination of data archival, backup de-duplication and compression represents the most effective manner to contain data within your environment Helps quantify business case for archiving (or other appropriate solution) Workshop to identify costs/issues 24 May © Copyright Dimension Data
Data Minimisation – Input Variables 24 May © Copyright Dimension Data
Data Minimisation – Graphical View 24 May © Copyright Dimension Data
Data Minimisation – Graphical View (Cont.) 24 May © Copyright Dimension Data
Data minimisation strategy achieved by... Archiving over 70% of data to a protected environment which removed the need for that data to be backed up via archiving Minimised the impact of data backup via de-duplication and compression (reduction in data volume and backup data by 80%) Minimised the impact of VMware on the environment through de-duplication Contained Tier 1 disk growth and spend Provided the most storage efficient backup method possible today Estimated savings to be over 5 Million dollars in 5 years. My initial Sync took 12 hours now I backup in 50 mins’ – Dimension Data Customer $ $ $ $
Sunday, May 24, 2015 Questions & Answers