SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Disk and Tape Storage Cost Models Richard Moore & David Minor San Diego Supercomputer.

Slides:



Advertisements
Similar presentations
The Total Cost of (Non) Ownership of Storage In The Cloud Jinesh Varia Technology Evangelist.
Advertisements

A new standard in Enterprise File Backup. Contents 1.Comparison with current backup methods 2.Introducing Snapshot EFB 3.Snapshot EFB features 4.Organization.
IT INFRASTRUCTURE AND EMERGING TECHNOLOGIES
Cloud Computing Jonathan Weitz Bus: 550 June 3, 2013.
Chapter 4 Infrastructure as a Service (IaaS)
Take your CMS to the cloud to lighten the load Brett Pollak Campus Web Office UC San Diego.
SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Disk and Tape Storage Cost Models Richard Moore & David Minor San Diego Supercomputer.
Background Chronopolis Goals Data Grid supporting a Long-term Preservation Service Data Migration Data Migration to next generation technologies Trust.
Empowering Business in Real Time. © Copyright 2009, OSIsoft Inc. All rights Reserved. Virtualization and HA PI Systems: Three strategies to keep your PI.
Mainframe Replication and Disaster Recovery Services.
Windows Azure Conference 2014 Hybrid Cloud Storage: StorSimple and Windows Azure.
DCAPE Project Update Richard MarcianoChien-Yi Hou Caryn Wojcik University of University of State of Michigan North Carolina North Carolina Records Management.
Chronopolis: Preserving Our Digital Heritage David Minor UC San Diego San Diego Supercomputer Center.
Information Means The World.. Enhanced Data Recovery Agenda EDR defined Backup to Disk (DDT) Tape Emulation (Tape Virtualization) Point-in-time Copy Replication.
Disaster Planning and Recovery BOF Internet2 Member Meeting, Chicago 11:45AM, December 4th, 2006 Room CC24C.
Evolution of Enterprise Services in the Statistics Canada IT Environment Silver Buckler Chief, Managed Storage Section Informatics Technology Services.
1 Archival Storage for Digital Libraries Arturo Crespo Hector Garcia-Molina Stanford University.
Guide to Linux Installation and Administration, 2e1 Chapter 13 Backing Up System Data.
HOMEWORK PAGE STAND ALONE PROGRAMS FUNCTION ON THEIR OWN AND SOMETIMES CANNOT SHARE DATA WITH OTHER PROGRAMS. INTEGRATED SOFTWARE COMBINES.
Barracuda Networks Confidential1 Barracuda Backup Service Integrated Local & Offsite Data Backup.
1 From Filing Cabinet to Desktop and Network: Records Management in N.C. State Government Ed Southern Government Records Branch N.C. Office of Archives.
Designing Storage Architectures for Preservation Collections Library of Congress, September 17-18, 2007 Preservation and Access Repository Storage Architecture.
An Introduction to DuraCloud Carissa Smith, Partner Specialist Michele Kimpton, Project Director Bill Branan, Lead Software Developer Andrew Woods, Lead.
IBM TotalStorage ® IBM logo must not be moved, added to, or altered in any way. © 2007 IBM Corporation Break through with IBM TotalStorage Business Continuity.
Disk and Tape Square Off Again Tape Remains King of Hill with LTO-4 Presented by Heba Saadeldeen.
November 2009 Network Disaster Recovery October 2014.
Database Services for Physics at CERN with Oracle 10g RAC HEPiX - April 4th 2006, Rome Luca Canali, CERN.
Cloud computing is the use of computing resources (hardware and software) that are delivered as a service over the Internet. Cloud is the metaphor for.
EE616 Technical Project Video Hosting Architecture By Phillip Sutton.
City of Seattle Office of the City Clerk Open Government = Access Challenges and Opportunities with Digital Records.
STEALTH Content Store for SharePoint using Caringo CAStor  Boosting your SharePoint to the MAX! "Optimizing your Business behind the scenes"
INFO1 – Practical problem solving in the digital world
Slide 1 Systems Analysis and Design With UML 2.0 An Object-Oriented Approach, Second Edition Chapter 13: Physical Architecture Layer Design Alan Dennis,
Meeting the Data Protection Demands of a 24x7 Economy Steve Morihiro VP, Programs & Technology Quantum Storage Solutions Group
Virtualization in the NCAR Mass Storage System Gene Harano National Center for Atmospheric Research Scientific Computing Division High Performance Systems.
Maintaining File Services. Shadow Copies of Shared Folders Automatically retains copies of files on a server from specific points in time Prevents administrators.
1 Maintain System Integrity Maintain Equipment and Consumables ICAS2017B_ICAU2007B Using Computer Operating system ICAU2231B Caring for Technology Backup.
Click to add text Introduction to the new mainframe: Large-Scale Commercial Computing © Copyright IBM Corp., All rights reserved. Chapter 2: Capacity.
1 U.S. Department of the Interior U.S. Geological Survey Contractor for the USGS at the EROS Data Center EDC CR1 Storage Architecture August 2003 Ken Gacke.
Overview of Physical Storage Media
Hosted by 2004 Purchasing Intentions Survey Mark Schlack Editorial Director, Storage Media Group TechTarget.
IST Storage & Backup Group 2011 Jack Shnell Supervisor Joe Silva Senior Storage Administrator Dennis Leong.
Archiving Solutions Software vs. Hosted vs. Appliance Based.
ALMA Archive Operations Impact on the ARC Facilities.
BACKUP/MASTER: Strategies for Archiving Dianne McAdam Senior Analyst and Partner Data Mobility Group.
Ensures (distributed) preservation of bitstreams Black box = easy-to-use Data are immutable, CRud Heterogeneous data (file size kByte –TByte, content independent)
IBM Systems and Technology Group © 2009 IBM Corporation IBM System Storage – Tape Part 1 This document is for IBM and IBM Business Partner use only. It.
RUNNER April 29, Executive Summary Business Problem: – cineSHARE, ACORN and EAGL are critical components of major digital media workflows supporting.
Storage Why is storage an issue? Space requirements Persistence Accessibility Needs depend on purpose of storage Capture/encoding Access/delivery Preservation.
Best Available Technologies: External Storage Overview of Opportunities and Impacts November 18, 2015.
Digital Library Storage Strategies Robert Cartolano, Director Library Information Technology Office November 14, 2008.
Computing Strategies. A computing strategy should identify – the hardware, – the software, – Internet services, and – the network connectivity needed.
SAN DIEGO SUPERCOMPUTER CENTER Replication Policies for Federated Digital Repositories Robert H. McDonald Chronopolis Project Manager
Building Preservation Environments with Data Grid Technology Reagan W. Moore Presenter: Praveen Namburi.
© 2012 IBM Corporation IBM Linear Tape File System (LTFS) Overview and Demo.
Collection-Based Persistent Archives Arcot Rajasekar, Richard Marciano, Reagan Moore San Diego Supercomputer Center Presented by: Preetham A Gowda.
IBM Systems and Technology Group © 2009 IBM Corporation IBM System Storage Product Overview Part 1 This document is for IBM and IBM Business Partner use.
Cofax Scalability Document Version Scaling Cofax in General The scalability of Cofax is directly related to the system software, hardware and network.
CDP Technology Comparison CONFIDENTIAL DO NOT REDISTRIBUTE.
Bernd Panzer-Steindel CERN/IT/ADC1 Medium Term Issues for the Data Challenges.
Open-E Data Storage Software (DSS V6)
Planning for Application Recovery
EMMS Infrastructure Cost/Risk Analysis
Experiences and Outlook Data Preservation and Long Term Analysis
Introduction.
AWS Cloud Computing Masaki.
Prepared by Jaroslav makovski
The MetaArchive Model: Distributed Digital Preservation Networks
Presentation transcript:

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Disk and Tape Storage Cost Models Richard Moore & David Minor San Diego Supercomputer Center (SDSC)‏ University of California San Diego Presented by: Heba Saadeldeen

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Objectives & Outline Realistic cost estimates and projections are critical for storage users/providers While much info is available on vendor hardware solutions … Little info on integrated costs from storage provider perspective Estimate costs for at-scale provider to ‘store bits’ Outline Caveats SDSC’s Storage Infrastructure ‘Bit Storage’ Cost Estimates Tape Archival Storage Disk Storage Projections – with scale of storage facility and into the future Conclusions

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Caveats on Cost Estimates Sustainable storage Annual cost w/ media/technology refresh & data migration Based on SDSC experience only Include UCSD’s indirect costs – will vary by institution Other providers may have different cost structure Based on SATA disk and enterprise-class tape systems Cannot be specific about vendor costs or burdening, but relative fractions are reasonable This is a snapshot as of Jan will decline w/ time Paper focuses only on single-copy ‘bit storage’ costs ‘Bit storage’ is only a fraction of the cost to ‘preserve data’

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO A Three-Stage Model for A Digital Preservation Environment StoreIngest Use ‘Bit Storage’ Capacity Online (disk)‏ Archival (tape)‏ Single-copy reliability Media/technology advances Data migration Replication Geographically distributed System diversity

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO SDSC’s Storage Infrastructure

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO SDSC’s archive shows exponential growth w/ a consistent doubling period of ~15 months

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Cost Elements of Bit Storage Estimates SDSC’s Cost Estimates Include Annualized capital costs of the media (including disk controllers)‏ Other annualized capital costs Disk: File system servers, Storage area network Archive: Tape libraries, tape drives, disk cache, file system servers Hardware maintenance and software licenses (annual)‏ Facilities costs – space, utilities (annual)‏ Labor to maintain & administer systems, migrate data (annual)‏ Disk: 3 FTE’s to administer disk storage & SAN Archive: 3 FTE’s to administer archival systems Annual costs normalized by: Total SATA disk deployed (~1.8 PB SATA)‏ Current volume of data stored on tape (~5 PB)‏ Sustainable rate - $/TB/year Assumed to be long-term storage w/ migration costs

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Clarifications about their cost model Discounts are negotiated for capital purchase and maintenance Indirect burdening included in these costs on various cost elements and these burdens will vary by institution Storage system costs are based on several large-scale purchases over the last 18 months; there will be a wide range of system cost based on the timing, scale, and negotiations. Complex sub-issues are not considered like resource costs associated with each transaction (read/write), networking/bandwidth costs for user to upload/access data

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Disk and tape storage cost elements Media cost is not the dominant cost (36%/20%) Additional capital infrastructure is required (15%/33%)‏ Media + other capital is ~half the total cost (51%/53%)‏ Labor costs are a significant cost (23%/20%)‏ Facilities costs modest (11%/5%)‏

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Disk/Tape Storage Cost Comparison: Relative Cost Elements

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO How do costs scale with the size of the storage infrastructure? Economies of scale are significant as one moves up to “at-scale” installations ($/TB/yr decreases) Vendor negotiations on media, other capital, maintenance Fully utilizing servers, infrastructure and personnel Once infrastructure is “at-scale”, economies of scale slow down and the cost ($/TB/yr) levels off with installation size Media, supporting capital, maintenance, facilities costs Perhaps some weak economies of scale in these factors Some “linear” costs occur in large quantum steps – e.g. hiring additional administrator, larger servers to handle load A portion of the cost elements (software licenses) are fixed with installation size => decreasing $/TB/yr for these elements So with “at-scale” installations, net $/TB/yr will level off and then slowly decline

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO What about trends in the relative cost of disk/tape storage? Historical trends in media costs Actual purchases over SDSC’s 20-year history indicate tape media cost/TB declines exponentially with halving time ~3 years Apples-apples comparisons harder for disk, but halving time is shorter If these trends continue, expect costs to converge within a few years Even as costs converge, there may be good reasons to maintain a few large-scale centralized tape archives Notion that there’s less risk to a tape cartridge than spinning disk

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO How will costs change in the future? Expect that exponential declines in media costs and other IT equipment will continue for a while Cost ($/TB/yr) will decline, but how much? Critical issue is which cost elements will scale with the declining media costs and which will not? Most costs scale w/ media, but labor & facility costs may not scale well Cost elements that do not scale well w/ media will dominate future costs, even at the ‘bit storage’ level – And we expect that for the broader ‘storage’ costs beyond bit storage, e.g. file management, labor costs will dominate!

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Comparison with Commercial Services Many commercial companies are offering web-accessible storage services One example - Amazon S3 (aws.amazon.com/s3)‏ Cost structure (~April 2007) - $1800/TB/yr storage + upload $100/TB + download $ /TB + put/get/list transaction fees # of copies and media not specified, but speculate 2+ disk copies Don’t know the capital/business model No Guarantees - From AWS License Agreement “Amazon and its affiliates are not responsible for any unauthorized access to, alteration of, or the deletion, destruction, damage, loss or failure to store any Content or other data which you submit in connection with your account. “ SDSC cost estimates are “in the ballpark” w/ commercial services

SAN DIEGO SUPERCOMPUTER CENTER at the UNIVERSITY OF CALIFORNIA, SAN DIEGO Conclusions Initial caveat … Bit storage costs are only a fraction of the total cost for ‘digital preservation’ Ingest and use phases not addressed Only a portion of storage phase costs included SDSC’s sustainable single-copy ‘bit storage’ costs: ~$500/TB/yr for tape storage ~$1500/TB/yr for disk storage Media costs are ~30% of the integrated ‘bit storage’ costs and total capital is ~50% of costs for both tape and disk Costs ($/TB/yr) increase, then flatten out and eventually slowly decline w/ scale of installation Costs will decline with time, but critical issue is which elements do not scale w/ media/technology advances Disk/tape integrated costs are converging