Archiving and Storage Overview – Trends In Preservation an Archiving(Part 1 ) PASIG Spring 2011 Raymond A. Clarke Enterprise Storage Consultant Oracle Corporation
2 Agenda Challenges Current Solutions to Meet Those Challenges Clarity of the Issues Industry Observations and Standards Practices Key Take-Aways
The Challenge(s) – Preservation, retention and access of digitally stored information for sufficiently long periods of time(or indefinitely) in order to maximize information value while maintaining infrastructure cost efficiency. Long-term digital information is sensitive to issues that do not exist in a short-term or paper world, such as media and format obsolescence, bit-rot, and loss of metadata. Interoperability for preservation, storage, and accessibility of long-term information has not yet been defined. Logical and Physical data migration remain a very significant threat to long-term information preservation and retention. Thousands of silent films were made in the years prior to the introduction of sound, and between 80 and 90 percent of them have been lost forever—many because of the deterioration of the actual celluloid film. Many commercially-oriented organizations are facing the serious challenge of economically preserving and maintaining access to a wide variety of digital content for dozens of years.
The Challenge(s) – Preservation, retention and access of digitally stored information for sufficiently long periods of time(or indefinitely) in order to maximize information value while maintaining infrastructure cost efficiency. Management of many, new and evolving data types, new and evolving media types(an tiers), intelligently Regulatory compliance Classification Enterprise Content Management consolidation To Cloud or Not to Cloud? Cloud computing is one of the most talked- about trends in IT, and the market for cloud services is exploding. In fact, IDC says cloud computing was a $17.4B market in 2009, but is expected to grow to $44.2B in 2013
5 Agenda Challenges Current Solutions to Meet Those Challenges Clarity of the Issues Industry Observations and Standards Practices Key Take-Aways
SNIA 100 Year Archive Task Force Over 80% report a need to retain information over 50 years, and 68% report a need of over 100 years Long-term generally means longer than 10 to 15 years Over 40% of respondents are keeping records over 10 years Database information is considered most at risk of loss 70% of respondents say they are ‘highly dissatisfied’ with their ability to read their retained information in 50 years Current practices are too manual, too prone to error and too costly Collaboration is recognized as necessary in order to define information retention requirements
Backup vs. HSM vs. Archive – What are we really talking about? ApplicationsData Protection Tiered Storage Management Information Archive Purpose Short term protection of records for system recovery Efficient Management of physical storage assets Long term preservation of records for business, compliance, libraries Data Type Dynamic data in production Should be data type agnostic but some are not Fixed content with ongoing, long-term integrity needs and value Access Pattern Entire volume or directory is restored after outage Does not imply or necessarily provide seamless promotion of files, objects or tables upon request Individual files, objects or tables are searched/queried and retrieved as needed System ActivityUsually block based Usually block based but some do support file based activity Must be file, table and/or object based SecurityStrict access policiesNot a function. Must guarantee information integrity and security over extended periods of time Fundamentally different approaches to data availability and storage infrastructure efficiency. All may be required in today’s enterprise environments.
Building a Terminology Bridge Archive: the report advocates that IT practices adopt a more consistent usage of the term ‘archive’ with other departments within the organization. To the archival, preservation, and records management communities, an “archive” is a specialized repository with preservation services and attributes. Preservation: managing information in today’s datacenter with requirements to safeguard information assets for eDiscovery, litigation evidence, security, and regulatory compliance requires that many classes of information be preserved from time of creation. Preservation is a set of services that protect, provide availability, integrity and authenticity controls, include security and confidentiality safeguards, and include an audit log, control of metadata, and other practices for each preservation object. The old IT practice of placing information into an archive when it becomes inactive or expired no longer works for compliance or litigation support, and only adds cost. Authenticity: is defined in a digital retention and preservation context as a practice of verifying a digital object has not changed. Authenticity attempts to identify that an object is currently the same genuine object that it was “originally” and verify that it has not changed over time unless that change is known and authorized. Authenticity verification requires the use of metadata. The critical change for IT practices is that metadata is now very important and must be safeguarded with the same priorities the data is. IT practices
9 Why Tiered Storage, now? Not a new concept. The fundamental concept has been around since the ‘70’s What has changed? New tiers of storage (i.e. Flash and high capacity disk) Storage software has evolved Business Application software now comes with embedded tiered-storage capabilities Optimized Solutions Offerings
Copyright © 2010 Oracle Corporation – Confidential10 Tiered Storage Reduces IT Costs Software automation contains management costs Tape $ $1/GB
The Cat Lives On… The Cost Ratio for a Terabyte Stored Long-Term on SATA Disk versus LTO-4 Tape is about 23:1 For energy cost, it is about 290:1 Source: Bi-annual iNEMI Mass Storage Report for 2008.Clipper Notes-October, 2008
12 Oracle’s Solution to the Storage Challenge Data/Information Lifecycle Highest Performance Storage Tier $$$$ High Performance Storage Tier $$$ Capacity Storage Tier $$ Nearline/Offline Storage Tier $ F5100 S6780 FC HDD S6780 SATA SL8500
Data Management Options Oracle Sun Storage Archive Manager (SAM) Oracle Recovery Manager (RMAN) Oracle Secure Backup (OSB) Tape Backup Disk and Tape
How does Archiving play? Archiving is a key underlying principal that is used to enable ECM Archiving is the movement or copying of data to tiers of storage, cost- effectively commensurate with data’s changing access requirements Data moved is required for long term reference and/or compliance Archiving is not backup (i.e. another copy of data for disaster recovery ) 14
ECM Technology Challenges Scale & Extensibility – Millions, billions, and even trillions of files need to be managed – We’re talking about Petabytes not merely Terabytes – Sharable by multiple application clients – Independently Scalable and Extensible in the dimensions of bandwidth, capacity and/or processing power Manageability & Accessibility – Need content awareness for efficiency – Interoperability within large geographically distributed footprints – Needs to be resilient to Change, not just Failure Cost vs. Performance – Requires overall lower Total Cost of Ownership – Must be capable of preserving content for decades 15
Benefits of Archiving Storage consolidation off of higher more expensive tiers of storage Can provide consolidated resource management Streamline backup and recovery of non-archive tiers Archived data can still be online and used by the source application via the same locality reference 16 Archive
Key Take-Aways Archive Industry Requirements continue to evolve and grow. Tape is Important Capacity needs continue to grow. Issues remain and input from this community is vital
Structured Data Unstructured Data Off-Site Tape Capacity SATA/FC Modular Disk Tape Libraries and Virtual Tape 7000 Unified Storage NFS Primary Database Exadata Primary Disk ST600, F5100 SAM-QFS Open Storage) SAM-QFS (Tiered Storage) SAPOracle UCM/URM/ OSB Open Storage Mixed Workgroup Archiver Video Images 7000 Unified Storage Oracle-Sun Archive/ECM Solutions Complete, Open, & Integrated to the Next Level
Copyright © 2010 Oracle Corporation – Confidential19 Copyright © 2009, Oracle and/or its affiliates. All rights reserved. 19 for your time and attention
Copyright © 2010 Oracle Corporation – Confidential20 Sample Graphics to put on slides - delete Analysis & Reporting