Co-funded by the European Union under FP7-ICT Co-ordinated by aparsen.eu #APARSEN Storage Solutions The use case at the National Library of the Netherlands (KB) Jeffrey van der Hoeven APARSEN webinar, April 14 th, 2014
Co-funded by the European Union under FP7-ICT aparsen.eu #APARSEN Outline of talk About the National Library of the Netherlands (KB) Storage challenges: creating digital collections Storage solution Cost Future perspective Cloud storage: hot or not…
Co-funded by the European Union under FP7-ICT aparsen.eu #APARSEN Since 1798 / 248 FTE / 53M euro budget We preserve & give access to everything published in and about the Netherlands Central role in Dutch information infrastructure Kept safe: 6M physical publications / 18M digital publications Goal: everything digital in 2035
Co-funded by the European Union under FP7-ICT aparsen.eu #APARSEN We give open access to: Newspaper pages online Online visits Parlementary pages online 2,1 million 8 million 4,6 million What we do
Co-funded by the European Union under FP7-ICT aparsen.eu #APARSEN Storage challenges: Creating digital collections
Co-funded by the European Union under FP7-ICT aparsen.eu #APARSEN Storage share of digital collections (in GB)
Co-funded by the European Union under FP7-ICT aparsen.eu #APARSEN 1.5 million CD-ROM’s Eiffel tour 324m 1800m 828m Burj Khalifa Dubai 443m Empire State Building PB 0,5 PB 2011 Storage prospect at KB & 500M files & 1000M files
Co-funded by the European Union under FP7-ICT aparsen.eu #APARSEN Challenges in (long-term) storage Volume (size and number of files) Type of data (structured / unstructured) Growth rate Availability vs preservation Cost per TB
Co-funded by the European Union under FP7-ICT aparsen.eu #APARSEN Storage solution
Co-funded by the European Union under FP7-ICT aparsen.eu #APARSEN IT & Storage at KB Two locations: In-house = data centre for primary storage and computing Off-site = for data back-up & archiving Hosting 230 servers (80 physical / 150 virtual) Managing 550 TB of data Managing +/- 500 million files: –PDF, TIFF, JPEG2000, JPEG, XML
Co-funded by the European Union under FP7-ICT aparsen.eu #APARSEN Storage tiers Storage Management Bronze Steel Silver Gold Very fast, very expensive Used for: indexing, databases HW: SAN with HiPerf SAS disks, near-future: SSD Fast, expensive Used for: web hosting, processing HW: SAN with HiCap SAS disks Slow (45 sec), sustainable Used: long-term archiving HW: Disk-based NAS with WORM Very slow (> 45 sec) Used for: back-up & restore, archiving HW: LTO4/5 tape
Co-funded by the European Union under FP7-ICT aparsen.eu #APARSEN Off-site Back-up Storage process & strategy Storage on-site BronzeSteelSilverGold Storage management PlatinumBronze SelectionDigital processingAccess Shared file system(s) / APIFile system Stage 1Stage 2Stage 3Stage 4Stage 5 DB
Co-funded by the European Union under FP7-ICT aparsen.eu #APARSEN Storage cost Source:
Co-funded by the European Union under FP7-ICT aparsen.eu #APARSEN TCO storage Cost per Terabyte (TB) per year per storage tier TCO composed of several cost components, based on whitepaper Four Principles for Reducing Total Cost of Ownership (2011 Hitachi) In total 14 cost components included In 2014 model was approved by PWC accounting office Referenced article: cost-of-ownership.pdfhttp:// cost-of-ownership.pdf
Co-funded by the European Union under FP7-ICT aparsen.eu #APARSEN Hardware & software Maintenance Support Power & coolingFloor spaceMonitoring Waste & duplication Off-site locations Network
Co-funded by the European Union under FP7-ICT aparsen.eu #APARSEN € 1,036.- € € 1,046.- € 4,858.- BronzeSteelSilverGold KB TCO storage 2014 per TB per year
Co-funded by the European Union under FP7-ICT aparsen.eu #APARSEN KB TCO storage cost over years
Co-funded by the European Union under FP7-ICT aparsen.eu #APARSEN KB vs storage providers (cloud) KB
Co-funded by the European Union under FP7-ICT aparsen.eu #APARSEN Can we afford it in the future? Recent developments *: –Disk storage is becoming more popular in archiving. –Physical limits of hard disk drive seems reached. –Kryder’s law seems to fail, as disk storage density seems not to keep up the pace of a yearly 30-40% increase of storage density. –Monopoly of hard disk producers Seagate and Western Digital is risky as prices might go up, especially in case of shortage. Risk: storage costs can become a bottleneck for long-term preservation. * David Rosenthal blog post, available at: cni.htmlhttp://blog.dshr.org/2012/12/talk-at-fall cni.html
Co-funded by the European Union under FP7-ICT aparsen.eu #APARSEN Storage in the cloud Cloud storage: hot… or not?
Co-funded by the European Union under FP7-ICT aparsen.eu #APARSEN Benefits of cloud storage Scalable Availability Pay per TB per month No need for own ICT infrastructure Less maintenance
Co-funded by the European Union under FP7-ICT aparsen.eu #APARSEN However… in preservation terms: Is it sustainable? Who is responsible for the data? Which jurisdiction is applied? What if I want to migrate to another cloud? Continuity: no money? No data! Advise: be cautious to use the cloud for long-term storage. Read on:
Co-funded by the European Union under FP7-ICT aparsen.eu #APARSEN Jeffrey DOT vanderhoeven AT kb DOT nl Thank you! Questions?