Download presentation
Presentation is loading. Please wait.
Published byIsaac Suarez Modified over 11 years ago
1
Cory Snavely Library IT Core Services manager University of Michigan September 2010
2
www.hathitrust.org HathiTrust project profile Launched October 2008 29 member institutions and growing primarily Google-scanned materials but also other sources 6.7 million volumes, 350 pages average 250 terabytes in two US instances
3
www.hathitrust.org Material and Data Flow ingest web sync Google or other scanning project storage @UM storage @IU network or media delivery catalog rights database web index
4
Content Growth
5
Content Distribution Over Time * As of July 25, 2010
6
www.hathitrust.org Trend is obvious, but not necessarily bad External error detection may be impossible What do I worry about? Yesterdays worry…is a non-issue due to……but todays worry is Managing too many separate devices Block/file virtualization Storage system software reliability and change management. What if I have to fsck this hulking beast? Non-volatile journals and online integrity checks Bit rot, misdirected writes, … Online error detection and repair
7
www.hathitrust.org Whats the Data Integrity Roadmap? Not all systems provide integrity features Its time for the data integrity model of systems to be a primary purchase criterion SNIA Data Integrity and Long Term Retention Technical Working Groups may help to surface minimum standards or common approaches; can anyone speak to progress?
8
Questions? Cory Snavely csnavely@umich.edu www.hathitrust.org
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.