Presentation is loading. Please wait.

Presentation is loading. Please wait.

Archiving What is it and why should it be important to me? John Shaw Director, Publishing Technologies SAGE Publications, U.S.

Similar presentations


Presentation on theme: "Archiving What is it and why should it be important to me? John Shaw Director, Publishing Technologies SAGE Publications, U.S."— Presentation transcript:

1 Archiving What is it and why should it be important to me? John Shaw Director, Publishing Technologies SAGE Publications, U.S.

2 I. Archiving Overview II. Types of Archives II. A SAGE Example IV. Risks, Questions, and More Questions

3 Archiving Part Archiving Part I: Archiving Overview

4 What is an Archive? An authoritative collection An authoritative collection Preserved and professionally managed in perpetuity Preserved and professionally managed in perpetuity History, institutional commitment & policy, integrity re: preservation History, institutional commitment & policy, integrity re: preservation “…information needed for society’s memory.” "Schellenberg in Cyberspace," American Archivist 61:2 (Fall 1998), p. 309-327. “…information needed for society’s memory.” "Schellenberg in Cyberspace," American Archivist 61:2 (Fall 1998), p. 309-327. Preservation first Preservation first

5 What is a Repository? “A place where things can be stored and maintained; a storehouse.” [Society of American Archivists Glossary] “A place where things can be stored and maintained; a storehouse.” [Society of American Archivists Glossary] “Depository” is same “Depository” is same also library that receives government documents to public access also library that receives government documents to public access Not all repositories are archives Not all repositories are archives

6 Why Care? “Preserving information for decades or even centuries has proved important. Shang dynasty (12th century BC) Chinese astronomers inscribed eclipse observations on “oracle bones" (animal bones and tortoise shells). About 3200 years later researchers used these records, together with one from 1302BC, to estimate that the accumulated clock error was just over 7 hours, and from this derived a value for the viscosity of the Earth's mantle as it rebounds from the weight of the glaciers..” ********

7 Why Care? “ ” “A Fresh Look at the Reliability of Long­term Digital Storage.” Baker, Mary, et al.. EuroSys '06, April 18-21, 2006 “These timescales of many decades, even centuries, contrast with the typical 5-year lifetime for computing hardware and digital media” “A Fresh Look at the Reliability of Long­term Digital Storage.” Baker, Mary, et al.. EuroSys '06, April 18-21, 2006

8 Preservation: Digital information is impermanent Publisher: Safety Publisher: Safety to insure ongoing availability of your content to insure ongoing availability of your content Your library customers: Custodianship Your library customers: Custodianship to insure continuity of the record of scientific progress to insure continuity of the record of scientific progress Very long view: epistemology, history of science and culture Very long view: epistemology, history of science and culture Why Care?

9 What Should be Preserved? Scholarly content Scholarly content Research materials Research materials Web-based, digitally born content Web-based, digitally born content

10 How e-Archives Differ Mission: collection v. preservation Mission: collection v. preservation Access control, dark v. light Access control, dark v. light Deposits Deposits Why: voluntary v. mandated Why: voluntary v. mandated Who: author v. publisher Who: author v. publisher What: manuscripts v. final work What: manuscripts v. final work When: backfile v. current content When: backfile v. current content Future format migration Future format migration Rights transfer Rights transfer Costs Costs

11 Archiving Part Archiving Part II: Types of Archives

12 Types of Archives: National archives National archives Institutional repositories Institutional repositories Community-based archives Community-based archives Product solution archives Product solution archives

13 Types of Archives: National Dutch National library Koninklijke Bibliotheek (KB) Dutch National library Koninklijke Bibliotheek (KB) British Library British Library NIH – PubMedCentral? NIH – PubMedCentral? “NIH’s digital repository for biomedical research” “NIH’s digital repository for biomedical research” Library of Congress? Library of Congress?

14 KB: Dutch National Library Mission: Legal deposit library Mission: Legal deposit library “…collect, catalogue and preserve all publications appearing in the Netherlands. ” “…collect, catalogue and preserve all publications appearing in the Netherlands. ” Capable of ingesting 60,000 articles/day Capable of ingesting 60,000 articles/day Deposits: Source files from publishers Deposits: Source files from publishers Automated, strict Automated, strict Costs? Costs? Access Control: Access Control: Local patron access Local patron access Publisher sets remote access rules Publisher sets remote access rules

15 KB: Dutch National Library Migration: Preservation research leader Migration: Preservation research leader Committed to format migration Committed to format migration Archiving agreements with: Archiving agreements with: OUP, Sage, Blackwell, Elsevier, Kluwer Academic, etc. OUP, Sage, Blackwell, Elsevier, Kluwer Academic, etc.

16 The British Library Legal Deposit Pilot Mission: Legal deposit library Mission: Legal deposit library UK-published (to start) UK-published (to start) Pilot: Legal deposit for e-journals Pilot: Legal deposit for e-journals 23 volunteer publishers 23 volunteer publishers Secure infrastructure Secure infrastructure Uses DigiTool by Ex-Libris Uses DigiTool by Ex-Libris Shared with the other UK legal deposit libraries Shared with the other UK legal deposit librariesother To “scope and test” ingest, storage, retrieval To “scope and test” ingest, storage, retrieval Cost? Cost?

17 The British Library: Preservation and Migration BL’s future for managing digital assets BL’s future for managing digital assets preserve any type of digital material in perpetuity preserve any type of digital material in perpetuity Migration Migration ensure that users can view the material with contemporary applications ensure that users can view the material with contemporary applications preserve the original look-and-feel where possible preserve the original look-and-feel where possible Access Control Access Control “appropriate permissions” “appropriate permissions”

18 PMC: US National Library of Medicine Journal Archive Mission: Make research more accessible Mission: Make research more accessible Free full-text archive of 230 journals Free full-text archive of 230 journals Deposit: publishers submit source files Deposit: publishers submit source files Migration Migration Access Control Access Control Cost? Cost?

19 PMC: Depository for NIH-Funded Research Articles Authors of NIH-funded articles “encouraged” to deposit final manuscript Authors of NIH-funded articles “encouraged” to deposit final manuscript “After all modifications due to …peer review” “After all modifications due to …peer review” MS Word, PDF, etc. MS Word, PDF, etc. With supplementary information With supplementary information Publisher can replace with published version Publisher can replace with published version To be required soon? To be required soon?

20 Library of Congress National Digital Information Infrastructure and Preservation Program (NDIIPP) – formed in 2000 Members: National Library of Medicine, the National Agricultural Library, the National Institute of Standards and Technology, the Research Libraries Group, the OCLC Online Computer Library Center, and the Council on Library and Information Resources Preliminary investigation and software development phase Preliminary investigation and software development phase Primarily e-journal deposit Primarily e-journal deposit Future …??? Future …???

21 Types of Archives: Institutional University with expansive focus University with expansive focus Stanford Digital Repository Stanford Digital Repository Automated Automated LOCKSS LOCKSS

22 Stanford Digital Repository Stanford Univ. Libraries initiative Stanford Univ. Libraries initiative Digital preservation serving Digital preservation serving Stanford University Stanford University Broader academic community Broader academic community Publishers Publishers Principles: Trust, Security, Transparency Principles: Trust, Security, Transparency Costs? Costs?

23 LOCKSS Technology to preserve local library collection Technology to preserve local library collection Automated, self-correcting cache servers Automated, self-correcting cache servers Requires LOCKSS server at library Requires LOCKSS server at library Requires publisher participation Requires publisher participation Builds collection of all resources which the institution licenses Builds collection of all resources which the institution licenses Goes online to users if data source becomes unavailable Goes online to users if data source becomes unavailable Provides access to static “HTML images” of source Provides access to static “HTML images” of source Costs Costs

24 Types of Archives: Product Solution Non-profit organization Non-profit organization Portico Portico

25 Portico Mission: scholarly preservation Mission: scholarly preservation Standalone archive Standalone archive Initiated by JSTOR, with grant funding Initiated by JSTOR, with grant funding Deposits: source files from publisher Deposits: source files from publisher Migration: planned Migration: planned Costs Costs Publishers annual fee $250 to $75,000 Publishers annual fee $250 to $75,000 based on annual revenue based on annual revenue Libraries annual fee $1,500 to $24,000 Libraries annual fee $1,500 to $24,000 based on Library Materials Expenditure based on Library Materials Expenditure

26 Portico: Access Control Member libraries get access: Member libraries get access: “when specific trigger events occur, and when titles are no longer available from the publisher or other source.” “when specific trigger events occur, and when titles are no longer available from the publisher or other source.” Trigger events include: Trigger events include: Publisher stops operations Publisher stops operations Publisher ceases to publish a title Publisher ceases to publish a title Publisher no longer offers back issues Publisher no longer offers back issues Catastrophic and sustained failure of a publisher’s delivery platform Catastrophic and sustained failure of a publisher’s delivery platform Can also fulfill “perpetual access” subscription obligations Can also fulfill “perpetual access” subscription obligations

27 Types of Archives: Community Community based and openly run Community based and openly run CLOCKSS CLOCKSS

28 CLOCKSS (Controlled LOCKSS) Long-term global archiving solution Long-term global archiving solution Community-managed, failsafe repository for scholarly content Community-managed, failsafe repository for scholarly content Serve libraries & publishers in the event of a long-term business interruption Serve libraries & publishers in the event of a long-term business interruption Publishers participation is voluntary Publishers participation is voluntary Small number library participants maintain the archive on behalf of larger community Small number library participants maintain the archive on behalf of larger community libraries preserve member publisher content whether they subscribe or not libraries preserve member publisher content whether they subscribe or not Release only after a trigger event Release only after a trigger event Publisher, libraries, and society collaborative decision to release Publisher, libraries, and society collaborative decision to release “cost sharing” for system, not access “cost sharing” for system, not access Costs? Costs?

29 Summary Table Agency Primary Mission DataA/CMigration KBGov’tPreservPubTwilightYes BLGov’tPreservPub?Yes PorticoInd.FailsafePubDarkYes PMCGov’tAccess Pub, Author LightYes LoCGov’tPreservPub?? SDRInst.PreservPubTwilightYes LOCKSSInst.FailsafePubDark- CLOCKSSComm.FailsafePubDark-

30 Summary: How Repositories Differ Stated purpose Stated purpose Dark v. light Dark v. light Complete backfile v. current only Complete backfile v. current only Deposits Deposits Who: author v. publisher Who: author v. publisher What: manuscripts v. final work What: manuscripts v. final work Why: voluntary v. mandated Why: voluntary v. mandated Rights transfer Rights transfer Access control Access control Costs Costs

31 Archiving Part III: A SAGE Example

32 Why Archive? SAGE’s commitment to customers and partners SAGE’s commitment to customers and partners Critical to society arrangements Critical to society arrangements Essential for new e-sales (consortia + single institutions) – Perpetual access Essential for new e-sales (consortia + single institutions) – Perpetual access Business continuity Business continuity Long-term preservation Long-term preservation We are not archiving experts! We are not archiving experts!

33 Where to Archive? Dutch KB Dutch KB CLOCKSS CLOCKSS LOCKSS LOCKSS Portico Portico Library of Congress Library of Congress British Library British Library

34 How to Archive? Provide details of digital availability Provide details of digital availability Provide sample of content Provide sample of content Provide details of content format (DTD) Provide details of content format (DTD) Send all backfile for loading Send all backfile for loading Set up content flow for ongoing content Set up content flow for ongoing content

35 SAGE Experience with DutchKB  Contract and negotiation  Contact with technical team  Delivery of samples and details of scope  Follow-up questions  Visit KB – Find out what’s happening  Delivery of back content  Delivery of ongoing issues  Ongoing issue discrepancies

36 Archiving Part IV: Questions, Questions and More Questions

37 Measurements of Success Who is overseeing the archiving process and governance? Who is overseeing the archiving process and governance? Compliance? Compliance? Accuracy and legitimacy? Accuracy and legitimacy? Financial stability? Financial stability?

38 Resources Archiving should be done by librarians ad archivists, period. Gordon Tibbitts, Blackwell Publishing. April 4, 2006 UKSG Archiving should be done by librarians ad archivists, period. Gordon Tibbitts, Blackwell Publishing. April 4, 2006 UKSG Portico - http://www.portico.org/ Portico - http://www.portico.org/http://www.portico.org/ LOCKSS - http://lockss.stanford.edu LOCKSS - http://lockss.stanford.eduhttp://lockss.stanford.edu CLOCKSS - http://www.lockss.org/clockss/Home CLOCKSS - http://www.lockss.org/clockss/Homehttp://www.lockss.org/clockss/Home KB E-Depot - http://www.kb.nl/index-en.html KB E-Depot - http://www.kb.nl/index-en.htmlhttp://www.kb.nl/index-en.html http://www- 5.ibm.com/be/pdf/en/events/nextlevel/presentation_kb_den_haag_edepot_ibm_brussels_v03. pdf DepotDigital Archiving at the national library of the Netherlands- http://www- 5.ibm.com/be/pdf/en/events/nextlevel/presentation_kb_den_haag_edepot_ibm_brussels_v03. pdf “A Fresh Look at the Reliability of Long­term Digital Storage.” Baker, Mary, et al.. EuroSys '06, April 18-21, 2006 “A Fresh Look at the Reliability of Long­term Digital Storage.” Baker, Mary, et al.. EuroSys '06, April 18-21, 2006 Digital Archives & Repositories: Why should I care? – Bernard Hecker, HighWire Press, Publishers Meeting, October 2004 Digital Archives & Repositories: Why should I care? – Bernard Hecker, HighWire Press, Publishers Meeting, October 2004 Archive Overview, – Bernard Hecker, HighWire Press, Publishers Meeting, April 2006 Archive Overview, – Bernard Hecker, HighWire Press, Publishers Meeting, April 2006 Trusted Digital Repositories: Attributes and Responsibilities An RLG-OCLC Report. © 2002 Research Libraries Group British Library: Project: JCLD Pilot Project in Anticipation of E-Journals, June 2005 Simon Inger Digital Archives & Repositories: Why should I care? – Bernard Hecker, HighWire Press, Publishers Meeting, October 2004; Archive Overview. Bernard Hecker, HighWire Press, Publishers Meeting, April 2006; Archiving: A SAGE Example. John Shaw. Publishers Meeting, April 2006 Note: Presentation based on Digital Archives & Repositories: Why should I care? – Bernard Hecker, HighWire Press, Publishers Meeting, October 2004; Archive Overview. Bernard Hecker, HighWire Press, Publishers Meeting, April 2006; Archiving: A SAGE Example. John Shaw. Publishers Meeting, April 2006

39 Thank You! Contact info: John.Shaw@sagepub.com www.sagepub.com John.Shaw@sagepub.com www.sagepub.com John.Shaw@sagepub.com www.sagepub.com


Download ppt "Archiving What is it and why should it be important to me? John Shaw Director, Publishing Technologies SAGE Publications, U.S."

Similar presentations


Ads by Google