Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Digital Libraries and e-Archiving: Tim Smith1/19 Digital Libraries and e-Archiving at CERN Challenges and Solutions for the Scientific Community “First”

Similar presentations

Presentation on theme: "1 Digital Libraries and e-Archiving: Tim Smith1/19 Digital Libraries and e-Archiving at CERN Challenges and Solutions for the Scientific Community “First”"— Presentation transcript:

1 1 Digital Libraries and e-Archiving: Tim Smith1/19 Digital Libraries and e-Archiving at CERN Challenges and Solutions for the Scientific Community “First” 28 th September 2006 Tim Smith CERN/IT

2 2 Digital Libraries and e-Archiving: Tim Smith2/19 Why Such A Hot Topic?  Software:...  National repositories:...  National strategies:...  International initiatives: The European Library...  Conferences: ECDL, iPres,...  Industry: Google Scholar / Book  WWW + Google + Internet archive  Not enough?  Data ≠ Information ≠ Knowledge

3 3 Digital Libraries and e-Archiving: Tim Smith3/19 Scholarly Communication Author Manuscript preparation Publisher Copy editing Consistency Conventions Refereeing Publication Dissemination Library Subscription Collection mgmt Classification Cataloguing Indexing Reference retrieval Archival Search Access Reader Library/Journal Subscription Communities Find WWW Digital Library

4 4 Digital Libraries and e-Archiving: Tim Smith4/19 Digital Library Services Aggregation Collection Conversion Organisation Enrichment Stamping Watermarking Indexing Ranking Clustering Classifying > 100 sources Expose CERN authored material

5 5 Digital Libraries and e-Archiving: Tim Smith5/19 Open Access  Scholarly publication ≠ trade publication  Signatory of Berlin Declaration  Author grants  free, irrevocable, worldwide, perpetual right of access, …  Store in repository  Unrestricted distribution, interoperability, long-term archiving, …

6 6 Digital Libraries and e-Archiving: Tim Smith6/19 Digital Age Services  Thus far, changed form not function  Reproduced paper chain  Take advantage of native digital services  Collaboration  Comments, reviews, baskets  Immediacy  Email alerts, RSS feeds  Intensive tasks  Keyword & citation extraction  Full text indexing & ranking  Conversion services: multiple download formats  Flexible formats  Remove constraints of print versions  Internationalisation

7 7 Digital Libraries and e-Archiving: Tim Smith7/19 Internationalisation

8 8 Digital Libraries and e-Archiving: Tim Smith8/19 Connections and Statistics

9 9 Digital Libraries and e-Archiving: Tim Smith9/19 Reviews and Comments

10 10 Digital Libraries and e-Archiving: Tim Smith10/19 Key Word Extraction

11 11 Digital Libraries and e-Archiving: Tim Smith11/19 Digital Age Processes  Thus far, same actors and processes  Print medium was difficult to produce, distribute, archive, duplicate  Not so for electronic media !  Publishers role: certification and dissemination  How to get in (digital world)  Authority, Authenticity, Quality  Exploring new forms of peer review  Open Access publishing: CERN initiative  Author-pay model  Break the vicious circle: Tenure / grant allocation

12 12 Digital Libraries and e-Archiving: Tim Smith12/19 Advocacy and Coverage  Legal deposit  Natural focal point: everything passed through publisher/printer  Encouraging / promoting deposit  CERN publishing policy – deposit in eArchive  Harvesting  CDS missing submissions  Theoretical papers: close to 100%  Experimental papers: average, about 70%  Instrumentation papers: only 30%

13 13 Digital Libraries and e-Archiving: Tim Smith13/19 Digital Age Content  Multimedia  CPU intensive services: web download format preparation from masters  Data behind the publication  Experimental data sets  Log books  Institutional information  Multimedia records of the experiment life-cycle  Financial, social etc  Dissemination of unfinished, unrefereed work

14 14 Digital Libraries and e-Archiving: Tim Smith14/19 Video Archives EGEE Interview: Bob Jones 0120kbps0120kbps (2439 kb), 0480kbps (9814 kb), 1000kbps (20702 kb)0480kbps1000kbps 2000kbps2000kbps (40092 kb), Multirate120 1000kbps (32977 kb)Multirate120 1000kbps

15 15 Digital Libraries and e-Archiving: Tim Smith15/19 CDS Content and Usage  Entries875,000  Articles, preprints, theses718,000  Books and proceedings57,000  Talks (slides, videos)14,000  Periodicals3,000  Multimedia items (photos, clips)30,000  Archived items54,000  Full texts450,000  Collections570  Distinct users per month26,000  Searches per month270,000 70% non-CERN

16 16 Digital Libraries and e-Archiving: Tim Smith16/19 Not “born-digital”  Multimedia archive project  Meta data: key to retrieval  Photo-caption project (retirees) Open reel Audio 1950s U-matic 1970s Beta SP 1980s VHS 1980s

17 17 Digital Libraries and e-Archiving: Tim Smith17/19 Digitisation for Preservation  Deposit in Digital Library  Improve access  Halt deterioration of objects  Archiving of knowledge to preserve perennial access  Institutional archives  Subject Archives  Digital preservation needs  Strategies  Certification  Networks of backups  Storage model

18 18 Digital Libraries and e-Archiving: Tim Smith18/19 Perpetual Access  Active curation  Used to be largely passive until conservation work required  Technology obsolescence  Not always possible to create exact digital copy or replicate appearance  Changing media or file format  Need to verify integrity, authenticity, reliability  Audit trails and check sums: to eliminate transcription errors (or deliberate)  Associated metadata  Digital object and meta data encapsulation: ISO14721 OAIS  Multiple copies for security  Across different administrations: Los Alamos declass reps  LOCKSS and CLOCKSS

19 19 Digital Libraries and e-Archiving: Tim Smith19/19 Outlook  CERN is implementing solutions to manage 100s of PBs of LHC data  CERN’s knowledge is being amassed in a Digital Library which is “safe on a 10yr timescale”  DB migration, redundancy, backups  Long term preservation (100yr timescale) is an unsolved problem, but lots of initiatives  Bringing together IT specialists, librarians, archivists, museum curators, (authors)...

Download ppt "1 Digital Libraries and e-Archiving: Tim Smith1/19 Digital Libraries and e-Archiving at CERN Challenges and Solutions for the Scientific Community “First”"

Similar presentations

Ads by Google