Realizing the Dream of a Global Digital Library in High-Energy Physics Annette Holtkamp, Salvatore Mele, Tibor Simko, Tim Smith CERN, Geneva DML 2010 –

Slides:



Advertisements
Similar presentations
INSPIRE A new information system for High-Energy Physics
Advertisements

How do High Energy Physics scholars search their information? Anne Gentil-Beccot, CERN – 11 December 2007, GL9 conference.
50 Years of Experience in Making Grey Literature Available Matching the Expectations of the Particle Physics Community Carmen ODell.
Global Digital Library in High-Energy Physics Jan Iwaszkiewicz (CERN) Digital Repositories – Linked Open Data – the possible Role of D4Science 16 th Dec.
Trends in Scientific Publishing Guenther Eichhorn DirectorAbstracting & Indexing Cambridge, MA April 2010.
1 2 HEP aims to understand how our Universe works: -Experimental HEP : builds the largest scientific instruments ever to reach.
Maximizing the benefit of research information in Particle Physics *** A user-driven story Anne Gentil-Beccot, CERN. EuroCris. 11 May 2010.
Citing and reading behaviours in High Energy Physics *** Learning from OA bibliometrics? Anne Gentil-Beccot, CERN. Uppsala. 17 November 2010.
SPIRES and INSPIRE Travis Brooks SLAC National Accelerator Laboratory INSPIRE Collaboration PPA Computing 1 July 2010.
Information-Seeking Behavior in the High-Energy Physics Community Tamar Sadeh School of Informatics, City University, London Ex Libris HCI conference,
JINR / CERN Grid and advanced information systems 2012 Anne Gentil-Beccot CERN Library GS/SIS The Library behind the scene Opportunities for Scientific.
The Library behind the scene How does it work ? The Library behind the scenes 1 JINR / CERN Grid and advanced information systems 2012 Anne Gentil-Beccot.
What have we done? What are we doing? What can we do? Travis Brooks (SLAC) Zaven Akopov (DESY)
Engineering Village ™ ® Basic Searching On Compendex ®
Steve Yip Head of Reference and Research Services HKUST Library Research Support Provided by HKUST Library and other JULAC Libraries in HK 1 Date : March.
New services and Springer Guenther Eichhorn Director, Abstracting & Indexing Cambridge, MA (April 2010)
The Casalini full-text platform: enriched content and expanded functionalities for empowered users Michele Casalini ADLUG Conference - Trento, 24 September.
1 Using Scopus for Literature Research. 2 Why Scopus?  A comprehensive abstract and citation database of peer- reviewed literature and quality web sources.
Institutional Repositories Tools for scholarship Mary Westell University of Calgary AMTEC Conference May 26, 2005.
Implementing Metadata Marjorie M K Hlava, President Access Innovations, Inc. Albuquerque, NM
Introduction to Information Retrieval Got a question concerning literature? Ask! Marion Bierhahn (4630) Where is the library? Bldg:1d.
ⓒ UNIST LIBRARY UNIST Institutional Repository ⓒ UNIST LIBRARY
Information systems for HEP: INSPIRE, arXiv and more Annette Holtkamp CERN ASP 2012 Kumasi, Ghana, Aug 3, 2012.
THOMSON SCIENTIFIC Web of Science 7.0 via the Web of Knowledge 3.0 Platform Access to the World’s Most Important Published Research.
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
JY Le Meur/Tibor Simko 12 th Feb’04 1)Context 2)Interoperability 3)Submission 4)Search 5)Preservation CERN, OAI3 Workshop, Geneva.
INSPIRE Travis Brooks (SLAC) Tibor Simko (CERN). SPIRES’ History Index to HEP literature for 35 years Via terminal login Via Via web (1st U.S. Website/1st.
ILC EDMS project suite Status Maura Barone GDE/Fermilab ILC Valencia - November 7, 2006.
European Organization for Nuclear Research Organisation Européenne pour la Recherche Nucléaire CDS Invenio CERN’s open source digital library information.
JINR DOCUMENT SERVER: Current Status and Future Plans I. Filozova 1, S. Kuniaev 2, G. Musulmanbekov 1, R. Semenov 1, G. Shestakova 1, P. Ustenko 2, T.Zaikina.
CERN – IT Department CH-1211 Genève 23 Switzerland t CERN Open Source Collaborative tools: Digital Library Software Tim Smith CERN/IT.
Processing e-literature at CERN Corrado Pettenati Mick Draper20 March 2000 Processing electronic literature: CERN case study C. Pettenati (ETT-SI) M. Draper.
E-Infrastructures for scholarly communication A first step to OA. An indispensable step for e-Science The case of High-Energy Physics Jens Vigen – Head.
European Organization for Nuclear Research Organisation Européenne pour la Recherche Nucléaire Digital Library and Conferencing update HEPiX at Cornell.
Thomson Scientific October 2006 ISI Web of Knowledge Autumn updates.
CERN - IT Department CH-1211 Genève 23 Switzerland t The CERN Document Server 12 th November 2010 Tim Smith.
Tullio Basaglia, CERN GS-SI CERN Scientific Information Service The context Presentation of the Service How do they search and use information? The project.
07/11/2002Thomas Baron - JACoW Workshop1 CERN Library Requirements T. Baron CERN ETT-DH-CDS.
A Tony Thomas-inspired guide to INSPIRE The evolution from SPIRES to INSPIRE and what it means for you Tony Thomas 60th Birthday Fest Feb Heath.
CERN - IT Department CH-1211 Genève 23 Switzerland t INSPIRE A Global Digital Library for HEP 14 th February 2011 Tim Smith on behalf of.
Library Science talk – Geneva/Bern 27./ Integrating information resources Annette Holtkamp CERN/DESY.
VIVO and Scholarly Repositories: Synergistic Opportunities.
Economists Online researchers and libraries collaborate. A subject-specific service model. Benoit Pauwels Université Libre de Bruxelles.
T. Brooks OAI6 18/6/09 Giving researchers what they want SPIRES, High-energy physics and subject repositories Travis Brooks SLAC National Accelerator Laboratory.
A Global Digital Library for High-Energy Physics Annette Holtkamp CERN-UNESCO School on Digital Libraries – Rabat, Nov 2010.
Oct 12-14, 2003NSDL Challenges in Building Federation Services over Harvested Metadata Kurt Maly, Michael Nelson, Mohammad Zubair Digital Library.
OAI and peer review Workshop (CERN 22/03/2001) Thomas Baron – Tibor Simko CERN Document Server: Validation & OAI WORKSHOP on the Open Archives initiative.
DSpace - Digital Library Software
1 The next generation HEP information system. HEP scientists love community services 2 What is the primary source of information for HEP scientists? From.
Jean-Yves Le Meur - CERN Geneva Switzerland - GL'99 Conference 1.
Inspire Status Library Group Meeting 2010/03/31. Meetings Mar workshop at CERN –Preparation of beta release –Travis, Joe, Zaven –Many small working.
CDS. CERN Document Server as the Library catalogue and institutional repository.
Roger Mills February don’t be evil stand on the shoulders of giants.
WISER: What’s new in Science SCOPUS, SCIRUS and Google Scholar Kate Williams and Juliet Ralph May 2006.
CERN Document Server 19 tth January 2006 CERN Document Server Jean-Yves Le Meur 19 th January 2006.
William J Nixon Setting up a Repository. Introduction Key Features to consider (and review) Wide Range of Technology Available –Best fit for purpose –Clear.
from Invenire: inveniō invenīs invenit invenī́mus invenī́tis inveniunt
The High Energy Physics information platform: Introduction
Annette Holtkamp - AAHEP7
Tim Smith CERN Geneva, Switzerland
Annette Holtkamp - 2nd HEP Information Summit, DESY, May 20-21, 2008
H.B. O'Connell HEP Info Summit DESY May 2008
VI-SEEM Data Repository
Introduction to Information Retrieval
Introduction to Information Retrieval
Context Interoperability Submission Search Preservation
Gwyn P. Williams and Kim Kindrew Pizza Seminar, September 18, 2013
Citation databases and social networks for researchers: measuring research impact and disseminating results - exercise Elisavet Koutzamani
DESY Documentation: Status + projects
Search for Article Citation
Presentation transcript:

Realizing the Dream of a Global Digital Library in High-Energy Physics Annette Holtkamp, Salvatore Mele, Tibor Simko, Tim Smith CERN, Geneva DML 2010 – Paris 7 Jul 2010

2 HEP community l closely-knit community n 20-30k active researchers publishing 10k articles n large collaborations (up to 5000 members) n very international (even small author groups) n authors = readers l rapid information exchange essential n mailing of preprints since the 60’s n long OA tradition n >90% of HEP journal articles on arXiv l dominance of community based information systems n arXiv n SPIRES

Dominance of community services 3 From 2007 survey of 2,000 physicists. Gentil-Beccot et al, Information Resources in High-Energy Physics: Surveying the Present Landscape and Charting the Future Course. J.Am.Soc.Inf.Sci.60: ,2009 arXiv:

SPIRES (1974-) 4 l network of databases n HEP literature, conferences, institutions, experiments, hepnames, jobs l SLAC – DESY – Fermilab Collaboration l SPIRES-HEP n Metadata for 850k objects, ~800 new records per week n Preprints, journal articles, conference contributions, books, grey literature n since 1974, web server since 1991 n 100k searches/day l high data quality, manually curated, comprehensive coverage l high acceptance, user involvement But: l outdated technology from the 70‘s

5 Invenio (2002-) l digital multimedia library system l platform for CERN Document Server (CDS) l powerful search engine n Google-like speed for up to 5M records n combined metadata, reference and fulltext search l flexible metadata (MARCXML, multimedia) l personalization and collaborative features l modular architecture l Apache/Python/MySQL l GNU General Public Licence n ~30 instances worldwide

6 ingestion

7 dissemination

8 run by (2007-)

9 INSPIRE development l 2007: Inception, feasibility study l 2008: user-level functionalities n data conversion n citation analysis, search syntax, output formats… l 2009: cataloguing functionalities n metadata maintenance and enrichment tools l 2010: workflow n harvesting, cataloguing… l April 2010: public beta version

10 Bibliographic Content l SPIRES content (plus part of CDS): journal articles, conference proceedings, preprints, experimental notes, theses l going beyond SPIRES: conference slides, multimedia, software, high-level research data… l going back before 1974 l more material from neighboring disciplines astrophysics, nuclear physics, mathematics… cited by core HEP articles

11 “Fulltext” repository l all freely accessible articles n esp. “endangered” material l access restricted articles n “hidden archive” n first agreements with Springer and APS l historical material n scanning of old preprint series l beyond articles n slides, multimedia, software, wikis… n independent citable objects

12 INSPIRE features I l Advanced search functionality n Google-like freetext search n Complex second-order searches Example: Find the most influential HEP core papers that cite the Hitchin article „Generalized Calabi-Yau manifolds“ but don‘t cite any papers by Polchinski refersto:reportnumber:math/ collection:core cited:100->9999 NOT refersto:author:Polchinski

13 INSPIRE features II l detailed record pages n abstract, keywords, references, citations, fulltext, figures n various export formats l comprehensive author pages n affiliation history, coauthors, frequent keywords, article classification, citation summary l citation analysis n cited by, co-cited with, self-citations, citation history l taxonomy based classification

14 HEP taxonomy hierarchical structure of all important l HEP concepts (dynamical symmetry breaking) providing n synonyms (dynamically broken) n related terms (spontaneous symmetry breaking) n broader/narrower (symmetry breaking) n definitions n subject areas (high-energy physics – theory)

15 Taxonomy applications l fast automatic generation of keywords n enabling e.g. prompt alerts n manually curated afterwards l automatic selection of HEP relevant articles n no longer time delay in border areas due to manual selection l improved search algorithm (planned) n A search for „SUSY“ will also find „supersymmetry“ n narrow/broaden search l user tagging (planned) n improve Inspire generated classification n improve taxonomy

16 Author identification l INSPIRE author id n compatible with other identification schemes n active participation in ORCID l author disambiguation n using e.g. lab id’s, affiliation history, coauthors and more n INSPIRE-id’s already assigned l automatic association of papers with authors n using info on affiliations, coauthors, research topics, from publishers G. Chen: 963 docs, 21 real authors, only 22 docs not assigned, 97.2% success rate n INSPIRE-id part of author lists of large collaborations

17 Coming sooner… l personalization n personal accounts, bookshelves, display formats, alerts, RSS feeds n collaborative tools, user groups l claim my papers l user tagging l fulltext search n snippet display l plot extraction, figure caption search n captions in TeX, display via jsMath, TeX symbols searchable l user submission n paper centric (articles, supplementary material) and beyond

18 … or later l innovative metrics l semantic analysis l content indexing of plots and tables l recommender systems n combining citations, keywords, fulltext, usage pattern data... l open API for 3rd party tools and searching l object aggregation (OAI-ORE) l OAIS standards for long-term document preservation

19 Partnerships l researchers n user tagging, user submission n improved correction interfaces n feedback driving future developments l information providers n close alliance with arXiv n data exchange with publishers/databases n standardized author identities l neighboring fields n open harvesting and searching n ADS (SAO/NASA Astrophysics Data System) n DML !