Subject Repositories European collaboration in the international context 28-29 January 2010 Workshop Technical infrastructure & interoperability Benoit.

Slides:



Advertisements
Similar presentations
Enhanced Publications Presentation for ODaF Europe 2009 Thomas Place 2 April 2009.
Advertisements

Preserv Preservation Eprint Services Simple Preservation Services – towards Proactive Support for the Institutional Repository.
28 April 2004Second Nordic Conference on Scholarly Communication 1 Citation Analysis for the Free, Online Literature Tim Brody Intelligence, Agents, Multimedia.
Rclis in vision and reality Thomas Krichel
Open Access Niamh Brennan Trinity College Dublin DRIVER Summit, Goettingen, January 17th 2008 Local Integration, National Federation TCD-RSS, TARA, IReL-Open,
The DART-Europe E-theses Portal Martin Moyle Digital Curation Manager UCL Library Services, UK ETD 2009, University of Pittsburgh, June.
Institutional Repositories Workshop Universiteit Maastricht 4 October 2006.
Open Stirling: Open Access Publishing and Research Data Management at Stirling Monday 25 th March 2013 Michael White, Information Services STORRE Co-Manager/RMS.
October 28, 2003Copyright MIT, 2003 METS repositories: DSpace MacKenzie Smith Associate Director for Technology MIT Libraries.
Interoperability Aspects in Europeana Antoine Isaac Workshop on Research Metadata in Context 7./8. September 2010, Nijmegen.
Repositories: Disruptive Technology or Disrupted Technology? Sandy Payette, Executive Director DORSDL Workshop at ECDL 2008 September 2008.
Object Re-Use and Exchange Mellon Retreat, Nassau Inn, Princeton, NJ, March Herbert Van de Sompel, Carl Lagoze The OAI Object Re-Use & Exchange.
DRIVER Summit, January 2008 NEREUS A network of leading libraries collaborate on NEEO Network of European Economists Online.
1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,
NEEO Workpackage 5 NEEO Workpackage Leader Meeting - 3 Warwick, UK 3 September, 2009 Benoit PAUWELS.
Subject Repositories European collaboration in the international context January 2010 Workshop Technical infrastructure & interoperability Benoit.
NEEO Workpackage 5 NEEO Project Meeting - 4 Leuven, Belgium March 5th, 2009 Benoit PAUWELS Université Libre de Bruxelles (ULB) Brussels.
NEEO Workpackage 5 NEEO Project Meeting - 5 Geneva, Switzerland 22 June, 2009 Benoit PAUWELS.
NEEO Technical Workshop 2 DIDL/MODS implementation Sciences Po, Paris January 15th, 2009 Benoit PAUWELS Université Libre de Bruxelles (ULB) Brussels.
NEEO Workpackage 5 NEEO Project Meeting - 6 Paris, FR 26 November, 2009 Benoit PAUWELS.
The Open Archives Initiative Simeon Warner (Cornell University) Symposium on “Scholarly Publishing and Archiving on the Web”, University.
Introducing Symposia : “ The digital repository that thinks like a librarian”
1 CS 502: Computing Methods for Digital Libraries Lecture 17 Descriptive Metadata: Dublin Core.
Institutional Repositories Tools for scholarship Mary Westell University of Calgary AMTEC Conference May 26, 2005.
NEEO Workpackage 5 NEEO WorkPackage Leader Meeting - 2 Sciences Po, Paris January 16th, 2009 Benoit PAUWELS Université Libre de Bruxelles (ULB) Brussels.
Federated Networks of Open Access Repositories in Mexico and Latin America Rosalina Vázquez Tapia, Autonomous University of San Luis Potosí.
1 XML as a preservation strategy Experiences with the DiVA document format Eva Müller, Uwe Klosa Electronic Publishing Centre Uppsala University Library,
The Metadata Object Description Schema (MODS) NISO Metadata Workshop May 20, 2004 Rebecca Guenther Network Development and MARC Standards Office Library.
Research evaluation requirements José Manuel Barrueco Universitat de València (SPAIN) Servei de Biblioteques i Documentació May, 2011.
1 CrossRef - a DOI Implementation for Journal Publishers January 29, 2003 CENDI Workshop.
Depth customization of DSpace: Best practices and techniques of institutional repository at IIT Kanpur, India By S. K. Vijaianand V. D. Shrivastava Gaurav.
SWAP FOR DUMMIES. Scholarly Works Application Profile a Dublin Core Application Profile for describing scholarly works (eprints) held in institutional.
OpenURL Link Resolvers 101
NEEO project EC Final review meeting Gateway and portal 23 March 2010 Benoit Pauwels Université Libre de Bruxelles, Belgium 1.
NEEO Technical Workshop 2 Exchange of usage metadata Sciences Po, Paris January 15th, 2009 Benoit PAUWELS Université Libre de Bruxelles (ULB) Brussels.
PLoS ONE Application Journal Publishing System (JPS) First application built on Topaz application framework Web 2.0 –Uses a template engine to display.
07/11/2002Thomas Baron - JACoW Workshop1 CERN Library Requirements T. Baron CERN ETT-DH-CDS.
Open access & visibility Management Digital Preservation ORA: Purposes.
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
Scientific Data and Electronic Publishing Renze Brandsma, Head, Digital Production Centre University of Amsterdam Maarten Hoogerwerf, Project Manager,
DNER Architecture Andy Powell 6 March 2001 UKOLN, University of Bath UKOLN is funded by Resource: The Council for.
V. Proudman, OR09, Atlanta 18 May 2009 The plan WhoThe Project Manager Goal and corresponding tasks today Interim results, and sneak preview of a new service.
Open Archive Initiative – Protocol for metadata Harvesting (OAI-PMH) Surinder Kumar Technical Director NIC, New Delhi
Slavic Digital Text Workshop 2006 The Open Archives Initiative Protocol for Metadata Harvesting: an Opportunity for Sharing Content in a Distributed Environment.
Economists Online researchers and libraries collaborate. A subject-specific service model. Benoit Pauwels Université Libre de Bruxelles.
CENDI/FLICC Workshop, June 21, 2000 Slide 1 of 24 The Impact of Reference Linking on the Creation and Use of References/Citations CENDI/FLICC Workshop.
Economists Online as a building block of a VRE solution OAI6 Conference, Geneva 18 June, 2009 Benoit PAUWELS - Université Libre de Bruxelles.
Metadata and Meta tag. What is metadata? What does metadata do? Metadata schemes What is meta tag? Meta tag example Table of Content.
VuFind Digital Libraries à la Carte International Ticer School 2009 Tilburg University 31 July, 2009 Benoit PAUWELS Université Libre de Bruxelles (ULB)
The library is open Digital Assets Management & Institutional Repository Russian-IUG November 2015 Tomsk, Russia Nabil Saadallah Manager Business.
Data Citation Implementation Pilot Workshop
Metadata & Repositories Jackie Knowles RSP Support Officer.
Metadata Schema Registries: background and context MEG Registry Workshop, Bath, 21 January 2003 Rachel Heery UKOLN, University of Bath Bath, BA2 7AY UKOLN.
CitEc as a source for research assessment and evaluation José Manuel Barrueco Universitat de València (SPAIN) May, й Международной научно-практической.
NEEO Technical Workshop 2
Tiewei (Lucy) Liu Metadata Librarian June 26, 2016
NEEO Workpackage Leader Meeting - 3
Jordan PIŠČANC, University of Trieste
OceanDocs Digital Repository of Marine Science Research Outputs
Repository Software - Standards
Accessing a national digital library: an architecture for the UK DNER
VI-SEEM Data Repository
Implementing an Institutional Repository: Part II
CLIENT RELATIONSHIP MANAGEMENT KEEPING TRACK OF REQUESTS THE EASY WAY
Digitometric Services for Open Archives Environments
IDEALS at the University Of Illinois: A Case Study of Integration Between an IR and Library Discovery Systems Sarah L. Shreeves University of Illinois.
Malte Dreyer – Matthias Razum
BUILDING A DIGITAL REPOSITORY FOR LEARNING RESOURCES
Implementing an Institutional Repository: Part II
How to Implement an Institutional Repository: Part II
Presentation transcript:

Subject Repositories European collaboration in the international context 28-29 January 2010 Workshop Technical infrastructure & interoperability Benoit Pauwels Université Libre de Bruxelles, Belgium

Workshop plan Theme 1: The Economists Online network of data providers General infrastructure of the EO solution DIDL/MODS: the EO metadata exchange format RDF/XML Admin file: decentralized administration Enrichment of metadata Theme 2: Economists Online and RePEc Pulling metadata from RePEc Pushing metadata to RePEc Contribute to LogEC Use CitEC

Workshop plan Theme (45’) Introduction (BP, 20’) 3 topics for brainstorming (breakout groups,10’) Breakout groups reporting back (all, 15’)

The Economists Online network of data providers Theme 1: The Economists Online network of data providers General infrastructure of the EO solution DIDL/MODS: the EO metadata exchange format RDF/XML Admin file: decentralized administration Enrichment of metadata

Meresco Harvester Crawler Lucene Other portals EO portal Metadata Logs Objects OAI-PMH HTTP Meresco Harvester Crawler Metadata Lucene SRU RePEc OAI-PMH RSS EO portal Homemade - FOSS Exporter engine Homemade - FOSS Other portals

DIDL / MODS NEEO specs Meresco Harvester Crawler SWUP OFI Comm Profile Metadata Logs Objects OAI-PMH HTTP Metadata exchange format DIDL / MODS NEEO specs Meresco Harvester Crawler Metadata Usage metadata exchange format SWUP OFI Comm Profile Lucene SRU RePEc OAI-PMH RSS EO portal Homemade - FOSS Exporter engine Homemade - FOSS Other portals

Technical decisions Desired EO functionality Technical decision Facetted search&find experience Normalized/normalizable metadata APA formatted citations Granular metadata Publication list per author Unambiguous identification of authors Full text indexing/searching Unambiguous links to full texts Enrichment of metadata (JEL, datasets, citations, ReDIF) Extensible metadata format

Metadata exchange format XML container structure that can hold semantically distinct metadata descriptive metadata object files (by-ref) splash page enriched metadata JEL full text (by-ref) datasets (by-ref) [ references ] RePEc handle and metadata (by-ref) DIDL Based on existing container structure defined by SurfShare “info:eu-repo” vocabularies (objectfile accessRights, version, ...)

Metadata exchange format Granular descriptive metadata MODS (3.2) Based on existing metadata structure defined by SurfShare “info:eu-repo” vocabularies (publication type, Unambiguous identification of authors DAI – Digital Author Identifier National or institution-unique persistent identifier Solutions not specific to the NEEO project; continuous aim of standardization at a level that surpasses the project

Publication is described as a complex (compound) object DIDL[1] Item[1] Descriptor/Identifier (persistent identifier) Item[1..∞] (of type descriptiveMetadata) Descriptor/type (« descriptiveMetadata ») Component/Resource -- representation by value (XML) Item[0..∞] (of type objectFile) Component/Resource -- representation by ref. (URL) Descriptor/modified Descriptor/type (« objectFile ») Item[0..1] (of type humanStartPage) Descriptor/type (« humanStartPage ») EO Data model Publication is described as a complex (compound) object persistent identifier Aggregation of 3 types of components descriptiveMetadata (MODS) objectFiles humanStartPage Extensible additional items can be stored within the complex object MODS contains Digital Author Identifier (DAI) of EO author

Metadata exchange format Implementations in NEEO DIDL application profile MODS application profile Vocabularies in DIDL and MODS Technical guidelines for project partners Solutions: home-made or with external support ARNO: home-made Dspace: home-made, AtMire Eprints: home-made, ECS-University Of Southampton Fedora: METS/MODS -> DIDL/MODS DigiTool: METS/MARC -> DIDL/MODS

Decentralized registry service XML-RDF file FOAF + NEEO-specific vocabulary maintained by each data provider on a local web server information of institution : name, description, ... OAI baseURL + OAI sets to harvest EO authors: photograph, full name, affiliation, DAI HTTP get and validated by EO Gateway at regular intervals Automated harvesting process Made visible through portal New partner Create admin file Ask for registration at economistsonline@uvt.nl , declaring location and validating admin file If valid, you’re in

Meresco Harvester Crawler Lucene Other portals EO portal Metadata Logs Objects OAI-PMH HTTP Meresco Harvester Crawler Metadata Lucene SRU RePEc OAI-PMH RSS EO portal Homemade - FOSS Exporter engine Homemade - FOSS Other portals

Meresco Enrichment service Harvester Crawler Lucene Other portals Metadata Logs Objects OAI-PMH HTTP Meresco Enrichment service Harvester Crawler OAI-PMH Metadata Lucene SRU SRU RePEc OAI-PMH RSS/Atom EO portal Homemade - FOSS Exporter engine Homemade - FOSS Other portals

Metadata enrichment “Automated” enrichment – JEL, full-text ES gets records to be enriched from EO, over SRU Based on date of request for enrichment of certain type and version Based on flag set in EO record ES creates enrichment record(s) ES makes enrichment records available to EO, over OAI-PMH EO harvests enrichment records from ES and integrates into original record EO reuses enrichment information in its services: index & present “Manual” enrichment – datasets Partner enters permalink of publication on DVN platform EO PMH-harvests DDI from DVN, and stores by-ref information

Enriched publication LinkedData / SemanticWeb / ORE ready IR / ES EO DIDL[1] PDF Item[1] HTML Descriptor/Identifier (persistent identifier) TXT Descriptor/modified Item[1..∞] (of type descriptiveMetadata) Dataset DDI Item[0..∞] (of type objectFile) LinkedData / SemanticWeb / ORE ready Item[0..1] (of type humanStartPage) Item[0..∞] (of type text) Item[0..∞] (of type enrichedMetadata) Review Item[0..∞] (of type dataset) Descriptor/Identifier (persistent identifier) Descriptor/modified Item[1..∞] (of type descriptiveMetadata) Item[0..∞] (of type review) Item[0..∞] (of type objectFile)

Theme 1: The Economists Online network of data providers BO Group 1: DIDL/MODS Scalable? Implementation by 100s of partners Local experiences from existing partners: implementation issues you want to share? Can this become a standard for exchange of metadata of IR contained publications? Where does this stand next to (flavours of) DC, SWAP,...? BO Group 2: XML Admin file DAI? BO Group 3: Enrichment model Extensibility: vocabulary for semantics of components Manual enrichment: need for enriched submission form, making it easy for people to make enriched publications Automated (JEL, full text): sustainable?

Workshop plan Theme 2: Economists Online and RePEc Pulling metadata from RePEc Pushing metadata to RePEc Contribute to LogEc Use CitEc

RePEc model RePEc archives contain RePEc series contain Working papers, Articles, Books, Book chapters, Software Manually maintained by research centres, journal publishers, university departments all over the world +/- 900 archives, more than 4000 series ReDIF metadata format Network accessible over FTP or HTTP Aggregation by RePEc services: EconPapers IDEAS Central PMH-accessible aggregated archive of AMF formatted metadata

RePEc model Template-type: ReDIF-Paper 1.0 Author-Name: Capron, Henri Author-Email: hcapron@ulb.ac.be Author-Name: Meeusen, Wim Author-Email: wim.meeusen@ua.ac.be Author-Name: Dumont, Michel Author-Person: pdu51 Author-Name: Cincera, Michele Author-Person: pci5 Title: National innovation systems: pilot study of the Belgian innovation system Creation-Date: 1998 Publication-Status: Published as a report for the Belgian Federal Office for Scientific, Technical and Cultural Affairs (OSTC) File-URL: http://bib17.ulb.ac.be:8080/dspace/bitstream/2013/941/1/mc-0048.pdf File-Format: application/pdf Handle: RePEc:dul:ecoulb:2013-941

RePEc model compared to IR model Very similar BUT RePEc model: Harvests only from “official” publisher repositories Therefore: 1 work exists once in RePEc and it is guaranteed the one and only “official” manifestation of the work IR model: holds publications for which institution is typically not the publisher 1 work  1 official manifestation + multiple author manifestations one work can exist in: one or more repositories as different publication types with different descriptive metadata with different object files attached with different object file metadata Pushing and pulling metadata records from RePEc and IR into one system is bound to raise problems

Pull metadata from RePEc EO harvests AMF formatted metadata records from http://oai.repec.openlib.org/ Overlap !! Same records are harvested from IR and RePEc Solution: XML Admin file contains directive <not-from-repec-series> Permits to specify which RePEc series do not need to be harvested from RePEc, since already delivered through IR BUT: IR contains articles produced by its authors These articles are contained in a journal RePEc series Overlap in EO cannot be avoided

Push metadata to RePEc EO sets up “RePEc:ner” archive, containing ReDIF-X formatted records ReDIF-X All records are delivered as “ReDIF-Paper”, but with extra fields denoting the “real” publication status and version of text Overlap !! Most institutions already maintain RePEc series: these records must not be pushed by EO XML Admin file controls which series to feed in this “ner” archive <feed-repec> boolean: to feed or not to feed <feed-repec-series> If not given: all records with fulltext that are not working papers are mapped to one series for that institution RePEc series  OAI setspec of DIDL/MODS record BUT IR inherent problem of multiple copies/versions is pushed to RePEc

Push metadata to RePEc: ReDIF-X Template-type: ReDIF-Paper 1.0 Title: Block investments and the race for corporate control in Belgium Author-Name: Chapelle, Ariane Language: en Note: info:eu-repo/semantics/published X-PublishedAs-Type: article X-PublishedAs-Article-Year: 2004 X-PublishedAs-Article-Journal: Corporate Ownership & Control X-PublishedAs-Article-Volume: 2 X-PublishedAs-Article-Issue: 1 Order-URL: http://dipot.ulb.ac.be:8080/dspace/handle/2013/9943 File-URL: http://dipot.ulb.ac.be:8080/dspace/bitstream/2013/9943/1/ac-0007.pdf File-Format: application/pdf File-Version: authorVersion Handle: RePEc:ulb:ecoulb:2013/9943

LogEc Aim: track abstract views and download clicks of publications presented through RePEc services (EconPapers, IDEAS, ... Economists Online) NOT: tracking of usage at the level of the archives Downloads of publications contained in RePEc archives, initiated through a Google user do not show up in LogEc How: EO logs clicks abstract views and download clicks of object files On a monthly basis, EO transforms these log entries into requested LogEc format, using “rstat.pl” 2009-10 EconomistsOnline RePEc:aah:aarhec:1987-21 a: 65.55.207.69 66.235.124.10 d: 66.235.124.10 RePEc handle of publication is necessary  EO partners delivering content to RePEc directly (and that EO therefore doesn’t harvest from RePEc but from the IR) must include the RePEc handle in the DIDL/MODS record

LogEc RePEc EO RePEc (AMF metadata) RePEc handle DIDL[1] Item[1] Descriptor/Identifier (persistent identifier) Descriptor/modified Item[1..∞] (of type descriptiveMetadata) Item[0..∞] (of type objectFile) Item[0..1] (of type humanStartPage) RePEc (AMF metadata) Item[0..∞] (of type descriptiveMetadata) RePEc handle Descriptor/modified byRef

CitEc Aim: citation analysis for RePEc publications How: Analyze text: extract and parse list of references from publications References are checked whether available in RePEc Cites: references to other RePEc publications Textual references CitedBy Co-citations EO publications (from our IRs) are pushed to RePEc and are therefore pulled through the CitEc processing EO has access to the resulting CitEc data, and presents this through the EO portal (not yet, will be in Feb 2010) RePEc handle of publication is necessary  EO partners delivering content to RePEc directly (and that EO therefore doesn’t harvest from RePEc but from the IR) must include the RePEc handle in the DIDL/MODS record

Theme 2: Economists Online and RePEc BO Group 1 : Push/pull to/from RePEc ReDIF-X data structure Duplicates; different versions of identical publication BO Group 2: Publishing models Advantages/disadvantages of RePEc publishing model as opposed to IR publishing model Push the two models together? Do we need to foresee specific services in the gateway or portal to make these two live together in peace? BO Group 3: Future RePEc/EO services What services should EO and RePEc jointly be looking at in the future in the interest of the economics researcher ?