Connecting Finnish publication data to OpenAIRE - case VIRTA

Slides:



Advertisements
Similar presentations
The REPOX system Nuno Freire -
Advertisements

Christiane Stock Emmanuelle Rocklin Aurélie Cordier
IST Humboldt University Berlin, Germany – Computer and Media Service – Electronic Publishing Group Birgit Matthaei, 4th Sept. 2003, Bath,
OAForum – September 2003 Muriel Foulonneau Open Archives Initiatives Protocol for Metadata Harvesting Practices for the cultural heritage sector Muriel.
Pure Silver Reusing and Repurposing Bibliographic Data in a Current Research Information System and Institutional Repository 15 September.
Copying Archives Project Group Members: Mushashu Lumpa Ngoni Munyaradzi.
OpenAIRE & OA in H2020 Open Access Infrastructure for Research In Europe Inge Van Nieuwerburgh Gwen Franck.
CNRIS CNRIS 2.0 Challenges for a new generation of Research Information Systems.
SSTIC’s OpenAccess related activities in international context Ľubomír BILSKÝ Bratislava, 25 March 2015 Slovak Scientific and Technical Information Center.
The Open Archives Initiative Simeon Warner (Cornell University) Symposium on “Scholarly Publishing and Archiving on the Web”, University.
Debbie Campbell Director Collaborative Services National Library of Australia Electronic Resources Australia Annual Forum Sydney 10 July 2012 Trove’s Application.
An Electronic Journal Impact Study: The Factors that Change when an Academic Library Migrates from Print Carol Hansen Montgomery, Ph.D. Dean of Libraries,
Metadata: Its Functions in Knowledge Representation for Digital Collections 1 Summary.
Federated Networks of Open Access Repositories in Mexico and Latin America Rosalina Vázquez Tapia, Autonomous University of San Luis Potosí.
Grey Literature, E-Repositories and Evaluation of Academic & Research Institutes. The case study of BPI e-repository Maria V. Kitsiou - Head Librarian,
Using CERIF-based CRIS to support the academic and research community: emerging services in Greece Nikos Houssos National Documentation Centre / National.
National CRIS development in Finland Aija Kaitera Research Administration University of Helsinki.
The DNER - a national digital library Andy Powell ZIG Meeting, York October 2001 UKOLN, University of Bath UKOLN is funded by Resource:
1 Schema Registries Steven Hughes, Lou Reich, Dan Crichton NASA 21 October 2015.
BIEN Confederated DB (S) Analytical DB(s) Heterogeneous source database(s) of Plots/Specimens/Occurrences Synonymy Names Reference taxonomy *** *** Feedback.
Digital Collection of TUT Library
Accessing a national digital library: an architecture for the UK DNER Andy Powell ELAG 2001, Prague 7 June 2001 UKOLN, University of Bath
Caltech CODA CODA: Collection of Digital Archives Caltech Scholarly Communication.
From small beginnings: Developing collection level description Mapping the Information Landscape Showcase day British Library Conference Centre, London,25.
1.Registration block send request of registration to super peer via PRP. Process re-registration will be done at specific period to info availability of.
Patterns in caBIG Baris E. Suzek 12/21/2009. What is a Pattern? Design pattern “A general reusable solution to a commonly occurring problem in software.
 All the 16 Finnish universities enter publications to their own publication registers and submit the information to the Finnish Ministry of Education.
Metadata V1 By Dick M.A. Schaap – technical coordinator Oostende, June 08.
ΕΚΤ Access to Knowledge ΕΚΤ Access to Knowledge R&D Statistics Information System: An Interoperability Tail between CERIF and SDMX Dimitris Karaiskos Dimitrios.
THE NATIONAL LIBRARY OF FINLAND Towards reliable data – counting the Finnish Open Access publications CRIS 2016, St Andrews, June 10, 2016 Jyrki Ilva.
INTRODUCTION TO BIBLIOMETRICS 1. History Terminology Uses 2.
19th international symposium on Theses and Dissertations Data and Dissertations July 2016, Lille, France Dr. Jamal Alsalmi Sultan Qaboos University.
Towards integrating European research information
Current Research Information SysTem In Norway
VIRTA Publication Information Service
Head of Publishing, University of Jyväskylä
Jordan PIŠČANC, University of Trieste
OpenAIRE in 8 Minutes Tony Ross-Hellauer State and University Library,
Even minor integrations can deliver great value – A case study
Building A Repository for Digital Objects
Proposal for piloting the VIRTA publication information service at the European level Janne Pölönen, Hanna-Mari Puuska and Gunnar Sivertsen.
Information modeling and infrastructures for metadata
Peter Shepherd COUNTER March 2012
The OpenAIRE infrastructure
An Open Knowledge & Research Information Infrastructure
Towards connecting geospatial information and statistical standards in statistical production: two cases from Statistics Finland Workshop on Integrating.
Say, “S” (as) = semantics – and mean it
European VIRTA pilot – current situation
Accessing a national digital library: an architecture for the UK DNER
European Publication Information Service - a Pilot Project
European VIRTA pilot – eurooppalaisen julkaisutietovirran pilotointi
European Publication Information Service - a Pilot Project
VI-SEEM Data Repository
OpenAIRE Services for Funders
Outline Pursue Interoperability: Digital Libraries
Managing ETDs with Associated Complex Digital Objects
Introduction to Implementing an Institutional Repository
Grey Literature Repositories and CRIS in a SOA Environment
SISAI STATISTICAL INFORMATION SYSTEMS ARCHITECTURE AND INTEGRATION
Metadata for research outputs management Part 2
Open Access in the humanities research in Finland and Sweden
YTY − an integrated production system for business statistics
Open Access Project Officer
University of Jyväskylä Faculty of Information Technology
Tech introduction.
Track Your Research Impact
Research Information =Descriptive infromation, metadata, on e.g. publications, research data, projects, researchers, research groups and organizations.
Managing the Institutional Repository for OA Khawulile Radebe: Librarian: Repository Administrator & Metadata.
SDMX IT building blocks
Presentation transcript:

Connecting Finnish publication data to OpenAIRE - case VIRTA Joonas Nikkanen, Project Manager, CSC - IT Center for Science, https://orcid.org/0000-0002-5036-6444 Making sense of Open Science jungle - 12.3.2019 Helsinki

Agenda Background on VIRTA: VIRTA Publication Information Service Context: Finnish publication metadata from OpenAIRE’s point of view Case: Implementation of OpenAIRE Guidelines for CRIS Managers

VIRTA Publication Information Service

Publication information collection in Finland Ministry of Education and Culture collects bibliographic information on scientific publications annually from 14 universities and 5 university hospital districts (since 2011) 23 universities of applied sciences (since 2012) 12 state research institutes (gradually since 2014) Used as criteria in performance-based funding model of higher education institutions For universities 13 % of core funding (~200 mill. euros) is allocated via publication points Publication points are calculated based on publication types and their level (evaluated by national scholarly panels - Publication Forum - http://www.julkaisufoorumi.fi/en) Publication type Level 3 Level 2 Level 1 Level 0 Peer-reviewed monograph (C1) 16 12 4 0.4 Peer-reviewed article in journal (A1-2) 3 1 0.1 Peer-reviewed article in book (A3) Peer-reviewed article in proceedings (A4) Peer-reviewed edited work (C2) Not-peer-reviewed monographs Not-peer-reviewed articles

Use of publication information in Finland Organizations provide a copy of their publication information to VIRTA making it a national data warehouse (or data hub) for other services to use the publication metadata In total some 60 000 publications per year = books, journal articles, conference papers, non-scholarly publications, dissertations, artistic publications All scientific fields are covered The data can be… …examined on bibliographic level: http://www.juuli.fi/ …pivoted on statistical level: www.vipunen.fi/en-gb/ …queried via authenticated REST API (XML, JSON) and OAI-PMH API (Dublin Core, XML) 24.4.2019

Annual publication data National funding model Vipunen Juuli Statistical portal for analysis Portal for examining publications +60k publications annually Academy of Finland 14 universities, 5 univ. hosp. Funding calls and reporting Pure / Converis / SoleCRIS VIRTA Publication Information System 23 universities of applied sci. +350k publications total Organisations’ systems Mostly JUSTUS / some manually Validating / de-duplicating / REST and OAI-PMH APIs available For master data purposes 12 state research institutes JUSTUS 2019 Varies from Pure to manual 2019 For prefilling co-publications Research Information Hub OpenAIRE Wider linking and interoperability National aggregator

Finnish publication metadata from OpenAIRE’s point of view

Finnish repositories supporting OpenAIRE harvesting Includes The repositories from 7 universities (+ 2 as sub-repositories) Common repository for universities of applied sciences (Theseus) One research institution (VTT) (+ 4 as sub-repositories) Self-archived (green OA) publications Theses Missing Repositories from 5 universities + most research institutions Publications not archived in repositories, e.g. Aaltodoc repository 2011-2017: 3049 publications AaltoCRIS 2011-2017: 35 886 publications Aalto publications in VIRTA 2011-2017: 27 257 publications 24.4.2019

Survey for Finnish OpenAIRE providers by Finnish NOAD in 2018 The survey was carried out by the Finnish NOAD, the University of Helsinki Responses from 7 (out of 9) current repositories that are harvested by OpenAIRE and 7 non-OpenAIRE repositories 6/7 current providers mentioned the work on data models and supporting harvestable API endpoint to be the biggest issues regarding OpenAIRE In some cases repositories are not connected to CRIS systems DSpace (used by most repositories) needs work (on publication forms etc.) to be compliant with OpenAIRE specifications Harvested metadata is rather poor after mapping (repositories include richer metadata) 6/7 of non-OpenAIRE harvested repositories are planning to implement OpenAIRE support Half of them to be harvestable in 2018/2019, others have no schedule yet 5/7 mention the technical implementation and work on metadata to be too resource intensive to yet have support for OpenAIRE spesifications Summary of results (in Finnish) at: https://blogs.helsinki.fi/openaire2020/2018/08/23/kansallisesta-koordinaatiosta-toivotaan-tukea-julkaisuarkistoille-openaire-kyselyn-tulosten-yhteenveto/ 24.4.2019

Implementation of OpenAIRE Guidelines for CRIS Managers 24.4.2019

Why? To expose more of the Finnish research publications to OpenAIRE Repositories only include a fraction of publications Some institutional repositories are missing OpenAIRE integration Centralized solution - to save time and resources Implementation and possible updates to OpenAIRE specifications has to be done to one system only Does not limit possible individual implementations in Finnish CRIS systems Better quality and more complete metadata for OpenAIRE Future additions to VIRTA data model can be harvested to OpenAire as well 2018 - Q1 2018 - Q2 2018 - Q3 2018 - Q4 2019 - Q1 2019 - Q2 Planning Presenting to organizations Mapping data models OAI-PMH endpoint validating OpenAIRE beta First harvests to OpenAIRE? Procedures for CERIF-XML Permissions from organizations OAI-PMH work 24.4.2019

Guidelines for CRIS Managers “The Guidelines provide orientation for CRIS managers to expose their metadata in a way that is compatible with the OpenAIRE infrastructure. By implementing the Guidelines, CRIS managers support the inclusion and therefore the reuse of metadata in their systems within the OpenAIRE infrastructure.” OpenAIRE Guidelines for CRIS Managers version 1.1. released in June 2018

Research organisations Research Information Hub Universities Universities of applied sciences Vipunen Statistical data VIRTA University hospitals Research institutions GET Juuli Examine metadata Source systems GET Load XML files to SA (SSIS) Publication data (CTRIS, DW) Validate data (SQL procedure) Research organisations Organisations transfer data to Publication XML Schema (eg. via CSV-XML tool) Find Jufo_IDs (C#) GET CRIS / master data use Find co-publications (C#) GET JUSTUS Find duplicates (C#) Metadata prefill SFTP Interface XML files synced APIs (REST, OAI-PMH) Send validation email to organisation Publication Data (XML) GET/POST PUT/DELETE Powershell script Transfer data to DW (SQL Procedure) Academy of Finland Applications / reporting Create VIRTA-XML (SQL Procedure) GET JUSTUS Database copy via SSH Create CERIF-XML (SQL Procedure) OpenAIRE Publication data (SQL) Create Dublin Core (SQL Procedure) Metadata harvesting Update API tables (SQL Procedure) Database connection Research Information Hub Update reports (SQL Procedure) Linking to data, projects etc. Ready To do

Data models Many similarities between VIRTA and CERIF data models on publications and what elements are included Key differences: Publication type classification Case of IDs as national aggregator Open access classification National classifications (e.g. field of science) Chosen not to be included in mapping: Publication forum levels (national scholarly panels) Artistic publications 24.4.2019

Data models VIRTA - CERIF mapping table available: https://wiki.eduuni.fi/pages/viewpage.action?pageId=80941717 VIRTA data model for reference: https://tietomallit.suomi.fi/model/julkaisu/ 24.4.2019

Lessons learned Implementation process highly dependent on source system architecture / technologies Following documentation is straightforward, but further support and best practices would be useful for implementers Aiming for high quality metadata equals more work and more complex mapping (+ upkeep) 24.4.2019

Thank you! 24.4.2019

Joonas Nikkanen Project Manager Research Information Management and Interoperability Tel. +358 50 381 80 92 linkedin.com/in/joonas-nikkanen 24.4.2019