CERIF 1.5 Tutorial November 5 th, 2012 euroCRIS Membership Meeting Madrid, Spain cfExpertise AndSkills cfEquipment cfFunding cfFacility cfService cfCitation cfEvent cfLanguagecfCurrency cfCountry cfCurriculum Vitae cfPrize cfQualificatio n cfGeographic BoundingBox cfPostalAddres s cfElectronicAddress cfPerson cfProject cfOrganisatio n Unit cfResultPaten t cfResult Publication cfResultProduc t cfIndicator cfMeasurement cfFederated Identifier
Slides Author Brigitte Jörg M.A. Information Science Information Systems, Business Economics CERIF Support Project – National Co-ordinator Innovation Support Center, UKOLN, University of Bath, Bath, UK CERIF TG Leader, Board Member euroCRIS, non-profit organization registered in the Netherlands Contact:
Introduction of the Speaker Jan Dvořák euroCRIS CERIF TG Deputy Lead CRIS2012 (Prague, June 2012) Organizer Charles University in Prague Faculty of Arts –Institute of Information Studies & Librarianship InfoScience Praha Research & Development & Innovation Information System (the national CRIS for CZ) Contact:
What is Research Information? Information about: Researchers Organisations (Research-performing, Funding) Funding Programmes, Calls Projects (Proposed, Ongoing, Completed) Publications, Patents, Data, Products Facilities, Equipment, Services Addresses, Geographic Bindings, Languages And their Relationships
Who needs Research Information? Research Information Funding Organisations Researchers Research Organisations Decision Makers Project Managers Publishers Enterprises Intermediaries / Brokers Media Education General Public visibility, finding collaborations, competitors, CV generation performance, strategic decisions, priorities, comparisons integration of relevant findings into lectures and training finding research results of potential market or innovative value distribution and communication information and education, interest finding reviewers, editors distribution of programs evaluation of results, finding reviewers finding information for participation in projects, partnerships, usage of results integration and interoperability strategic management overview of ongoing activities Libraries acquisition, dissemination
Kinds of questions we want to support How many articles has author X published in 2011 as a first author? How many times have articles by author X been cited by the end of the previous year? Did author X publish with institutionally external authors? In how many FP7 projects does/did organisation Z participate? How many publications have resulted from project Y? How many people have been employed in the course of FP7 projects from the 1st call in the New Member States? How many PhD students have participated in national research projects in country C? In which countries have they earned their masters degrees? How many women have been involved in FP7 projects? How often have articles in journal A been requested in 2010? How many articles have been published in field B?
Common European Research Information Format
The CERIF Evolution EU Working Group on Research Databases Workshop CERIF 91 PROJECT Similar Ideas UN/UNESCO OECD CODATA Acronym: ERGO Participant: Keith Jeffery, Anne Asser son, many more Organisations: Rutherford Appleton, Uni- versity of Bergen, … Acronym: ERGO Participant: Keith Jeffery, Anne Asser son, many more Organisations: Rutherford Appleton, Uni- versity of Bergen, … 2000 CLASSIFICATION RESULTSEQUIPMENT PROJECT OrgUnitPERSON EXPERTISE Roles CERIF 2000 Model - Networking of DBs - Exchange of Records - EC Recommendation to Member States - Data Model - Multilinguality - Controlled Vocabulary - Roles / Types - User-driven - EC Recommendation to Member States 2ndLevel Base Language Semantics Link CERIF 2006 / 2008 Model - Data Model - Model Normalization - Robust/Consistent Structure - Extensible Structure - Semantic Layer - XML Exchange Specification - Elaboration on Publication - CERIF Core Semantics ( ) Measurement GEO Citation CV Prize Qualification ExpertiseAndSkills Equipment Facility Funding Service ElectronicAddresse PostalAddress Country Currency Language Event MetricsIndicator Measurement 2ndLevel Base CERIF 1.3 Semantics Language Link Infrastructure - Data Model - Infrastructure - Facility, Equipment, Service - Measurement & Indicator - Entities and Link Tables - Geographic Bounding Box - CERIF 1.3 Vocabulary - UUIDs - Terms - Schemes - CERIF 1.4 new XML format - CERIF 1.5 Federated Identifiers CERIF 1.5 CERIF 1.4 (XML) CERIF 1.3 FORMALSEMANTICSFORMALSEMANTICS + Linked Data
Common European Research Information Format CERIF is an EU Recommendation to Member States The European Commission (EC) has authorised euroCRIS to maintain and develop CERIF and its usage eases&t=1 eases&t=1
Model Levels Conceptual Level (Specification) Concepts relevant for the research domain and their relationships Logical Level (ER Model) Entities and their relationships Physical Level (Database Scripts) Data Definition commands for the database Semantic Layer (Declared Semantics) A formalized controlled vocabulary describing a general contextual semantics of the research domain inline with the conceptual, logical and machine description Equipment Project Organisation Service Funding Patent Skills CV Product Event Person Classification ( Semantics ) Classification ( Semantics ) Publication SQL Script CREATE Table cfPers CREATE Table cfProj CREATE Table cfOrgUnit
CERIF Model Structure (Views) CERIF Entity Types Base Entities Result Entities Infrastructure Entities 2nd Level Entities Link Entities CERIF Features Multiple Language Semantics Measures & Indicators Geographic Bounding Box
CERIF Base Entities
CERIF Base Entities Person ID URI Gender FirstNames OtherNames FamilyNames NameVariants ResearchInterest Keywords Project ID URI Acronym StartDate EndDate Title Abstract Keywords OrganisationUnit ID URI Acronym Name HeadCount CurrencyCode Turnover ResearchActivity Keywords
CERIF Base Entities cfOrganisationUnit cfID cfURI cfAcronym cfHeadCount cfCurrencyCode cfTurnover cfTitle cfAbstract cfKeywords cfName cfDescription cfKeywords cfDescription cfKeywords cfFamilyNames cfFirstNames cfOtherNames cfNameVariants cfPerson cfID cfURI cfGender cfBirthdate cfProject cfID cfURI cfAcronym cfStartDate cfEndDate
CERIF Result Entities
CERIF Result Entities ResultProduct ID URI ResultPublication ID URI Title Subtitle Abstract Bibl. Note PublicationDate TotalPages StartPage EndPage Keywords ResultPatent ID URI PatentNumber Title CountryCode RegistrationDate ApprovalDate Description Keywords
CERIF Result Entities cfResultPublication cfID cfURI cfNumber PublicationDate cfStartPage cfEndPage cfTotalPages cfEdition cfSeries cfIssue cfVolume cfISBN cfISSN cfResultPatent cfID cfURI cfPatentNumber cfCountryCode cfRegistrationDate cfApprovalDate cfTitle cfAbstract cfKeywords cfSubtitle cfVersionInfo cfBibliographic Note cfAbbreviation cfDescription cfKeywords cfName cfResultProduct cfID cfURI cfVersionInfo cfAbstract cfKeywords cfName
CERIF Infrastructure Entities Equipment Facility Service
CERIF Infrastructure Entities Facility ID Acronym URI Title Description Keywords Service ID Acronym URI Title Description Keywords Equipment ID Acronym URI Title Description Keywords Equipment Facility Service
CERIF Infrastructure Entities cfService cfID cfURI cfAcronym cfEquipment cfID cfURI cfAcronym Equipment Facility Service cfFacility cfID cfURI cfAcronym cfName cfDescription cfKeywords cfName cfDescription cfKeywords cfName cfDescription cfKeywords
CERIF 1.5 cfExpertise AndSkills cfEquipment cfFunding cfFacility cfService cfCitation cfEvent cfLanguagecfCurrency cfCountry cfCurriculum Vitae cfPrize cfQualificatio n cfGeographic BoundingBox cfPostalAddres s cfElectronicAddress cfPerson cfProject cfOrganisation Unit cfResultPatent cfResult Publication cfResultProduc t cfIndicator cfMeasurement cfFederated Identifier
Measuring Impact in CERIF (MICE) MICE, a JISC-funded Project coordinated by Richard Gartner, Kings College, London, UK
CERIF Measurement & Indicator cfMeasureIdentifier cfCountInteger cfCountIntegerChange cfValueFloatingPoint cfCountFloatingPointChange cfValueJudgementalNumeric cfValueJudgementalNumericChan ge cfValueJudgementalText cfValueJudgementalTextChange cfURI Is an Aggregation Entity
Measurement & Indicator (some examples) –economic and commercial economic –impact on business »improving performance of existing businesses increased turnover time savings reduced costs »new products/processes Creating numbers of new products/services commercialising success measures Indicator Measurement Extract from the MICE List of Indicators
Measurement & Indicator (some examples) –economic and commercial economic –impact on business »improving performance of existing businesses increased turnover time savings reduced costs »new products/processes Creating numbers of new products commercialising success measures cfIndicator cfIndicID=00123 cfIndicator cfIndicID=00123 cfMeasurement cfMeasID= cfValueFloat=X cfMeasurement cfMeasID= cfValueFloat=X cfOrganisation_Measurement cfOrgUnitID=01234 cfMeasID= cfClassID=turnover cfClassSchemeID=ImpactOnBusiness cftStartDate= cfEndDate= cfOrganisation_Measurement cfOrgUnitID=01234 cfMeasID= cfClassID=turnover cfClassSchemeID=ImpactOnBusiness cftStartDate= cfEndDate= cfProduct_Measurement cfResultProductID= cfMeasID= cfClassID=new-2010 cfClassSchemeID=ImpactOnBusiness cfStartDate= cfEndDate= cfProduct_Measurement cfResultProductID= cfMeasID= cfClassID=new-2010 cfClassSchemeID=ImpactOnBusiness cfStartDate= cfEndDate= cfMeasurement cfMeasID= cfCount=Z cfMeasurement cfMeasID= cfCount=Z
Measurement & Indicator (some examples) –economic and commercial economic –impact on business »improving performance of existing businesses increased turnover time savings reduced costs »new products/processes Creating numbers of new products commercialising success measures cfMeasurement cfMeasID= cfValueFloat=X cfMeasurement cfMeasID= cfValueFloat=X cfOrganisation_Measurement cfOrgUnitID=01234 cfMeasID=012345; cfClassID=turnover cfClassSchemeID=ImpactOnBusiness cfStartDate= cfEndDate= cfOrganisation_Measurement cfOrgUnitID=01234 cfMeasID=012345; cfClassID=turnover cfClassSchemeID=ImpactOnBusiness cfStartDate= cfEndDate= cfMeasurement cfMeasID= cfValueFloat=Y cfMeasurement cfMeasID= cfValueFloat=Y X-Y cfMeasurement_Measurement cfMeasID1= cfMeasID2= cfClassID=increasedTurnover cfClassSchemeID=ImpactOnBusiness cfStartDate= cfEndDate= cfMeasurement_Measurement cfMeasID1= cfMeasID2= cfClassID=increasedTurnover cfClassSchemeID=ImpactOnBusiness cfStartDate= cfEndDate= cfIndicator cfIndicID=00123 cfIndicator cfIndicID=00123
CERIF Federated Identifiers ResultPublication –DOI –WoS Accession Number Person –Social Security Number –Staff Id in HR system –Author identifier ORCID ScopusID Organisation –VAT Identification Number –Internal Code
CERIF 1.5 cfExpertise AndSkills cfEquipment cfFunding cfFacility cfService cfCitation cfEvent cfLanguagecfCurrency cfCountry cfCurriculum Vitae cfPrize cfQualificatio n cfGeographic BoundingBox cfPostalAddres s cfElectronicAddress cfPerson cfProject cfOrganisation Unit cfResultPatent cfResult Publication cfResultProduc t cfIndicator cfMeasurement cfFederated Identifier
CERIF – Generic Entity Structure Generic Identifier URI Attributes Multilingual Entities Relationships (Links)
Some CERIF Link Entities Citation CV Prize Qualification ExpertiseAndSkills Equipment Facility Funding Service ElectronicAddresse PostalAddress Country Currency Language Event MetricsIndicator Measurement Geographic Bounding Box
Some CERIF Link Entities role=author role=principal investigator role=research assistant role=deliverable role=author‘s affiliation role=coordinator Citation CV Prize Qualification ExpertiseAndSkills Equipment Facility Funding Service ElectronicAddresse PostalAddress Country Currency Language Event MetricsIndicator Measurement Geographic Bounding Box
Some CERIF Link Entities
CERIF – Generic Link Entity Structure GenericApplied Contextual Roles Semantic Layer Valid Time Range Vocabulary Binary, Time-based Links with Semantics
CERIF Modularisation OrganisationUnit ProjectFundingResultPublication SCHEMA 1 Role X Role Y Role Z SCHEMA 3 Role A Role C Role B Semantic Layer SCHEMA 2 Role A Role C Role B Citation CV Prize Qualification ExpertiseAndSkills Equipment Facility Funding Service ElectronicAddresse PostalAddress Country Currency Language Event MetricsIndicator Measurement Geographic Bounding Box
Result_Publication Instance Diagram (slide by Keith Jeffery) Person A Publication X OrgUnit O OrgUnit M OrgUnit N Project P member employee Part of owns IPR author Project leader
CERIF Example (Person)
CERIF Example (Project)
CERIF Semantic Layer Allows to capture any Schema or Structure Flat Lists Thesauri Classification Systems (e.g. SKOS,...) Taxonomies Ontologies Open / Extensible in all directions New Schemas New Concepts / Terms New Relationships Enables to manage Roles / Types Semantics Subject Headings Archiving (Time component) Allows for simple Mappings between Schemes
CERIF Semantic Layer (Declared Semantics) Recursion is-a maps-to is-part-of Is-broader-term Scheme-Assignment Time-based
CERIF Semantic Layer (Declared Semantics) CERIF / SKOS Class / Concept ClassScheme / ConceptScheme class-class / broadMatch class-class / broader class-class / broaderTransitive class-class / hasTopConcept class-class / mappingRelation generic / explicit (open set) / (defined sets) Joerg, B.; Jeffery, K.G.; Van Grootel, G. (2011): Towards a Sharable Research Vocabulary – A Model-driven Approach; MTSR 2011, Izmir, Turkey.
CERIF 1.5 cfExpertise AndSkills cfEquipment cfFunding cfFacility cfService cfCitation cfEvent cfLanguagecfCurrency cfCountry cfCurriculum Vitae cfPrize cfQualificatio n cfGeographic BoundingBox cfPostalAddres s cfElectronicAddress cfPerson cfProject cfOrganisation Unit cfResultPatent cfResult Publication cfResultProduc t cfIndicator cfMeasurement cfFederated Identifier
CERIF Federated Identifiers ResultPublication –DOI –WoS Accession Number Person –Social Security Number –Staff Id in HR system –Author identifier ORCID ScopusID Organisation –VAT Identification Number –Internal Code Classification –External Code
CERIF Federated Identifiers Records the “tag” by which an object is known elsewhere For any Base, Result, Infrastructure, or 2 nd Level entity Connected to a Service representing the issuer of the identifier –Usually an information system
CERIF XML 1.5 Interchange Format For point-to-point interchange XML namespace XML Schema Based on the ER model cfExpertise AndSkills cfEquipment cfFunding cfFacility cfService cfCitation cfEvent cfLanguagecfCurrency cfCountry cfCurriculum Vitae cfPrize cfQualificatio n cfGeographic BoundingBox cfPostalAddres s cfElectronicAddress cfPerson cfProject cfOrganisatio n Unit cfResultPaten t cfResult Publication cfResultProduc t cfIndicator cfMeasurement cfFederated Identifier
CERIF 1.5 XML Interchange Format internal-project-identifier ACRO The title of the project The goals of the project infrastructure-project-uuid -project-types-scheme-uuid orgunit-1-identifier coordinator-uuid orgunit-project-roles-scheme-uuid from-datetime to-datetime
CERIF 1.5 Release CERIF Model Introduction and Specification coming CERIF XML Data Exchange Format Specification coming CERIF Formal Semantics (Vocabulary) ✓ CERIF SQL Scripts ✓ CERIF XML Schemas ✓ CERIF XML Examples ✓ CERIF Semantics (Excel) ✓
What is a CRIS? … information about Researchers Organisations (Research- performing, Funding) Funding Programmes, Calls Projects … … that means of current interest not necessarily ongoing … driven by Concepts Model Implementation (Information System) Current Research Information System an integrated approach towards managing research information = CRIS CERIF
CRIS and Repositories at an institution (slide by Keith Jeffery) CRIS Research Context [projects, persons, organisational units funding, products, patents, publications facilities, equipment, events] OA Repository (hypermedia) Documents e-Research repository Datasets and Software OAI- PMH Various protocols End-User CERIF
Ongoing Activities towards CERIF Model Cleaning Cross-TG Activities Linked Open Data TG Institutional Repositories TG Architectures TG Indicators TG Best Practice TG Cooperation with CASRAI VIVO cfExpertis e AndSkills cfEquipme nt cfFunding cfFacility cfService cfCitation cfEvent cfLanguag e cfCurrency cfCountry cfCurricul um Vitae cfPrize cfQualific ation cfGeograp hic BoundingB ox cfPostalAdd ress cfElectronicAddr ess cfPerson cfProject cfOrganisa tion Unit cfResultPa tent cfResult Publication cfResultPro duct cfIndicator cfMeasurem ent cfFederate d Identifier
Strategic Partnerships International Council for Science; Commission on Data Access European Association of Research Managers and Administrators All European Academies
Ongoing Activities RE F HUNCRIS SK CRIS Members beyond Europe Australia Canada China Iran Israel Malaysia Mexico South Korea U.S.
What makes CERIF shine Right level of abstraction Normalized model –Record data only once –Reference rather than copy Versatile Semantic Layer Time-based relationships Clean design, regular structure
www. euroCRIS.org