Annotating All Knowledge

Annotating All Knowledge
Overview of the Berlin Community Meeting October 25, 2017

Berlin Community Meeting
Hosted by the CoKo Foundation and Hypothesis Planned together with Europe PMC, ContentMine, PaperHive, and Pundit Hackathon sponsored by Europe PMC

Diversity Reuse

Our Vision Within three years, most scholarly works -- books, articles and other digital media, new and old -- will come with the capability for readers to create, share, and discover annotations from colleagues, authors, friends and experts around the globe. This technology will be open source, federated, and based on open standards. Just like the Web.

Our Plan Partner with the world’s greatest platforms, publishers and libraries to design, deploy, and then curate an open annotation layer atop all digital books, articles and other online resources.

Annotating All Knowledge
hypothes.is/annotating-all-knowledge Over 70 of the leading academic publishers, platforms and libraries are bringing web annotation to the world’s scholarship over the next several years.

Open annotation and its benefits
Annotation allows precision in citation and critique. Web annotation technology is open source and based on an emerging web standard (w3.org/annotation). Doesn’t need special hardware, just a browser and an Internet connection. Group modes enable both private and public collaboration. Supports a variety of formats like HTML, PDF, EPUB, and images. Is a clear candidate for inclusion as an altmetric.

The ask Would you agree to the following?
We share the vision for how this can benefit scholarship. We will explore how to incorporate web annotations into our [platform, publication, workflow, community]. We will collaborate openly with others in doing so. You can mention our participation to others.

FAIR Annotations Maryann Martone

The FAIR Guiding Principles for scientific data management and stewardship
High level principles to make data: Findable Accessible Interoperable Re-usable Mark D. Wilkinson et al. The FAIR Guiding Principles for scientific data management and stewardship, Scientific Data (2016). DOI: /sdata

FAIR and annotations? Annotations are data and should be FAIR
Annotations make data FAIR by adding searchable metadata and links

Findable F1. (meta)data are assigned a globally unique and persistent identifier F2. data are described with rich metadata F3. metadata clearly and explicitly include the identifier of the data it describes F4. (meta)data are registered or indexed in a searchable resource

Accessible A1. (meta)data are retrievable by their identifier using a standardized communications protocol A1.1 the protocol is open, free, and universally implementable A1.2 the protocol allows for an authentication and authorization procedure, where necessary A2. metadata are accessible, even when the data are no longer available

Interoperable I1. (meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation. I2. (meta)data use vocabularies that follow FAIR principles I3. (meta)data include qualified references to other (meta)data

Re-usable R1. meta(data) are richly described with a plurality of accurate and relevant attributes R1.1. (meta)data are released with a clear and accessible data usage license R1.2. (meta)data are associated with detailed provenance R1.3. (meta)data meet domain-relevant community standards

Kick off Meeting: Force2016 in Portland
70+ attendees came together to: Explore technical opportunities and challenges Explore publisher’s opportunities and challenges Converge on a definition of interoperability Determine use cases that are in scope Identify next steps

putting annotation fairness into practice
Francesca Di Donato Maryann introduced us to the FAIR principles, (focusing on their declination to annotation). There are many issues to be addressed, and I will start with a couple of examples based on my specific experience with Arts and Humanities scholars.You can consider these examples as use cases. I work very often with teams of scholars. They create digital libraries where they publish critical editions. Their typical workflow is transcribing and encoding texts into TEI-XML and using Pundit to semantically annotate them. 2 recurrent elements I want to stress here: First (not specifically related to annotation): they do not use the same workflow for their work! They compose their articles mainly as doc files, and export them in PDF for the publisher version. Taking no care at all about standards for data fairness and so one. The new technologies do not affect radically the way they work: more specifically, they don’t think about next scholars generations. Write in Word and convert to PDF. Stop. New technologies didn’t affected the way they work. Second: very often, they ask us to have Pundit-made annotations back in the XML source document, again. So, the document is still seen as the “final product”. In such a way, they feel their work is more credited in this way. There are many implications in both their behaviours, but I’d like to stress here is that data fairness concerns not only and not mainly technology and specific actions need to be implemented to change the way science is performed.

Main challenges With this objective the GO FAIR initiative has been launched as a follow-up of the EOSC.

Main challenges “The majority of the challenges to reach a functional European Open Science Cloud are social rather than technical” *Realising the European Open Science Cloud. First report and recommendations of the Commission High Level Expert Group on the European Open Science Cloud

GO FAIR GO FAIR is a bottom-up initiative to start working in a trusted environment where partners can deposit, find, access, exchange and reuse each other’s data, workflow and other research objects. In practice, the GO FAIR implementation approach is based on three interactive processes/pillars:

GO-CHANGE GO-TRAIN GO-BUILD
The GO FAIR initiative 3 main processes: GO-CHANGE GO-TRAIN GO-BUILD Education/Training MOOCs, SPOCs Wizards Certification Culture change Open Science promotion Reward systems Technical implementation FAIR data and services Technical infrastructure The first pillar is go change: a cultural change is needed, where open science and the principles of data findability, interoperability, accessibility and reusability are a common way of conducting science. The aim of the second pillar, go train, is to have core certified data experts and to have in each Member State and for each discipline at least one certified institute to support implementation of Data Stewardship per discipline. The last pillar, go build, deals with the need for interoperable and federated data infrastructures and the harmonization of standards, protocols, and services, which enable all researchers to deposit access and analyse scientific data across disciplines.

GO FAIR Implementation Network on annotation
Possible actions Communication Advocacy Training Building Hypothes.is, the Coalition and Pundit are supporting the creation of an implementation network on annotation: in practice, it means adopting the go fair approach to establish 3 intertwining processes: Change, Train and Build. Some actions could include the following activities: Implement Website of the AAKC as a connecting point for federated events and initiatives. Specific local but streamed initiatives of AAKC aiming at: 1) Changing research practices: call to ideas and brainstorming events on: annotation and (labelled) citations / annotation and peer reviewing (hypothes.is experiment) / annotation “status” - specific use cases; videolectures; 2) Training: webinars/MOOCs (such as OpenAire webinars) 3) Annual World event with local initiatives of FAIR annotation? (as in the case of the Open Access week).

Annotation for Literature Data Integration
Force 11 Berlin, October 24th 2017

Curation makes data FAIR
Deposition Tools to assist reuse Collaborative enterprise, community standards Citation, reproducibility Analyse, curate and integrate Classification Share with other data providers

Data resources at EMBL-EBI
Literature & ontologies Experimental Factor Ontology Gene Ontology BioStudies Europe PMC Chemical biology ChEBI ChEMBL SureChEMBL Molecular structures Protein Data Bank in Europe Electron Microscopy Data Bank Gene, protein & metabolite expression Expression Atlas Metabolights PRIDE RNA Central Protein sequences, families & motifs InterPro Pfam UniProt Genes, genomes & variation Ensembl Ensembl Genomes GWAS Catalog Metagenomics portal Systems BioModels BioSamples Enzyme Portal IntAct Reactome Molecular Archives European Nucleotide Archive European Variation Archive European Genome-phenome Archive ArrayExpress

as a Platform Text and data mining community.
Providing deep links to data

ScliLite Annotations in Europe PMC

Linking annotations in SciLite
Link-out Sentence level Link-back

2. Sharing automated annotations
Platform A Curators Platform B Professional Community Curation Mapping Triage Feedback Crosslinking Annotations Infrastructural components Content Engineering Metadata

Annotations API API users can retrieve by
Annotation type (gene, disease, organism) PMCID/PMID Providers Specific entity e.g. “heart attack” or “human” CC-0, CC-BY, CC-BY-NC plus “OA subset” Will be used in F11 Hackathon

Future Annotations Possibilities
Feedback on false negatives Layering human annotations Public Groups such as curators (UX needed!) Sharing annotations across different platforms Co-development/co-outreach opportunities welcome!

Europe PMC is supported by:

Hi! Jennifer Lin, PhD Director of Product Management jlin@crossref.org
orcid.org/

FAIR is fair. Apply to annotations?
Discussion is critical for validation & reproducibility of results Enable tracking of the evolution of scholarly claims through the lineage of expert discussion Make reviews to the full history of the published results transparent (Provide credit to contributors)

Annotations as publication metadata

Publishers: how to deposit
Directly into article’s metadata as standard part of content registration process: As part of references AND/OR As part of relations assertions (structured metadata)

But what if publisher is not aware?
Event Data collects activities surrounding publications with DOIs (nearly 100mil publications) Annotations are important events! Crossref Event Data event stream contains Hypothesis annotations right now. Interested in integrating more data sources.

Crossref & DataCite APIs
Event Data Crossref & DataCite APIs

Making annotations fully FAIR
Annotations are important scholarly contributions & can be considered a form of peer review Already being registered by some publishers but support is insufficient NEW content type available dedicated to reviews (including pre- /post-pub annotations). Register annotations (assign DOI) as content in Crossref Through Event Data, track the online activity of annotations as autonomous/independent scholarly objects

Annotations as part of workflow Collaboration between Open Source Tool Providers Business models and pathway to sustainability

Any questions? FAIR Panel: 10/26 at 10:30 in Galerie
Curathon: 12:00-1:30 Room 5 For more information on Annotating All Knowledge Coalition: or

Annotating All Knowledge

Similar presentations

Presentation on theme: "Annotating All Knowledge"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Annotating All Knowledge

Similar presentations

Presentation on theme: "Annotating All Knowledge"— Presentation transcript:

Similar presentations

About project

Feedback