International Congress of Entomology, Orlando

Slides:



Advertisements
Similar presentations
A vision for the future of taxonomic databases David Eades Illinois Natural History Survey Presented at the Natural History Museum, London, 17 January.
Advertisements

CrossRef Linking and Library Users “The vast majority of scholarly journals are now online, and there have been a number of studies of what features scholars.
ORCID – Institutional Uses Minimizing contributor disambiguation costs Use-case: MIT Libraries support for OA initiative Need to determine Institute scholarly.
How to publish genomic Data papers based on BOL data - Biodiversity Data Journal Lyubomir Penev Bulgarian Academy of Sciences & Pensoft Publishers ViBRANT.
Don’t make me think Biodiversity data publishing made easy Vince Smith, Alice Heaton, Laurence Livermore, Simon Rycroft, Ben Scott & Lyubomir Penev* The.
Pensoft Writing Tool (PWT) Lyubomir Penev ViBRANT Tools for DNA taxonomists, 11 June 2013, Brussles ViBRANT.
Developing a Metadata Exchange Format for Mathematical Literature David Ruddy Project Euclid Cornell University Library DML 2010 Paris 7 July 2010.
Making small data big! The Biodiversity Data Journal (BDJ) Lyubomir Penev, Jordan Biserkov, Teodor Georgiev, Pavel Stoev, David Roberts, Vincent Smith.
Publish or perish? Linking Scratchpads and the new Biodiversity Data Journal for streamlining publication of botanical data D.N Koureas 1, L. Penev 2 &
Making small data big! The Biodiversity Data Journal (BDJ) Lyubomir Penev, Teodor Georgiev, Pavel Stoev, David Roberts, Vincent Smith ViBRANT.
PubMed Central Mahyar Ahmadpour-B. Kowsar Publicatin Corp. Kowsar Editorial Meeting 1 September 19th, 2013 Tehran, Iran.
Service activities ViBRANT Project Year 3/Final Review Meeting – Brussels Description & Objectives WP Description WP Objectives WP partners.
Scratchpads Publishing biodiversity: The interplay between Scratchpads and the Biodiversity Data Journal Dr Dimitrios Koureas Biodiversity Informatics.
Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the experience of a biodiversity publisher Lyubomir Penev, Terry Catapano,
The XML mark up process from the viewpoint of a biodiversity publisher Lyubomir Penev, Donat Agosti, Teodor Georgiev, Terry Catapano, Vladimir Blagoderov,
Introducing Symposia : “ The digital repository that thinks like a librarian”
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
Streamlining the registration- to-publication pipeline Lyubomir Penev, Teodor Georgiev, Pavel Stoev Sherborn Meeting, NHM London, 28 Oct 2011 ViBRANT.
Link yourself or perish? PhytoKeys, the next generation journal in systematic botany Lyubomir Penev 1, W. John Kress 2, Sandra Knapp 3, De-Zhu Li 4, Susanne.
GLOBAL BIODIVERSITY INFORMATION FACILITY David Remsen ECAT Program Officer September G A Darwin-Core Archive solution to publishing and.
Open access journals Pensoft Journal Ststem PJS 2.0 Lyubomir Penev Bulgarian Academy of Sciences & Pensoft Publishers ViBRANT ViBRANT Tools for DNA taxonomists,
Cybertaxonomy and revisionary systematics Dmitry Dmitriev Illinois Natural History Survey, USA
Making small data big: The Biodiversity Data Journal (BDJ) Lyubomir D. Penev 1,3, Teodor A. Georgiev 3, Pavel E. Stoev 2,3, David M. Roberts 4 & Vincent.
IDs in and out of the database Entomological Collections Network (ECN) 2012 November 10 – 11, Knoxville, TN Debbie Paul, Greg Riccardi.
Scratchpads Publication Module - A paradigm shift in publishing RBG Kew, Seminar,
At the frontline of publishing in systematic zoology: A presentation of ZooKeys Lyubomir Penev 1, Terry Erwin 2, Jeremy Miller 3 1 Pensoft Publishers,
The Pensoft Journal System and XML-based workflow Lyubomir Penev Life and Literature Conference, Chicago 2011 ViBRANT Virtual Biodversity.
Training course on biodiversity data publishing and fitness-for-use in the GBIF Network, 2011 edition How Darwin Core Archives have changed the landscape.
The Metadata Object Description Schema (MODS) NISO Metadata Workshop May 20, 2004 Rebecca Guenther Network Development and MARC Standards Office Library.
1 CrossRef - a DOI Implementation for Journal Publishers January 29, 2003 CENDI Workshop.
Thomson Scientific October 2006 ISI Web of Knowledge Autumn updates.
GLOBAL BIODIVERSITY INFORMATION FACILITY David Remsen ECAT Program Officer October DarwinCore Archives – Simplified Format for publishing.
Standards and tools for publishing biodiversity data Yu-Huang Wang June 25, 2012.
1 Reference Linking in Project Euclid …with some thoughts on the preservation of digital collections. A presentation at the Workshop on Linking and searching.
A paradigm shift in biodiversity publishing: mobilization, mark up, reuse and integration of small data Lyubomir D. Penev 1,3, Teodor A. Georgiev 3, Pavel.
Biodiversity Data Journal: mobilization, reuse and integration of small data Lyubomir D. Penev 1,3, Teodor A. Georgiev 3, Pavel E. Stoev 2,3, Jordan Bisserkov.
Resolving the publishing bottleneck and increasing data interoperability in biodiversity science Lyubomir Penev, Teodor Georgiev, Pavel Stoev, David Roberts,
The title of your presentation goes here Graham McCann Head of Product Management & Innovation AAHEP5, Cornell, Sept 2011.
Scratchpads The virtual research environment for biodiversity data Simon Rycroft, Dave Roberts, Vince Smith, Alice Heaton, Katherine Bouton, Laurence Livermore,
CBSOR,Indian Statistical Institute 30th March 07, ISI,Kokata 1 Digital Repository support for Consortium Dr. Devika P. Madalli Documentation Research &
TaxonX : A mark-up schema and approach for systematics literature American Museum of Natural History and University of Karlsruhe in collaboration with.
Jeremy Miller 1,2, Donat Agosti 2,3, Guido Sautter 2, Terry Catapano 2,4, David King 5, Serrano Pereira 1, Rutger Vos 1, Soraya Sierra 1 Unlocking the.
Catherine Tabone 04 June ELI Compliant URI Scheme Implementation.
The Future of Informatics in Digital Literature – or Literature and it’s (Digital) Future Donat Agosti and Terrance Catapano Plazi TDWG, Woods Hole, September.
The PLAZI Markup System Donat Agosti Terry Catapano Robert “Bob“ Morris Guido Sautter Universität Karlsruhe (TH) Research University – founded 1825.
Evidence from Metadata INST 734 Doug Oard Module 8.
Linked Data: Emblematic applications on Legacy Data in Libraries.
Don’t make me think Biodiversity Data Publishing Made Easy Laurence Livermore, Vince Smith, Alice Heaton, Simon Rycroft, Ed Baker, Ben Scott & Lyubomir.
Scratchpads and the new Biodiversity Data Journal Biodiversity Data Publishing made… easier Dimitris Koureas Natural History Museum London.
Publishing & Citing Research Data Arun Prakash. Agenda  Introduction  Why is Data publishing important ?  Ongoing Work  Role of Semantics.
Acronym Soup GBIF, TDWG & GUIDs Jerry Cooper. Global Biodiversity Information Facility (GBIF) Established in 2000 through non-binding MOU (25 countries.
Literature & interoperability: a working example using ants Donat Agosti, Terry Catapano, Guido Sautter, Christiana Klingenberg & Christie Stephenson TDWG.
Plazi: Prospects for Markup of Legacy and New Taxonomic Literature Terry Catapano TDWG Fremantle, WA October 21, 2008.
Current initiatives in developing library linked data Gordon Dunsire Presented at the Cataloguing and Indexing Group Scotland seminar “Linked data and.
Digitizing Historical Newspapers South Carolina Digital Newspaper Program's participation with the Library of Congress' Chronicling America: Historic American.
Coordination and Policy Development in Preparation for a European Open Biodiversity Knowledge Management System Supported by the European Commission through.
Coordination and Policy Development in Preparation for a European Open Biodiversity Knowledge Management System Supported by the European Commission through.
Web Services Overview Thomas Hickey. 2 What are Web Services? Machine-to-machine communication Run over standard Web protocols –XML syntax, HTTP packaging.
ZooBank: Scope of Registry
Introduction to Persistent Identifiers
Repository Software - Standards
RCN Development of an Online Database to Enhance the Conservation of SGCN Invertebrates in the Northeastern Region James W. Fetzner Jr. & John.
Training course on biodiversity data publishing and fitness-for-use in the GBIF Network, 2011 edition How Darwin Core Archives have changed the landscape.
ALA Practical Linked Data With Open Source
Linking persistent identifiers at the British Library
Data publishing from the viewpoint of a biodiversity publisher
Applications of IFLA Namespaces
Publishing and Mark-up of Collection Data
Citation databases and social networks for researchers: measuring research impact and disseminating results - exercise Elisavet Koutzamani
Presentation transcript:

International Congress of Entomology, Orlando Notes: Add in Plazi and the idea of the treatment server Terry Catapano September 26, 2016 International Congress of Entomology, Orlando

Extracting Linked Open Data from Taxonomic Publications Notes: Add in Plazi and the idea of the treatment server Plazi Extracting Linked Open Data from Taxonomic Publications Terry Catapano Plazi, New York (http://plazi.org)

5,000 journals with taxonomic content 1,900,000 species described Who are we? 3 Plazi 500,000,000+ printed pages 5,000 journals with taxonomic content 1,900,000 species described 20,000,000+ species treatments BUT: The facts are hidden Incomplete digitization Publications are not semantically enhanced Data are not linked Most data are not open Plazi solution: Linking through taxonomic treatments

Taxonomic Treatment Plazi Formica obsoleta Linnaeus, 1758: 580 Who are we? 4 Plazi Taxonomic Treatment Formica obsoleta Linnaeus, 1758: 580 name description distribution Treatment: a well defined part of an article that defines the particular usage of a scientific name by an authority at a given time (a page(s) in a publication). Linnaeus has to be credited for Latin Binomen AND Treatment.

Plazi TreatmentBank: Million Treatment Goal For 1M Treatments, at minimum: Identify with HTTP URIs in form of http://treatment.plazi.org/id/[UUID] Metadata Taxon Concept Publication Information Representations HTML RDF [XML] Exports Biodiversity Literature Repository EOL GBIF WikiData Notes: Add in Plazi and the idea of the treatment server

Plazi: Output Bibliographic references: > 800,000 Publications (year of publishing): 17,799 (2016: 3,500) Taxonomic treatments: 165,692 (50,000) Observation Records: 53099 Observation Records geo-referenced: 23068 Taxonomic names: 152,641 (45,000) Bibliographic references: > 800,000 RDF triples: > 100 Million (1M) Scientific illustrations: >140,000 (being uploaded to BLR/Zenodo) Notes: Add in Plazi and the idea of the treatment server

TreatmentBank

Plazi: TreatmentBank: Treatment stubs: adding content Notes: Add in Plazi and the idea of the treatment server

Sources Legacy publications Print  Digitization  Text Capture  XML Digitized  Text Capture  XML Born Digital  Text Extraction  XML Prospective publishing Pensoft Journals TaxPub XML  RDF Notes: Add in Plazi and the idea of the treatment server

Plazi conversion workflow TreatmentBank find scan text extraction markup store Notes: Add in Plazi and the idea of the treatment server

Daily Automated Processing of New Taxa Notes: I am afraid, I am going to loose some of you at some point, but I will try to get you all together at the end of the talk

TreatmentBank: HTML Representation Notes: Add in Plazi and the idea of the treatment server

Treatment Text: XML Representation

XML Representation: Text Markup and Enhancement Treatments Treatment Sections Features of interest Taxon Names Treatment Citations Material Citations (e.g., specimens) Bibliographic References (w/ citation) Figures (w/citation) Tables (w/ ciation) Notes: Add in Plazi and the idea of the treatment server

Nomenclature Section and Taxon Name Notes: Add in Plazi and the idea of the treatment server

Treatment Citation Notes: Add in Plazi and the idea of the treatment server

Material Citation Notes: Add in Plazi and the idea of the treatment server

TreatmentBank: online editing: material citation Notes: Add in Plazi and the idea of the treatment server

Semantic XML Publishing: TaxPub Notes: Add in Plazi and the idea of the treatment server

Treatment Data: RDF Representation

Treatment Data Published in  Publication Defines  Taxon Concept [1 and only 1] Cites  Treatments/Taxon Concepts Cites  Material (Specimens) hasInformation  Information Item [Text] Content Data Notes: Add in Plazi and the idea of the treatment server

Treatment Data: Vocabularies, Ontologies, and Identifiers Treatment Ontology https://github.com/plazi/treatmentontologies OBKMS Ontology (Viktor Senderov/Pensoft) Plazi TreatmentBank HTTP URIs Publication: Dublin Core, SPAR (FABIO, PRO, FRBR) DOI (CrossRef, Zenodo/DataCite) ORCID, ISSN Taxon Concept DarwinCore, DarwinCoreSW ZooBank HTTP URIs ORCID, ResearcherID, Collection Codes, Repository IDs Citations: CiTO Information Item: SPM (+ EOL SPM extensions) Data: Trait Ontologies; SDD, etc… Notes: Add in Plazi and the idea of the treatment server

Treatment Data: Taxon Concept Notes: Add in Plazi and the idea of the treatment server

Treatment Data: Publication Information Notes: Add in Plazi and the idea of the treatment server

Biodiversity Literature Repository: DOIs for Legacy Literature Access, archive, DOI Who are we?

Treatment Data: Treatment Citations Notes: Add in Plazi and the idea of the treatment server

TreatmentBank: Taxonomic data: linking treatments

Treatment Data: Material Citations Notes: Add in Plazi and the idea of the treatment server

Treatment Data: Specimen Data Analysis and Outputs Notes: Add in Plazi and the idea of the treatment server

Treatment Data: Other Information Content Notes: Add in Plazi and the idea of the treatment server

Thank you! Terry Catapano catapano@plazi.org Notes: Add in Plazi and the idea of the treatment server