Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the experience of a biodiversity publisher Lyubomir Penev, Terry Catapano,

Slides:



Advertisements
Similar presentations
Creating Institutional Repositories Stephen Pinfield.
Advertisements

Incentivising Biodiversity Data Publishing: GBIF-Pensoft Partnership Vishwas Chavan 1, Lyubomir Penev 2,3, Teodor Georgiev 3 1 Global Biodiversity Information.
TaxPub: An Extension of JATS for Taxonomic Descriptions Terry Catapano, Plazi Leiden, Netherlands
How to publish genomic Data papers based on BOL data - Biodiversity Data Journal Lyubomir Penev Bulgarian Academy of Sciences & Pensoft Publishers ViBRANT.
Making your data work for you: Scratchpads, publishing & the Biodiversity Data Journal Vince Smith 1, Dave Roberts 1 & Lyubomir Penev 2 1. Natural History.
Don’t make me think Biodiversity data publishing made easy Vince Smith, Alice Heaton, Laurence Livermore, Simon Rycroft, Ben Scott & Lyubomir Penev* The.
Pensoft Writing Tool (PWT) Lyubomir Penev ViBRANT Tools for DNA taxonomists, 11 June 2013, Brussles ViBRANT.
Making small data big! The Biodiversity Data Journal (BDJ) Lyubomir Penev, Jordan Biserkov, Teodor Georgiev, Pavel Stoev, David Roberts, Vincent Smith.
A common XML query/response model for automated publication- to-registration pipeline Lyubomir Penev, Jordan Biserkov, Teodor Georgiev, Pavel Stoev Pro-iBiosphere.
Publish or perish? Linking Scratchpads and the new Biodiversity Data Journal for streamlining publication of botanical data D.N Koureas 1, L. Penev 2 &
Making small data big! The Biodiversity Data Journal (BDJ) Lyubomir Penev, Teodor Georgiev, Pavel Stoev, David Roberts, Vincent Smith ViBRANT.
Taxonomic Literature Standards and Synergies TDWG 2006 Anna L. Weitzman & Christopher H. C. Lyal.
OPEN ACCESS Your Publisher of Choice DE GRUYTER OPEN Society-Pays Publishing Program.
Virtual Biodiversity ViBRANT 3 in 1: The Pensoft Writing Tool (PWT) Lyubomir Penev, Pavel Stoev, Teofor Georgiev Pensoft Publishers ViBRANT.
SAE INTERNATIONAL Copyright (c) 2015 SAE International and Data Conversion Laboratory. Further use or distribution is not permitted without permission.
Service activities ViBRANT Project Year 3/Final Review Meeting – Brussels Description & Objectives WP Description WP Objectives WP partners.
Scratchpads Publishing biodiversity: The interplay between Scratchpads and the Biodiversity Data Journal Dr Dimitrios Koureas Biodiversity Informatics.
NATIONAL LIBRARY OF MEDICINE NLM Journal Archiving and Interchange Tagset Jeff Beck National Center for Biotechnology Information National Library of Medicine.
1 Archiving Workflow between a Local Repository and the National Library Archive Experiences from the DiVA Project Eva Müller, Peter Hansson, Uwe Klosa,
The XML mark up process from the viewpoint of a biodiversity publisher Lyubomir Penev, Donat Agosti, Teodor Georgiev, Terry Catapano, Vladimir Blagoderov,
The Open Archives Initiative Simeon Warner (Cornell University) Symposium on “Scholarly Publishing and Archiving on the Web”, University.
ReQuest (Validating Semantic Searches) Norman Piedade de Noronha 16 th July, 2004.
Streamlining the registration- to-publication pipeline Lyubomir Penev, Teodor Georgiev, Pavel Stoev Sherborn Meeting, NHM London, 28 Oct 2011 ViBRANT.
Link yourself or perish? PhytoKeys, the next generation journal in systematic botany Lyubomir Penev 1, W. John Kress 2, Sandra Knapp 3, De-Zhu Li 4, Susanne.
GLOBAL BIODIVERSITY INFORMATION FACILITY David Remsen ECAT Program Officer September G A Darwin-Core Archive solution to publishing and.
Open access journals Pensoft Journal Ststem PJS 2.0 Lyubomir Penev Bulgarian Academy of Sciences & Pensoft Publishers ViBRANT ViBRANT Tools for DNA taxonomists,
Cybertaxonomy and revisionary systematics Dmitry Dmitriev Illinois Natural History Survey, USA
Sam Kalb Scholarly Communication Services Coordinator QUEEN’S.
Making small data big: The Biodiversity Data Journal (BDJ) Lyubomir D. Penev 1,3, Teodor A. Georgiev 3, Pavel E. Stoev 2,3, David M. Roberts 4 & Vincent.
Berkeley Electronic Press (bepress). Bepress history Started 10 years ago by University of California at Berkeley faculty to publish scholarly journals.
To be Published for free or to be Read for free: OA publishing from an Easterneuropean perspective Lyubomir Penev Pensoft Publishers, Sofia APE 2011 Berlin.
Scratchpads Publication Module - A paradigm shift in publishing RBG Kew, Seminar,
Collaborative Approach to Open Access: Experience from Bioline International Leslie Chan Associate Director Bioline International University of Toronto.
Virtual Biodiversity ViBRANT Vince Smith & Dave Roberts Natural History Museum, London ViBRANT Virtual Biodiversity.
Virtual Biodiversity ViBRANT Literature Mining and Mark-up ViBRANT’s text processing tools David Morse, The Open University, UK,
5-7 November 2014 DR Workflow Practical Digital Content Management from Digital Libraries & Archives Perspective.
THE SCIENCETHE SEARCHTHE SOLUTION New Publishing Paradigms and their impact on a not-for- profit organisation Shaun Hobbs Database.
At the frontline of publishing in systematic zoology: A presentation of ZooKeys Lyubomir Penev 1, Terry Erwin 2, Jeremy Miller 3 1 Pensoft Publishers,
The Pensoft Journal System and XML-based workflow Lyubomir Penev Life and Literature Conference, Chicago 2011 ViBRANT Virtual Biodversity.
Training course on biodiversity data publishing and fitness-for-use in the GBIF Network, 2011 edition How Darwin Core Archives have changed the landscape.
SCIENCE, RESEARCH DATA, AND PUBLISHING Stewart Wills Editorial Director, Web & New Media, Science 26 February 2013.
The Global Names Architecture: Integration In Action (NOT “Inaction”) 1.Overview of GNA, GNI & GNUB (15 mins) 2.Questions, Elaborations & Clarifications.
Virtual Biodiversity ViBRANT Data publishing Lyubomir Penev, Vince Smith, Dave Roberts, Pavel Stoev ViBRANT Virtual Biodiversity “BioFresh goes Political”
A paradigm shift in biodiversity publishing: mobilization, mark up, reuse and integration of small data Lyubomir D. Penev 1,3, Teodor A. Georgiev 3, Pavel.
Biodiversity Data Journal: mobilization, reuse and integration of small data Lyubomir D. Penev 1,3, Teodor A. Georgiev 3, Pavel E. Stoev 2,3, Jordan Bisserkov.
Resolving the publishing bottleneck and increasing data interoperability in biodiversity science Lyubomir Penev, Teodor Georgiev, Pavel Stoev, David Roberts,
Scratchpads The virtual research environment for biodiversity data Simon Rycroft, Dave Roberts, Vince Smith, Alice Heaton, Katherine Bouton, Laurence Livermore,
TaxonX : A mark-up schema and approach for systematics literature American Museum of Natural History and University of Karlsruhe in collaboration with.
Jeremy Miller 1,2, Donat Agosti 2,3, Guido Sautter 2, Terry Catapano 2,4, David King 5, Serrano Pereira 1, Rutger Vos 1, Soraya Sierra 1 Unlocking the.
An Introduction to Scratchpads: Making your data work for you Laurence Livermore Natural History Museum, London Joinville, Brazil.
The Future of Informatics in Digital Literature – or Literature and it’s (Digital) Future Donat Agosti and Terrance Catapano Plazi TDWG, Woods Hole, September.
Technical Aspects in Scientific Publishing: A Scientific Publisher's Perspective Chi Wai (Rick) Lee World Scientific Publishing.
The PLAZI Markup System Donat Agosti Terry Catapano Robert “Bob“ Morris Guido Sautter Universität Karlsruhe (TH) Research University – founded 1825.
Don’t make me think Biodiversity Data Publishing Made Easy Laurence Livermore, Vince Smith, Alice Heaton, Simon Rycroft, Ed Baker, Ben Scott & Lyubomir.
Scratchpads and the new Biodiversity Data Journal Biodiversity Data Publishing made… easier Dimitris Koureas Natural History Museum London.
Incentives for Biodiversity Data Publishing June 2011.
Literature & interoperability: a working example using ants Donat Agosti, Terry Catapano, Guido Sautter, Christiana Klingenberg & Christie Stephenson TDWG.
Plazi: Prospects for Markup of Legacy and New Taxonomic Literature Terry Catapano TDWG Fremantle, WA October 21, 2008.
Scratchpads An online platform for biodiversity data Laurence Livermore Biodiversity Informatics | Department of Life Sciences Natural History Museum London.
Coordination and Policy Development in Preparation for a European Open Biodiversity Knowledge Management System Supported by the European Commission through.
CitEc as a source for research assessment and evaluation José Manuel Barrueco Universitat de València (SPAIN) May, й Международной научно-практической.
Coordination and Policy Development in Preparation for a European Open Biodiversity Knowledge Management System Supported by the European Commission through.
GB22 TRAINING EVENT FOR NODES – 4 OCTOBER 2015 Session 02: 2015 Data Publishing Landscape Laura Russell.
ZooBank: Scope of Registry
International Congress of Entomology, Orlando
Markup of Educational Content
Data publishing from the viewpoint of a biodiversity publisher
VI-SEEM Data Repository
Publishing and Mark-up of Collection Data
Presentation transcript:

Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the experience of a biodiversity publisher Lyubomir Penev, Terry Catapano, Donat Agosti, Teodor Georgiev, Guido Sautter, Pavel Stoev JATS-Con, Oct 2012 Plazi

This presentation wll focus on: Implementation of TaxPub, an extension to the general NLM JATS DTD for taxonomy publishing Implementation of TaxPub, an extension to the general NLM JATS DTD for taxonomy publishing Semantic tagging of and enhancements to published texts Semantic tagging of and enhancements to published texts Dissemination of published information to aggregators Dissemination of published information to aggregators Current and future development of TaxPub Current and future development of TaxPub

Quick facts about Plazi Plazi founded in 2008: Swiss based NGO with members in Switzerland, Germany, US and Iran Plazi is a research based think tank with the mission to promote the idea of open access to scientific content Plazi has four pillars: Legal advice, technical solutions (eg TaxPub), maintenance of a treatment repository, advocacy Plazi GmbH founded in 2012 as service SME owned by Plazi to provide document conversion services and consultation Funding from public donors, eg. EU, and private Clients are global

Context Conservation: Global biodiversity crisis. Increasing loss of species, but no tools to measure and document it Science: ca 1.8M species described, ca 8M expected Scientific publications ca 17,000 species described per annum; ca 100,000 redescriptions per annum -> rich content highly fragmented with over 2,500 journals and books involved -> difficult access Solution: Open Access and semantically enhanced publications allow immediate registration of new taxa and dissemination of content -> Taxpub JATS/DTD

This presentation wll focus on:  Implementation of TaxPub, an extension to the general NLM JATS DTD for taxonomy publishing  Semantic tagging of and enhacements to published texts  Dissemination of published information to aggregators  Current and future development of TaxPub

TaxPub  Lightweight extension of Blue DTD  Describe at JATS-Con 2010: “TaxPub: An Extension of the NLM/NCBI Journal Publishing DTD for Taxonomic Descriptions” (  Treatments (i.e., species descriptions) ,, ,,  Domain specific content  : Taxonomic names  references to specimens  : descriptions of morphological features

Platyscelio mzantsi urn:lsid:zoobank.org:act:D084EF F-916F- 2C8CDE23E29B urn:lsid:biosci.ohio- state.edu:osuc_concepts: Taekul & Johnson sp. n....

Holotype worker. King Saud Museum of Arthropods (KSMA), College of Food and Agriculture Sciences, King Saud University, Riyadh, Kingdom of Saudi Arabia. SAUDI ARABIA, Al Bahah province, Amadan forest, Al Mandaq governorate, 20°12'N, 41°13'E, 1881 m.a.s.l. 19.V.2010 (M. R. Sharaf & A. S. Aldawood Leg.);

TaxPub: Recent and Future Developments  Largely stable    Greenfication  Interest from journals:  European Journal of Taxonomy  Zootaxa (via EOL)  Markup of morphological descriptions

Spreading shrub; stems erect, greenish Leaves deciduous early in summer (particularly when infected with Diseasomyces), oblong, apex obtuse, glabrous or weakly hirsute; stipules sharply pointed, 3,2mm wide, color black or darkish brown, extremely rarely yellow, often shallowly joined around the node; spines stout.

TaxPub: Challenges  Maintenance  Sourceforge  Volunteer effort, little time, no funding…  Supported by Plazi  Documentation  Comments with ad hoc markup in extension files  Converted to HTML by NCBI Tool  Maintained at Species-ID wiki

Quick facts about Pensoft & ZooKeys Pensoft founded in 1992: more than 700 books published; two offices in Sofia and Moscow; 16 employees ZooKeys launched in July 2008 as the first mandatory Open Access journal in taxonomy; 205 issues, 20,000 pages IN FOUR YEARS All new taxa registered in ZooBank and supplied to EOL, Plazi and the wiki Species-ID CrossRef member, ISI and Scopus covered, indexed in Zoological Record, DOAJ, CABI Abstracts, Google Scholar; archived in PubMedCentral and CLOCKSS Pensoft Journal System – XML-based online editorial system; publishing services offered to society and institutional journals

ZooKeys growth

Unified marked up final output Taxon treatments, keys, images, localities PROSPECTIVE PUBLISHING | HISTORICAL LITERATURE The XML landscape for legacy and prospective taxonomic literature Content management systems & repositories (e.g., EOL, GBIF, SCRATCHPADS) TaxPub XML schema PENSOFT MARK UP tool Marked up publications PDF, HTML and XML archiving WIKI Species-ID Wikispecies Wikipedia Indexing (IPNI, ZooBank, Myco- Bank, GNA) Aggregators (EOL, GBIF) Electronic archives; Data Centers END USERS TaxonX, taXMLit schemas PLAZI’ GOLDEN GATE editor Automated submission; peer-review

Four stages of the XML- based editorial workflow S UBMISSION: XML-tagged or non-tagged manuscripts? S UBMISSION: XML-tagged or non-tagged manuscripts? PEER-REVIEW/EDITORIAL PROCESS: The technical challenges of the XML mark up PEER-REVIEW/EDITORIAL PROCESS: The technical challenges of the XML mark up PUBLICATION: Different publishing formats and to whom they are addressed? PUBLICATION: Different publishing formats and to whom they are addressed? DISSEMINATION: How to provide a maximum distribution of published information DISSEMINATION: How to provide a maximum distribution of published information

But why to mark up? Is it really needed? Who will be using it? Descriptions Images Occurrences Nomenclature Literature Plazi

What XML gives to the readers more than the usual PDF does?

Semantic enhancements to published texts

Archiving in PubMedCentral

Automated export of species descriptions to Encyclopedia of Life (EOL) XML MARK UP

Automated harvesting and deposition of taxon treatments in Plazi

Export of content to the Wiki environment

Species descriptions on Wikispecies and Wikimedia Commons

The Future of TaxPub and its implementations More semantic Web Enhancements! Pensoft Writing Tool (PWT) – a collaborative article writing platform Community-based and open peer review process Biodiversity Data Journal will publish any kind of “small data”: checklists, nomenclatural acts, taxon treatments

The collaborative article authoring tool

Why the Biodiversity Data Journal is needed?

Primary data Drawings: Slavena Peneva Publishing and sharing of primary data RE-USE of CONTENT

Biodiversity Data Journal All data maters: NO lower or upper limit of manuscript size! All data maters: NO lower or upper limit of manuscript size! ALL within a single online collaborative platform, including the writing of the manuscript! ALL within a single online collaborative platform, including the writing of the manuscript! Collaborative article authoring tool Collaborative article authoring tool Community peer review with “open” and “public” options, on the top of conventional peer-review Community peer review with “open” and “public” options, on the top of conventional peer-review Online editorial process and version control Online editorial process and version control Standard-compliant (Darwin Core, Dublin Core, NLM JATS, etc.) Standard-compliant (Darwin Core, Dublin Core, NLM JATS, etc.) Pre-defined biological Code-compliant article templates Pre-defined biological Code-compliant article templates

Life cycle of data published in the BDJ BIODIVERSITY MANUSCRIPT Occurrence data Genome data Image galleries Morphometric data Environmental data Phylogenetic data Any other data XML MARK UP Structured text (data!) ARTICLES Occurr- ence data Taxon names Taxon treatments Plazi BHL Wiki COL Biblio- graphies

The lessons learned The main difficulties are caused by: The specificity of the domain (e.g., taxon names, synonyms, instability of nomenclature, lack of global LSID infrastructure, etc.) Mark up of occurrence data (certainly a great challenge) Cost efficiency of markup process Sociological barriers: the majority of authors are not willing to change their writing habits; most are still not aware about the tremendous advantages of the Web 2.0 technologies Most small taxonomy publishers (and some bigger ones) have no experience in XML-based editorial wokflows or they simply can’t afford it

“ Semi-automatically generated semantic, enhanced e- publications are the only way to describe the missing 10 M species, and to deal with an increasing flood of data. ” Donat Agosti It is not easy, but it is exciting however possible only through Open Access!