Download presentation
Presentation is loading. Please wait.
Published byKristian Stewart Modified over 9 years ago
1
Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the experience of a biodiversity publisher Lyubomir Penev, Terry Catapano, Donat Agosti, Teodor Georgiev, Guido Sautter, Pavel Stoev JATS-Con, 16 - 17 Oct 2012 Plazi
2
This presentation wll focus on: Implementation of TaxPub, an extension to the general NLM JATS DTD for taxonomy publishing Implementation of TaxPub, an extension to the general NLM JATS DTD for taxonomy publishing Semantic tagging of and enhancements to published texts Semantic tagging of and enhancements to published texts Dissemination of published information to aggregators Dissemination of published information to aggregators Current and future development of TaxPub Current and future development of TaxPub
3
Quick facts about Plazi Plazi founded in 2008: Swiss based NGO with members in Switzerland, Germany, US and Iran Plazi is a research based think tank with the mission to promote the idea of open access to scientific content Plazi has four pillars: Legal advice, technical solutions (eg TaxPub), maintenance of a treatment repository, advocacy Plazi GmbH founded in 2012 as service SME owned by Plazi to provide document conversion services and consultation Funding from public donors, eg. EU, and private Clients are global
4
Context Conservation: Global biodiversity crisis. Increasing loss of species, but no tools to measure and document it Science: ca 1.8M species described, ca 8M expected Scientific publications ca 17,000 species described per annum; ca 100,000 redescriptions per annum -> rich content highly fragmented with over 2,500 journals and books involved -> difficult access Solution: Open Access and semantically enhanced publications allow immediate registration of new taxa and dissemination of content -> Taxpub JATS/DTD
5
This presentation wll focus on: Implementation of TaxPub, an extension to the general NLM JATS DTD for taxonomy publishing Semantic tagging of and enhacements to published texts Dissemination of published information to aggregators Current and future development of TaxPub
6
TaxPub Lightweight extension of Blue DTD Describe at JATS-Con 2010: “TaxPub: An Extension of the NLM/NCBI Journal Publishing DTD for Taxonomic Descriptions” (http://www.ncbi.nlm.nih.gov/books/NBK47081/) Treatments (i.e., species descriptions) ,, ,, Domain specific content : Taxonomic names references to specimens : descriptions of morphological features
7
Platyscelio mzantsi urn:lsid:zoobank.org:act:D084EF48-4736-444F-916F- 2C8CDE23E29B urn:lsid:biosci.ohio- state.edu:osuc_concepts:242617 Taekul & Johnson sp. n....
8
Holotype worker. King Saud Museum of Arthropods (KSMA), College of Food and Agriculture Sciences, King Saud University, Riyadh, Kingdom of Saudi Arabia. SAUDI ARABIA, Al Bahah province, Amadan forest, Al Mandaq governorate, 20°12'N, 41°13'E, 1881 m.a.s.l. 19.V.2010 (M. R. Sharaf & A. S. Aldawood Leg.);
9
TaxPub: Recent and Future Developments Largely stable Greenfication Interest from journals: European Journal of Taxonomy Zootaxa (via EOL) Markup of morphological descriptions
10
Spreading shrub; stems erect, greenish http://ontology.org/plant/stem-colorhttp://ontology.org/plant/greenish. Leaves deciduous early in summer (particularly when infected with Diseasomyces), oblong, apex obtuse, glabrous or weakly hirsute; stipules sharply pointed, 3,2mm wide, http://ontology.org/plant/stipule- color black or darkish brown, extremely rarely yellow, often shallowly joined around the node; spines stout.
11
TaxPub: Challenges Maintenance Sourceforge Volunteer effort, little time, no funding… Supported by Plazi Documentation Comments with ad hoc markup in extension files Converted to HTML by NCBI Tool Maintained at Species-ID wiki
14
Quick facts about Pensoft & ZooKeys Pensoft founded in 1992: more than 700 books published; two offices in Sofia and Moscow; 16 employees ZooKeys launched in July 2008 as the first mandatory Open Access journal in taxonomy; 205 issues, 20,000 pages IN FOUR YEARS All new taxa registered in ZooBank and supplied to EOL, Plazi and the wiki Species-ID CrossRef member, ISI and Scopus covered, indexed in Zoological Record, DOAJ, CABI Abstracts, Google Scholar; archived in PubMedCentral and CLOCKSS Pensoft Journal System – XML-based online editorial system; publishing services offered to society and institutional journals
15
ZooKeys growth
16
Unified marked up final output Taxon treatments, keys, images, localities PROSPECTIVE PUBLISHING | HISTORICAL LITERATURE The XML landscape for legacy and prospective taxonomic literature Content management systems & repositories (e.g., EOL, GBIF, SCRATCHPADS) TaxPub XML schema PENSOFT MARK UP tool Marked up publications PDF, HTML and XML archiving WIKI Species-ID Wikispecies Wikipedia Indexing (IPNI, ZooBank, Myco- Bank, GNA) Aggregators (EOL, GBIF) Electronic archives; Data Centers END USERS TaxonX, taXMLit schemas PLAZI’ GOLDEN GATE editor Automated submission; peer-review
17
Four stages of the XML- based editorial workflow S UBMISSION: XML-tagged or non-tagged manuscripts? S UBMISSION: XML-tagged or non-tagged manuscripts? PEER-REVIEW/EDITORIAL PROCESS: The technical challenges of the XML mark up PEER-REVIEW/EDITORIAL PROCESS: The technical challenges of the XML mark up PUBLICATION: Different publishing formats and to whom they are addressed? PUBLICATION: Different publishing formats and to whom they are addressed? DISSEMINATION: How to provide a maximum distribution of published information DISSEMINATION: How to provide a maximum distribution of published information
18
But why to mark up? Is it really needed? Who will be using it? Descriptions Images Occurrences Nomenclature Literature Plazi
19
What XML gives to the readers more than the usual PDF does?
20
Semantic enhancements to published texts
23
Archiving in PubMedCentral
24
Automated export of species descriptions to Encyclopedia of Life (EOL) XML MARK UP
25
Automated harvesting and deposition of taxon treatments in Plazi
26
Export of content to the Wiki environment
27
Species descriptions on Wikispecies and Wikimedia Commons
29
The Future of TaxPub and its implementations More semantic Web Enhancements! Pensoft Writing Tool (PWT) – a collaborative article writing platform Community-based and open peer review process Biodiversity Data Journal will publish any kind of “small data”: checklists, nomenclatural acts, taxon treatments
30
The collaborative article authoring tool
34
Why the Biodiversity Data Journal is needed?
35
Primary data Drawings: Slavena Peneva Publishing and sharing of primary data RE-USE of CONTENT
36
Biodiversity Data Journal All data maters: NO lower or upper limit of manuscript size! All data maters: NO lower or upper limit of manuscript size! ALL within a single online collaborative platform, including the writing of the manuscript! ALL within a single online collaborative platform, including the writing of the manuscript! Collaborative article authoring tool Collaborative article authoring tool Community peer review with “open” and “public” options, on the top of conventional peer-review Community peer review with “open” and “public” options, on the top of conventional peer-review Online editorial process and version control Online editorial process and version control Standard-compliant (Darwin Core, Dublin Core, NLM JATS, etc.) Standard-compliant (Darwin Core, Dublin Core, NLM JATS, etc.) Pre-defined biological Code-compliant article templates Pre-defined biological Code-compliant article templates
37
Life cycle of data published in the BDJ BIODIVERSITY MANUSCRIPT Occurrence data Genome data Image galleries Morphometric data Environmental data Phylogenetic data Any other data XML MARK UP Structured text (data!) ARTICLES Occurr- ence data Taxon names Taxon treatments Plazi BHL Wiki COL Biblio- graphies
38
The lessons learned The main difficulties are caused by: The specificity of the domain (e.g., taxon names, synonyms, instability of nomenclature, lack of global LSID infrastructure, etc.) Mark up of occurrence data (certainly a great challenge) Cost efficiency of markup process Sociological barriers: the majority of authors are not willing to change their writing habits; most are still not aware about the tremendous advantages of the Web 2.0 technologies Most small taxonomy publishers (and some bigger ones) have no experience in XML-based editorial wokflows or they simply can’t afford it
39
“ Semi-automatically generated semantic, enhanced e- publications are the only way to describe the missing 10 M species, and to deal with an increasing flood of data. ” Donat Agosti It is not easy, but......... it is exciting....... however possible only through Open Access!
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.