Presentation is loading. Please wait.

Presentation is loading. Please wait.

Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the experience of a biodiversity publisher Lyubomir Penev, Terry Catapano,

Similar presentations


Presentation on theme: "Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the experience of a biodiversity publisher Lyubomir Penev, Terry Catapano,"— Presentation transcript:

1 Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the experience of a biodiversity publisher Lyubomir Penev, Terry Catapano, Donat Agosti, Teodor Georgiev, Guido Sautter, Pavel Stoev JATS-Con, 16 - 17 Oct 2012 Plazi

2 This presentation wll focus on: Implementation of TaxPub, an extension to the general NLM JATS DTD for taxonomy publishing Implementation of TaxPub, an extension to the general NLM JATS DTD for taxonomy publishing Semantic tagging of and enhancements to published texts Semantic tagging of and enhancements to published texts Dissemination of published information to aggregators Dissemination of published information to aggregators Current and future development of TaxPub Current and future development of TaxPub

3 Quick facts about Plazi Plazi founded in 2008: Swiss based NGO with members in Switzerland, Germany, US and Iran Plazi is a research based think tank with the mission to promote the idea of open access to scientific content Plazi has four pillars: Legal advice, technical solutions (eg TaxPub), maintenance of a treatment repository, advocacy Plazi GmbH founded in 2012 as service SME owned by Plazi to provide document conversion services and consultation Funding from public donors, eg. EU, and private Clients are global

4 Context Conservation: Global biodiversity crisis. Increasing loss of species, but no tools to measure and document it Science: ca 1.8M species described, ca 8M expected Scientific publications ca 17,000 species described per annum; ca 100,000 redescriptions per annum -> rich content highly fragmented with over 2,500 journals and books involved -> difficult access Solution: Open Access and semantically enhanced publications allow immediate registration of new taxa and dissemination of content -> Taxpub JATS/DTD

5 This presentation wll focus on:  Implementation of TaxPub, an extension to the general NLM JATS DTD for taxonomy publishing  Semantic tagging of and enhacements to published texts  Dissemination of published information to aggregators  Current and future development of TaxPub

6 TaxPub  Lightweight extension of Blue DTD  Describe at JATS-Con 2010: “TaxPub: An Extension of the NLM/NCBI Journal Publishing DTD for Taxonomic Descriptions” (http://www.ncbi.nlm.nih.gov/books/NBK47081/)  Treatments (i.e., species descriptions) ,, ,,  Domain specific content  : Taxonomic names  references to specimens  : descriptions of morphological features

7 Platyscelio mzantsi urn:lsid:zoobank.org:act:D084EF48-4736-444F-916F- 2C8CDE23E29B urn:lsid:biosci.ohio- state.edu:osuc_concepts:242617 Taekul & Johnson sp. n....

8 Holotype worker. King Saud Museum of Arthropods (KSMA), College of Food and Agriculture Sciences, King Saud University, Riyadh, Kingdom of Saudi Arabia. SAUDI ARABIA, Al Bahah province, Amadan forest, Al Mandaq governorate, 20°12'N, 41°13'E, 1881 m.a.s.l. 19.V.2010 (M. R. Sharaf & A. S. Aldawood Leg.);

9 TaxPub: Recent and Future Developments  Largely stable    Greenfication  Interest from journals:  European Journal of Taxonomy  Zootaxa (via EOL)  Markup of morphological descriptions

10 Spreading shrub; stems erect, greenish http://ontology.org/plant/stem-colorhttp://ontology.org/plant/greenish. Leaves deciduous early in summer (particularly when infected with Diseasomyces), oblong, apex obtuse, glabrous or weakly hirsute; stipules sharply pointed, 3,2mm wide, http://ontology.org/plant/stipule- color black or darkish brown, extremely rarely yellow, often shallowly joined around the node; spines stout.

11 TaxPub: Challenges  Maintenance  Sourceforge  Volunteer effort, little time, no funding…  Supported by Plazi  Documentation  Comments with ad hoc markup in extension files  Converted to HTML by NCBI Tool  Maintained at Species-ID wiki

12

13

14 Quick facts about Pensoft & ZooKeys Pensoft founded in 1992: more than 700 books published; two offices in Sofia and Moscow; 16 employees ZooKeys launched in July 2008 as the first mandatory Open Access journal in taxonomy; 205 issues, 20,000 pages IN FOUR YEARS All new taxa registered in ZooBank and supplied to EOL, Plazi and the wiki Species-ID CrossRef member, ISI and Scopus covered, indexed in Zoological Record, DOAJ, CABI Abstracts, Google Scholar; archived in PubMedCentral and CLOCKSS Pensoft Journal System – XML-based online editorial system; publishing services offered to society and institutional journals

15 ZooKeys growth

16 Unified marked up final output Taxon treatments, keys, images, localities PROSPECTIVE PUBLISHING | HISTORICAL LITERATURE The XML landscape for legacy and prospective taxonomic literature Content management systems & repositories (e.g., EOL, GBIF, SCRATCHPADS) TaxPub XML schema PENSOFT MARK UP tool Marked up publications PDF, HTML and XML archiving WIKI Species-ID Wikispecies Wikipedia Indexing (IPNI, ZooBank, Myco- Bank, GNA) Aggregators (EOL, GBIF) Electronic archives; Data Centers END USERS TaxonX, taXMLit schemas PLAZI’ GOLDEN GATE editor Automated submission; peer-review

17 Four stages of the XML- based editorial workflow S UBMISSION: XML-tagged or non-tagged manuscripts? S UBMISSION: XML-tagged or non-tagged manuscripts? PEER-REVIEW/EDITORIAL PROCESS: The technical challenges of the XML mark up PEER-REVIEW/EDITORIAL PROCESS: The technical challenges of the XML mark up PUBLICATION: Different publishing formats and to whom they are addressed? PUBLICATION: Different publishing formats and to whom they are addressed? DISSEMINATION: How to provide a maximum distribution of published information DISSEMINATION: How to provide a maximum distribution of published information

18 But why to mark up? Is it really needed? Who will be using it? Descriptions Images Occurrences Nomenclature Literature Plazi

19 What XML gives to the readers more than the usual PDF does?

20 Semantic enhancements to published texts

21

22

23 Archiving in PubMedCentral

24 Automated export of species descriptions to Encyclopedia of Life (EOL) XML MARK UP

25 Automated harvesting and deposition of taxon treatments in Plazi

26 Export of content to the Wiki environment

27 Species descriptions on Wikispecies and Wikimedia Commons

28

29 The Future of TaxPub and its implementations More semantic Web Enhancements! Pensoft Writing Tool (PWT) – a collaborative article writing platform Community-based and open peer review process Biodiversity Data Journal will publish any kind of “small data”: checklists, nomenclatural acts, taxon treatments

30 The collaborative article authoring tool

31

32

33

34 Why the Biodiversity Data Journal is needed?

35 Primary data Drawings: Slavena Peneva Publishing and sharing of primary data RE-USE of CONTENT

36 Biodiversity Data Journal All data maters: NO lower or upper limit of manuscript size! All data maters: NO lower or upper limit of manuscript size! ALL within a single online collaborative platform, including the writing of the manuscript! ALL within a single online collaborative platform, including the writing of the manuscript! Collaborative article authoring tool Collaborative article authoring tool Community peer review with “open” and “public” options, on the top of conventional peer-review Community peer review with “open” and “public” options, on the top of conventional peer-review Online editorial process and version control Online editorial process and version control Standard-compliant (Darwin Core, Dublin Core, NLM JATS, etc.) Standard-compliant (Darwin Core, Dublin Core, NLM JATS, etc.) Pre-defined biological Code-compliant article templates Pre-defined biological Code-compliant article templates

37 Life cycle of data published in the BDJ BIODIVERSITY MANUSCRIPT Occurrence data Genome data Image galleries Morphometric data Environmental data Phylogenetic data Any other data XML MARK UP Structured text (data!) ARTICLES Occurr- ence data Taxon names Taxon treatments Plazi BHL Wiki COL Biblio- graphies

38 The lessons learned The main difficulties are caused by: The specificity of the domain (e.g., taxon names, synonyms, instability of nomenclature, lack of global LSID infrastructure, etc.) Mark up of occurrence data (certainly a great challenge) Cost efficiency of markup process Sociological barriers: the majority of authors are not willing to change their writing habits; most are still not aware about the tremendous advantages of the Web 2.0 technologies Most small taxonomy publishers (and some bigger ones) have no experience in XML-based editorial wokflows or they simply can’t afford it

39 “ Semi-automatically generated semantic, enhanced e- publications are the only way to describe the missing 10 M species, and to deal with an increasing flood of data. ” Donat Agosti It is not easy, but......... it is exciting....... however possible only through Open Access!


Download ppt "Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the experience of a biodiversity publisher Lyubomir Penev, Terry Catapano,"

Similar presentations


Ads by Google