TaxPub: An Extension of JATS for Taxonomic Descriptions Terry Catapano, Plazi Leiden, Netherlands 2013-02-14.

Slides:



Advertisements
Similar presentations
Chungnam National University DataBase System Lab
Advertisements

Copyright © 2003 Pearson Education, Inc. Slide 7-1 Created by Cheryl M. Hughes, Harvard University Extension School Cambridge, MA The Web Wizards Guide.
Copyright © 2003 Pearson Education, Inc. Slide 3-1 Created by Cheryl M. Hughes The Web Wizards Guide to XML by Cheryl M. Hughes.
NATIONAL LIBRARY OF MEDICINE PubMed Central Edwin Sequeira National Library of Medicine May 26, 2004.
OLAC Metadata Steven Bird University of Melbourne / University of Pennsylvania OLAC Workshop 10 December 2002.
History Study Center Primary and secondary sources documenting global history 2010.
A Common Standard for Data and Metadata: The ESDS Qualidata XML Schema Libby Bishop ESDS Qualidata – UK Data Archive E-Research Workshop Melbourne 27 April.
UKOLN, University of Bath
NIH Public Access Compliance Cleveland Health Sciences Library Case Western Reserve University Kathleen C. Blazar.
DOCUMENT TYPES. Digital Documents Converting documents to an electronic format will preserve those documents, but how would such a process be organized?
XML Data Validation An Open QA Framework February 28, 2005 The Exchange Network Node Mentoring Workshop.
What is XML? a meta language that allows you to create and format your own document markups a method for putting structured data into a text file; these.
How to publish genomic Data papers based on BOL data - Biodiversity Data Journal Lyubomir Penev Bulgarian Academy of Sciences & Pensoft Publishers ViBRANT.
Don’t make me think Biodiversity data publishing made easy Vince Smith, Alice Heaton, Laurence Livermore, Simon Rycroft, Ben Scott & Lyubomir Penev* The.
XML Craig Stewart Dr. Alexandra I. Cristea
Pensoft Writing Tool (PWT) Lyubomir Penev ViBRANT Tools for DNA taxonomists, 11 June 2013, Brussles ViBRANT.
EAD Revision: Technical Considerations Terry Catapano EAD Roundtable Meeting
ISO DSDL ISO – Document Schema Definition Languages (DSDL) Martin Bryan Convenor, JTC1/SC18 WG1.
An Introduction to XML Based on the W3C XML Recommendations.
22. Calochortus greenei S. Watson, Proc. Amer. Acad. Arts. 14: Greene’s mariposa-lily Stems usually branching, 1–3 dm. Leaves: basal persistent,
Making small data big! The Biodiversity Data Journal (BDJ) Lyubomir Penev, Teodor Georgiev, Pavel Stoev, David Roberts, Vincent Smith ViBRANT.
PubMed Central Mahyar Ahmadpour-B. Kowsar Publicatin Corp. Kowsar Editorial Meeting 1 September 19th, 2013 Tehran, Iran.
Taxonomic Literature Standards and Synergies TDWG 2006 Anna L. Weitzman & Christopher H. C. Lyal.
SDD: Structured Descriptive Data Gregor Hagedorn (Germany) Bob Morris (USA) Kevin Thiele (Australia)
TaxPub: An Extension of JATS for Taxonomic Descriptions Terry Catapano
Scratchpads Publishing biodiversity: The interplay between Scratchpads and the Biodiversity Data Journal Dr Dimitrios Koureas Biodiversity Informatics.
Implementation of TaxPub, a JATS extension for domain-specific markup in taxonomy: the experience of a biodiversity publisher Lyubomir Penev, Terry Catapano,
NATIONAL LIBRARY OF MEDICINE NLM Journal Archiving and Interchange Tagset Jeff Beck National Center for Biotechnology Information National Library of Medicine.
NATIONAL LIBRARY OF MEDICINE PubMed Central and the NLM Journal Archiving Vocabulary.
The XML mark up process from the viewpoint of a biodiversity publisher Lyubomir Penev, Donat Agosti, Teodor Georgiev, Terry Catapano, Vladimir Blagoderov,
Bookshelf Leafing through XML NLM Journal Article Tag Suite Conference 2010 Martin Latterner and Marilu Hoeppner National Center for Biotechnology Information.
1 COS 425: Database and Information Management Systems XML and information exchange.
Link yourself or perish? PhytoKeys, the next generation journal in systematic botany Lyubomir Penev 1, W. John Kress 2, Sandra Knapp 3, De-Zhu Li 4, Susanne.
Open access journals Pensoft Journal Ststem PJS 2.0 Lyubomir Penev Bulgarian Academy of Sciences & Pensoft Publishers ViBRANT ViBRANT Tools for DNA taxonomists,
Cybertaxonomy and revisionary systematics Dmitry Dmitriev Illinois Natural History Survey, USA
UKOLUG - July Metadata for the Web RDF and the Dublin Core Andy Powell UKOLN, University of Bath UKOLN.
Moving beyond free text. Authors Scientist does research Scientist publishes research results in journal article Old Paradigm:
By Carrie Moran. To examine the Metadata Object Description Schema (MODS) metadata scheme to determine its utility based on structure, interoperability.
November 1&2, Are we there yet? YES What to expect along the way A Brief History Some Jargon you may need to know First Detour: NLM DTD vs PMC.
Why XML ? Problems with HTML HTML design - HTML is intended for presentation of information as Web pages. - HTML contains a fixed set of markup tags. This.
Scratchpads Publication Module - A paradigm shift in publishing RBG Kew, Seminar,
XP 1 CREATING AN XML DOCUMENT. XP 2 INTRODUCING XML XML stands for Extensible Markup Language. A markup language specifies the structure and content of.
At the frontline of publishing in systematic zoology: A presentation of ZooKeys Lyubomir Penev 1, Terry Erwin 2, Jeremy Miller 3 1 Pensoft Publishers,
The Pensoft Journal System and XML-based workflow Lyubomir Penev Life and Literature Conference, Chicago 2011 ViBRANT Virtual Biodversity.
1 herbert van de sompel CS 502 Computing Methods for Digital Libraries Cornell University – Computer Science Herbert Van de Sompel
The Metadata Object Description Schema (MODS) NISO Metadata Workshop May 20, 2004 Rebecca Guenther Network Development and MARC Standards Office Library.
(the NLM DTDs) Update on the NLM Journal Article Tag Suite Jeffrey Beck
Chapter 27 The World Wide Web and XML. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.27-2 Topics in this Chapter The Web and the Internet.
XML and Digital Libraries M. Zubair Department of Computer Science Old Dominion University.
TaxonX : A mark-up schema and approach for systematics literature American Museum of Natural History and University of Karlsruhe in collaboration with.
The Future of Informatics in Digital Literature – or Literature and it’s (Digital) Future Donat Agosti and Terrance Catapano Plazi TDWG, Woods Hole, September.
1 Introduction to XML XML stands for Extensible Markup Language. Because it is extensible, XML has been used to create a wide variety of different markup.
Chapter 27 The World Wide Web and XML. Copyright © 2004 Pearson Addison-Wesley. All rights reserved.27-2 Topics in this Chapter The Web and the Internet.
Using XML to store Descriptive Metadata Richard Murphy Rosarie O’Riordan Central Statistics Office Ireland.
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 4 1COMP9321, 15s2, Week.
Scratchpads and the new Biodiversity Data Journal Biodiversity Data Publishing made… easier Dimitris Koureas Natural History Museum London.
Structured Documents - XML and FrameMaker 7 Asit Pant.
Literature & interoperability: a working example using ants Donat Agosti, Terry Catapano, Guido Sautter, Christiana Klingenberg & Christie Stephenson TDWG.
Riccardi: DIALOGUE Workshop August 1, 2005 Supported by NSF BDI 1 Representing and Using Phylogenetic Characters in Morphbank Greg Riccardi, David Gaitros,
Mechanisms for coordination and delivery of taxon profiles in Australia Longitudinal use case scenarios from primary data custodians and roles for data.
Plazi: Prospects for Markup of Legacy and New Taxonomic Literature Terry Catapano TDWG Fremantle, WA October 21, 2008.
Digitizing Historical Newspapers South Carolina Digital Newspaper Program's participation with the Library of Congress' Chronicling America: Historic American.
ZooBank: Scope of Registry
Introduction to Persistent Identifiers
International Congress of Entomology, Orlando
XML in Web Technologies
Authors’ names to be inserted here, one or two lines in total
Authors’ names to be inserted here, one or two lines in total
Publishing and Mark-up of Collection Data
HOW (and why?) DO WE DESCRIBE ?
Presentation transcript:

TaxPub: An Extension of JATS for Taxonomic Descriptions Terry Catapano, Plazi Leiden, Netherlands

JATS Journal Article Tag Suite Formerly NLM/NCBI Journal Archiving and Publishing Tag Suite Version 1, 2002 PubMed Central Widely adopted by STM publishers Now NISO JATS ANSI/NISO Z

JATS DTD Spectrum Legacy/Loose Prospective/Strict Archiving (Green) DTD Publishing (Blue) DTD Now also Authoring and Book DTDs Offers extensibility features

Taxonomic Descriptions “Treatment” Discussion of the features/distribution of a related group of organisms, “taxon” Formal conventions ICZN, ICBN, etc... Frequently parts of publications Cited as discrete objects 200+ year history

Linnaeus, Systema Naturae, 10th Edition,

Taekul, C., N. F. Johnson, L. Masner, A. Polaszek and Rajmohana K World species of the genus Platyscelio Kieffer (Hymenoptera, Platygastridae). ZooKeys 50:

Treatment Components Nomenclature o Name o Authority o Status, etc… Description Materials Examined o Specimens  Collection  Deposit Diagnosis, Distribution, Etymology, Key, etc…

Background: TaxonX NSF/DFG Funded Project Extraction of species data from taxonomic literature of Ants TaxonX schema for markup of corpus c. 500 publications; c. 11,000 treatments Development continued by Plazi

Legacy Literature: Challenges Text accuracy Formal/Editorial Variety Condensed Information Loose schema, higher costs of application

New Literature: Rationale Matt Yoder et al., Development of the Hymenoptera Anatomy Ontology: Implications for Systematics and Literature Mark-up

TaxPub Extension of Publishing (“Blue”) DTD Parsimony: largely rely on base DTD “tp:” namespace Available throughout o : scientific names o : morphology o : specimens; gene sequences Within o + subelements

A further undescribed Nixonia species related to N. lamorali emerged from processing of samples collected in Kogelberg Biosphere Reserve (50km east of Cape Town). This species may usurp N. gigas...

, regularized form of name object-id: identifier(s) for name o semantics of xlink semantics for name components o string o use URI's: here terms from Darwin Core vocabulary ( N. lamorali

Relatively undeveloped Modeling of descriptions challenging o complex, if formal, natural language Segment text o Delineate components o Normalize/Annotate o

... Length 7.0 mm ; completely black, tarsi lighter (figs. 2A, B); wings infuscate throughout, brownish tarsi lighter...

Spreading shrub; stems erect, greenish Leaves deciduous early in summer (particularly when infected with Diseasomyces), oblong, apex obtuse, glabrous or weakly hirsute; stipules sharply pointed, 3,2mm wide, black or darkish brown, extremely rarely yellow, often shallowly joined around the node; spines stout.

: how, when collected o : where collected : current location

, con't 1 male, South Africa Western Cape" Langberg Farm, (3 km 270° W Langebaanweg) 32°58.461’S 18°07.344’E 12–19 Mar 2003, S. van Noort, Malaise trap, LW02-N2-M175, Sand Plain Fynbos, SAM-HYM-P030184, OSUC ), ( SAMC ) tp:location:  URI (Darwin Core)  string named-content: all other components

tp:treatment and Sub-Elements

o bibliographic metadata for treatments o standalone treatments : required o : required o other elements...

Nixonia masneri van Noort & Johnson sp. n. Figures 1A–F

Nixonia Masner, 1958, 101 Original description. Type: Nixonia pretiosa Masner, by monotypy and original designation. For subsequent taxonomic literature see Johnson (1992) or The Genera of Platygastroidea of the World ( hymenoptera/platygastroidea ).

Type material Holotype... Diagnosis Most similar to... Etymology Named in honour of Lubomír Masner,... Distribution and habitat association Currently only known from two widely spaced localities.... Description..., con't

Keys Indentify subordinate taxa within higher taxon (e.g., species in genus) No model in TaxPub Use existing JATS table model Use or

Keys, con't Key to species of Nixonia Online interactive key...> 1 Third antennal segment shorter than, or subequal to, second antennal segment 2

Future Work Extension to “Green”/Archiving DTD o For legacy literature Descriptive data (i.e., keys, characters, states, etc...) Tools XSLT stylesheets for rendition/proofing XSLT stylesheets for conversion to external formats Development of supporting vocabularies Schematron for profiling Stand-alone validator Implementations EJT Smithsonian Zootaxa