Presentation is loading. Please wait.

Presentation is loading. Please wait.

Plazi: Prospects for Markup of Legacy and New Taxonomic Literature Terry Catapano TDWG Fremantle, WA October 21, 2008.

Similar presentations

Presentation on theme: "Plazi: Prospects for Markup of Legacy and New Taxonomic Literature Terry Catapano TDWG Fremantle, WA October 21, 2008."— Presentation transcript:

1 Plazi: Prospects for Markup of Legacy and New Taxonomic Literature Terry Catapano TDWG Fremantle, WA October 21, 2008

2 NSF/DFG Grant (AMNH/University of Karlsruhe)‏ XML Markup of taxonomic publications for extraction of: Treatments Scientific Names Morphological Characters Distribution Data Collection locales/events For: Open Access Submission to db's Retrieval Ontology development

3 Markup Languages Provides grammar to define document types Delineate & identify document elements (atoms) in text Syntax: Structural relationships between elements (parent/child, cardinality, ordinality, id/idref, key/keyref)‏ Beyond the PDF‏


5 TaxonX schema Golden Gate Editor 250 Docs/7500 Treatments DSpace-based Digital Object Repository (handles)‏ SRS TAPIR (specimen data)‏ Species Profile Model/RDF (descriptive data)‏

6 Wildly heterogeneous Requires lax structuring of documents Need for regularization Requires editorial policy (reproduction: text of work or text of document) Defers much work of interoperability Benefits Treatments +names, subsections, localities, bibliographic references Extraction & representation in other services Costs GoldenGate configured for testbed: 3 minutes per page $5 page(?)‏

7 New Literature Different markup activity Different markup activity Prospective not Retrospective More optimal cost/benefit ratio? Strict modeling for consistent documents/data Increased regularization Increased sharing, re-use Decreased costs (potentially)‏: Application QC Adoption

8 TDWG Vocabularies supply many concepts NLM Journal Archiving and Interchange Tag Suite DTD's for markup of journal articles Archiving, Publishing, Authoring, other modules possible Wide adoption by publishers and aggregators; LOC Actively maintained Module for taxonomic treatments in Publishing

9 Inherit generic features from existing Tag Set Bibliographic references Tables Linking supporting material/data (xlink)‏ Linking to graphic and media objects (xlink)‏ Treatments Treatment sections Scientific names, Geographic names, Characters/States Specimens and other materials citations

10 Plazi: NLM conversion of Zootaxa and PLOS One articles Apply markup at earliest stage possible Develop tools to assist (probably easier than for “pure” legacy literature)‏ Extend codes and structures to handle electronic publication Shifts “illustrated narrative” complex digital objects METS, OAI-ORE, MPEG-21/DIDL

11 Text Materials Description Treatment Image Data Nomenclature

12 Linked Data Machines > Documents > Data Open documents, free data Reduced costs of use/re-use (e.g., SPM for EOL)‏ Broaden scope of application Accelerate velocity of information exchange

Download ppt "Plazi: Prospects for Markup of Legacy and New Taxonomic Literature Terry Catapano TDWG Fremantle, WA October 21, 2008."

Similar presentations

Ads by Google