Presentation is loading. Please wait.

Presentation is loading. Please wait.

The PLAZI Markup System Donat Agosti Terry Catapano Robert “Bob“ Morris Guido Sautter Universität Karlsruhe (TH) Research University – founded 1825.

Similar presentations


Presentation on theme: "The PLAZI Markup System Donat Agosti Terry Catapano Robert “Bob“ Morris Guido Sautter Universität Karlsruhe (TH) Research University – founded 1825."— Presentation transcript:

1 The PLAZI Markup System Donat Agosti Terry Catapano Robert “Bob“ Morris Guido Sautter Universität Karlsruhe (TH) Research University – founded 1825

2 Guido Sautter Universität Karlsruhe (TH) The PLAZI Markup System2 GoldenGATE Document Editor PLAZI Server PLAZI Search Portal External Data Sources Marked-Up Documents Queries Treatments, Detail Data, PDF Document Handles Links, Materials Citations Taxon LSIDs, GeoData New Taxon Names Taxonomic data sources & web services Search portal, TAPIR provider, RSS feed Document markup, external referencing XML & PDF storage, treatment server

3 Guido Sautter Universität Karlsruhe (TH) The PLAZI Markup System3 The PLAZI Server GoldenGATE Search & Retrieval Server (SRS) –Extracts individual treatments from XML documents –Stores and indexes treatments –Based on independend, pluggable Indexers Taxonomic names Materials citations Document meta data Full text –Serves treatments or indexed details DSpace –Stores PDF and XML documents –Issues Handles for documents Web Service SRS PostgreSQL File System TNMCMDFT Document Management Data Index Data XML Documents Indexers

4 Guido Sautter Universität Karlsruhe (TH) The PLAZI Markup System4 GoldenGATE Document Editor PLAZI Server PLAZI Search Portal External Data Sources Marked-Up Documents Queries Treatments, Detail Data, PDF Document Handles Links, Materials Citations Taxon LSIDs, GeoData New Taxon Names Taxonomic data sources & web services Search portal, TAPIR provider, RSS feed Document markup, external referencing XML & PDF storage, treatment server

5 Guido Sautter Universität Karlsruhe (TH) The PLAZI Markup System5 The PLAZI Search Portal Series of Java Servlets running in Apache Tomcat Front-end for SRS Web Service Linker plug-ins create hyperlinks to other web sites HTML based search portal for humans –Search treatments & index data –Links submitting new search queries –Links to external data sources (e.g. HNS, GoogleMaps) –Links to PDF document & XML versions of treatments XML document access in various XML schemas TAPIR provider –Taxonomic names –Materials citations RSS feed for new treatments

6 Guido Sautter Universität Karlsruhe (TH) The PLAZI Markup System6 Probolomyrmex tani The PLAZI Search Portal

7 Guido Sautter Universität Karlsruhe (TH) The PLAZI Markup System7 GoldenGATE Document Editor PLAZI Server PLAZI Search Portal External Data Sources Marked-Up Documents Queries Treatments, Detail Data, PDF Document Handles Links, Materials Citations Taxon LSIDs, GeoData New Taxon Names Taxonomic data sources & web services Search portal, TAPIR provider, RSS feed Document markup, external referencing XML & PDF storage, treatment server

8 Guido Sautter Universität Karlsruhe (TH) The PLAZI Markup System8 The GoldenGATE Editor Java-based editor for semi-automated document markup Extensible through plug-in mechanism Independent of specific XML schema Element-level XML editing (XML syntax is generated) Flexible display for clear view on all detail levels Existing plug-ins provide broad spectrum of functionality: –NLP-based markup generation Regular expressions, gazetteers, GATE JAPE Homegrown and third-party NLP components Import of data from external sources (e.g. LSIDs) –Specialized document views for correcting NLP results –Markup transformation & filtering –IO components for different data formats & storage locations (e.g. for uploading XML documents to PLAZI server)

9 Guido Sautter Universität Karlsruhe (TH) The PLAZI Markup System9 The GoldenGATE Editor

10 Guido Sautter Universität Karlsruhe (TH) The PLAZI Markup System10 The PLAZI Markup System GoldenGATE Document Editor PLAZI Server PLAZI Search Portal External Data Sources Marked-Up Documents Queries Treatments, Detail Data, PDF Document Handles Links, Materials Citations Taxon LSIDs, GeoData New Taxon Names Taxonomic data sources & web services Search portal, TAPIR provider, RSS feed Document markup, external referencing XML & PDF storage, treatment server

11 Guido Sautter Universität Karlsruhe (TH) The PLAZI Markup System11 The External Data Sources Hymenoptera Name Server (HNS) –Retrieve LSIDs for taxon names –Enter new taxon names in HNS database Further LSID sources: ZooBank, Index Fungorum GBIF pulls materials citations via TAPIR EOL pulls treatments via TAPIR (to start soon)

12 Guido Sautter Universität Karlsruhe (TH) The PLAZI Markup System12 Outlook Tighter integration of GoldenGATE editor with server –Load plug-ins from server  Easier update distribution –Upload documents directly after OCR –Host documents at server throughout markup  Users can share markup work (experts do LSIDs, etc)  Treatments available in search portal soon as marked up –Auto-distribute documents to different storage locations –Run automated markup generation on server side –Get corrections from community via online feedback forms Other extensions of GoldenGATE editor –Simplified, more flexible plug-in architecture –Extensible user interface

13 Thank you! Questions? Donat Agosti Terry Catapano Robert “Bob“ Morris Guido Sautter PLAZI homepage PLAZI search portal GoldenGATE homepage Universität Karlsruhe (TH) Research University – founded 1825 agosti@amnh.org catapanoth@gmail.com ram@cs.umb.edu sautter@ipd.uka.de http://plazi.org http://plazi.org:8080/GgSRS http://idaho.ipd.uka.de/GoldenGATE

14 Guido Sautter Universität Karlsruhe (TH) The PLAZI Markup System14 The GoldenGATE Editor V3 Plug-in GUI extensions (hideable) Simplified, more flexible architecture Pre-OCR page images for correcting OCR errors Document navigator for finding stuff more quickly


Download ppt "The PLAZI Markup System Donat Agosti Terry Catapano Robert “Bob“ Morris Guido Sautter Universität Karlsruhe (TH) Research University – founded 1825."

Similar presentations


Ads by Google