The PLAZI Markup System Donat Agosti Terry Catapano Robert “Bob“ Morris Guido Sautter Universität Karlsruhe (TH) Research University – founded 1825.

Slides:



Advertisements
Similar presentations
How to Set Up a System for Teaching Files, Conferences, and Clinical Trials Medical Imaging Resource Center.
Advertisements

How to Author Teaching Files Draft Medical Imaging Resource Center.
IRRA DSpace April 2006 Claire Knowles University of Edinburgh.
AHRT: The Automated Human Resources Tool BY Roi Ceren Muthukumaran Chandrasekaran.
ANNIC ANNotations In Context GATE Training Course 27 – 28 April 2006 Niraj Aswani.
Information Retrieval in Practice
The XML mark up process from the viewpoint of a biodiversity publisher Lyubomir Penev, Donat Agosti, Teodor Georgiev, Terry Catapano, Vladimir Blagoderov,
Technical Tips and Tricks for User Support Mike Gardner
B.Sc. Multimedia ComputingMedia Technologies Database Technologies.
Mgt 240 Lecture Website Construction: Software and Language Alternatives March 29, 2005.
Overview of Search Engines
XIS™ XML Intranet System. XIS, the XML Intranet System provides the foundation for your database production and management. XIS maximizes the flexible.
An innovative platform to allow translation and indexing of internet sites Localization World
Open access journals Pensoft Journal Ststem PJS 2.0 Lyubomir Penev Bulgarian Academy of Sciences & Pensoft Publishers ViBRANT ViBRANT Tools for DNA taxonomists,
DEiXTo.
Web Programming Language Dr. Ken Cosh Week 1 (Introduction)
John Perry MIRC Overview Medical Imaging Resource Center MIRC Overview Medical Imaging Resource Center.
VxOware Progress Report August How to create a new section? Configure section –Create metadata structure (template) –Create elements map for web.
Web Content Management Systems. Lecture Contents Web Content Management Systems Non-technical users manage content Workflow management system Different.
ETD Repositories Using DSpace Software Andrew Penman The Robert Gordon University 27 th September 2004.
Aurora: A Conceptual Model for Web-content Adaptation to Support the Universal Accessibility of Web-based Services Anita W. Huang, Neel Sundaresan Presented.
Trimble Connected Community
Using MIRC Khan M. Siddiqui, MD Chief, Imaging Informatics & MRI VA Maryland Health Care System Assistant Professor, Radiology University of Maryland,
Oracle Application Express 3.0 Joel R. Kallman Software Development Manager.
OCLC Online Computer Library Center CONTENTdm ® Digital Collection Management Software Ron Gardner, OCLC Digital Services Consultant ICOLC Meeting April.
EXtensible Neuroimaging Archive Toolkit (XNAT) Washington University Neuroinformatics Group.
Class Instructor Name Date. Classroom Tips Class Roster – Please Sign In Class Roster – Please Sign In Internet Usage Internet Usage –Breaks and Lunch.
Virtual Biodiversity ViBRANT Literature Mining and Mark-up ViBRANT’s text processing tools David Morse, The Open University, UK,
ANNIC ANNotations In Context GATE Training Course October 2006 Kalina Bontcheva (with help from Niraj Aswani)
IUScholarWorks is a set of services to make the work of IU scholars freely available. Allows IU departments, institutes, centers and research units to.
WordFreak A Language Independent, Extensible Annotation Tool.
Portfolio v1.0 Products. Benefits Scalable Fast Full interface via web services Fully integrated with Microsoft SharePoint Easy navigation Competence.
Project Overview Graduate Selection Process Project Goal Automate the Selection Process.
Web software. Two types of web software Browser software – used to search for and view websites. Web development software – used to create webpages/websites.
Project Overview Graduate Selection Process Project Goal Automate the Selection Process.
Copenhagen, 7 June 2006 Toolkit update and maintenance Anton Cupcea Finsiel Romania.
The Future of Informatics in Digital Literature – or Literature and it’s (Digital) Future Donat Agosti and Terrance Catapano Plazi TDWG, Woods Hole, September.
Using EBSCOhost databases Access via MyAthens Click on the EBSCOhost link.
Copyright © 2006 Pilothouse Consulting Inc. All rights reserved. Search Overview Search Features: WSS and Office Search Architecture Content Sources and.
GBIF Data Access and Database Interoperability 2003 Work Programme Overview Donald Hobern, GBIF Programme Officer for Data Access and Database Interoperability.
Chapter 29 World Wide Web & Browsing World Wide Web (WWW) is a distributed hypermedia (hypertext & graphics) on-line repository of information that users.
Mercury – A Service Oriented Web-based system for finding and retrieving Biogeochemical, Ecological and other land- based data National Aeronautics and.
The RSNA Teaching File System (MIRC) John Perry.  MIRC Overview – Teaching Files  RSNA Clinical Trial and Research Software  Hands On: Using the RSNA.
How to Set Up a System for Teaching Files, Conferences, and Clinical Trials Medical Imaging Resource Center.
Module: Software Engineering of Web Applications Chapter 2: Technologies 1.
Literature & interoperability: a working example using ants Donat Agosti, Terry Catapano, Guido Sautter, Christiana Klingenberg & Christie Stephenson TDWG.
DSpace System Architecture 11 July 2002 DSpace System Architecture.
A wiki is a collaborative web application which allows people to add and edit content using a browser… …it creates communities and empowers users as they.
The New GBIF Data Portal Web Services and Tools Donald Hobern GBIF Deputy Director for Informatics October 2006.
VERI is an interface that provides a Web based front end to the access the datasets generated by the MVED. The goal is to Provide open access to the Don.
The library is open Digital Assets Management & Institutional Repository Russian-IUG November 2015 Tomsk, Russia Nabil Saadallah Manager Business.
RSS Interfaces and Standards Chander Iyer. Really Simple Syndication (RSS) Web data format providing users with frequently updated content. Make a collection.
Steven Perry Dave Vieglais. W a s a b i Web Applications for the Semantic Architecture of Biodiversity Informatics Overview WASABI is a framework for.
Plazi: Prospects for Markup of Legacy and New Taxonomic Literature Terry Catapano TDWG Fremantle, WA October 21, 2008.
FAT – Finding All Taxa (in Text Documents) Guido Sautter, Donat Agosti, Klemens Böhm Universität Karlsruhe (TH) Research University – founded 1825.
Digital Data Preservation: a schema-driven model Student: Stacy Kowalczyk Co-Authors: Clare McInerney and Phil Mitchell Digital Data Preservation – the.
CHAPTER 7 LESSON C Creating Database Reports. Lesson C Objectives  Display image data in a report  Manually create queries and data links  Create summary.
VIVO architecture March 1, Major Components Vitro is a general-purpose Web-based application leveraging semantic standards VIVO is a customized.
Getting Your Content in the Penn State Student Portal Presented By James Leous, Program Manager James Vuccolo, Lead Research Programmer.
General Architecture of Retrieval Systems 1Adrienn Skrop.
MIRC Overview Medical Imaging Resource Center John Perry RSNA 2009.
MIRC Overview Medical Imaging Resource Center. RSNA2006 MIRC Courses Overview of the RSNA MIRC Software Installing MIRC on Your Laptop Using MIRC for.
Project 1 Introduction to HTML.
International Congress of Entomology, Orlando
LMEvents SharePoint Portal How-to Guide
Flanders Marine Institute (VLIZ)
Web software.
Getting Started With Solr
SDMX IT Tools SDMX Registry
Presentation transcript:

The PLAZI Markup System Donat Agosti Terry Catapano Robert “Bob“ Morris Guido Sautter Universität Karlsruhe (TH) Research University – founded 1825

Guido Sautter Universität Karlsruhe (TH) The PLAZI Markup System2 GoldenGATE Document Editor PLAZI Server PLAZI Search Portal External Data Sources Marked-Up Documents Queries Treatments, Detail Data, PDF Document Handles Links, Materials Citations Taxon LSIDs, GeoData New Taxon Names Taxonomic data sources & web services Search portal, TAPIR provider, RSS feed Document markup, external referencing XML & PDF storage, treatment server

Guido Sautter Universität Karlsruhe (TH) The PLAZI Markup System3 The PLAZI Server GoldenGATE Search & Retrieval Server (SRS) –Extracts individual treatments from XML documents –Stores and indexes treatments –Based on independend, pluggable Indexers Taxonomic names Materials citations Document meta data Full text –Serves treatments or indexed details DSpace –Stores PDF and XML documents –Issues Handles for documents Web Service SRS PostgreSQL File System TNMCMDFT Document Management Data Index Data XML Documents Indexers

Guido Sautter Universität Karlsruhe (TH) The PLAZI Markup System4 GoldenGATE Document Editor PLAZI Server PLAZI Search Portal External Data Sources Marked-Up Documents Queries Treatments, Detail Data, PDF Document Handles Links, Materials Citations Taxon LSIDs, GeoData New Taxon Names Taxonomic data sources & web services Search portal, TAPIR provider, RSS feed Document markup, external referencing XML & PDF storage, treatment server

Guido Sautter Universität Karlsruhe (TH) The PLAZI Markup System5 The PLAZI Search Portal Series of Java Servlets running in Apache Tomcat Front-end for SRS Web Service Linker plug-ins create hyperlinks to other web sites HTML based search portal for humans –Search treatments & index data –Links submitting new search queries –Links to external data sources (e.g. HNS, GoogleMaps) –Links to PDF document & XML versions of treatments XML document access in various XML schemas TAPIR provider –Taxonomic names –Materials citations RSS feed for new treatments

Guido Sautter Universität Karlsruhe (TH) The PLAZI Markup System6 Probolomyrmex tani The PLAZI Search Portal

Guido Sautter Universität Karlsruhe (TH) The PLAZI Markup System7 GoldenGATE Document Editor PLAZI Server PLAZI Search Portal External Data Sources Marked-Up Documents Queries Treatments, Detail Data, PDF Document Handles Links, Materials Citations Taxon LSIDs, GeoData New Taxon Names Taxonomic data sources & web services Search portal, TAPIR provider, RSS feed Document markup, external referencing XML & PDF storage, treatment server

Guido Sautter Universität Karlsruhe (TH) The PLAZI Markup System8 The GoldenGATE Editor Java-based editor for semi-automated document markup Extensible through plug-in mechanism Independent of specific XML schema Element-level XML editing (XML syntax is generated) Flexible display for clear view on all detail levels Existing plug-ins provide broad spectrum of functionality: –NLP-based markup generation Regular expressions, gazetteers, GATE JAPE Homegrown and third-party NLP components Import of data from external sources (e.g. LSIDs) –Specialized document views for correcting NLP results –Markup transformation & filtering –IO components for different data formats & storage locations (e.g. for uploading XML documents to PLAZI server)

Guido Sautter Universität Karlsruhe (TH) The PLAZI Markup System9 The GoldenGATE Editor

Guido Sautter Universität Karlsruhe (TH) The PLAZI Markup System10 The PLAZI Markup System GoldenGATE Document Editor PLAZI Server PLAZI Search Portal External Data Sources Marked-Up Documents Queries Treatments, Detail Data, PDF Document Handles Links, Materials Citations Taxon LSIDs, GeoData New Taxon Names Taxonomic data sources & web services Search portal, TAPIR provider, RSS feed Document markup, external referencing XML & PDF storage, treatment server

Guido Sautter Universität Karlsruhe (TH) The PLAZI Markup System11 The External Data Sources Hymenoptera Name Server (HNS) –Retrieve LSIDs for taxon names –Enter new taxon names in HNS database Further LSID sources: ZooBank, Index Fungorum GBIF pulls materials citations via TAPIR EOL pulls treatments via TAPIR (to start soon)

Guido Sautter Universität Karlsruhe (TH) The PLAZI Markup System12 Outlook Tighter integration of GoldenGATE editor with server –Load plug-ins from server  Easier update distribution –Upload documents directly after OCR –Host documents at server throughout markup  Users can share markup work (experts do LSIDs, etc)  Treatments available in search portal soon as marked up –Auto-distribute documents to different storage locations –Run automated markup generation on server side –Get corrections from community via online feedback forms Other extensions of GoldenGATE editor –Simplified, more flexible plug-in architecture –Extensible user interface

Thank you! Questions? Donat Agosti Terry Catapano Robert “Bob“ Morris Guido Sautter PLAZI homepage PLAZI search portal GoldenGATE homepage Universität Karlsruhe (TH) Research University – founded

Guido Sautter Universität Karlsruhe (TH) The PLAZI Markup System14 The GoldenGATE Editor V3 Plug-in GUI extensions (hideable) Simplified, more flexible architecture Pre-OCR page images for correcting OCR errors Document navigator for finding stuff more quickly