University of Sheffield NLP Teamware: A Collaborative, Web-based Annotation Environment Kalina Bontcheva, Milan Agatonovic University of Sheffield
University of Sheffield NLP 2GATE Summer School - July 27-31, 2009 Hands-on Preparation Go to the FIG’09 Wiki Under Resources, Teamware lecture Click on link to the Teamware install Login using you user name (from your reg.pack): -annotator Click on the link “Annotation Editor” to download and prepare the software for our first hands on When it opens, leave it as is, till we need it
University of Sheffield NLP 3GATE Summer School - July 27-31, 2009 Outline Why Teamware? What’s Teamware? Teamware for annotation Teamware for quality assurance and curation Teamware for defining workflows, running automatic services, managing annotation projects Outlook
University of Sheffield NLP 4GATE Summer School - July 27-31, 2009 From Annotation Tools to Collaborative Annotation Workflows We have lots and lots of tools and algorithms for annotation; what we need is 1. methodological instead of purely technological 2. multi-role instead of single role 3. assistive instead of autonomous 4. service-orientated, not monolithic 5. usable by non-specialists GATE Teamware Research users in several EU projects External users at IRF and Matrixware Interest from other commercial users as well
University of Sheffield NLP 5GATE Summer School - July 27-31, 2009 GATE Teamware: Annotation Workflows on the Web GATE Teamware is: □ Collaborative, social, Web 2.0, has behaviour mining using Machine Learning □ Parallel and distributed (using web services) □ Scalable (via service replication) □ Workflow based with business process integration
University of Sheffield NLP 6GATE Summer School - July 27-31, 2009 Teamware – Layer Cake Teamware Executive Layer Workflow Management Authentication And User Management Services Layer GATE Document Service GATE Annotation Services GATE Ontology Service GATE Machine Learning API User Interface Layer Manual Annotation User Interface Schema Annotation UI Ontology Annotation UI Data Curation User Interface Annotation Diff UI ANNIC UI Document Browser Language Engineer User Interface GATE Developer UI
University of Sheffield NLP 7GATE Summer School - July 27-31, 2009 Division of Labour: A Multi-role Methodology (Human) Annotators - labour has to be cheap! Bootstrap annotation process with JAPE rules or mixed-initiative learning Curators (or super-annotators) Reconcile differences between annotators, using IAA, AnnDiff, curator UI Manager Defining annotation guidelines and schemas Choose relevant automatic services to pre-process Toolset including performance benchmarking, progress monitoring tools, small linguistic customisations Define workflow, manage annotators, liaise with language engineers and sys admins Sys admin Setup the Teamware system, users, etc. Language engineer Uses GATE Developer to create bespoke services and deploy online
University of Sheffield NLP 8GATE Summer School - July 27-31, 2009 Teamware: Manual Annotation Tool
University of Sheffield NLP 9GATE Summer School - July 27-31, 2009 Manual Annotation Process Annotator logs into Teamware Clicks on “Open Annotation Editor” Requests an annotation task (first button) Annotates the assigned document When done, presses the “Finish task” button If wants to save work and return to this task later – “Save” button, then close the UI. Next time a task is requested, the same document will be assigned, so it can be finished Depending on the project setup, it might be possible to reject a document and then ask for another one to annotate (Reject button)
University of Sheffield NLP 10GATE Summer School - July 27-31, 2009 Hands-on Open a web browser and Teamware Login using you user name (from your reg.pack): -annotator Open the annotation UI Try requesting tasks, editing annotations, saving your work, asking for another task, etc. This is what Teamware looks like to a human annotator
University of Sheffield NLP 11GATE Summer School - July 27-31, 2009 Teamware for Curators Still being developed, so UI is in transition Identify if there are differences between annotators using IAA Inspect differences in detail using AnnDiff Edit and reconcile differences if required New curator UI in Teamware under development Currently available in Developer
University of Sheffield NLP 12GATE Summer School - July 27-31, 2009 IAA: Do my annotators agree?
University of Sheffield NLP 13GATE Summer School - July 27-31, 2009 IAA: Results
University of Sheffield NLP 14GATE Summer School - July 27-31, 2009 IAA: Recap The IAA on IE tasks, such as named entity recognition, should be measured using f-measure across all annotators For classification tasks, use Kappa to measure IAA For details, see the evaluation lecture and the GATE user guide
University of Sheffield NLP 15GATE Summer School - July 27-31, 2009 AnnDiff: Finding the differences
University of Sheffield NLP 16GATE Summer School - July 27-31, 2009 Where are these in Teamware? Only visible to curators and their managers Resources/Documents menu Select the corpus worked on Iterate through each document Run IAA and AnnDiff, as required Try for yourself: Login as -curator Corpus: annie-demo The first or second documents
University of Sheffield NLP 17GATE Summer School - July 27-31, 2009 Forthcoming curator facilities Have a corpus-level view of IAA Extended AnnDiff to allow easy reconciliation of the differences between 2 annotators Currently prototyped in Developer Will be made available in Teamware soon
University of Sheffield NLP 18GATE Summer School - July 27-31, 2009 New AnnDiff in Developer
University of Sheffield NLP 19GATE Summer School - July 27-31, 2009 Beyond Pair-wise Reconciliation AnnDiff only handles 2 sets of annotations at a time – we often need more! Towards an in-place, content-based reconciliation interface
University of Sheffield NLP 20GATE Summer School - July 27-31, 2009 Current UI Prototype
University of Sheffield NLP 21GATE Summer School - July 27-31, 2009 Teamware for Managers Defining workflows Running annotation projects Tracking progress
University of Sheffield NLP 22GATE Summer School - July 27-31, 2009 Teamware Workflows Whole process is controlled by a workflow manager Workflow may be simple: Give the document to a human annotator Information curator checks a sample of documents for QC or more complex Invoke one or more web services to produce automatic annotations Pass each document to 2 annotators Information curator to quickly check level of agreement between the annotators and reconcile any differences Annotated documents used to train an ML model When model is good enough, start making suggestions to the annotators
University of Sheffield NLP 23GATE Summer School - July 27-31, 2009 Workflow Templates
University of Sheffield NLP 24GATE Summer School - July 27-31, 2009 Defining new workflows Select Projects/WF Templates Opens the WF wizard Choose which services you want to run Choose whether you want manual annotation, how many annotators per doc, …
University of Sheffield NLP 25GATE Summer School - July 27-31, 2009 Setting up a Manual Annotation Project Upload the schemas Upload the documents Define the Workflow template Run the project, choosing the corpus, the annotators, curators, etc. DEMO!
University of Sheffield NLP 26GATE Summer School - July 27-31, 2009 Setting up an Automatic Annotation Project Configure the web service(s) Define the Workflow template Run the project, choosing the corpus DEMO!
University of Sheffield NLP 27GATE Summer School - July 27-31, 2009 Semi-automatic Projects Just combine the two sets of steps
University of Sheffield NLP 28GATE Summer School - July 27-31, 2009 Teamware: Monitoring Project Progress
University of Sheffield NLP 29GATE Summer School - July 27-31, 2009 Outlook Teamware is still under active development Many features subject to change If you’d like further information or to try it with your data for a particular project, please contact Hamish and Kalina