Advanced JAPE Mark A. Greenwood. University of Sheffield NLP Recap Installed and run GATE Understand the idea of  LR – Language Resources  PR – Processing.

Slides:



Advertisements
Similar presentations
Semantic Analysis and Symbol Tables
Advertisements

1/(20) Introduction to ANNIE Diana Maynard University of Sheffield March 2004
An Introduction to GATE
University of Sheffield NLP Exercise I Objective: Implement a ML component based on SVM to identify the following concepts in company profiles: company.
University of Sheffield NLP Machine Learning in GATE Angus Roberts, Horacio Saggion, Genevieve Gorrell.
University of Sheffield NLP Module 4: Machine Learning.
Exercise 1 Generics and Assignments. Language with Generics and Lots of Type Annotations Simple language with this syntax types:T ::= Int | Bool | T =>
ANNIC ANNotations In Context GATE Training Course 27 – 28 April 2006 Niraj Aswani.
Feature requests for Case Manager By Spar Nord Bank A/S IBM Insight 2014 Spar Nord Bank A/S1.
Stacks.
Cs164 Prof. Bodik, Fall Symbol Tables and Static Checks Lecture 14.
The Program Design Phases
Struts 2.0 an Overview ( )
Symbol Table (  ) Contents Map identifiers to the symbol with relevant information about the identifier All information is derived from syntax tree -
1 Web Based Programming Section 6 James King 12 August 2003.
University of Sheffield NLP Module 3: Introduction to JAPE.
Cs2220: Engineering Software Class 8: Implementing Data Abstractions Fall 2010 University of Virginia David Evans.
Survey of Semantic Annotation Platforms
ANNIC ANNotations In Context GATE Training Course October 2006 Kalina Bontcheva (with help from Niraj Aswani)
© The McGraw-Hill Companies, 2006 Chapter 4 Implementing methods.
Language Translators - Lee McCluskey LANGUAGE TRANSLATORS: WEEK 21 LECTURE: Using JavaCup to create simple interpreters
CSC204 – Programming I Lecture 4 August 28, 2002.
Extracting Metadata for Spatially- Aware Information Retrieval on the Internet Clough, Paul University of Sheffield, UK Presented By Mayank Singh.
University of Sheffield NLP Teamware: A Collaborative, Web-based Annotation Environment Kalina Bontcheva, Milan Agatonovic University of Sheffield.
University of Sheffield NLP Module 1: Introduction to GATE Developer © The University of Sheffield, This work is licenced under the Creative.
Programming for Beginners Martin Nelson Elizabeth FitzGerald Lecture 5: Software Design & Testing; Revision Session.
Effective Test Driven Database Development Gojko Adzic
Interpretation Environments and Evaluation. CS 354 Spring Translation Stages Lexical analysis (scanning) Parsing –Recognizing –Building parse tree.
CHAPTER 4: CONTROL STRUCTURES - SEQUENCING 10/14/2014 PROBLEM SOLVING & ALGORITHM (DCT 1123)
1 Web Based Programming Section 8 James King 12 August 2003.
Introduction to GATE Developer Ian Roberts. University of Sheffield NLP Overview The GATE component model (CREOLE) Documents, annotations and corpora.
Sets, Maps and Hash Tables. RHS – SOC 2 Sets We have learned that different data struc- tures have different advantages – and drawbacks Choosing the proper.
1 CSC 221: Computer Programming I Fall 2004 Lists, data access, and searching  ArrayList class  ArrayList methods: add, get, size, remove  example:
Control Structures (A) Topics to cover here: Introduction to Control Structures in the algorithmic language Sequencing.
CMP-MX21: Lecture 4 Selections Steve Hordley. Overview 1. The if-else selection in JAVA 2. More useful JAVA operators 4. Other selection constructs in.
©2003 Paula Matuszek Taken primarily from a presentation by Lin Lin. CSC 9010: Text Mining Applications.
An Introduction to JavaScript By: John Coliton Tuesday, November 10, 1998 Center for Teaching and Learning.
Combining GATE and UIMA Ian Roberts. University of Sheffield NLP 2 Overview Introduction to UIMA Comparison with GATE Mapping annotations between GATE.
JAPE and Java Kalina Bontcheva, Department of Computer Science, University.
University of Sheffield, NLP Module 6: ANNIC Kalina Bontcheva © The University of Sheffield, This work is licensed under the Creative Commons.
Chapter 2 Input, Variables and Data Types. JAVA Input JAVA input is not straightforward and is different depending on the JAVA environment that you are.
University of Sheffield NLP Module 3: Introduction to JAPE © The University of Sheffield, This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike.
 In the java programming language, a keyword is one of 50 reserved words which have a predefined meaning in the language; because of this,
Parser Generation Using SLK and Flex++ Copyright © 2015 Curt Hill.
1/(27) GATE Ontology Tools GATE Training Course October 2006 Kalina Bontcheva
Error Example - 65/4; ! Toplevel input: ! 65/4; ! ^^ ! Type clash: expression of type ! int ! cannot have type ! real.
Cross Language Clone Analysis Team 2 February 3, 2011.
© 2006 Carnegie Mellon University Introduction to CBMC: Part 1 Software Engineering Institute Carnegie Mellon University Pittsburgh, PA Arie Gurfinkel,
Calculator Program Explained by Arafa Hamed. First Designing The Interface Ask yourself how many places are there that will be used to input numbers?
GATE Applications as Web Services Ian Roberts. University of Sheffield NLP Introduction Scenario:  Implementing a web service (or other web application)
Comments, Conditional Statements Continued, and Loops Engineering 1D04, Teaching Session 4.
© 2006 Pearson Addison-Wesley. All rights reserved 1-1 Chapter 1 Review of Java Fundamentals.
©2012 Paula Matuszek GATE and ANNIE Information taken primarily from the GATE user manual, gate.ac.uk/sale/tao, and GATE training materials,
Combining GATE and UIMA Ian Roberts. 2 Overview Introduction to UIMA Comparison with GATE Mapping annotations between GATE and UIMA.
© 2006 Carnegie Mellon University Introduction to CBMC: Part 1 Software Engineering Institute Carnegie Mellon University Pittsburgh, PA Arie Gurfinkel,
University of Sheffield NLP Module 1: Introduction to JAPE © The University of Sheffield, This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike.
Part 1 Learning Objectives To understand that variables are a temporary named location to store data and that programmers work with different data types.
University of Sheffield NLP Sentiment Analysis (Opinion Mining) with Machine Learning in GATE.
Variable Scope & Lifetime
Names and Attributes Names are a key programming language feature
Java Coding 3 – part2 David Davenport Computer Eng. Dept.,
CS 536 / Fall 2017 Introduction to programming languages and compilers
Installing and Using MARIE
Compiler Design 18. Object Oriented Semantic Analysis (Symbol Tables, Type Checking) Kanat Bolazar March 30, 2010.
Module 3: Introduction to JAPE
Lecture 15 (Notes by P. N. Hilfinger and R. Bodik)
int [] scores = new int [10];
Installing and Using MARIE
Installing and Using MARIE
Combining GATE and UIMA
Presentation transcript:

Advanced JAPE Mark A. Greenwood

University of Sheffield NLP Recap Installed and run GATE Understand the idea of  LR – Language Resources  PR – Processing Resources ANNIE  Understand the goals of information extraction  Loaded ANNIE into GATE  Constructed one or more gazetteer lists Created JAPE rules with simple RHS

University of Sheffield NLP Overview Simple RHS Limitations The RHS API Accessing Annotations and Features Adding New Annotations Hands-On

University of Sheffield NLP Simple RHS Limitations The simple RHS of a JAPE rule can only add simple annotations and features  Feature values are hard coded or can be copied from annotations matched by the LHS You may need more complex processing  Removing temporary annotations  Building complex features ... Fortunately the RHS of a rule can consist of arbitrary Java code – the possibilities are endless!

University of Sheffield NLP The RHS API Java code provided as a RHS is used as the body of this method: public void doit(Document doc, Map bindings, AnnotationSet annotations, AnnotationSet inputAS, AnnotationSet outputAS, Ontology ontology)‏throws JapeException This provides easy access to the document, rule bindings and annotations. DO NOT USE annotations IT IS DEPRECATED!

University of Sheffield NLP Accessing Annotations and Features Each labelled section of the LHS results in an Annotation Set These Annotation Sets can be retrieved from the bindings map AnnotationSet set = (AnnotationSet)bindings.get("labelname");

University of Sheffield NLP Accessing Annotations and Features When writing complex JAPE you will often need to access annotation features All features of an annotation are stored in a map FeatureMap map = annotation.getFeatures() Each feature is accessed by name Object obj = map.get(“featurename”)

University of Sheffield NLP Adding New Annotations New annotations should always be created in the outputAS To create an annotation you need  The annotation name  The start and end offset  A FeatureMap instance (can be empty) outputAS.add(start,end,label,features)

University of Sheffield NLP Shorthand Notation for JAVA RHS Where a Java block refers to a single left- hand-side binding, JAPE provides a shorthand notation: Rule: RemoveDoneFlag ( {Instance.flag == "done"} ):inst --> :inst{ Annotation theInstance = (Annotation)instAnnots.iterator().next(); theInstance.getFeatures().remove("flag"); }

University of Sheffield NLP Shorthand Notation for JAVA RHS A label : on a Java block creates a local variable Annots within the Java block which is the AnnotationSet bound to the label. The Java code in the block is only executed if there is at least one annotation bound to the label

University of Sheffield NLP Hands On: Extending the IE Example In the previous JAPE session you wrote a rule to annotate phrases such as  Whitbread shares closed up 2p at 645p. Annotating the phrase is useful but there is lots of information which would be useful to extract as features  Starting price  Change in price  Closing price

University of Sheffield NLP Hands On: Extending the IE Example You will need to  Extract the closing price and change assume they are always in pence so you can get the value by removing the trailing ‘p’  Get the minorType of the Lookup  Calculate the starting price  Create a new annotation with these values as features

Your Turn! Feel Free To Refer To The User Guide And To Ask For Help

University of Sheffield NLP Hands On: Extending the IE Example Phase: Shares Input: Token Organization Lookup Money Options: control = appelt Rule:ShareChange ( {Organization} ({Token})[0,3] ({Lookup.majorType=="change"}):lookup ({Token})[0,3] ({Money}):delta {Token.string == "at"} ({Money}):closing ):change --> { try { AnnotationSet change = (AnnotationSet)bindings.get("change"); Annotation delta = ((AnnotationSet)bindings.get("delta")).iterator().next(); Annotation closing = ((AnnotationSet)bindings.get("closing")).iterator().next(); boolean rise = ((AnnotationSet)bindings.get("lookup")).iterator().next().getFeatures().get("minorType").equals("Changes-up"); int deltaValue = Integer.parseInt(doc.getContent().getContent(delta.getStartNode().getOffset(),delta.getEndNode().getOffset()-1).toString()); int closingValue = Integer.parseInt(doc.getContent().getContent(closing.getStartNode().getOffset(),closing.getEndNode().getOffset()-1).toString()); int startValue = (rise ? closingValue - deltaValue : closingValue + deltaValue); FeatureMap features = Factory.newFeatureMap(); features.put("rule","ShareChange"); features.put("opening",startValue+"p"); features.put("change",deltaValue+"p"); features.put("closing", closingValue+"p"); features.put("direction", (rise ? "up" : "down")); outputAS.add(change.firstNode(),change.lastNode(),"ShareChange",features); } catch (Exception e) { // ignore this for now }