Presentation is loading. Please wait.

Presentation is loading. Please wait.

©2012 Paula Matuszek GATE and ANNIE Information taken primarily from the GATE user manual, gate.ac.uk/sale/tao, and GATE training materials,

Similar presentations


Presentation on theme: "©2012 Paula Matuszek GATE and ANNIE Information taken primarily from the GATE user manual, gate.ac.uk/sale/tao, and GATE training materials,"— Presentation transcript:

1 ©2012 Paula Matuszek GATE and ANNIE Information taken primarily from the GATE user manual, gate.ac.uk/sale/tao, and GATE training materials, http://gate.ac.uk/wiki/training-materials-2011.htmlhttp://gate.ac.uk/wiki/training-materials-2011.html CSC 9010: Text Mining Applications Fall, 2012 Lab: Modifying ANNIE Dr. Paula Matuszek Paula.Matuszek@gmail.com

2 ©2012 Paula Matuszek GATE and ANNIE Information taken primarily from the GATE user manual, gate.ac.uk/sale/tao, and GATE training materials, http://gate.ac.uk/wiki/training-materials-2011.htmlhttp://gate.ac.uk/wiki/training-materials-2011.html ANNIE Applications l Most of the ANNIE components are domain-independent, and can be used in any application l Typically we need to modify two things for a new application –gazetteers –JAPE transducers

3 ©2012 Paula Matuszek GATE and ANNIE Information taken primarily from the GATE user manual, gate.ac.uk/sale/tao, and GATE training materials, http://gate.ac.uk/wiki/training-materials-2011.htmlhttp://gate.ac.uk/wiki/training-materials-2011.html Gazetteers l These are pretty simple –Modify within GATE using gazetteer editor –Modify the files directly –If you add a gazetteer then be sure to modify lists.def as well.

4 ©2012 Paula Matuszek GATE and ANNIE Information taken primarily from the GATE user manual, gate.ac.uk/sale/tao, and GATE training materials, http://gate.ac.uk/wiki/training-materials-2011.htmlhttp://gate.ac.uk/wiki/training-materials-2011.html JAPE Rules l Much more complex. JAPE is a full-fledged rule- based system. l We will spend the lab working through a JAPE tutorial. l Go to http://gate.ac.uk/sale/talks/gate-course- may11/track-1/ and downloadhttp://gate.ac.uk/sale/talks/gate-course- may11/track-1/ –the presentation from module 3: module-3- jape.{pdf|ppt} –the data files from modules 2 and 3 : module-2-hands- on.zip and jape-hands-on.zip l As I get to the exercises, try them.

5 /(20) JAPE grammars A semantic tagger consists of a set of rule- based JAPE grammars run sequentially JAPE is a pattern-matching language The LHS of each rule contains patterns to be matched The RHS contains details of annotations (and optionally features) to be created More complex rules can also be created http://gate.ac.uk/sale/talks/annie-tutorial.ppt

6 /(20) LHS of the rule LHS is expressed in terms of existing annotations, and optionally features and their values Any annotation to be used must be included in the input header Any annotation not included in the input header will be ignored (e.g. whitespace) Each annotation is enclosed in curly braces Each pattern to be matched is enclosed in round brackets and has a label attached http://gate.ac.uk/sale/talks/annie-tutorial.ppt

7 /(20) RHS of the rule LHS and RHS are separated by  Label matches that on the LHS Annotation to be created follows the label (Annotation1):match  :match.NE = {feature1 = value1, feature2 = value2} http://gate.ac.uk/sale/talks/annie-tutorial.ppt

8 /(20) Using phases Grammars usually consist of several phases, run sequentially Only one rule within a single phase can fire Temporary annotations may be created in early phases and used as input for later phases Annotations from earlier phases may need to be combined or modified A definition phase (conventionally called main.jape) lists the phases to be used, in order Only the definition phase needs to be loaded http://gate.ac.uk/sale/talks/annie-tutorial.ppt

9 ©2012 Paula Matuszek GATE and ANNIE Information taken primarily from the GATE user manual, gate.ac.uk/sale/tao, and GATE training materials, http://gate.ac.uk/wiki/training-materials-2011.htmlhttp://gate.ac.uk/wiki/training-materials-2011.html l On to JAPE tutorial

10 ©2012 Paula Matuszek GATE and ANNIE Information taken primarily from the GATE user manual, gate.ac.uk/sale/tao, and GATE training materials, http://gate.ac.uk/wiki/training-materials-2011.htmlhttp://gate.ac.uk/wiki/training-materials-2011.html Assignment 7 l Assignment 7 will be to modify the JAPE tutorial files to extract a bit of information about universities in the Philadelphia area. You should be able to add the university annotation to occurrences of: –Villanova University –Villanova (in the right context!) –‘Nova –University of Pennsylvania –Penn –UPenn


Download ppt "©2012 Paula Matuszek GATE and ANNIE Information taken primarily from the GATE user manual, gate.ac.uk/sale/tao, and GATE training materials,"

Similar presentations


Ads by Google