Fex Feature Extractor - v2. Topics Vocabulary Syntax of scripting language –Feature functions –Operators Examples –POS tagging Input Formats.

Slides:



Advertisements
Similar presentations
Expectation Maximization Dekang Lin Department of Computing Science University of Alberta.
Advertisements

Three Basic Problems 1.Compute the probability of a text (observation) language modeling – evaluate alternative texts and models P m (W 1,N ) 2.Compute.
Intermediate Code Generation
INSTRUCTOR:Dr.Veton Kepuska STUDENT:Dileep Narayan.Koneru YES/NO RECOGNITION SYSTEM.
Automatic Speech Recognition II  Hidden Markov Models  Neural Network.
Albert Gatt Corpora and Statistical Methods Lecture 8.
Part II. Statistical NLP Advanced Artificial Intelligence Part of Speech Tagging Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme Most.
Using Treebanks tgrep2 Lecture 2: 07/12/2011. Using Corpora For discovery For evaluation of theories For identifying tendencies – distribution of a class.
Confidence Estimation for Machine Translation J. Blatz et.al, Coling 04 SSLI MTRG 11/17/2004 Takahiro Shinozaki.
1 Learning to Detect Objects in Images via a Sparse, Part-Based Representation S. Agarwal, A. Awan and D. Roth IEEE Transactions on Pattern Analysis and.
Ch 10 Part-of-Speech Tagging Edited from: L. Venkata Subramaniam February 28, 2002.
Copyright © 2007 Pearson Education, Inc. Publishing as Pearson Addison-Wesley Slide 4- 1.
Syllabus Text Books Classes Reading Material Assignments Grades Links Forum Text Books עיבוד שפות טבעיות - שיעור חמישי POS Tagging Algorithms עידו.
Lesson 6. Refinement of the Operator Model This page describes formally how we refine Figure 2.5 into a more detailed model so that we can connect it.
Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network Kristina Toutanova, Dan Klein, Christopher Manning, Yoram Singer Stanford University.
Implementing FastTBL in Oz Leif Grönqvist & Fredrik Kronlid
TopicTrend By: Jovian Lin Discover Emerging and Novel Research Topics.
1 Overview of Machine Learning for NLP Tasks: part I (based partly on slides by Kevin Small and Scott Yih)
CSCI 347 / CS 4206: Data Mining Module 04: Algorithms Topic 06: Regression.
STRUCTURED PERCEPTRON Alice Lai and Shi Zhi. Presentation Outline Introduction to Structured Perceptron ILP-CRF Model Averaged Perceptron Latent Variable.
Lecture 1, 7/21/2005Natural Language Processing1 CS60057 Speech &Natural Language Processing Autumn 2005 Lecture 1 21 July 2005.
Title Extraction from Bodies of HTML Documents and its Application to Web Page Retrieval Microsoft Research Asia Yunhua Hu, Guomao Xin, Ruihua Song, Guoping.
GTECH 361 Lecture 13a Address Matching. Address Event Tables Any supported tabular format One field must specify an address The name of that field is.
Part II. Statistical NLP Advanced Artificial Intelligence Applications of HMMs and PCFGs in NLP Wolfram Burgard, Luc De Raedt, Bernhard Nebel, Lars Schmidt-Thieme.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Tree Kernels for Parsing: (Collins & Duffy, 2001) Advanced Statistical Methods in NLP Ling 572 February 28, 2012.
SVM by Sequential Minimal Optimization (SMO)
Midterm Review Rao Vemuri 16 Oct Posing a Machine Learning Problem Experience Table – Each row is an instance – Each column is an attribute/feature.
Overview of Machine Learning for NLP Tasks: part II Named Entity Tagging: A Phrase-Level NLP Task.
Chapter Four UNIX File Processing. 2 Lesson A Extracting Information from Files.
Copyright © 2012 Pearson Education, Inc. Publishing as Pearson Addison-Wesley C H A P T E R 6 Value- Returning Functions and Modules.
Copyright © 2012 Pearson Education, Inc. Publishing as Pearson Addison-Wesley C H A P T E R 2 Input, Processing, and Output.
8/25/05 Cognitive Computations Software Tutorial Page 1 SNoW: Sparse Network of Winnows Presented by Nick Rizzolo.
Copyright © 2015 Pearson Education, Inc. Publishing as Pearson Addison-Wesley C H A P T E R 5 Functions.
HW7 Extracting Arguments for % Ang Sun March 25, 2012.
Exploring an Open Source Automation Framework Implementation.
Arabic Tokenization, Part-of-Speech Tagging and Morphological Disambiguation in One Fell Swoop Nizar Habash and Owen Rambow Center for Computational Learning.
Data Mining Practical Machine Learning Tools and Techniques Chapter 4: Algorithms: The Basic Methods Section 4.6: Linear Models Rodney Nielsen Many of.
Unit-1 Introduction Prepared by: Prof. Harish I Rathod
A Cascaded Finite-State Parser for German Michael Schiehlen Institut für Maschinelle Sprachverarbeitung Universität Stuttgart
Transformation-Based Learning Advanced Statistical Methods in NLP Ling 572 March 1, 2012.
Indirect Supervision Protocols for Learning in Natural Language Processing II. Learning by Inventing Binary Labels This work is supported by DARPA funding.
9.0 New Features Min. Life for a Titanium Turbine Blade Workshop 9 Robust Design – DesignXplorer.
Introducing Python CS 4320, SPRING Lexical Structure Two aspects of Python syntax may be challenging to Java programmers Indenting ◦Indenting is.
Face Detection Using Large Margin Classifiers Ming-Hsuan Yang Dan Roth Narendra Ahuja Presented by Kiang “Sean” Zhou Beckman Institute University of Illinois.
CS 460/660 Compiler Construction. Class 01 2 Why Study Compilers? Compilers are important – –Responsible for many aspects of system performance Compilers.
CSA2050 Introduction to Computational Linguistics Parsing I.
13-1 Sequential File Processing Chapter Chapter Contents Overview of Sequential File Processing Sequential File Updating - Creating a New Master.
CS : Speech, NLP and the Web/Topics in AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture-14: Probabilistic parsing; sequence labeling, PCFG.
FUNCTIONS. Topics Introduction to Functions Defining and Calling a Void Function Designing a Program to Use Functions Local Variables Passing Arguments.
CSSE463: Image Recognition Day 11 Due: Due: Written assignment 1 tomorrow, 4:00 pm Written assignment 1 tomorrow, 4:00 pm Start thinking about term project.
Haskell Basics CSCE 314 Spring CSCE 314 – Programming Studio Using GHC and GHCi Log in to unix.cse.tamu.edu (or some other server) From a shell.
CS : Speech, NLP and the Web/Topics in AI Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture-15: Probabilistic parsing; PCFG (contd.)
NLP. Parsing ( (S (NP-SBJ (NP (NNP Pierre) (NNP Vinken) ) (,,) (ADJP (NP (CD 61) (NNS years) ) (JJ old) ) (,,) ) (VP (MD will) (VP (VB join) (NP (DT.
Machine Learning Lecture 1: Intro + Decision Trees Moshe Koppel Slides adapted from Tom Mitchell and from Dan Roth.
NLP. Introduction to NLP #include int main() { int n, reverse = 0; printf("Enter a number to reverse\n"); scanf("%d",&n); while (n != 0) { reverse =
1 Machine Learning in Natural Language More on Discriminative models Dan Roth University of Illinois, Urbana-Champaign
13-1 ANSYS, Inc. Proprietary © 2009 ANSYS, Inc. All rights reserved. April 28, 2009 Inventory # Chapter 13 Solver.out File and CCL Introduction to.
語音訊號處理之初步實驗 NTU Speech Lab 指導教授: 李琳山 助教: 熊信寬
CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 25– Probabilistic Parsing) Pushpak Bhattacharyya CSE Dept., IIT Bombay 14 th March,
Introduction of SNoW (Sparse Network of Winnows )
Why the interest in Queries?
By Dan Roth and Wen-tau Yih PowerPoint by: Reno Kriz CIS
Chapter 12: Query Processing
Tutorial for LightSIDE
Topics Introduction to Value-returning Functions: Generating Random Numbers Writing Your Own Value-Returning Functions The math Module Storing Functions.
Word embeddings (continued)
CS639: Data Management for Data Science
Joining Your Data to a Map
Word representations David Kauchak CS158 – Fall 2016.
Presentation transcript:

Fex Feature Extractor - v2

Topics Vocabulary Syntax of scripting language –Feature functions –Operators Examples –POS tagging Input Formats

Vocabulary example – A list of active records for which Fex produces a single SNOW example. Usually a sentence. record –a single position in an example (sentence). –Contains a list of fields, each of which holds a different info: e.g. NLP: Word, Tag, Vision: color, etc. Raw input to Fex –A list of valid example, (raw sentences, tagged corpora, etc. ) Fex’s Output –Lexical features written to the lexicon file. –Their corresponding numeric ID’s are written to the example file. feature function –A relation among one or more records.

Example: Feature Functions

Script Syntax A Fex script file contains a list of definitions, each of which will rewrite the given observation into a set of active features. Definition format, terms in ()’s optional: target (inc) (loc): FeatureFunc ([left, right]) target - Target index or word. To treat each record in the observation as a target, use -1. This is a macro for “all words”. inc - Include target word instead of placeholder (*) in some features. loc - Generate features with location relative to target.

FeatureFunc - A feature function defined in terms of certain unary and n-ary relations, and operators. left - Left offset of scope for generating features. Negative values are left of the target, positive to the right. right - Right offset of scope.

Basic Feature Functions Type Def Fex Notation Interpretation Output to Lexicon Labellab produces a label featurelab[target word] lab(t)lab[target tag] Wordw Active if word(s) in current w[current word] record is within scope Tag (pos)tActive if tag(s) in current t[current tag] record is within scope VowelvActive if the word(s) in v[initial vowel] current record begin with a vowel. PrefixpreActive if the word(s) in the pre[active prefix] current record begins with a prefix in a given list.

Type Def Fex Notation Interpretation Output to Lexicon SuffixsufActive if the word(s) in suf[the active suffix] the current record begins with a prefix in a given list BaselinebaseActive if a baseline tag frombase[baseline tag] a prepared list exists for the word(s) in the current record LemmalemActive if a lemma from thelem[active lemma] WordNet database exists for the word(s) in the current record

Example Sentence = “(DET The) (NN dog) (V is) (JJ mad)” method 1 Script Def Output to lexicon Output to example file dog: w [-1,1]10001 w[The] 10001, 10002, 10003, 10004: w[is] dog: t [1,2]10003 t[V] t[JJ] method 2 Script Def Output to lexicon Output to example file -1: lab10001 w[The] 1, 10001, 10002, 10003, 10004: -1: w [-1,1] w[is] -1: t [1,2] t[V] t[JJ]

Operators & Complex Functions (X) operator - Indicate that a feature is active without any specific instantiation. Script Def Output to Lexicon dog: v(X) [-1,1] v[] (x=y) operator – Creates an active feature iff the active instantiation matches the given argument. Script Def Output to Lexicon dog: w(x=is) w[is] Sentence = “(DET The) (NN dog) (V is) (JJ mad)”

& operator - conjunct two features: producing a new feature which is active iff record fulfills both constituent features. Script Def Output to Lexicon dog: w&t [-1,-1] w[The]&t[DET] | operator - disjunction of two feature: outputting a feature for each term of the disjunction that is active in the current record. Script Def Output to Lexicon dog: w|t [-1,-1] w[The] t[DET] Sentence = “(DET The) (NN dog) (V is) (JJ mad)” Operators & Complex Functions

coloc function - Consecutive feature function: takes two or more features as arguments to produce a consecutive collocation over two or more records. The order of the arguments is preserved in the active feature. Script Def Output to Lexicon mad: coloc(w, t) [-3,-1]10001w[The]-t[NN] 10002w[dog]-t[V] scoloc function –Sparse Consecutive feature function: operates similarly to coloc, except that active colocations need not be consecutive. However, the order of the arguments is still preserved in determining whether a feature is active. Script Def Output to Lexicon mad: scoloc(w,t) [-3,-1] 10001w[The]-t[NN] 10002w[dog]-t[V] 10003w[The]-t[V] Operators & Complex Functions

Example: POS tagging Useful features for POS tagging: –The preceding word is tagged c. –The following word is tagged c. –The word two before is tagged c. –The word two after is tagged c. –The preceding word is tagged c and the following word is tagged t. –The preceding word is tagged c and the word two before is tagged t –The following word is tagged c and the word two after is tagged t. –The current word is w. –The most probable part of speech for the current word is c.

Given the sentence: –(t1 The) (t2 dog) (t3 ran) (t4 very) (t5 quickly) The following Fex script will produce the features from the last slide. -1: lab(t) -1 loc: t [-2,2] -1: coloc(t,t,t) [-2,2] -1 inc: w[0,0] -1: base[0,0] To do POS tagging, an example needs to be generated for each word in observation.

For the third word, “ran”, the script produces the following output: –Script: Lexicon Output: -1: lab(t)1 lab[t3] -1 loc: t [-2,2]10001 t[t1_*] 10002t[t2*] 10003t[*t4] 10004t[*_t5] -1: coloc(t,t,t) [-2,2]10005t[t1]-t[t2]-* 10006t[t2]-*-t[t4] 10007*-t[t4]-t[t5] -1 inc: w [0,0]10008w[ran] -1: base [0,0]10009base[V] And an example in the example file: –1, 10001, 10002, 10003, 10004, 10005, 10006, 10007, 10008, 10009:

Input Formats Fex can presently accept data in two formats: –w1 w2 w3 w4 … – (t1 w1) (t2 w2) (t3 w3) (t4 w4) … –w1 (t2 w2) (t3 t3a; w3) (t4; w4 w4a) …

Input Formats Fex can presently accept data in two formats: –Old format: w1 (t2 w2) (t3 t3a; w3) (t4; w4 w4a) –New format (ILK): I-NP NNP Pierre NOFUNC Vinken I-NP NNP Vinken NP-SBJ join O COMMA COMMA NOFUNC Vinken I-NP CD 61 NOFUNC years I-NP NNS years NP old I-ADJP JJ old ADJP Vinken O COMMA COMMA NOFUNC Vinken I-VP MD will NOFUNC join 8

Using Fex (command line) fex [options] script-file lexicon-file corpus-file example-file Options: -t: target file –do not have any empty line in your file!!! –Each target in a separate line -r: test mode –Does not create new features -h, -I –Creates a histogram of active features

Using Fex (command line) Target file= targ: Script file = script : dog -1 : lab cat -1 : w [-1,-1] -1 : t [-1,-1] Corpus file = corpus (DET The) (NN dog) (V is) (JJ mad) Lexicon file =lexicon Example file=example fex –t targ script lexicon corpus example

SNoW

Sparse Networks Of Winnows Complex Features sayjoin Targets Basic Features Knowledge Enriched Features Constant feature mapping Learning

Word representation

Restrictions on the learning approach Multi- Class Variable number of features –per class –per example Efficient learning Efficient evaluation

SNoW Network of threshold gates Target nodes represent class labels Input nodes (features) and links are allocated in a data driven way ( Order of 10 5 input features for many target nodes) Each sub-network (target nodes) is learned autonomously as a function of the features An example presented is positive to one network negative to others (depends on the algorithm) Allocations of nodes (features) and links is Data-Driven (a link between feature f i and target t j is created only when f i was active with any target t j )

Word prediction using SNoW Target nodes each word in the set of candidates words is a target node Input nodes an input node for feature f i is allocated only if that feature f i was active with any target Decision task we need to choose one target among all possible candidates

SNoW (Command line) snow –train –I inputfile –F networkfile [-ABcdePrsTvW] snow –test –I inputfile –F networkfile [-bEloRvw] Architecture Winnow: -W [ , , , init weight] :targets Perceptron: -P [ , , init weight] :targets NB: -B :targets

SNoW parameters (training) -d | rel > : discarding method -e : eligibility threshold -r : number of cycles output modes -c : interval for network snapshot -v :details for the output to the screen

SNoW parameters (testing) -b : smoothing for NB -w : smoothing for W, P output modes -E : error file -o :details for the output -R : results file (stdout)

File Format (Example file) 6, 10034, 10141, 10151, 10158, 10179: 177, 10034, 10035, 10047: With weights: 6, 10034(1), 10141(1.5), 10151(0.4), 10158(2), 10179(0.1): 177, 10034(2), 10035(4), 10047(0.6): Only active feature appear in an example !!!

File Format (Network file) NB target naivebayes : 0 : : : 0 : : Winnow target winnow : 0 : : : 0 : : Perceptron target perceptron : 0 : : : 0 : :

File Format (Error file) Algorithms: Perceptron: (1, 30, 0.05) Targets: 3, 53, 73 Ex: 8 Prediction: 3 Label: 53 3: : * 73: Ex: 15 Prediction: 3 Label: 73 3: : * 53: