Applications of Sequence Learning CMPT 825 Mashaal A. Memon
What We Know of Sequence Learning Part Of Speech (POS) Tagging is a sequence learning problem. 3 approaches to solving the problem: Noisy-Channel Classification Rule-Based
What We Know About POS Tagging A part of speech (POS) explains not what the word is, but how it is used. Problem: Which POS does each word represent? Tags: POS tags (i.e. NN = Noun, VB = Verb, etc…) Training: Words sequences with corresponding POS tags. Input: Word sequences.
What We Know About POS Tagging Continued… Examples: Anoop is a great professor . NN VBZ DT JJ NN . I am kissing butt right now . PRP VBP RB NN RB RB .
What Is My Point? Other interesting and important problems can be represented as tagging problems. The same three approaches can be used. 4 such applications will be briefly introduced: Chunking Named Entity Recognition Cascaded Chunking Word Segmentation
(1) Chunking A chunk is a syntactically correlated part of a language (i.e. noun phrase, verb phrase, etc.) Problem: Which type of chunk does each word or group of words belong to? Note: Chunks of the same type can sometimes kiss each other.
(1) Chunking Continued… Noun-Phrase (NP) Chunking Only look for noun phrase chunks. Tags: B = beginning noun phrase I = in noun phrase O = other Training: Word sequences with corresponding POS and NP tags. Input: Word sequences and POS tags.
(1) Chunking Continued… Noun-Phrase (NP) Chunking Examples: The student talked to Anoop . B I O O B O The guy he talked to was smelly . B I B O O O O O
(1) Chunking Continued… General Chunking Look for other syntactical constructs as well as noun phrases. Tags: - B or I prefix to each chunk type - chunk types (NP = noun phrase, VP = verb phrase, PP = prepositional phrase, O = other) Training: Word sequences with corresponding POS and chunk tags. Input: Word sequences and POS tags.
(1) Chunking Continued… General Chunking Examples: Anoop should give me an A+ . B-NP B-VP I-VP B-NP B-NP I-NP O His presentation is boring me to death . B-NP I-NP B-PP B-VP B-NP B-PP B-VP O
(2) Named Entity Recognition A named entity is a phrase that contains names of persons, organizations or locations Problem: Does a word or group of words represent a named entity or not? Tags: - B or I prefix to each NE type - NE types (PER = person, ORG = organization, LOC = location, O = other) Training: Word sequences with corresponding POS and NE tags. Sometimes lists of NE data are used (Cheating!!) Input: Word sequences with POS tags.
(2) Named Entity Recognition Continued… Examples: The United States of America O B-LOC I-LOC I-LOC I-LOC has an intelligent leader in D.C. O O O O O B-LOC , Dick Cheney of Halliburton . O B-PER I-PER O B-ORG O
(3) Cascaded Chunking Cascaded chunking gives us the parse tree of the sentence back. Can think of it as chunker taking initial input and then continues to work on its OWN output until no more changes are made to input. Difference: Chunks may contain other chunks and POS
(3) Cascaded Chunking Continued… CHUNKER (W = {w1..wn}, T = {t1..tn}) → T’ = {t’1..t’n}; CASCADE (W = {w1..wn}, T = {t1..tn}) { OutputBefore = {Ø}; OutputAfter = CHUNKER (W,T); while (OutputBefore != OutputAfter) do { OutputBefore = OutputAfter; OutputAfter = CHUNKER (W, OutputBefore); /* Output result of current iteration */ }
(3) Cascaded Chunking Continued… Example: The effort to establish such a conclusion is unnecessary . DT NN TO VB PDT DT NN VBZ JJ . ______ __ ________ __________ ___________ DT NP IP VP PDT DT NP AP __________ ____________ __________________ ______________ DP CP DP CP ... ___________________________________________________________ S Chunking is an intermediate step to a full parse
(4) Word Segmentation When written, some languages like Chinese don’t have obvious word boundries. Problem: Find whether a character or group of characters is a single word? Tags: B = beginning of word I = in word Training: Character sequences with corresponding WS tags. Input: Character sequences.
(4) Word Segmentation Continued… Example: 參賽者並未參加任何賓大語料之競賽 B I I B I B I B I B B B I B B I
Conclusion All problems are different in their goals, but with the same type of representation, they all can be solved with the same approaches. We all LOVE sequence learning THE END
Questions?!
References Manning D., H. Schultze. Foundations of Statistical Natural Language Processing. 1999. CoNLL shared task on Chunking 2000. Website: (http://cnts.uia.ac.be/conll2000/chunking/) CoNLL shared task on NER 2003. Website: (http://cnts.uia.ac.be/conll2003/ner/) CoNLL shared task on NER 2002. Website: (http://cnts.uia.ac.be/conll2002/ner/) Abney, S.. Parsing By Chunks. In Journal of Psychological Research, 18(1), 1989. Chinese Word Segmentation Bakeoff 2003. Website: (http://www.sighan.org/bakeoff2003)