Sequence Classification: Chunking Shallow Processing Techniques for NLP Ling570 November 28, 2011
Chunking
Roadmap Chunking Definition Motivation Challenges Approach
What is Chunking? Form of partial (shallow) parsing
What is Chunking? Form of partial (shallow) parsing Extracts major syntactic units, but not full parse trees
What is Chunking? Form of partial (shallow) parsing Extracts major syntactic units, but not full parse trees Task: identify and classify Flat, non-overlapping segments of a sentence
What is Chunking? Form of partial (shallow) parsing Extracts major syntactic units, but not full parse trees Task: identify and classify Flat, non-overlapping segments of a sentence Basic non-recursive phrases
What is Chunking? Form of partial (shallow) parsing Extracts major syntactic units, but not full parse trees Task: identify and classify Flat, non-overlapping segments of a sentence Basic non-recursive phrases Correspond to major POS May ignore some categories; i.e. base NP chunking
What is Chunking? Form of partial (shallow) parsing Extracts major syntactic units, but not full parse trees Task: identify and classify Flat, non-overlapping segments of a sentence Basic non-recursive phrases Correspond to major POS May ignore some categories; i.e. base NP chunking Create simple bracketing [ NP The morning flight][ PP from][ NP Denver][ Vp has arrived]
What is Chunking? Form of partial (shallow) parsing Extracts major syntactic units, but not full parse trees Task: identify and classify Flat, non-overlapping segments of a sentence Basic non-recursive phrases Correspond to major POS May ignore some categories; i.e. base NP chunking Create simple bracketing [ NP The morning flight][ PP from][ NP Denver][ Vp has arrived] [ NP The morning flight] from [ NP Denver] has arrived
Why Chunking? Used when full parse unnecessary
Why Chunking? Used when full parse unnecessary Or infeasible or impossible (when?)
Why Chunking? Used when full parse unnecessary Or infeasible or impossible (when?) Extraction of subcategorization frames Identify verb arguments e.g. VP NP VP NP NP VP NP to NP
Why Chunking? Used when full parse unnecessary Or infeasible or impossible (when?) Extraction of subcategorization frames Identify verb arguments e.g. VP NP VP NP NP VP NP to NP Information extraction: who did what to whom
Why Chunking? Used when full parse unnecessary Or infeasible or impossible (when?) Extraction of subcategorization frames Identify verb arguments e.g. VP NP VP NP NP VP NP to NP Information extraction: who did what to whom Summarization: Base information, remove mods
Why Chunking? Used when full parse unnecessary Or infeasible or impossible (when?) Extraction of subcategorization frames Identify verb arguments e.g. VP NP VP NP NP VP NP to NP Information extraction: who did what to whom Summarization: Base information, remove mods Information retrieval: Restrict indexing to base NPs
Processing Example Tokenization: The morning flight from Denver has arrived
Processing Example Tokenization: The morning flight from Denver has arrived POS tagging: DT JJ N PREP NNP AUX V
Processing Example Tokenization: The morning flight from Denver has arrived POS tagging: DT JJ N PREP NNP AUX V Chunking: NP PP NP VP
Processing Example Tokenization: The morning flight from Denver has arrived POS tagging: DT JJ N PREP NNP AUX V Chunking: NP PP NP VP Extraction: NP NP VP etc
Approaches Finite-state Approaches Grammatical rules in FSTs Cascade to produce more complex structure
Approaches Finite-state Approaches Grammatical rules in FSTs Cascade to produce more complex structure Machine Learning Similar to POS tagging
Finite-State Rule-Based Chunking Hand-crafted rules model phrases Typically application-specific
Finite-State Rule-Based Chunking Hand-crafted rules model phrases Typically application-specific Left-to-right longest match (Abney 1996) Start at beginning of sentence Find longest matching rule
Finite-State Rule-Based Chunking Hand-crafted rules model phrases Typically application-specific Left-to-right longest match (Abney 1996) Start at beginning of sentence Find longest matching rule Greedy approach, not guaranteed optimal
Finite-State Rule-Based Chunking Chunk rules: Cannot contain recursion NP -> Det Nominal:
Finite-State Rule-Based Chunking Chunk rules: Cannot contain recursion NP -> Det Nominal: Okay Nominal -> Nominal PP:
Finite-State Rule-Based Chunking Chunk rules: Cannot contain recursion NP -> Det Nominal: Okay Nominal -> Nominal PP: Not okay Examples: NP (Det) Noun* Noun NP Proper-Noun VP Verb VP Aux Verb
Finite-State Rule-Based Chunking Chunk rules: Cannot contain recursion NP -> Det Nominal: Okay Nominal -> Nominal PP: Not okay Examples: NP (Det) Noun* Noun NP Proper-Noun VP Verb VP Aux Verb Consider: Time flies like an arrow Is this what we want?
Cascading FSTs Richer partial parsing Pass output of FST to next FST
Cascading FSTs Richer partial parsing Pass output of FST to next FST Approach: First stage: Base phrase chunking Next stage: Larger constituents (e.g. PPs, VPs) Highest stage: Sentences
Example
Chunking by Classification Model chunking as task similar to POS tagging Instance:
Chunking by Classification Model chunking as task similar to POS tagging Instance: tokens Labels: Simultaneously encode segmentation & identification
Chunking by Classification Model chunking as task similar to POS tagging Instance: tokens Labels: Simultaneously encode segmentation & identification IOB (or BIO tagging) (also BIOE or BIOSE) Segment: B(eginning), I (nternal), O(utside)
Chunking by Classification Model chunking as task similar to POS tagging Instance: tokens Labels: Simultaneously encode segmentation & identification IOB (or BIO tagging) (also BIOE or BIOSE) Segment: B(eginning), I (nternal), O(utside) Identity: Phrase category: NP, VP, PP, etc.
Chunking by Classification Model chunking as task similar to POS tagging Instance: tokens Labels: Simultaneously encode segmentation & identification IOB (or BIO tagging) (also BIOE or BIOSE) Segment: B(eginning), I (nternal), O(utside) Identity: Phrase category: NP, VP, PP, etc. The morning flight from Denver has arrived NP-B NP-I NP-I PP-B NP-B VP-B VP-I
Chunking by Classification Model chunking as task similar to POS tagging Instance: tokens Labels: Simultaneously encode segmentation & identification IOB (or BIO tagging) (also BIOE or BIOSE) Segment: B(eginning), I (nternal), O(utside) Identity: Phrase category: NP, VP, PP, etc. The morning flight from Denver has arrived NP-B NP-I NP-I PP-B NP-B VP-B VP-I NP-B NP-I NP-I NP-B
Features for Chunking What are good features?
Features for Chunking What are good features? Preceding tags for 2 preceding words
Features for Chunking What are good features? Preceding tags for 2 preceding words Words for 2 preceding, current, 2 following
Features for Chunking What are good features? Preceding tags for 2 preceding words Words for 2 preceding, current, 2 following Parts of speech for 2 preceding, current, 2 following
Features for Chunking What are good features? Preceding tags for 2 preceding words Words for 2 preceding, current, 2 following Parts of speech for 2 preceding, current, 2 following Vector includes those features + true label
Chunking as Classification Example
Evaluation System: output of automatic tagging Gold Standard: true tags Typically extracted from parsed treebank Precision: # correct chunks/# system chunks Recall: # correct chunks/# gold chunks F-measure: F 1 balances precision & recall
State-of-the-Art Base NP chunking: 0.96
State-of-the-Art Base NP chunking: 0.96 Complex phrases: Learning: Most learners achieve similar results Rule-based:
State-of-the-Art Base NP chunking: 0.96 Complex phrases: Learning: Most learners achieve similar results Rule-based: Limiting factors:
State-of-the-Art Base NP chunking: 0.96 Complex phrases: Learning: Most learners achieve similar results Rule-based: Limiting factors: POS tagging accuracy Inconsistent labeling (parse tree extraction) Conjunctions Late departures and arrivals are common in winter Late departures and cancellations are common in winter