Download presentation
Presentation is loading. Please wait.
Published byErnest Leonard Modified over 9 years ago
1
A Fast Finite-state Relaxation Method for Enforcing Global Constraints on Sequence Decoding Roy Tromble & Jason Eisner Johns Hopkins University
2
Seminar – Friday, April 1 Speaker: Monty Hall Location: Auditorium #1 “Let’s Make a Dilemma” Monty Hall will host a discussion of his famous paradox. We know what the labels should look like! Agreement: –Named Entity Recognition (Finkel et al., ACL 2005) –Seminar announcements (Finkel et al., ACL 2005) Label structure: –Bibliography parsing (Peng & McCallum, HLT- NAACL 2004) –Semantic Role Labeling (Roth & Yih, ICML 2005) *One role per string*One string per role
3
Sequence modeling quality Decoding runtime Local models Global constraints Finite-state constraint relaxation Exploit the quality of the local models!
4
Semantic Role Labeling Label each argument to a verb –Six core argument types (A0-A5) CoNLL-2004 shared task –Penn Treebank section 20 –4305 propositions Follow Roth & Yih (ICML 2005) A1A4A3 Sales for the quarter rose to $ 1.63 billion from $ 1.47 billion. A1 A1 A1 O O A4 O A3 O
5
Encoding constraints as finite-state automata
6
Roth & Yih’s constraints as FSAs [^A0]*A0*[^A0]* [^A1]*A1*[^A1]* Each argument type (A0, A1,...) can label at most one sub-sequence of the input. NO DUPLICATE ARGUMENTS
7
Roth & Yih’s constraints as FSAs O*[^O]?* The label sequence must contain at least one instance that is not O. AT LEAST ONE ARGUMENT Regular expressions on any sequences: grep for sequence models
8
Roth & Yih’s constraints as FSAs Only allow argument types that are compatible with the proposition’s verb. DISALLOW ARGUMENTS
9
Roth & Yih’s constraints as FSAs The proposition’s verb must be labeled O. KNOWN VERB POSITION
10
Roth & Yih’s constraints as FSAs Certain sub-sequences must receive a single label. ARGUMENT CANDIDATES Any constraints on bounded-length sequences
11
Roth & Yih’s local model as a lattice “Soft constraints” or “features” Unigram model!
12
A brute-force FSA decoder Local model IntersectDecode Sentence Labeling Global constraints
13
NO DUPLICATE A0
14
NO DUPLICATE A0, A1
15
NO DUPLICATE A0, A1, A2
16
NO DUPLICATE ARGUMENTS Any approach would blow up in worst case! Satisfying global constraints is NP-hard.
17
Roth & Yih (ICML 2005): Express path decoding and global constraints as an integer linear program (ILP). Apply ILP solver: –Relax ILP to (real-valued) LP. –Apply polynomial-time LP solver. –Branch and bound to find optimal integer solution. Handling an NP-hard problem
18
The ILP solver doesn’t know it’s labeling sequences Path constraints: State 0: outflow ≤ 1; State 3: inflow ≤ 1 States 1 & 2: outflow = inflow At least one argument: Arcs labeled O: flow ≤ 1
19
Maybe we can fix the brute-force decoder?
20
Local model usually violated no constraints
21
Most constraints were rarely violated
22
Finite-state constraint relaxation Local models already capture much structure. Relax the constraints instead! Find best path using linear decoding algorithm. Apply only those global constraints that path violates.
23
Brute-force algorithm Local model IntersectDecode Sentence Labeling Global constraints
24
Constraint relaxation algorithm Test Violated constraints yes no C1C1 C2C2 C3C3 Local model IntersectDecode Sentence Labeling Global constraints Never intersected! Optimal!
25
Finite-state constraint relaxation is faster than the ILP solver State-of-the-art implementations: –Xpress-MP for ILP, –FSA (Kanthak & Ney, ACL 2004) for constraint relaxation. Why?
26
No sentences required more than a few iterations Many take one iteration even though two constraints were violated.
27
Buy one, get one free A1A4A3A1 Sales for the quarter rose to $ 1.63 billion from $ 1.47 billion.
28
Lattices remained small Arcs at each iteration for examples that required 5 intersectionsArcs in brute force lattice for examples that required 5 intersections
29
Take-home message Global constraints aren’t usually doing that much work for you: –Typical examples violate only a small number using local models. They shouldn’t have to slow you down so much, even though they’re NP-hard in the worst case: –Figure out dynamically which ones need to be applied.
30
Future work General soft constraints (We discuss binary soft constraints in the paper.) Choose order to test and apply constraints, e.g. by reinforcement learning. k-best decoding
31
Thanks to Scott Yih for providing both data and runtime, and to Stephan Kanthak for FSA.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.