Download presentation
Presentation is loading. Please wait.
Published byLesley Dulcie Clarke Modified over 8 years ago
1
Global Inference via Linear Programming Formulation Presenter: Natalia Prytkova Tutor: Maximilian Dylla 14.07.2011
2
2 Outline Motivation Naïve Algorithm LP Formulation –Constraints –Objective Function Applications of LP Experiments Discussion
3
3 Inference with Classifiers Recognize entities Recognize relations Inference
4
4 Example Book Author
5
5 Example Book Author
6
6 Properties of Extracted Items BalletWrittenBy (Ballet, Composer) BookWrittenBy (Book, Author) Ballet Composer Book Author
7
7 Properties of Extracted Items BalletWrittenBy (Ballet, Composer) BookWrittenBy (Book, Author) ShownInTheater (Ballet,Theater) GraduatedFrom (Composer, Conservatory) BookPublishedBy (Book, Publisher) MemberOfUnion (Author, WritersUnion) Ballet Composer Theater Book Author WritersUnion Conservatory Publisher
8
8 Example BalletWrittenBy Ballet Composer
9
9 Example BalletWrittenBy Ballet Composer
10
10 Properties of Extracted Items a lot of relations types a lot of entities types mutually dependent
11
11 Outline Motivation Naïve Algorithm ILP Formulation –Constraints –Objective Function Applications of ILP Experiments Discussion
12
12 Outline Motivation Naïve Algorithm LP Formulation –Constraints –Objective Function Applications of LP Experiments Discussion
13
13 Key Idea Recognize entities Recognize relations Inference
14
14 Naïve Algorithm
15
15 Naïve Algorithm P(Book BalletWrittenBy Composer) = 0.07 P(Book BalletWrittenBy Author) = 0.07 P(Book BookWrittenBy Composer) = 0.12 P(Book BookWrittenBy Author) = 0.03 P(Ballet BalletWrittenBy Composer) = 0.28 P(Ballet BalletWrittenBy Author) = 0.28 P(Ballet BookWrittenBy Composer) = 0.12 P(Ballet BookWrittenBy Author) = 0.12 …
16
16 Naïve Algorithm P(Book BalletWrittenBy Composer) = 0.07 P(Book BalletWrittenBy Author) = 0.07 n entities – O(n 2 ) binary relations P(Book BookWrittenBy Composer) = 0.12 l labels – l n 2 assignments P(Book BookWrittenBy Author) = 0.03 P(Ballet BalletWrittenBy Composer) = 0.28 P(Ballet BalletWrittenBy Author) = 0.28 P(Ballet BookWrittenBy Composer) = 0.12 P(Ballet BookWrittenBy Author) = 0.12 …
17
17 Naïve Algorithm P(Book BalletWrittenBy Composer) = 0.07 P(Book BalletWrittenBy Author) = 0.07 n entities – O(n 2 ) binary relations P(Book BookWrittenBy Composer) = 0.12 l labels – l n 2 assignments P(Book BookWrittenBy Author) = 0.03 P(Ballet BalletWrittenBy Composer) = 0.28 P(Ballet BalletWrittenBy Author) = 0.28 P(Ballet BookWrittenBy Composer) = 0.12 P(Ballet BookWrittenBy Author) = 0.12 …
18
18 Some Useful Properties Relations impose restrictions on entities Each entity or relation can be labeled only with one label Relations can be directed (BookWrittenBy) or undirected (SpouseOf)
19
19 Outline Motivation Naïve Algorithm ILP Formulation –Constraints –Objective Function Applications of ILP Experiments Discussion
20
20 Key Idea Obtain a set of possible labels for entities/relations Optimize the global decision given a set of constraints
21
21 Definitions Sentence S –Linked list of words and entities. Boundaries of entities are given Piotr Ilyich Tchaikovsky is one entity. Entity ε –Observed variables Relation –Binary relations between entities Class –Predefined sets of entities and relations labels.
22
22 Constraints Indicator variables
23
23 Constraints
24
24 Constraints Each entity or relation can be labeled only with one label Assignment to each entity or relation variable is consistent with the assignments to its neighboring variables
25
25 Objective Function Assignment cost –e.g. –Cost of deviating from the assignments given by classifiers Constraint cost –e.g. –Cost of breaking constraints between two neighboring entities
26
26 Naïve Algorithm P(Book BalletWrittenBy Composer) = 0.07 P(Book BalletWrittenBy Author) = 0.07 n entities – O(n 2 ) binary relations P(Book BookWrittenBy Composer) = 0.12 l labels – l n 2 assignments P(Book BookWrittenBy Author) = 0.03 P(Ballet BalletWrittenBy Composer) = 0.28 P(Ballet BalletWrittenBy Author) = 0.28 P(Ballet BookWrittenBy Composer) = 0.12 P(Ballet BookWrittenBy Author) = 0.12 …
27
27 Useful Property ILP is NP hard in general, but sometimes can be solved in polynomial time.
28
28 Outline Motivation Naïve Algorithm ILP Formulation –Constraints –Objective Function Applications of ILP Experiments Discussion
29
29 Viterbi Shortest path
30
30 Viterbi
31
31 Phrases Identification
32
32 Phrases Identification
33
33 Phrases Identification
34
34 Outline Motivation Naïve Algorithm ILP Formulation –Constraints –Objective Function Applications of ILP Experiments Discussion
35
35 Experiments E -> R E R Separate R -> E Omniscient E R I E R I E R I E R I E R I
36
36 Experiments
37
37 Experiments 5 336 entities 19 048 pairs of entities 1 437 sentences running time < 30 sec on Pentium III 800 MHz
38
38 Outline Motivation Naïve Algorithm ILP Formulation –Constraints –Objective Function Applications of ILP Experiments Discussion
39
39 Discussion Guarantees optimality Supports correct decisions by imposing limitations LP solvers are available Not scalable –cplex accepts at most 2 31 variables and constraints ~ 46 000 entities –student edition accepts only 500 =) ~ 20 entities No feedback to extractors
40
40 References Dan Roth and Wen-tau Yih: A Linear Programming Formulation for Global Inference in Natural Language Tasks, CoNLL'04 Dan Roth and Wen-tau Yih: Global Inference for Entity and Relation Identification via a Linear Programming Formulation, Introduction to Statistical Relational Learning, 2007
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.