Download presentation
Presentation is loading. Please wait.
1
Dekai Wu Presented by David Goss-Grubbs
Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora Dekai Wu Presented by David Goss-Grubbs
2
Inversion Transduction Grammars
What they are How they work What they can do
3
What an ITG is A formalism for modeling bilingual sentence pairs
A transducer: it defines a relation between sentences Conceived as generating pairs of sentences rather than translating
4
Grammar Scope Not intended to relate a sentence to all and only its correct translations Used to extract useful information from parallel corpora They will overgenerate wildly.
5
How ITGs work A subset of ‘context-free syntax-directed transduction grammars’. A simple transduction grammar is just a CFG whose terminals are pairs of symbols (or singletons) In an ITG the order of the constituents in one language may be the reverse of the other language, for any given rule.
6
Notation The right-hand side of a rule is in square brackets [ ] when the order is the same in both languages Angle brackets are used when the order is reversed.
7
Yoda Speak S SubjAux VP SubjAux [NP Aux] VP ‘begun’ / ‘begunY’
NP ‘the-clone-war’ / ‘the-clone-warY’ Aux ‘has’ / ‘hasY’ ‘the-clone-war has begun’ ‘begunY the-clone-warY hasY’
8
Yoda Speak S SubjAux VP NP Aux begun The clone war has
9
Normal Form For any ITG there is a weakly equivalent grammar in the normal form All right-hand sides are either: Terminal couples Terminal singletons Pairs of non-terminals with straight orientation Pairs of non-terminals with inverted orientation
10
Expressiveness Limits
Not every way of matching is possible ‘Obi (has) not won victory’ cannot be matched with ‘notY victoryY ObiY wonY’. This is a good thing: We only have to consider a subset of the possible matchings The percentage of possibilities eliminated increases rapidly as the number of tokens increases
11
Expressiveness Limits
12
Stochastic ITGs A probability can be added to each rewrite rule.
The probabilities of all the rules with a given left hand side must sum to 1. An SITG will give the most probable matching parse for a sentence pair.
13
Parsing with an SITG: the chart
Five dimensions to the chart: start and stop indices for sentence 1; start and stop indices for sentence 2; nonterminal category. Each cell stores the probability of the most likely parse covering the appropriate substrings, rooted in the appropriate category.
14
Parsing with an SITG: the algorithm
Initialize the cells corresponding to terminals using a translation lexicon For the other cells, find the most probable way of getting that category. Compute the probability by multiplying the probability of the rule by the probabilities of both the constituents Store that probability plus the orientation of the rule
15
What you can do with an ITG
Segmentation Bracketing Alignment Bilingual Constraint Transfer
16
Segmentation Several words might go together to make a single lexical entry Several characters might go together to make a single word “sandwiches there” vs “sand which is there” A segmentation may make sense in the monolingual case, but not in the bilingual case
17
Segmentation Change the parsing algorithm:
Allow the initialization step to find strings of any length in the translation lexicon. The recursive step stores the most probable way of creating a constituent, whether it came from the lexicon or from rules.
18
Bracketing How to assign structure to a sentence with no grammar available? Get a parallel corpus pairing it with some other language Get a reasonable translation dictionary Parse it with a bracketing transduction grammar
19
Bracketing Transduction Grammars
A minimal, generic ITG Just one nonterminal A [A A], A A, terminal couples, singletons The important probabilities are on the rules that rewrite to terminal couples, taken from the translation lexicon.
20
Bracketing Transduction Grammars
Given lexical translation probabilities, only a subset of otherwise possible bracketings are available. ‘Obi not won victory’ ‘not victory Obi won’ is a limiting case – no bracketings ‘the-clone-war has begun’ has two bracketings When paired with ‘begun the-clone-war has’, it has just one bracketing
21
Bracketing with Singletons
Singletons don’t help in bracketing. Depending on the language, Wu uniformly attaches them to the left or to the right. They are then ‘pushed down’ using yield-preserving transformations. e.g. [ x A B ] = x A B
22
Avoiding Arbitrary Choices
Some arrangements leave us with an arbitrary choice ‘create a-perimeter around-the-survivors’ ‘around-the-survivors a-perimeter create’ A B C and A B C both work. Use a more complex bracketing grammar that makes such things left-branching Fix it in post-processing to be A B C
23
Bracketing Experiment
They tested on 2000 English-Chinese sentence pairs They rejected sentence pairs that weren’t adequately covered by the translation lexicon. 80% bracket precision for English, 78% bracket precision for Chinese
24
Alignment Alignments (phrasal or lexical) are a natural byproduct of bilingual parsing. Unlike ‘parse-parse-match’ methods, this Doesn’t require a fancy grammar for both languages Guarantees compatibility between parses Has a principled way of choosing between possible alignments Provides a more reasonable ‘distortion penalty’.
25
Bilingual Constraint Transfer
A high-quality parse for one language can be leveraged to get structure for the other. Alter the parsing algorithm to only allow constituents that match the parse that already exists for the well-known language. This works for any sort of constraint supplied for the well-known language.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.