Download presentation
Presentation is loading. Please wait.
Published byMelvin Newton Modified over 9 years ago
1
1 Montague Grammar and MT Chris Brew, The Ohio State University http://www.purl.org/NET/cbrew.htm
2
2 795V, Autumn 2005 Montague Grammar and MT Machine Translation and Montague Grammar Great paper by Jan Landsbergen, in Readings in Machine Translation. The place of linguistics in MT What is the essence of Montague Grammar? How can we use it (the essence) in MT? The subset problem How does this look today?
3
3 795V, Autumn 2005 Montague Grammar and MT Possible translations It must be defined clearly what the correct sentences of the source and target languages are. Linguistic theory provides means to do this by providing grammars with associated compositional semantics Landsbergen suggests a Montague -(inspired) grammar If the input is a correct source language sentence, the output should be a correct target language sentence. This is a condition on the design of the translation system. Landsbergen sketches one approach There must be some definition of the information content that the source and target sentences should have in common This is a call to arms for translation theory No good solution is currently available
4
4 795V, Autumn 2005 Montague Grammar and MT Best translations It must be defined clearly what the correct sentences of the source and target languages are. This defines the search space of possible inputs and outputs If the input is a correct source language sentence, the output should be the best corresponding target language sentence. The system will be evaluated on its treatment of correct sentences. Robustness with respect to incorrect input is not required. It could be that there are three sentences e,f and e’ such that f is the best translation of e but e’ is the best translation of f. ‘best translation’ is not a symmetric relation By contrast, ‘possible translation’ is symmetric. In addition, if we have three languages E,F,G then we have transitivity possible E-F possible F-G = possible E-G
5
5 795V, Autumn 2005 Montague Grammar and MT Comparing MT systems It is possible to reason theoretically about systems that at least aspire to Landsbergen’s principles There are no obvious grammatical or semantic criteria for evaluating systems when the output is not even a correct sentence of the target language. Linguists should specify the possible translations Engineers (or linguists wearing hard hats) should worry about robustness and translation selection. The robustness part might need to appeal to world knowledge, discourse history, knowledge of the task, other extralinguistic factors
6
6 795V, Autumn 2005 Montague Grammar and MT The essence of Montague Grammar There is a set of basic expressions with meanings Rules are pairs of a syntactic and a semantic rule, where the syntactic and the semantic rules work in lock-step (Rule-to-rule hypothesis) Either: the semantic rules are operators that build up the semantic value (Montagovian) Or: the semantic rules build up an expression in some logic, then the expression is interpreted by the rules of the logic to produce a standardized semantic value (echt Montague)
7
7 795V, Autumn 2005 Montague Grammar and MT Landsbergen’s system M-grammars Have surface trees (S-trees). S-PARSER is standard technology, generates parse forest of S-trees) M-PARSER scans the results of S-PARSER, and applies a series of analytical rules to the S-trees rewriting them to produce surface trees. The M-PARSER is very powerful, and builds up semantic values. The result of M-PARSER is a semantic tree that is easy to transfer.
8
8 795V, Autumn 2005 Montague Grammar and MT The subset problem Montague grammars translate natural language into subsets of intensional logic There is no guarantee that the subset will be the same for every language Without extra cleverness, the only sentences that can be translated will be those which are in the intersection of the source language IL and the target language IL
9
9 795V, Autumn 2005 Montague Grammar and MT Isomorphic grammars To avoid the subset problem, impose the constraint that For every syntactic rule in one language there is a corresponding syntactic rule in every other language, and that the meaning operation is the same across the board For every basic expression, there is a corresponding one in every other language This is a really heavy constraint on grammar writers, and it isn’t clear how to do it
10
10 795V, Autumn 2005 Montague Grammar and MT Grammar writing A set of compositional rules R is written for handling a particular phenomenon in language L, a corresponding set of rules R’ is written for handling the corresponding phenomenon in language L’ (Landsbergen p250) Grammar development proceeds in parallel. You test by ensuring that R covers the relevant expressions of L and R’ covers the relevant expressions of L’ The most important practical difference between this and other approaches is probably that the grammars are written with translation in mind.
11
11 795V, Autumn 2005 Montague Grammar and MT The claim If you do this grammar-writing co-ordination, you can get away without worrying about the subset problem Montague grammar may be way too complicated but if Dutch geloven works the same as English believe you can, in that case, get away with the same theoretically insufficient representation on both sides You might be able to control the consequencesof putting extra (non-truth functional) control information into the IL by doing this on a case-by-case basis in order to co-ordinate specific phenomena. (DANGER)
12
12 795V, Autumn 2005 Montague Grammar and MT How does this look today? Practical experience with broad-coverage grammars We now know that broad-coverage grammars produce large numbers of analyses, most of them crazy. It definitely pays to do some kind of probabilistic parse selection, even if you have a good broad-coverage grammar. If your goal is to do well on existing parsing metrics, it works well to learn the grammar from a treebank.
13
13 795V, Autumn 2005 Montague Grammar and MT The linguistic question Given a tree, tell me how to make a score for the tree out of smaller components
14
14 795V, Autumn 2005 Montague Grammar and MT Given a tree Tell me how to break it down into smaller components Smaller components because these smaller components are going to be common enough that the statistics over them might be reliable But large enough that the crucial relationships between the parts of the tree have a chance of coming through Probabilistic context-free grammars are (slightly?) too coarse- grained. So we adjust them in ways that bring out more of the crucial relationships. Add parents, grandparents, head-words, other clever stuff
15
15 795V, Autumn 2005 Montague Grammar and MT Given a translation pair Tell me how to break it down into smaller components Smaller components because these smaller components are going to be common enough that the statistics over them might be reliable But large enough that the crucial relationships between the parts of the pair have a chance of coming through Language model for TL, standard technology Models 1,2,3,4,5 for SL TL correspondence. Clearly very coarse- grained How to adjust so that more of the crucial relationships come through? How to think about translation pairs?
16
16 795V, Autumn 2005 Montague Grammar and MT Errorfulness PTB is smallish and somewhat errorful This imposes practical limits on the complexity of models. The more detail you ask for, the less likely your training procedure is to provide it in reliable form. Hand-written grammars blur the distinction between ungrammaticality and lack of coverage. It is therefore dangerous for components that use grammars to give too much weight to the grammar’s claims about ungrammaticality Even when the grammar fails to provide a complete analysis, it could provide useful partial results.
17
17 795V, Autumn 2005 Montague Grammar and MT Errorfulness Current word-aligned corpora are tiny, but do at least exist. Presumably they too are errorful. Unsupervised learning via EM has dominated the field. This is because nothing better is available. The pseudo-annotation that EM hallucinates is very errorful. The complexity of models is limited by the need to do EM and by the difficulty of working with errorful annotation. It is dangerous for the system to believe hard-and-fast things about intertranslatability
18
18 795V, Autumn 2005 Montague Grammar and MT Coverage To score well, it usually pays to guess even if The question seems so stupid that no sensible answer is possible Your answer would be little better than a random guess Statistical parsers build up models of grammar that always make a guess The models learn from the whole of the data. They might be designed to learn linguistic things, but they can and do implicitly learn non-linguistic things that turn out to help.
19
19 795V, Autumn 2005 Montague Grammar and MT Coverage To score well, it usually pays to guess even if The question seems so stupid that no sensible answer is possible Your answer would be little better than a random guess Brown-style MT systems have good coverage, and not-bad probabilistic models of. They too learn from the whole of the data. Their design is shaped partly by the need to model linguistic things (e.g. word order variation) partly by accidental success in modeling other factors that we don’t understand yet
20
20 795V, Autumn 2005 Montague Grammar and MT Conclusions There is clear parallel between Landsbergen’s notion of intertranslatability and Montague’s notion of grammaticality. Arguably, statistical parsers succeed because they relax the notion of grammaticality, allowing them to handle misfires in the grammar smoothly. Co-incidentally, they finish up robust to other difficulties, including weaknesses in the statistical models and the training data.
21
21 795V, Autumn 2005 Montague Grammar and MT Conclusions There is clear parallel between Landsbergen’s notion of intertranslatability and Montague’s notion of grammaticality. Arguably, MT systems succeed because they relax the notion of intertranslatability (or just fail to even have such a notion). Co-incidentally, this makes them robust to failings in the statistical modeling, the data, and the procedures for data augmentation. That said, it would be nice to have explicit semantics in MT systems
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.