Download presentation
Presentation is loading. Please wait.
Published byMelvin Allen Modified over 9 years ago
1
Week 9: resources for globalisation Finish spell checkers Machine Translation (MT) The ‘decoding’ paradigm Ambiguity Translation models Interlingua and First Order Predicate Calculus Human involvement Historical note
2
Spelling dictionaries Implementing spelling identification and correction algorithm
3
Spelling dictionaries Implementing spelling identification and correction algorithm STAGE 1: compare each string in document with a list of legal strings; if no corresponding string in list mark as misspelled STAGE 2: generate list of candidates Apply any single transformation to the typo string Filter the list by checking against a dictionary STAGE 3: assign probability values to each candidate in the list STAGE 4: select best candidate
4
Spelling dictionaries STAGE 3 prior probability given all the words in English, is this candidate more likely to be what the typist meant than that candidate? P(c) = c/N where N is the number of words in a corpus likelihood Given, the possible errors, or transformation, how likely is it that error y has operated on candidate x to produce the typo? P(t/c), calculated using a corpus of errors, or transformations Bayesian rule: get the product of the prior probability and the likelihood P(c) X P(t/c)
5
Spelling dictionaries non-word errors Implementing spelling identification and correction algorithm STAGE 1: identify misspelled words STAGE 2: generate list of candidates STAGE 3a: rank candidates for probability STAGE 3b: select best candidate Implement: noisy channel model Bayesian Rule
6
Resoucres for Globalisation: Machine translation
7
The ‘decoding’ paradigm Assumes one-to-one relation between source symbol and target symbol
8
Resoucres for Globalisation: Machine translation The ‘decoding’ paradigm Assumes one-to-one relation between source symbol and target symbol one-to-many (homonymy)
9
Resoucres for Globalisation: Machine translation The ‘decoding’ paradigm Assumes one-to-one relation between source symbol and target symbol one-to-many (homonymy) one-to-many (hypernym → hyponyms):
10
Resoucres for Globalisation: Machine translation The ‘decoding’ paradigm Assumes one-to-one relation between source symbol and target symbol one-to-many (homonymy) one-to-many (hypernym → hyponyms): many-to-one (hyponyms → hypernym)
11
Machine translation The ‘decoding’ paradigm one-to-many (homonymy) bank → Ufer, Bank (German)
12
Machine translation The ‘decoding’ paradigm one-to-many (homonymy) one-to-many (hypernym → hyponyms): brother → otooto, oniisan (Japanese) blue → синий, голубой (Russian) many-to-one (hyponyms → hypernym)
13
Machine translation The ‘decoding’ paradigm one-to-many (homonymy) one-to-many (hypernym → hyponyms): many-to-one (hyponyms → hypernym) hill, mountain → Berg (German) learn, teach → leren (Dutch)
14
Machine translation and globalisation Ambiguity ‘I made her duck’ “The possibility of interpreting an expression in two or more distinct ways” Collins English Dictionary
15
Machine translation Ambiguity Challenge of the translation depends on the level of ambiguity that arises This depends on the closeness of the source and target languages w.r.t. the following: vocabulary homonyms grammar structural ambiguity conceptual structure specificity ambiguity lexical gaps
16
Machine translation Pragmatic approach
17
Machine translation Pragmatic approach aim for a rough translation, ‘gist’ translation Used for multi-lingual information retrieval
18
Machine translation Pragmatic approach aim for a rough translation, ‘gist’ translation Used for multi-lingual information retrieval involve human translators in the process: computer-aided translation
19
Machine translation Translation models Transfer model ‘the dog bit my friend’ Hindi: kutte-ne mere dost ko-kata dog my friend bit
20
Machine translation Translation models Transfer model Alter grammatical structure of source language to make it adhere to the grammatical structure of target language Use transformation rule Analysis process (source) Transfer process (‘bridge’) Generation process (target) Problem: each source-target pair will need it own unique set of transformation rules
21
Machine translation Translation models Inter-lingua model Extract the meaning from the source string Give it a language independent representation, i.e. an interlingua Translation process takes the interlingua as its input Multiple translation processes take the same input for multiple target language outputs
22
Machine translation Translation models What is the inter-lingua? for words, some sort of semantic analysis, e.g. (GO, BY-FOOT) (GO, BY-TRANSPORT) Russian: идти ехать English: go go
23
Machine translation and globalisation Translation models What is the inter-lingua? for sentences, a logical language e.g. First Order Predicate Calculus
24
Meaning representation Goal: 1.the semantic representation must give you a one-to-one mapping to non-linguistic knowledge of the world 2. The representation must be expressive, i.e. handle different types of data
25
Meaning representation First Order Predicate Calculus computationally tractable objects (terms) properties of objects relations amongst objects Predicate argument structure large composite representations logical connectives
26
Meaning representation First Order Predicate Calculus Object: referred to uniquely by a term constant e.g. SurreyUniversity function e.g. LocationOf(SurreyUniversity) variable
27
Meaning representation First Order Predicate Calculus Relations amongst objects Predicates: “symbols that refer to, or name, the relations that hold among some fixed number of objects” (J & M) Educates(SurreyUniversity, Citizens) two-place predicate
28
Meaning representation First Order Predicate Calculus Relations amongst objects Predicates: Can specify the category of an object University(SurreyUniversity) one-place predicate
29
Meaning representation First Order Predicate Calculus properties / parts of objects functions: LocationOf(SurreyUniversity)
30
Meaning representation First Order Predicate Calculus Composite representations through predicates and functions: Near(LocationOf(SurreyUniversity), LocationOf(Cathedral))
31
Meaning representation First Order Predicate Calculus Logical connectives combine basic representations to form larger more complex representations e.g ٨ operator = ‘and’
32
Meaning representation First Order Predicate Calculus Logical connectives combine basic representations to form larger more complex representations Educates(SurreyUniversity, Citizens) ٨ ¬ Remunerates(SurreyUniversity, Staff)
33
Machine translation and globalisation Machine translation and globalisation: change of priorities 1954: IBM and Georgetown University, first MT demo goal: ‘perfect’ translation 1967: Automatic Language Process Advisory Committee (ALPAC) report: damning of goal Post ALPAC Goal: rough translation, involve human element Current situation: online translation, e.g. Babel Fish, descendant of SYSTRAN whose goal was rough translation Journal of Machine Translation
34
Next week Globalisation as an industry SDL and the SDLX-TRADOS globalisation application
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.