Presentation is loading. Please wait.

Presentation is loading. Please wait.

Jason Ji Computer Systems Laboratory

Similar presentations


Presentation on theme: "Jason Ji Computer Systems Laboratory"— Presentation transcript:

1 Jason Ji Computer Systems Laboratory 2004-2005
Natural Language Processing: Using Machine Translation in Creation of a German-English Translator Jason Ji Computer Systems Laboratory

2 Machine Translation a field that has been around for decades
several methods to solve problem none of them resemble human methods this program attempts to use human methods to translate

3 Direct Approach original translation strategy
translate each word directly in one-to-one dictionary look-up then perform some local reordering doesn't consider semantic information

4 Indirect Approach Interlingua and Transfer Approaches
translate from source language to some intermediary, unnatural language, including semantic information, etc translate from intermediary to the target language other methods knowledge based, etc, more complicated, less human- like

5 Theory no current translation method is 100% effective
no current translation method closely resembles human approach humans can be 100% effective translators therefore, use human approach with machines to have more effective translations?

6 Overview of Method separate input string into each word
first look-up: a list that maps each word to its part of speech second look-up: each part-of-speech-specific list maps each word to its translation and semantic information past tense forms, irregular conjugations, etc

7 Development Assumptions: look-up: article must precede noun
preposition must be followed by anoun verb must be preceded by a noun look-up: find word in list.txt, then redirect to other text files

8 Development Assumptions: look-up: article must precede noun
preposition must be followed by anoun verb must be preceded by a noun look-up: find word in list.txt, then redirect to other text files

9 Development Assumptions: look-up: article must precede noun
preposition must be followed by anoun verb must be preceded by a noun look-up: find word in list.txt, then redirect to other text files

10 Development Line of semantic information chopped up
various subclass objects (Noun, Verb, etc) are created pronouns created in nominative case articles created unidentified verbs created infinitive form

11 Development Correct article genders and cases
for article in pos x, check noun in pos x+1 check case of nearestModifier() correct verb conjugations for verb in pos x, check subject in pos x-1 search in verblist, find weak or strong weak: follow conjugation pattern; strong: read in conjugations from list

12 Development Correct pronoun cases
for pronoun in pos x, check verb or preposition in pos x-1 append all corrected Strings together and display in text field

13 Results/Conclusion I see the dog / Ich sehe den Hund.
Correct for pronoun, present-tense verb conjugation, direct-object case correction The cats help the dogs / Die Katzen helfen den Hunden. Correct for nominative pluralizations, verb conjugation, and dative pluralization due to verb

14 Results/Conclusion The cats are the dogs / RUNTIME ERRORS
fails with irregular verbs in English, including “to be” The cats ate the pie / Die Katzen essen die Torte. Fails with past tense verbs recognizes a past tense verb, but does not correct

15 Results/Conclusion Succeeds in limited goals
not practical or applicable in anything highly fragile runtime errors for basically anything that doesn’t follow the exact exact form inefficient: list.txt with 53 words was 4KB; a list of 1,000,000 words would be 75.5MB


Download ppt "Jason Ji Computer Systems Laboratory"

Similar presentations


Ads by Google