Presentation is loading. Please wait.

Presentation is loading. Please wait.

An Investigation of Statistical Machine Translation (Spanish to English) Raghav Bashyal.

Similar presentations


Presentation on theme: "An Investigation of Statistical Machine Translation (Spanish to English) Raghav Bashyal."— Presentation transcript:

1 An Investigation of Statistical Machine Translation (Spanish to English) Raghav Bashyal

2 SMT Statistical Machine Translation Possible only through computers Global audience Use of statistical techniques to produce natural translations

3 Kevin Knight's Book SMT has two parts The second part, N-grams, are simple The first part, the alignment portion, are difficult After many long projects, I made my own algorithm

4 Before that, an introduction to the characters NLTK – simplifying input of corpora Corpora – hold text N-Grams – the frequency of a phrase

5 Algorithm 1. Match a. Take small Spanish input b. Look through the corpus to find instances of the input c. Collect the Spanish sentences in which this input was found, as well as the English translation right below each sentence d. Compare the English sentences to discover similar words e. Find the most common similar words and find permutations of them 2. Check a. Gather bi-gram values for each permutation using the bigram calculator b. Calculate the probabilities for each permutation with Knight’s formula e. Return the most probable permutation as the most likely simple translation

6 Development Simple – goal was to translate Corpora – functional “cosas” and “monkey”

7 Results It works! “ el mono” = “the monkey” Deeper understanding of SMT’s power (Google translate) Expand, elaborate upon algorithm


Download ppt "An Investigation of Statistical Machine Translation (Spanish to English) Raghav Bashyal."

Similar presentations


Ads by Google