KantanNeural™ LQR Experiment Solving KantanNeural™ LQR Experiment
Evolution of Machine Translation 2016 Neural MT Quality Statistical MT 2002 Rule-Based 1970
Phrase-Based Statistical MT 1970 2002 2016
Neural MT – The emergence of AI 1970 2002 2016
Scientific Rigour Experiment Setup Identical Training Data Sets Identical Test Reference Sets Automated Scores Used: F-Measure, TER, BLEU Native Speaking, Professional Reviewers NMT – KantanNeural™ – GPU Processors SMT – KantanMT – CPU Processors Translation Evaluation – KantanLQR™
Training Corpora Language Arc Parallel Sentences TWC UWC Domain(s) English->German 8,820,562 110,150,238 859,167 Legal/Medical English->Chinese(Simplified) 6,522,064 84,426,931 956,864 Legal/Technical English->Japanese 8,545,366 87,252,129 676,244 English->Italian 2,756,185 35,295,535 765,930 Medical English->Spanish 3,681,332 44,917,538 952,089 Legal
Training : Time SMT NMT Language Arc Time (Hours) English->German 18 92 English->Chinese(Simplified) 6 10 English->Japanese 9 68 English->Italian 8 83 English->Spanish 71
Training : Automated Scores SMT NMT Language Arc F-Measure BLUE TER Time Perplexity English->German 62.00 54.08 54.31 18 62.53 47.53 53.41 3.02 92 English->Chinese(Simplified) 77.16 45.36 46.85 6 71.85 39.39 47.01 2.00 10 English->Japanese 80.04 63.27 43.77 9 69.51 40.55 49.46 1.89 68 English->Italian 69.74 56.98 42.54 8 64.88 42.00 48.73 2.70 83 English->Spanish 71.53 54.78 41.87 69.41 49.24 44.89 2.59 71
Training : Automated Scores SMT NMT Language Arc F-Measure BLUE TER Time Perplexity English->German 62.00 54.08 54.31 18 62.53 47.53 53.41 3.02 92 English->Chinese(Simplified) 77.16 45.36 46.85 6 71.85 39.39 47.01 2.00 10 English->Japanese 80.04 63.27 43.77 9 69.51 40.55 49.46 1.89 68 English->Italian 69.74 56.98 42.54 8 64.88 42.00 48.73 2.70 83 English->Spanish 71.53 54.78 41.87 69.41 49.24 44.89 2.59 71
Training : Automated Scores SMT NMT Language Arc F-Measure BLUE TER Time Perplexity English->German 62.00 54.08 54.31 18 62.53 47.53 53.41 3.02 92 English->Chinese(Simplified) 77.16 45.36 46.85 6 71.85 39.39 47.01 2.00 10 English->Japanese 80.04 63.27 43.77 9 69.51 40.55 49.46 1.89 68 English->Italian 69.74 56.98 42.54 8 64.88 42.00 48.73 2.70 83 English->Spanish 71.53 54.78 41.87 69.41 49.24 44.89 2.59 71
Training : Automated Scores SMT NMT Language Arc F-Measure BLUE TER Time Perplexity English->German 62.00 54.08 54.31 18 62.53 47.53 53.41 3.02 92 English->Chinese(Simplified) 77.16 45.36 46.85 6 71.85 39.39 47.01 2.00 10 English->Japanese 80.04 63.27 43.77 9 69.51 40.55 49.46 1.89 68 English->Italian 69.74 56.98 42.54 8 64.88 42.00 48.73 2.70 83 English->Spanish 71.53 54.78 41.87 69.41 49.24 44.89 2.59 71
Professional Reviewers
Ranking
Ranking
Ranking
Conclusions Comparative study of identical SMT and NMT Engines Commercial Setting Identical Training Scenarios Compared Automated Scores Conducted A/B Testing Analysed Results
Conclusions Translation Quality Automated Scores As determined by native speaking, professional translators In all cases NMT ranked higher than SMT We observed that 48% of lower scoring BLUE NMT translations when ranked higher than their higher BLUE scoring SMT counterparts Automated Scores BLEU: Susceptible to under-estimating of NMT quality
Solving Thank You