Presentation is loading. Please wait.

Presentation is loading. Please wait.

Tri-gram + LanguageTool

Similar presentations


Presentation on theme: "Tri-gram + LanguageTool"— Presentation transcript:

1 Tri-gram + LanguageTool
Manually tuned results David Ling

2 Tri-gram + LanguageTool: HSMC script 1
Highlights: Green: by tri-gram detector Blue: by LanguageTool More errors are recalled when combined LanguageTool is better on SVG Trigram is better in word usages

3 Tri-gram + LanguageTool: HSMC script 1
4 kinds of scores are used in trigram detection Normalized frequency: 𝑠 2 = 𝑐𝑜𝑢𝑛𝑡( 𝑥 1 , 𝑥 2 , 𝑥 3 ) 𝑐𝑜𝑢𝑛𝑡 𝑥 1 𝑐𝑜𝑢𝑛𝑡( 𝑥 2 )𝑐𝑜𝑢𝑛𝑡( 𝑥 3 ) Interpolation of normalized frequency: 𝑠 3 = 𝜆 1 𝑠 2 + 𝜆 2 𝑝 𝑥 1 )𝑝( 𝑥 2 | 𝑥 1 )𝑝( 𝑥 3 | 𝑥 1 𝑐𝑜𝑢𝑛𝑡 𝑥 1 𝑐𝑜𝑢𝑛𝑡( 𝑥 2 )𝑐𝑜𝑢𝑛𝑡( 𝑥 3 ) Verb Inflection when trigram freq is low: 𝑠 4 = 𝑐𝑜𝑢𝑛𝑡( 𝑥 1 , 𝑥 2 ′, 𝑥 3 ) 𝑐𝑜𝑢𝑛𝑡( 𝑥 1 , 𝑥 2 , 𝑥 3 ) Product of s2 and s4: 𝑠 5 = 1 𝑠 2 × 𝑠 4

4 Tri-gram + languagetool: HSMC script 2
A problem using normalized scores -- they failed to recall errors due to misspelled /uncommon word For example: “with common asker”: 𝑛𝑓= 𝑓𝑟𝑒𝑞(𝑤𝑖𝑡ℎ 𝑐𝑜𝑚𝑚𝑜𝑛 𝑎𝑠𝑘𝑒𝑟) 𝑓𝑟𝑒𝑞 𝑤𝑖𝑡ℎ 𝑓𝑟𝑒𝑞 𝑐𝑜𝑚𝑚𝑜𝑛 𝑓𝑟𝑒𝑞(𝑎𝑠𝑘𝑒𝑟) As freq(asker) is small, nf will be large Since “asker” is not frequent, therefore trigram with “asker” is not frequent is reasonable => no error However, this can avoid false positives on uncommon word, eg. Special names

5 Tri-gram + languagetool: HSMC script 3

6 Tri-gram + languagetool: HSMC script 4

7 LanguageTool’s neural network resolves done/did correctly here


Download ppt "Tri-gram + LanguageTool"

Similar presentations


Ads by Google