An Improved Hierarchical Word Sequence Language Model Using Word Association 2015.11.26 NARA Institute of Science and Technology Xiaoyi WuYuji Matsumoto.

An Improved Hierarchical Word Sequence Language Model Using Word Association 2015.11.26 NARA Institute of Science and Technology Xiaoyi WuYuji Matsumoto Kevin Duh Hiroyuki Shindo

Motivation Continuous Language Model Continuous Language Model a a selfish man Unseen Sequence a a man a a selfish man Learned Sequences Data Sparsity 1

Motivation a a selfish man a a Training data Discontinuous Sequence a a man P( ) ≈ ? Smoothing techniques 30 years worth of newswire text 1/3 trigrams are unseen (Allison et al., 2005) 1/3 trigrams are unseen (Allison et al., 2005) 2

HWS language model 3 as soon as possible as soon as possible n-gram HWS continuous discontinuous utterance-oriented pattern-oriented

Basic Idea of HWS Patterns are discontinuous(Sentence are divided into several sections by patterns) – x is a y of z Patterns are hierarchical – x is a y of z → x is y of z → x is z Words are generated from certain position of patterns (Words depend on patterns) 4

Basic Idea of HWS 5 is Tom a a boy of nine is Tom a a boy of nine discontinuous Hierarchical Word depends on pattern Word depends on pattern

Proposed Approach (Frequency-based HWS Model) Corpus Mrs. Allen is a senior editor of insight magazine 6 of Mrs. Allen is a senior editor Insight magazine

Proposed Approach (Frequency-based HWS Model) Corpus Mrs. Allen is a senior editor of insight magazine 7 of is magazine Mrs. Allen a senior editor Insight

Proposed Approach (Original HWS Model) Corpus Mrs. Allen is a senior editor of insight magazine 8 of is magazine Mrs. a a Insight Allen editor senior

Proposed Approach (Original HWS Model) Corpus Mrs. Allen is a senior editor of insight magazine 9 of is magazine Mrs. a a Insight Allen editor senior ($, of), (of, a), (a, is), (is, Mrs.), (Mrs., Allen), (a, senior), (senior, editor), (of, magazine), (magazine, insight)

Advantage of HWS: discontinuity 10 as soon as possible as soon as possible n-gram HWS

Word Association Based HWS 11 too much to handle Frequency-based HWS Frequency-based HWS too much to handle Word Association Based HWS Word Association Based HWS Frequency Word Association Score

Extra Techniques 1/2: Directionalization 12 as soon as possible as soon as possible ($, $, as), ($, as, soon), (as, soon, as), (soon, as, possible) ($, $, as), ($, as, soon), (as, soon, as), (soon, as, possible) ($, $, as), ($, as, as), (as, as, soon), (as, as, possible) ($, $, as), ($, as, as), (as, as, soon), (as, as, possible) n-gram (One-side generation) n-gram (One-side generation) HWS (Double-side generation) HWS (Double-side generation)

Extra Techniques 1/2: Directionalization 13 as soon as possible HWS (Directionalization) HWS (Directionalization) as soon as possible HWS R L R ($, as, as), (as, as, soon), (as, as, possible) ($, as, as), (as, as, soon), (as, as, possible) ($-R, as-R, as), (as-R, as-L, soon), (as-R, as-R, possible) ($-R, as-R, as), (as-R, as-L, soon), (as-R, as-R, possible)

Extra Techniques 2/2: Unification 14 the.. HWS (Unification) HWS (Unification) the.. when constructing a HWS structure, for each word in one sentence, we only count it once. when constructing a HWS structure, for each word in one sentence, we only count it once.

Intrinsic Experiments Training data – British National Corpus （ 449,755 sentences, 10 million words ） Test data – English Gigaword Corpus （ 44,702 sentences, 1 million words ） Preprocessing – NLTK tokenizer – Lowercase Word Association Score – Dice coefficient Smoothing methods – MKN(Modeified Kneser-Ney) (Chen & Goodman, 1999) – GLM(Generalized language model) (Pickhardt et. al.,2014) Evaluation measures – Perplexity – Coverage (|TR∩TE| / |TE|) – Usage (|TR∩TE| / |TR|) 15

Evaluation 16

Extrinsic Experimental Settings Training data – TED talks parallel corpus French-English (139761 sentence pairs) Test data – TED talks parallel corpus French-English (1617sentence pairs) Translation toolkit – Moses system Evaluation measures – BLEU (Papineni et al., 2002) – METEOR(Banerjee & Lavie, 2005) – TER (Snover et al., 2006) 17

Extrinsic Evaluation 18

Conclusions We proposed an improved hierarchical language model using word association and two extra techniques Proposed model can model natural language more precisely than the original FB-HWS Proposed model has better performance on both intrinsic and extrinsic experiments Source code can be downloaded at – https://github.com/aisophie/HWS 19

` Thank you for your attention! 20

An Improved Hierarchical Word Sequence Language Model Using Word Association 2015.11.26 NARA Institute of Science and Technology Xiaoyi WuYuji Matsumoto.

Similar presentations

Presentation on theme: "An Improved Hierarchical Word Sequence Language Model Using Word Association 2015.11.26 NARA Institute of Science and Technology Xiaoyi WuYuji Matsumoto."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

An Improved Hierarchical Word Sequence Language Model Using Word Association 2015.11.26 NARA Institute of Science and Technology Xiaoyi WuYuji Matsumoto.

Similar presentations

Presentation on theme: "An Improved Hierarchical Word Sequence Language Model Using Word Association 2015.11.26 NARA Institute of Science and Technology Xiaoyi WuYuji Matsumoto."— Presentation transcript:

Similar presentations

About project

Feedback