Presentation is loading. Please wait.

Presentation is loading. Please wait.

Compact WFSA based Language Model and Its Application in Statistical Machine Translation Xiaoyin Fu, Wei Wei, Shixiang Lu, Dengfeng Ke, Bo Xu Interactive.

Similar presentations


Presentation on theme: "Compact WFSA based Language Model and Its Application in Statistical Machine Translation Xiaoyin Fu, Wei Wei, Shixiang Lu, Dengfeng Ke, Bo Xu Interactive."— Presentation transcript:

1 Compact WFSA based Language Model and Its Application in Statistical Machine Translation Xiaoyin Fu, Wei Wei, Shixiang Lu, Dengfeng Ke, Bo Xu Interactive Digital Media Technology Research Center, CASIA

2 1 Outline Task Problems Solution Our Approach Results Conclusion

3 2 Outline Task Problems Solution Our Approach Results Conclusion

4 3 Task N-gram Language Model assign probabilities to string of words or tokens Let w L denote a string of L tokens over a fixed vocabulary smoothing techniques – back-off – Define

5 4 Outline Task Problems Solution Our Approach Results Conclusion

6 5 Problems Query in trie structure Useless queries Problems in Forward Query Problems in Back-off Query

7 6 Outline Task Problems Solution Our Approach Results Conclusion

8 7 Solution Another point of view a random procedure a continuous process Benefit Speed up Forward Query Speed up Back-off Query Goal Fast Compact

9 8 Outline Task Problems Solution Our Approach Results Conclusion

10 9 Our Approaches FAST WFSA 5-turple M=(Q, Σ, I, F, δ ) Definition Qa set of states Ia set of initial states Fa set of final states Σa alphabet which represents the input and output labels δ δ Q×(Σ ∪ {ε}), a transition relation

11 10 Our Approaches FAST WFSA 5-turple M=(Q, Σ, I, F, δ ) Example Qa set of states Ia set of initial states Fa set of final states Σa alphabet which represents the input and output labels δ δ Q×(Σ ∪ {ε}), a transition relation

12 11 Our Approaches Compact Trie Sort Array

13 12 Our Approaches Compact Trie Sort Array Link index

14 13 Our Approaches WFSA-based LM Trie structure Note: – T f triggers corresponding to forward query – T b triggers spontaneously without any input – reaches to the leaves – carries out back-off queries Qthe nodes in trie Ithe root of trie FEach node of trie except the root Σthe alphabet of input sentences δforward transition T f and roll-back transition T b

15 14 Our Approaches WFSA-based LM

16 15 Our Approaches WFSA-based LM Probability Back-off Index Probability Back-off Index Roll-back index

17 16 Our Approaches WFSA-based LM Probability Back-off Index Probability Back-off Index Roll-back index Cross Layer

18 17 Our Approaches Query Method

19 18 Our Approaches Query Method

20 19 Our Approaches Query Method

21 20 Our Approaches Query Method

22 21 Our Approaches Query Method

23 22 Our Approaches Query Method

24 23 Our Approaches Query Method

25 24 Our Approaches Query Method

26 25 Our Approaches Query Method

27 26 Our Approaches State Transitions

28 27 Our Approaches Query LM

29 28 Our Approaches For HPB SMT For a source sentence – A huge number of LM queries – Ten Millions – Most of these are repetitive Hash cache

30 29 Our Approaches For HPB SMT Hash cache – Small & fast – Hash size 24bit – 16M – Simple operation – Additive Operation – Bitwise Operation Hash clear – For each sentence

31 30 Outline Task Problems Solution Our Approach Results Conclusion

32 31 Results Setup LM Toolkit: SRILM Decoder: Hierarchical phrase-based translation system Test data: IWSLT-07(489) & NIST-06(1664) Training data: TasksModel Parallel sentences Chinese words English words IWSLT-07 TM [1] [1] 0.38M3.0M3.1M LM [2] [2] 1.3M —— 15.2M NIST-06 TM [3] [3] 3.4M64M70M LM [4] [4] 14.3M —— 377M [1] [1] The parallel corpus of BTEC (Basic Traveling Expression Corpus) and CJK (China-Japan-Korea corpus) [2] [2] The English corpus of BTEC+CJK+CWMT2008 [3] [3] LDC2002E18, LDC2002T01, LDC2003E07, LDC2003E14, LDC2003T17, LDC2004T07, LDC2004T08, LDC2005T06, LDC2005T10, LDC2005T34, LDC2006T04, LDC2007T09 [4] [4] LDC2007T07

33 32 Results Storage Space The storage sizes increase about 35% Linearly dependent with the nodes of trie Acceptable Tasksn-gramsSRILM (Mb)WFSA (Mb)Δ (%) IWSLT-07 465.789.135.6 589.8119.533.1 NIST-06 4860.31190.438.4 5998.51339.734.2 The comparison of LM size between SRILM and WFSA

34 33 Results Query Speed WFSA – 60% in 4-grams – 70% in 5-grams WFSA+cache – Speed up by 75% n-gramsmethodsIWSLT-07(s)NIST-06(s) 4 SRILM16315433 WFSA706251 WFSA+cache423907 5 SRILM26125172 WFSA857944 WFSA+cache596128

35 34 Results Analysis Repetitive queries and back-off queries in SMT 4-gram – back-off queries are widely existed – most of these queries are repetitive WFSA based LM can speed up queries effectively TasksBack-offRepetitive IWSLT-0760.5%95.5% NIST-0660.3%96.4%

36 35 Outline Task Problems Solution Our Approach Results Conclusion

37 36 Conclusion A faster WFSA-based LM Faster forward query Faster back-off query A compact WFSA-based LM Trie structure A simple caching technique For SMT system Other fields Speech recognition Information retrieval

38 Thanks!


Download ppt "Compact WFSA based Language Model and Its Application in Statistical Machine Translation Xiaoyin Fu, Wei Wei, Shixiang Lu, Dengfeng Ke, Bo Xu Interactive."

Similar presentations


Ads by Google