Japanese Dependency Analysis using Cascaded Chunking Taku Kudo 工藤拓 Yuji Matsumoto 松本裕治 Nara Institute Science and Technology, JAPAN.

Japanese Dependency Analysis using Cascaded Chunking Taku Kudo 工藤拓 Yuji Matsumoto 松本裕治 Nara Institute Science and Technology, JAPAN

Motivation Kudo, Matsumoto 2000 (VLC) Presented a state-of-the-art Japanese dependency parser using SVMs ( 89.09 % for standard dataset) Could show the high generalization performance and feature selection abilities of SVMs Problems Not scalable 2 weeks training using 7,958 sentences Hard to train with larger data Slow in Parsing 2 ～ 3 sec./sentence Too slow to use it for actual NL applications

Goal Improve the scalability and the parsing efficiency without loosing accuracy ! How? Apply Cascaded Chunking model to dependency parsing and the selection of training examples Reduce the number of times SVMs are consulted in parsing Reduce the number of negative examples learned

Outline Japanese dependency analysis Two models Probabilistic model (previous) Cascaded Chunking model (new!) Features used for training and classification Experiments and results Conclusion and future work

Japanese Dependency Analysis (1/2) Analysis of relationship between phrasal units called bunsetsu (segments), base phrases in English Two Constraints Each segment modifies one of the right-side segments (Japanese is head final language) Dependencies do not cross each other

Japanese Dependency Analysis (2/2) Morphological analysis and Bunsetsu identification 私は / 彼女と / 京都に / 行きます I with her to Kyoto-loc go 私は彼女と京都に行きます I go to Kyoto with her. Raw text 私は / 彼女と / 京都に / 行きます Dependency Analysis

Probabilistic Model 私は 1 / 彼女と 2 / 京都に 3 / 行きます 4 I-top / with her / to Kyoto-loc / go Input 1.03 0.80.22 0.70.20.11 432 Dependency Matrix Modifiee Modifier 1. Build a Dependency Matrix ME, DT or SVMs (How probable one segment modifies another) 2. Search the optimal dependencies which maximize the sentence probabilities using CYK or Chart Output 私は 1 / 彼女と 2 / 京都に 3 / 行きます 4

Problems of Probabilistic model(1/2) Selection of training examples: All candidates of two segments which have Dependency relation → positive No dependency relation → negative This straightforward way of selection requires a total (where n is # of segments in a sentence) training examples per sentence Difficult to combine probabilistic model with SVMs which require polynomial computational cost

Problems of Probabilistic model(2/2) parsing time is necessary with CYK or Chart Even if beam-search is applied, parsing time is always necessary The classification cost of SVMs is much more expensive than other ML algorithms such as ME and DT

Cascaded Chunking Model English parsing [Abney 1991] Parses a sentence deterministically only deciding whether the current segment modifies the segment on its immediate right hand side Training examples are extracted using this algorithm itself

Example: Training Phase 彼は 1 彼女の 2 温かい 3 真心に 4 感動した。 5 He her warm heart be moved (He was moved by her warm heart.) Annotated sentence SVMs Training Data Pairs of tag (D or O) and context(features) are stored as training data for SVMs Tag is decided by annotated corpus 彼は 1 彼女の 2 温かい 3 真心に 4 感動した。 5 O O D D O ? ?? ? 彼は 1 彼女の 2 真心に 4 感動した。 5 O D D O ? ? ? 彼は 1 真心に 4 感動した。 5 ? ? O D O 彼は 1 感動した。 5 D O ? 感動した。 5 finish 彼は 1 彼女の 2 温かい 3 真心に 4 感動した。 5

Example: Test Phase 彼は 1 彼女の 2 温かい 3 真心に 4 感動した。 5 He her warm heart be moved (He was moved by her warm heart.) Test sentence SVMs Tag is decided by SVMs built in training phase 彼は 1 彼女の 2 温かい 3 真心に 4 感動した。 5 O O D D O ? ?? ? 彼は 1 彼女の 2 真心に 4 感動した。 5 O D D O ? ? ? 彼は 1 真心に 4 感動した。 5 ? ? O D O 彼は 1 感動した。 5 D O ? 感動した。 5 finish 彼は 1 彼女の 2 温かい 3 真心に 4 感動した。 5

Advantages of Cascaded Chunking model Simple and Efficient Prob.: v.s. cascaded chunking: Lower than since most of segments modify segment on its immediate right- hand-side Training examples is much smaller Independent from ML algorithm Can be combined with any ML algorithms which work as a binary classifier Probabilities of dependency are not necessary

Features 彼の 1 友人は 2 この本を 3 持っている 4 女性を 5 探している 6 His friend-top this book-acc have lady-acc be looking for modifier modifiee Static Features modifier/modifiee Head/Functional Word: (surface,POS,POS-subcategory,inflection- type,inflection-form), brackets, quotations, punctuations, position Between segments: distance, case-particles, brackets, quotations, punctuations Dynamic Features [Kudo, Matsumoto 2000] A,B : Static features of Functional word C: Static features of Head word BAC Modify or not? His friend is looking for a lady who has this book.

Experimental Setting Kyoto University Corpus 2.0/3.0 Standard Data Set Training: 7,958 sentences / Test: 1,246 sentences Same data as [Uchimoto et al. 98, Kudo, Matsumoto 00] Large Data Set 2-fold Cross-Validation using all 38,383 sentences Kernel Function: 3 rd polynomial Evaluation method Dependency accuracy Sentence accuracy

Results Data SetStandardLarge ModelCascaded Chunking ProbabilisticCascaded Chunking Probabilistic Dependency Acc. (%) 89.29 89.09 90.04 N/A Sentence Acc. (%) 47.53 46.17 53.16 N/A # of training sentences 7,956 19,191 # of training examples 110,355 459,105 251,254 1,074,316 Training time (hours) 8 336 48 N/A Parsing time (sec./sent.) 0.5 2.1 0.7 N/A

Effect of Dynamic Features(1/2)

Effect of Dynamic Features (2/2) Deleted type of dynamic features Difference from the model with all dynamic features Dependency Acc. Sentence Acc. A -0.28 % -0.89 % B -0.10% -0.89 % C -0.28 % -0.56 % AB -0.33 % -1.21 % AC -0.55 % -0.97 % BC -0.54 % -1.61 % ABC -0.58 % -2.34 % 彼の 1 友人は 2 この本を 3 持っている 4 女性を 5 探している 6 His Friend-top this book-acc have lady-acc be looking for modifier modifiee BAC Modify or not?

Probabilistic v.s. Cascaded Chunking (1/2) 彼は 1 この本を 2 持っている 3 女性を 4 探している 5 He-top this book-acc have lady-acc be looking for modifier modifiee (He is looking for a lady who has this book.) Positive: この本を 2 → 持っている 3 Negative: この本を 2 → 探している 5 Probabilistic models commit a number of unnecessary examples unnecessary Probabilistic Model uses all candidates of dependency relation as training data

Probabilistic v.s. Cascaded Chunking (2/2) ProbabilisticCascaded Chunking StrategyMaximize sentence probability Shift-Reduce Deterministic MeritCan see all candidates of dependency Simple, efficient and scalable Accurate as Prob. model DemeritNot efficient, Commit to unnecessary training examples Cannot see the all (posterior) candidates of dependency

Conclusion A new Japanese dependency parser using a cascaded chunking model It outperforms the previous probabilistic model with respect to accuracy, efficiency and scalability Dynamic features significantly contribute to improve the performance

Future Work Coordinate structure analysis Coordinate structures frequently appear in Japanese long sentences and make analysis hard Use posterior context Hard to parse the following sentence only using cascaded chunking model 僕の母のダイヤの指輪 My mother ’ s diamond ring

Comparison with Related Work ModelTraining Corpus (# of sentences) Acc. (%) Our ModelCascaded Chunking + SVMs Kyoto Univ. (19,191)90.46 Kyoto Univ. (7,956)89.29 Kudo et al. 00Prob. + SVMsKyoto Univ. (7,956)89.09 Uchimoto et al. 00 Prob. + MEKyoto Univ. (7,956)87.93 Kanayama et al. 00 Prob. + ME + HPSG EDR (192,778)88.55 Haruno et al. 98Prob. + DT + Boosting EDR (50,000)85.03 Fujio et al. 98Prob. + MLEDR (190,000)86.67

Support Vector Machines [Vapnik] Maximize the margin d Min. ： s.t. ： Soft Margin Kernel Function

Japanese Dependency Analysis using Cascaded Chunking Taku Kudo 工藤拓 Yuji Matsumoto 松本裕治 Nara Institute Science and Technology, JAPAN.

Similar presentations

Presentation on theme: "Japanese Dependency Analysis using Cascaded Chunking Taku Kudo 工藤拓 Yuji Matsumoto 松本裕治 Nara Institute Science and Technology, JAPAN."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Japanese Dependency Analysis using Cascaded Chunking Taku Kudo 工藤 拓 Yuji Matsumoto 松本 裕治 Nara Institute Science and Technology, JAPAN.

Similar presentations

Presentation on theme: "Japanese Dependency Analysis using Cascaded Chunking Taku Kudo 工藤 拓 Yuji Matsumoto 松本 裕治 Nara Institute Science and Technology, JAPAN."— Presentation transcript:

Similar presentations

About project

Feedback

Japanese Dependency Analysis using Cascaded Chunking Taku Kudo 工藤拓 Yuji Matsumoto 松本裕治 Nara Institute Science and Technology, JAPAN.

Presentation on theme: "Japanese Dependency Analysis using Cascaded Chunking Taku Kudo 工藤拓 Yuji Matsumoto 松本裕治 Nara Institute Science and Technology, JAPAN."— Presentation transcript: