Japanese Dependency Analysis using Cascaded Chunking Taku Kudo 工藤 拓 Yuji Matsumoto 松本 裕治 Nara Institute Science and Technology, JAPAN.

Slides:



Advertisements
Similar presentations
あどべんちゃーにほんご L. 2か にほんごのきょうしつ /Japanese Classroom General goals of the lessons: You will be able to communicate the information below in the given situations.
Advertisements

Can の使い方を練習しよう can は ~できるという意味です。 I can play baseball. 私は野球ができま す。
Is Charlie Brown a Loser? Do you think you know your real character? Do you actually know, for example, the good and bad points about yourself? It is.
SVM—Support Vector Machines
A Comparison of Statistical Post-Editing on Chinese and Japanese Midori Tatsumi and Yanli Sun Under the supervision of: Sharon O’Brien; Minako O’Hagan;
Automatic Image Collection of Objects with Similar Function by Learning Human Grasping Forms Shinya Morioka, Tadashi Matsuo, Yasuhiro Hiramoto, Nobutaka.
Giving and Receiving Gifts Chapter 15 のぶんぽう. Giving and Receiving Gifts Giving and receiving gifts is a very important custom in Japan. As such, it is.
© S. Hamano and W. Kikuchi 1 Visualizing Japanese Grammar Appendix Shoko Hamano George Washington University.
Made with OpenOffice.org 1 Sentiment Classification using Word Sub-Sequences and Dependency Sub-Trees Pacific-Asia Knowledge Discovery and Data Mining.
1 Fast Methods for Kernel-based Text Analysis Taku Kudo 工藤 拓 Yuji Matsumoto 松本 裕治 NAIST (Nara Institute of Science and Technology) 41st Annual Meeting.
JPN 311: Conversation and Composition 伝言 (relaying a message)
JPN 311: Conversation and Composition 許可 (permission)
地図に親しむ 「しゅくしゃくのちがう 地図を 使ってきょりを調べよ う1」 小学4年 社会. 山口駅裁判所 県立 美術館 サビエル 記念聖堂 山口市役所 地図で探そう 市民会館 県立 図書館.
JPN494: Japanese Language and Linguistics JPN543: Advanced Japanese Language and Linguistics Syntax (4)
タイピングゲー ム ~坂井 D 班の発表~ ~坂井 D 班の発表~. メンバー  村本 晟弥  岡本 武士  若松 健人.
Ishida & Matsubara Laboratory – Ari Hautasaari Target system introduction Quantitative analysis of the statistics Research problems Solution discussion.
携帯電話でのコミュニ ケーションについて 1班真田 出水 佐伯 堺. 仮説  女性のほうが携帯電話を使ったコミュニ ケーションを重要視する。
An SVM Based Voting Algorithm with Application to Parse Reranking Paper by Libin Shen and Aravind K. Joshi Presented by Amit Wolfenfeld.
Exercise IV-A p.164. What did they say? 何と言ってましたか。 1.I’m busy this month. 2.I’m busy next month, too. 3.I’m going shopping tomorrow. 4.I live in Kyoto.
An SVMs Based Multi-lingual Dependency Parsing Yuchang CHENG, Masayuki ASAHARA and Yuji MATSUMOTO Nara Institute of Science and Technology.
Self-efficacy(自己効力感)について
Seven Lectures on Statistical Parsing Christopher Manning LSA Linguistic Institute 2007 LSA 354 Lecture 7.
Japanese Dependency Structure Analysis Based on Maximum Entropy Models Kiyotaka Uchimoto † Satoshi Sekine ‡ Hitoshi Isahara † † Kansai Advanced Research.
Efficient Model Selection for Support Vector Machines
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
National Institute of Informatics Kiyoko Uchiyama 1 A Study for Introductory Terms in Logical Structure of Scientific Papers.
On the Issue of Combining Anaphoricity Determination and Antecedent Identification in Anaphora Resolution Ryu Iida, Kentaro Inui, Yuji Matsumoto Nara Institute.
You (and a partner) have $4,000 each to use to plan a trip to Japan. When you go, where you go, how you travel around the country, where you stay, is ALL.
One World Your Translation Companion Kenny Risk Alex Cheng Angapparaj Kalimuthu Daniel Mejia.
Verbs 2 Grammar and Vocabulary Ⅰ May 17, Let’s begin with Q-and-A What time did you go to bed last night? How long did you sleep? What time did.
Phrase Reordering for Statistical Machine Translation Based on Predicate-Argument Structure Mamoru Komachi, Yuji Matsumoto Nara Institute of Science and.
Joining Adjectives て form for い adjectives and な adjectives.
1 Boosting-based parse re-ranking with subtree features Taku Kudo Jun Suzuki Hideki Isozaki NTT Communication Science Labs.
たくさんの人がいっしょに乗れる乗り物を 「公共交通」といいます バスや電車 と 自動車 の よいところ と よくない ところ よいところ と よくない ところ を考えてみよう!
Statistical Phrase Alignment Model Using Dependency Relation Probability Toshiaki Nakazawa and Sadao Kurohashi Kyoto University.
2006/12/081 Large Scale Crawling the Web for Parallel Texts Chikayama Taura lab. M1 Dai Saito.
日本語きほん文法の復習 Basic Japanese Grammar Review
Indirect Supervision Protocols for Learning in Natural Language Processing II. Learning by Inventing Binary Labels This work is supported by DARPA funding.
Bellwork: 1) How do you plan to survive an immersion classroom? 2) What things can you do to improve your speaking ability in class? 3) What is circumlocution?
Finding Translation Correspondences from Parallel Parsed Corpus for Example-based Translation Eiji Aramaki (Kyoto-U), Sadao Kurohashi (U-Tokyo), Satoshi.
2007/4/201 Extracting Parallel Texts from Massive Web Documents Chikayama Taura lab. M2 Dai Saito.
Japanese Affairs Presentation 王雯 Do You Want To Go Abroad ?
ACTION WORDS (VERBS). 行きます to go します to do 見ます to watch / see/ look かいます to buy よみます to read ききます to listen.
日本語1 2月12日 愛 あい. みっきーは みにーを あいしてい ます。 ほーまーは まーじを あいしてい ます。
The Game Begins Background information screen (this may need to be presented some other way & need more detail.): Jicchan is your grandfather’s good friend.
B 04 How to Type in Japanese How do you TYPE in Japanese?
1 Minimum Error Rate Training in Statistical Machine Translation Franz Josef Och Information Sciences Institute University of Southern California ACL 2003.
い 日本の どこに 行きたい です か。 Where do you want to go in Japan?
英語勉強会(坂田英語) B4 詫間 風人. A Corrected English Composition Sharing System Classification Display and Interface for Searching A corrected English composition.
英語勉強会 10/13 住谷 English /21 三木 裕太. 原文 The purpose of this study is Development of system for Automated Generation of Deformed Maps. My study become.
Assignments: -Writing practice prompt due THUR. -Quiz signed.
Jeopardy KatakanaAdverbsParticles Q $100 Q $200 Q $300 Q $400 Q $500 Q $100 Q $200 Q $300 Q $400 Q $500 Final Jeopardy Vocabular y Translations.
Virtual Examples for Text Classification with Support Vector Machines Manabu Sassano Proceedings of the 2003 Conference on Emprical Methods in Natural.
音読用 ICT 教材 サンプル フラッシュ型 文字が消える 文字が現れる 文字の色が変わる 職場体験では.
Support Vector Machines Reading: Ben-Hur and Weston, “A User’s Guide to Support Vector Machines” (linked from class web page)
かぞく 家族. Today… Review family members vocabulary and kanji characters Enhance knowledge and understanding of connecting adjectives Answer questions in.
Noun Modification Describing nouns. りん ご red fresh yummy あかい あたらし い おいしい big 大き い.
英語勉強会 (橋本さんの) 10月9日 坂田梨紗. 英語の文章の 成り立ち 言いたいこと 説明 言いたいこと I went to the library to read Harry Potter.
Rethinking Retirement Unit 15. Paragraph 1  You probably haven’t thought much about retirement yet.  After all 要するに、つまり、結局は  Ahead of you 貴方の前には 
RELATIVE CLAUSES Adjectival Clauses/Modifiers. RELATIVE CLAUSES A relative clause is the part of a sentence which describes a noun Eg. The cake (which)
I have a ball. 私は、ボールを持っています 。 I play volley ball. 私は、バレーボールをします。 I like volley ball. 私は、バレーボールが好きです。
Katakana English and English Education Presenter: Ryota Tokunaga.
Boosted Augmented Naive Bayes. Efficient discriminative learning of
CRF &SVM in Medication Extraction
Lei Sha, Jing Liu, Chin-Yew Lin, Sujian Li, Baobao Chang, Zhifang Sui
Ask Have ~ ? / How long ~ ? Answer these questions
Dependency Model Using Posterior Context
Approval vs. Costly Punishment
Kyoto University Participation to WAT 2016
Presentation transcript:

Japanese Dependency Analysis using Cascaded Chunking Taku Kudo 工藤 拓 Yuji Matsumoto 松本 裕治 Nara Institute Science and Technology, JAPAN

Motivation Kudo, Matsumoto 2000 (VLC) Presented a state-of-the-art Japanese dependency parser using SVMs ( % for standard dataset) Could show the high generalization performance and feature selection abilities of SVMs Problems Not scalable 2 weeks training using 7,958 sentences Hard to train with larger data Slow in Parsing 2 ~ 3 sec./sentence Too slow to use it for actual NL applications

Goal Improve the scalability and the parsing efficiency without loosing accuracy ! How? Apply Cascaded Chunking model to dependency parsing and the selection of training examples Reduce the number of times SVMs are consulted in parsing Reduce the number of negative examples learned

Outline Japanese dependency analysis Two models Probabilistic model (previous) Cascaded Chunking model (new!) Features used for training and classification Experiments and results Conclusion and future work

Japanese Dependency Analysis (1/2) Analysis of relationship between phrasal units called bunsetsu (segments), base phrases in English Two Constraints Each segment modifies one of the right-side segments (Japanese is head final language) Dependencies do not cross each other

Japanese Dependency Analysis (2/2) Morphological analysis and Bunsetsu identification 私は / 彼女と / 京都に / 行きます I with her to Kyoto-loc go 私は彼女と京都に行きます I go to Kyoto with her. Raw text 私は / 彼女と / 京都に / 行きます Dependency Analysis

Probabilistic Model 私は 1 / 彼女と 2 / 京都に 3 / 行きます 4 I-top / with her / to Kyoto-loc / go Input Dependency Matrix Modifiee Modifier 1. Build a Dependency Matrix ME, DT or SVMs (How probable one segment modifies another) 2. Search the optimal dependencies which maximize the sentence probabilities using CYK or Chart Output 私は 1 / 彼女と 2 / 京都に 3 / 行きます 4

Problems of Probabilistic model(1/2) Selection of training examples: All candidates of two segments which have Dependency relation → positive No dependency relation → negative This straightforward way of selection requires a total (where n is # of segments in a sentence) training examples per sentence Difficult to combine probabilistic model with SVMs which require polynomial computational cost

Problems of Probabilistic model(2/2) parsing time is necessary with CYK or Chart Even if beam-search is applied, parsing time is always necessary The classification cost of SVMs is much more expensive than other ML algorithms such as ME and DT

Cascaded Chunking Model English parsing [Abney 1991] Parses a sentence deterministically only deciding whether the current segment modifies the segment on its immediate right hand side Training examples are extracted using this algorithm itself

Example: Training Phase 彼は 1 彼女の 2 温かい 3 真心に 4 感動した。 5 He her warm heart be moved (He was moved by her warm heart.) Annotated sentence SVMs Training Data Pairs of tag (D or O) and context(features) are stored as training data for SVMs Tag is decided by annotated corpus 彼は 1 彼女の 2 温かい 3 真心に 4 感動した。 5 O O D D O ? ?? ? 彼は 1 彼女の 2 真心に 4 感動した。 5 O D D O ? ? ? 彼は 1 真心に 4 感動した。 5 ? ? O D O 彼は 1 感動した。 5 D O ? 感動した。 5 finish 彼は 1 彼女の 2 温かい 3 真心に 4 感動した。 5

Example: Test Phase 彼は 1 彼女の 2 温かい 3 真心に 4 感動した。 5 He her warm heart be moved (He was moved by her warm heart.) Test sentence SVMs Tag is decided by SVMs built in training phase 彼は 1 彼女の 2 温かい 3 真心に 4 感動した。 5 O O D D O ? ?? ? 彼は 1 彼女の 2 真心に 4 感動した。 5 O D D O ? ? ? 彼は 1 真心に 4 感動した。 5 ? ? O D O 彼は 1 感動した。 5 D O ? 感動した。 5 finish 彼は 1 彼女の 2 温かい 3 真心に 4 感動した。 5

Advantages of Cascaded Chunking model Simple and Efficient Prob.: v.s. cascaded chunking: Lower than since most of segments modify segment on its immediate right- hand-side Training examples is much smaller Independent from ML algorithm Can be combined with any ML algorithms which work as a binary classifier Probabilities of dependency are not necessary

Features 彼の 1 友人は 2 この本を 3 持っている 4 女性を 5 探している 6 His friend-top this book-acc have lady-acc be looking for modifier modifiee Static Features modifier/modifiee Head/Functional Word: (surface,POS,POS-subcategory,inflection- type,inflection-form), brackets, quotations, punctuations, position Between segments: distance, case-particles, brackets, quotations, punctuations Dynamic Features [Kudo, Matsumoto 2000] A,B : Static features of Functional word C: Static features of Head word BAC Modify or not? His friend is looking for a lady who has this book.

Experimental Setting Kyoto University Corpus 2.0/3.0 Standard Data Set Training: 7,958 sentences / Test: 1,246 sentences Same data as [Uchimoto et al. 98, Kudo, Matsumoto 00] Large Data Set 2-fold Cross-Validation using all 38,383 sentences Kernel Function: 3 rd polynomial Evaluation method Dependency accuracy Sentence accuracy

Results Data SetStandardLarge ModelCascaded Chunking ProbabilisticCascaded Chunking Probabilistic Dependency Acc. (%) N/A Sentence Acc. (%) N/A # of training sentences 7,956 19,191 # of training examples 110, , ,254 1,074,316 Training time (hours) N/A Parsing time (sec./sent.) N/A

Effect of Dynamic Features(1/2)

Effect of Dynamic Features (2/2) Deleted type of dynamic features Difference from the model with all dynamic features Dependency Acc. Sentence Acc. A % % B -0.10% % C % % AB % % AC % % BC % % ABC % % 彼の 1 友人は 2 この本を 3 持っている 4 女性を 5 探している 6 His Friend-top this book-acc have lady-acc be looking for modifier modifiee BAC Modify or not?

Probabilistic v.s. Cascaded Chunking (1/2) 彼は 1 この本を 2 持っている 3 女性を 4 探している 5 He-top this book-acc have lady-acc be looking for modifier modifiee (He is looking for a lady who has this book.) Positive: この本を 2 → 持っている 3 Negative: この本を 2 → 探している 5 Probabilistic models commit a number of unnecessary examples unnecessary Probabilistic Model uses all candidates of dependency relation as training data

Probabilistic v.s. Cascaded Chunking (2/2) ProbabilisticCascaded Chunking StrategyMaximize sentence probability Shift-Reduce Deterministic MeritCan see all candidates of dependency Simple, efficient and scalable Accurate as Prob. model DemeritNot efficient, Commit to unnecessary training examples Cannot see the all (posterior) candidates of dependency

Conclusion A new Japanese dependency parser using a cascaded chunking model It outperforms the previous probabilistic model with respect to accuracy, efficiency and scalability Dynamic features significantly contribute to improve the performance

Future Work Coordinate structure analysis Coordinate structures frequently appear in Japanese long sentences and make analysis hard Use posterior context Hard to parse the following sentence only using cascaded chunking model 僕の 母の ダイヤの 指輪 My mother ’ s diamond ring

Comparison with Related Work ModelTraining Corpus (# of sentences) Acc. (%) Our ModelCascaded Chunking + SVMs Kyoto Univ. (19,191)90.46 Kyoto Univ. (7,956)89.29 Kudo et al. 00Prob. + SVMsKyoto Univ. (7,956)89.09 Uchimoto et al. 00 Prob. + MEKyoto Univ. (7,956)87.93 Kanayama et al. 00 Prob. + ME + HPSG EDR (192,778)88.55 Haruno et al. 98Prob. + DT + Boosting EDR (50,000)85.03 Fujio et al. 98Prob. + MLEDR (190,000)86.67

Support Vector Machines [Vapnik] Maximize the margin d Min. : s.t. : Soft Margin Kernel Function