Download presentation
Presentation is loading. Please wait.
Published bySusanna Lang Modified over 9 years ago
1
Named Entity Recognition based on Bilingual Co-training Li Yegang lyg8256@bit.edu.cn School of Computer, BIT
2
Outline Introduction and Related Work Bilingual Co-training Corrective NE Projection Annotation Experiment Result
3
1.Introduction Related work –Das et al.,2011 using parallel English-foreign language data, a high- quality NER tagger for English, and projected annotations for the foreign language. –Burkett et al., 2010 Parallel data has also been used to improve existing monolingual taggers or other analyzers in both languages.
4
Introduction Disadvantages – current NE alignment methods are not accurate enough, and many noises could be introduced during the word alignment stage. –manual annotation is usually obtained from a few limited domains, leading to a bad affect on statistical supervised learning methods.
5
Introduction Complementarity a) Results from Chinese name tagger : 金庸新 小说 (b) Results from English name tagger: the new novels of Jin Yong (c) Name tagging after using bilingual co-training: 金庸 新小说 “ 金庸新 ” and “ 金庸 ” in Chinese can be a PER name, while its English translation “Jin Yong” indicates that “ 金庸 ” is more likely to be a PER name than “ 金庸新 ”.
6
Introduction Complementarity (a) Results from English name tagger The captain of a ferry boat who works on 〈 PER 〉 Lake Constance 〈 /PER 〉... (b) Results from Chinese name tagger: 在〈 LOC 〉康斯坦茨湖〈 /LOC 〉工作的一艘渡船的船长... (c) Name tagging after using bilingual co-training: The captain of a ferry boat who works on 〈 LOC 〉 LakeConstance 〈 /LOC 〉 … “Lake” in English can be the suffix word of either a PER or LOC name, while its Chinese translation “ 康斯坦茨湖 ” indicates that “Lake Constance” is more likely to be a LOC name.
7
2.Bilingual Co-training Co-training –Starting with a set of labeled data, co-training algorithms attempt to increase the amount of annotated data using large amounts of unlabeled data. The process may continue for several iterations.
8
Bilingual Co-training –regarding the parallel Chinese-English sentences as weaker independent views for NE identity
9
Bilingual Co-training
10
Given: –A set Ls of source labeled examples –A set Lt of target labeled examples – A set Us of source unlabeled examples – A set Ut of target unlabeled examples Algorithm of Bilingual Co-training
11
1. Classifiers –Use Ls to train the classifiers Classifier(s) –Use Lt to train the classifiers Classifier(t) Algorithm of Bilingual Co-training
15
3.Corrective NE Projection Annotation Projection NE Candidate –For each word in the source language NE, we find all the possible projection word in target language through the word alignment. Next, we have all the projection words as the “seed” data. With an open-ended window for each seed, all the possible sequences located within the window are considered as possible candidates for NE projection. Their lengths range from 1 to the empirically determined length of the window. During the best candidate projection NE selection, the NE alignment model discussed as follows is applied to search the best projection NE.
16
3.Corrective NE Projection Annotation NE Alignment Model translation feature, the source NE and target NE’s co- occurrence feature, and length of NE pair feature
17
3.Corrective NE Projection Annotation Translation Feature (Brown et al.,1993)
18
3.Corrective NE Projection Annotation Co-occurrence Feature
19
3.Corrective NE Projection Annotation Length Feature (Church,1993)
20
Experimental Results NE TypeChinese(F)Enlish(F) PER89.5981.22 LOC88.4880.43 ORG84.5479.69 ALL87.1780.37 Baseline F-Measure (%) of NE Tagging
21
Experimental Results Co-training Model F-Measure (%) of NE Tagging NE TypeChinese(F)Enlish(F) PER90.8682.31 LOC89.5382.01 ORG85.7180.42 ALL88.2881.76
22
Thank you!
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.