Download presentation
Presentation is loading. Please wait.
Published byChester Clarke Modified over 8 years ago
1
An Adaptive Learning with an Application to Chinese Homophone Disambiguation from Yue-shi Lee International Journal of Computer Processing of Oriental Languages, Vol. 15, No. 3 (2002) 245 – 260
2
Introduction Contextual language processing: Given a specified corpus, finding a plausible probabilistic model that translate the perceived syllable sequence into characters, words, or sentences correctly (By the sense of maximum likelihood). Two main errors affect the correctness: Modeling error : Weakness of the language model – enhance or refine the weak language model. Estimation error : Small size training corpus ; Variant run-time context domain ;
3
Introduction (cont.) Estimation error : Small size training corpus: smoothing, class-based and similarity-based models to STATICALLY adjust the unreliable probabilities. Variant run-time context domain: cache-based model: capturing short-term fluctuation of the frequencies of words, effective for frequent words or repeated sentences. multiple language model: Defining a “current time context”, and importing several similar texts from distinct fields for comparison.
4
Introduction (cont.) Author’s work: Goal: To decrease the estimation error under the variant context situation. Method: Developing an adaptive algorithm to adjust statistical information under the variant context environment; Application: Disambiguation of homophone Chinese.
5
Language Model Chinese Fact: 1,300 syllables, 13,094 commonly used characters, 100,000 words. Disambiguation is important: Many different characters share the same syllables.
6
Language Model (cont.) Disambiguation Definition: Translate a sequence of syllables into a sequence of characters (sentence) with correct meaning. Modeling:
7
Language Model (cont.) Simplification:
8
Language Model (cont.) Stored context statistical information: 2-D matrix 1-D vector Contributes of author ’ s work: Adapt the context statistical information according to feedback estimation errors under variant context situation.
9
Adaptive Learning Model Detecting the components needing be adjusted
10
Adaptive Learning Model (cont.) Detecting the components needing be adjusted (cont.)
11
Adaptive Learning Model (cont.) Modeling and Learning algorithm:
12
Experimental Results Two parts of data come from two different context. Part1 has 123 sentences and 1057 characters Part2 has 123 sentences and 977 characters Two Stages of Experiments Stage1: Parts1 as learning data, part2 as testing data Stage2: Parts2 as learning data, part1 as testing data
13
Experimental Results
14
Experimental Results (cont.)
15
Conclusion propose an adaptive learning algorithm for task adaptation to fit best the run-time context domain in the application of Chinese homophone disambiguation. This adaptive learning algorithm is also suitable for incremental training. Personal comment (1) : The author proposed an MLP based method. However, I think the modeling of the learning problem is quite a linear one, therefore the MLP is not necessary. Personal comment (2) : I do not think the experimental results are good enough, however, the adaptive learning idea in nature language processing is interesting. Link : from Inspec database, available on MSU librarywebsite.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.