Download presentation
Presentation is loading. Please wait.
Published byWilla Arnold Modified over 9 years ago
2
Yun-Nung (Vivian) Chen, Yu Huang, Sheng-Yi Kong, Lin-Shan Lee National Taiwan University, Taiwan
3
2 Key Term Extraction, National Taiwan University
4
Definition Key Term Higher term frequency Core content Two types Keyword Key phrase Advantage Indexing and retrieval The relations between key terms and segments of documents 3 Key Term Extraction, National Taiwan University
5
Introduction 4 Key Term Extraction, National Taiwan University
6
Introduction 5 Key Term Extraction, National Taiwan University acoustic model language model hmm n gram phonehidden Markov model
7
Introduction 6 Key Term Extraction, National Taiwan University hmm acoustic model language model n gram phonehidden Markov model bigram Target: extract key terms from course lectures
8
7 Key Term Extraction, National Taiwan University
9
Automatic Key Term Extraction 8 Key Term Extraction, National Taiwan University ▼ Original spoken documents Archive of spoken documents Branching Entropy Feature Extraction Learning Methods 1)K-means Exemplar 2)AdaBoost 3)Neural Network ASR speech signal ASR trans
10
Automatic Key Term Extraction 9 Key Term Extraction, National Taiwan University Archive of spoken documents Branching Entropy Feature Extraction Learning Methods 1)K-means Exemplar 2)AdaBoost 3)Neural Network ASR speech signal ASR trans
11
Automatic Key Term Extraction 10 Key Term Extraction, National Taiwan University Archive of spoken documents Branching Entropy Feature Extraction Learning Methods 1)K-means Exemplar 2)AdaBoost 3)Neural Network ASR speech signal ASR trans
12
Phrase Identification Automatic Key Term Extraction 11 Key Term Extraction, National Taiwan University Archive of spoken documents Branching Entropy Feature Extraction Learning Methods 1)K-means Exemplar 2)AdaBoost 3)Neural Network ASR speech signal First using branching entropy to identify phrases ASR trans
13
Phrase Identification Key Term Extraction Automatic Key Term Extraction 12 Key Term Extraction, National Taiwan University Archive of spoken documents Branching Entropy Feature Extraction Learning Methods 1)K-means Exemplar 2)AdaBoost 3)Neural Network ASR speech signal Key terms entropy acoustic model : Then using learning methods to extract key terms by some features ASR trans
14
Phrase Identification Key Term Extraction Automatic Key Term Extraction 13 Key Term Extraction, National Taiwan University Archive of spoken documents Branching Entropy Feature Extraction Learning Methods 1)K-means Exemplar 2)AdaBoost 3)Neural Network ASR speech signal Key terms entropy acoustic model : ASR trans
15
Branching Entropy 14 Key Term Extraction, National Taiwan University “hidden” is almost always followed by the same word hidden Markov model How to decide the boundary of a phrase? represent is can : : is of in : :
16
Branching Entropy 15 Key Term Extraction, National Taiwan University “hidden” is almost always followed by the same word “hidden Markov” is almost always followed by the same word hidden Markov model How to decide the boundary of a phrase? represent is can : : is of in : :
17
Branching Entropy 16 Key Term Extraction, National Taiwan University hidden Markov model boundary Define branching entropy to decide possible boundary How to decide the boundary of a phrase? represent is can : : is of in : : “hidden” is almost always followed by the same word “hidden Markov” is almost always followed by the same word “hidden Markov model” is followed by many different words
18
Branching Entropy 17 Key Term Extraction, National Taiwan University hidden Markov model Definition of Right Branching Entropy Probability of children x i for X Right branching entropy for X X xixi How to decide the boundary of a phrase? represent is can : : is of in : :
19
Branching Entropy 18 Key Term Extraction, National Taiwan University hidden Markov model Decision of Right Boundary Find the right boundary located between X and x i where X boundary How to decide the boundary of a phrase? represent is can : : is of in : :
20
Branching Entropy 19 Key Term Extraction, National Taiwan University hidden Markov model How to decide the boundary of a phrase? represent is can : : is of in : :
21
Branching Entropy 20 Key Term Extraction, National Taiwan University hidden Markov model How to decide the boundary of a phrase? represent is can : : is of in : :
22
Branching Entropy 21 Key Term Extraction, National Taiwan University hidden Markov model How to decide the boundary of a phrase? represent is can : : is of in : :
23
Branching Entropy 22 Key Term Extraction, National Taiwan University hidden Markov model Decision of Left Boundary Find the left boundary located between X and x i where X: model Markov hidden How to decide the boundary of a phrase? boundary X represent is can : : is of in : : Using PAT Tree to implement
24
Branching Entropy 23 Key Term Extraction, National Taiwan University Implementation in the PAT tree Probability of children x i for X Right branching entropy for X hidden Markov 1 model 2 chain 3 state 5 distribution 6 variable 4 X x1x1 x2x2 X : hidden Markov x 1 : hidden Markov model x 2 : hidden Markov chain How to decide the boundary of a phrase?
25
Phrase Identification Key Term Extraction Automatic Key Term Extraction 24 Key Term Extraction, National Taiwan University Archive of spoken documents Branching Entropy Feature Extraction Learning Methods 1)K-means Exemplar 2)AdaBoost 3)Neural Network ASR speech signal Key terms entropy acoustic model : Extract prosodic, lexical, and semantic features for each candidate term ASR trans
26
Feature Extraction 25 Key Term Extraction, National Taiwan University Prosodic features For each candidate term appearing at the first time Feature Name Feature Description Duration (I – IV) normalized duration (max, min, mean, range) Speaker tends to use longer duration to emphasize key terms using 4 values for duration of the term duration of phone “a” normalized by avg duration of phone “a”
27
Feature Extraction 26 Key Term Extraction, National Taiwan University Prosodic features For each candidate term appearing at the first time Higher pitch may represent significant information Feature Name Feature Description Duration (I – IV) normalized duration (max, min, mean, range)
28
Feature Extraction 27 Key Term Extraction, National Taiwan University Prosodic features For each candidate term appearing at the first time Higher pitch may represent significant information Feature Name Feature Description Duration (I – IV) normalized duration (max, min, mean, range) Pitch (I - IV) F0 (max, min, mean, range)
29
Feature Extraction 28 Key Term Extraction, National Taiwan University Prosodic features For each candidate term appearing at the first time Higher energy emphasizes important information Feature Name Feature Description Duration (I – IV) normalized duration (max, min, mean, range) Pitch (I - IV) F0 (max, min, mean, range)
30
Feature Extraction 29 Key Term Extraction, National Taiwan University Prosodic features For each candidate term appearing at the first time Higher energy emphasizes important information Feature Name Feature Description Duration (I – IV) normalized duration (max, min, mean, range) Pitch (I - IV) F0 (max, min, mean, range) Energy (I - IV) energy (max, min, mean, range)
31
Feature Extraction 30 Key Term Extraction, National Taiwan University Lexical features Feature NameFeature Description TFterm frequency IDFinverse document frequency TFIDFtf * idf PoSthe PoS tag Using some well-known lexical features for each candidate term
32
Feature Extraction 31 Key Term Extraction, National Taiwan University Semantic features Probabilistic Latent Semantic Analysis (PLSA) Latent Topic Probability Key terms tend to focus on limited topics D i : documents T k : latent topics t j : terms
33
Feature Extraction 32 Key Term Extraction, National Taiwan University Semantic features Probabilistic Latent Semantic Analysis (PLSA) Latent Topic Probability Feature NameFeature Description LTP (I - III) Latent Topic Probability (mean, variance, standard deviation) non-key term key term Key terms tend to focus on limited topics describe a probability distribution How to use it?
34
Feature Extraction 33 Key Term Extraction, National Taiwan University Semantic features Probabilistic Latent Semantic Analysis (PLSA) Latent Topic Significance Within-topic to out-of-topic ratio Feature NameFeature Description LTP (I - III) Latent Topic Probability (mean, variance, standard deviation) non-key term key term Key terms tend to focus on limited topics within-topic freq. out-of-topic freq.
35
Feature Extraction 34 Key Term Extraction, National Taiwan University Semantic features Probabilistic Latent Semantic Analysis (PLSA) Latent Topic Significance Within-topic to out-of-topic ratio Feature NameFeature Description LTP (I - III) Latent Topic Probability (mean, variance, standard deviation) LTS (I - III) Latent Topic Significance (mean, variance, standard deviation) non-key term key term Key terms tend to focus on limited topics within-topic freq. out-of-topic freq.
36
Feature Extraction 35 Key Term Extraction, National Taiwan University Semantic features Probabilistic Latent Semantic Analysis (PLSA) Latent Topic Entropy Feature NameFeature Description LTP (I - III) Latent Topic Probability (mean, variance, standard deviation) LTS (I - III) Latent Topic Significance (mean, variance, standard deviation) non-key term key term Key terms tend to focus on limited topics
37
Feature Extraction 36 Key Term Extraction, National Taiwan University Semantic features Probabilistic Latent Semantic Analysis (PLSA) Latent Topic Entropy Feature NameFeature Description LTP (I - III) Latent Topic Probability (mean, variance, standard deviation) LTS (I - III) Latent Topic Significance (mean, variance, standard deviation) LTEterm entropy for latent topic non-key term key term Key terms tend to focus on limited topics Higher LTE Lower LTE
38
Phrase Identification Key Term Extraction Automatic Key Term Extraction 37 Key Term Extraction, National Taiwan University Archive of spoken documents Branching Entropy Feature Extraction Learning Methods 1)K-means Exemplar 2)AdaBoost 3)Neural Network ASR speech signal ASR trans Key terms entropy acoustic model : Using unsupervised and supervised approaches to extract key terms
39
Learning Methods 38 Key Term Extraction, National Taiwan University Unsupervised learning K-means Exemplar Transform a term into a vector in LTS (Latent Topic Significance) space Run K-means Find the centroid of each cluster to be the key term The candidate term in the same group are related to the key term The key term can represent this topic The terms in the same cluster focus on a single topic
40
Learning Methods 39 Key Term Extraction, National Taiwan University Supervised learning Adaptive Boosting Neural Network Automatically adjust the weights of features to produce a classifier
41
40 Key Term Extraction, National Taiwan University
42
Experiments 41 Key Term Extraction, National Taiwan University Corpus NTU lecture corpus Mandarin Chinese embedded by English words Single speaker 45.2 hours 我們的 solution 是 viterbi algorithm (Our solution is viterbi algorithm)
43
Experiments 42 Key Term Extraction, National Taiwan University ASR Accuracy LanguageMandarinEnglishOverall Char Acc (%)78.1553.4476.26 CHEN SI Model some data from target speaker AM Out-of-domain corpora Background In-domain corpus Adaptive trigram interpolation LM Bilingual AM and model adaptation
44
Experiments 43 Key Term Extraction, National Taiwan University Reference Key Terms Annotations from 61 students who have taken the course If the k-th annotator labeled N k key terms, he gave each of them a score of, but 0 to others Rank the terms by the sum of all scores given by all annotators for each term Choose the top N terms form the list ( N is average N k ) N = 154 key terms 59 key phrases and 95 keywords
45
Experiments 44 Key Term Extraction, National Taiwan University Evaluation Unsupervised learning Set the number of key terms to be N Supervised learning 3-fold cross validation
46
Experiments 45 Key Term Extraction, National Taiwan University Feature Effectiveness Neural network for keywords from ASR transcriptions Each set of these features alone gives F1 from 20% to 42% Prosodic features and lexical features are additiveThree sets of features are all useful 20.78 42.86 35.63 48.15 56.55 Pr: Prosodic Lx: Lexical Sm: Semantic F-measure
47
Experiments 46 Key Term Extraction, National Taiwan University Overall Performance 51.95 55.84 62.39 67.31 23.38 Conventional TFIDF scores w/o branching entropy stop word removal PoS filtering Branching entropy performs well K-means Exempler outperforms TFIDFSupervised approaches are better than unsupervised approaches F-measure AB: AdaBoost NN: Neural Network
48
Experiments 47 Key Term Extraction, National Taiwan University Overall Performance The performance of ASR is slightly worse than manual but reasonable Supervised learning using neural network gives the best results 23.38 20.78 51.95 43.51 55.84 52.60 62.39 57.68 67.31 62.70 F-measure AB: AdaBoost NN: Neural Network
49
48 Key Term Extraction, National Taiwan University
50
Conclusion 49 Key Term Extraction, National Taiwan University We propose the new approach to extract key terms The performance can be improved by Identifying phrases by branching entropy Prosodic, lexical, and semantic features together The results are encouraging
51
Thank reviewers for valuable comments NTU Virtual Instructor: http://speech.ee.ntu.edu.tw/~RA/lecture 50 Key Term Extraction, National Taiwan University
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.