LREC Combining Multiple Models for Speech Information Retrieval Muath Alzghool and Diana Inkpen University of Ottawa Canada
LREC Presentation Outline Task: Speech Information Retrieval. Data: Mallach collection (Oard et al, 2004). System description. Model fusion. Experiments using model fusion. Results of the cross-language experiments. Results of manual keywords and summaries. Conclusion and future work.
LREC The Mallach collection Used in the Cross-Language Speech Retrieval (CLSR) task at Cross-Language Evaluation Forum (CLEF) “documents” (segments) from 272 interviews with Holocaust survivors, totaling 589 hours of speech: ASR transcripts with a word error rate of 25-38%. Additional metadata: automatically-assigned keywords, manually-assigned keywords, and a manual 3-sentence summary. A set of 63 training topics and 33 test topics, created in English from actual user requests and translated into Czech, German, French, and Spanish by native speakers. Relevance judgments were generated standard pooling.
LREC Segments VHF[IntCode]-[SegId].[SequenceNum] Interviewee name(s) and birthdate Full name of every person mentioned Thesaurus keywords assigned to the segment 3-sentence segment summary ASR transcript produced in 2004 ASR transcript produced in 2006 Thesaurus keywords from a kNN classifier Thesaurus keywords from a second kNN classifier
LREC Example topic (English) 1159 Child survivors in Sweden Describe survival mechanisms of children born in who spend the war in concentration camps or in hiding and who presently live in Sweden. The relevant material should describe the circumstances and inner resources of the surviving children. The relevant material also describes how the wartime experience affected their post-war adult life.
LREC Example topic (French) 1159 Les enfants survivants en Suède Descriptions des mécanismes de survie des enfants nés entre 1930 et 1933 qui ont passé la guerre en camps de concentration ou cachés et qui vivent actuellement en Suède. Les documents recherches devront décrire les circonstances ainsi que les ressources propres aux enfants ayant survécu. Les documents en question devront décrire également l'influence que l'expérience de la guerre a eu sur leur vie d'adulte après la guerre.
LREC System Description SMART: Vector Space Model (VSM). Terrier: Divergence from Randomness models (DFR). Two Query Expansion Methods: Based on thesaurus (novel technique). Blind relevance feedback (12 terms from the top 15 documents): based on Bose-Einstein 1 model (Bo1 from Terrier). Model Fusion: sum of normalized weighted similarity scores (novel way to compute weights). Combined output of 7 machine translation tools.
LREC Model Fusion Combine the results of different retrieval strategies from SMART (14 runs) and Terrier (1 run). Each technique will retrieve different sets of relevant documents; therefore combining the results could produce a better result than any of the individual techniques.
LREC Experiments using Model Fusion Applied the data fusion methods to 14 runs produced by SMART and one run produced by Terrier. % change is given with respect to the run providing better performance in each combination on the training data. Model fusion helps to improve the performance (MAP and Recall score) on the test data. Monolingual (English): 6.5% improvement (not statistically significant). Cross-language experiments (French) : 21.7% improvement (significant).
LREC Experiments using Model Fusion (MAP) Weighting scheme Manual En Auto-English Auto- FrenchAuto- Spanish Train.TestTrain.TestTrain.TestTrain.Test 1.nnc.ntc nnn.ntn ntn.nnn ltn.ntn atn.ntn In(exp)C Fusion % change1.0%0.9%0.0%6.5%0.4%21.7%-2.2%10.0% Fusion % change0.2%0.1%0.6%5.1%3.6%19.1%0.7%0.8%
LREC Experiments using Model Fusion (Recall) Weighting scheme Manual En Auto-English Auto- FrenchAuto- Spanish Train.TestTrain.TestTrain.TestTrain.Test 1.nnc.ntc nnn.ntn ntn.nnn ltn.ntn atn.ntn In(exp)C Fusion % change0.3%0.5 %0.3%1.0%0.6%-1.0%0.1%2.4% Fusion % change 0.3%0.0% 0.8%1.2%-0.7% -5.5%-0.7%-1.5%
LREC Results of the cross-language experiments The cross-language results for French are very close to Monolingual (English) on training data (the difference is not significant), but not on test data (the difference is significant). The difference is significant between cross-language results for Spanish and Monolingual (English) on training data but not on test data (the difference is not significant). LanguageTrainingTest 1 English French Spanish
LREC Results of manual keywords and summaries LanguageTrainingTest 1 Manual English Auto-English Experiments on manual keywords and manual summaries showed high improvements comparing to Auto-English. Our results (for manual and automatic runs) are the highest to date on this data collection in CLEF/CLSR.
LREC Conclusion and future work Model fusion helps to improve the retrieval significantly for some experiments (Auto-French) and for other not significantly (Auto-English). The idea of using multiple translations proved to be good (based on previous experiments). Future work we plan to investigate more methods of model fusion. Removing or correcting some of the speech recognition errors in the ASR content words.