AQUAINT Herbert Gish and Owen Kimball June 11, 2002 Answer Spotting.

Slides:



Advertisements
Similar presentations
A Human-Centered Computing Framework to Enable Personalized News Video Recommendation (Oh Jun-hyuk)
Advertisements

Yansong Feng and Mirella Lapata
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Atomatic summarization of voic messages using lexical and prosodic features Koumpis and Renals Presented by Daniel Vassilev.
You can use this presentation to: Gain an overall understanding of the purpose of the revised tool Learn about the changes that have been made Find advice.
Markpong Jongtaveesataporn † Chai Wutiwiwatchai ‡ Koji Iwano † Sadaoki Furui † † Tokyo Institute of Technology, Japan ‡ NECTEC, Thailand.
Jing-Shin Chang National Chi Nan University, IJCNLP-2013, Nagoya 2013/10/15 ACLCLP – Activities ( ) & Text Corpora.
J. Kunzmann, K. Choukri, E. Janke, A. Kießling, K. Knill, L. Lamel, T. Schultz, and S. Yamamoto Automatic Speech Recognition and Understanding ASRU, December.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Adaptation Resources: RS: Unsupervised vs. Supervised RS: Unsupervised.
1 Texmex – November 15 th, 2005 Strategy for the future Global goal “Understand” (= structure…) TV and other MM documents Prepare these documents for applications.
Mining the web to improve semantic-based multimedia search and digital libraries
Information Retrieval in Practice
Chapter 11 Beyond Bag of Words. Question Answering n Providing answers instead of ranked lists of documents n Older QA systems generated answers n Current.
1 Adaptive Management Portal April
Presentation Outline  Project Aims  Introduction of Digital Video Library  Introduction of Our Work  Considerations and Approach  Design and Implementation.
CS294-9 :: Fall 2003 vic and NAÏVE K. Mayer-Patel.
Introduction to Machine Learning Approach Lecture 5.
Beginning Oral Language and Vocabulary Development
Overview of Search Engines
DIVINES – Speech Rec. and Intrinsic Variation W.S.May 20, 2006 Richard Rose DIVINES SRIV Workshop The Influence of Word Detection Variability on IR Performance.
Intelligent Tutoring Systems Traditional CAI Fully specified presentation text Canned questions and associated answers Lack the ability to adapt to students.
Information Retrieval in Practice
Data Mining: Concepts & Techniques. Motivation: Necessity is the Mother of Invention Data explosion problem –Automated data collection tools and mature.
AQUAINT Kickoff Meeting – December 2001 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
1 Samson Cheung EE 639, Fall 2004 Lecture 1: Applications & Trends Multimedia Information Systems advent: open communicator browser, screen cam, hari’s.
Lightly Supervised and Unsupervised Acoustic Model Training Lori Lamel, Jean-Luc Gauvain and Gilles Adda Spoken Language Processing Group, LIMSI, France.
Empirical Methods in Information Extraction Claire Cardie Appeared in AI Magazine, 18:4, Summarized by Seong-Bae Park.
Multimedia Databases (MMDB)
Supervisor: Dr. Eddie Jones Electronic Engineering Department Final Year Project 2008/09 Development of a Speaker Recognition/Verification System for Security.
An Integrated Approach for Arabic-English Named Entity Translation Hany Hassan IBM Cairo Technology Development Center Jeffrey Sorensen IBM T.J. Watson.
Experiments on Building Language Resources for Multi-Modal Dialogue Systems Goals identification of a methodology for adapting linguistic resources for.
Chapter 1 Introduction to Data Mining
Linguistics & AI1 Linguistics and Artificial Intelligence Linguistics and Artificial Intelligence Frank Van Eynde Center for Computational Linguistics.
Subtask 1.8 WWW Networked Knowledge Bases August 19, 2003 AcademicsAir force Arvind BansalScott Pollock Cheng Chang Lu (away)Hyatt Rick ParentMark (SAIC)
Web-Assisted Annotation, Semantic Indexing and Search of Television and Radio News (proceedings page 255) Mike Dowman Valentin Tablan Hamish Cunningham.
LML Speech Recognition Speech Recognition Introduction I E.M. Bakker.
Rapid and Accurate Spoken Term Detection Michael Kleber BBN Technologies 15 December 2006.
1 Boostrapping language models for dialogue systems Karl Weilhammer, Matthew N Stuttle, Steve Young Presenter: Hsuan-Sheng Chiu.
Recognizing Discourse Structure: Speech Discourse & Dialogue CMSC October 11, 2006.
1 Modeling Long Distance Dependence in Language: Topic Mixtures Versus Dynamic Cache Models Rukmini.M Iyer, Mari Ostendorf.
1 Applications of video-content analysis and retrieval IEEE Multimedia Magazine 2002 JUL-SEP Reporter: 林浩棟.
Creating Subjective and Objective Sentence Classifier from Unannotated Texts Janyce Wiebe and Ellen Riloff Department of Computer Science University of.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Number Sense Disambiguation Stuart Moore Supervised by: Anna Korhonen (Computer Lab)‏ Sabine Buchholz (Toshiba CRL)‏
Metadata By N.Gopinath AP/CSE Metadata and it’s role in the lifecycle. The collection, maintenance, and deployment of metadata Metadata and tool integration.
Hello, Who is Calling? Can Words Reveal the Social Nature of Conversations?
Digital Video Library Network Supervisor: Prof. Michael Lyu Student: Ma Chak Kei, Jacky.
Copyright © 2013 by Educational Testing Service. All rights reserved. Evaluating Unsupervised Language Model Adaption Methods for Speaking Assessment ShaSha.
Copyright Paula Matuszek Kinds of Machine Learning.
Behrooz ChitsazLorrie Apple Johnson Microsoft ResearchU.S. Department of Energy.
 digital methodologies for global media research Randy Kluver Dept of Communication Texas A&M University.
“Intelligent User Interfaces” by Hefley and Murray.
1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.
1 February 2012 ILCAA, TUFS, Tokyo program David Nathan and Peter Austin Hans Rausing Endangered Languages Project SOAS, University of London Language.
Maximum Entropy techniques for exploiting syntactic, semantic and collocational dependencies in Language Modeling Sanjeev Khudanpur, Jun Wu Center for.
W3C Multimodal Interaction Activities Deborah A. Dahl August 9, 2006.
AQUAINT Mid-Year PI Meeting – June 2002 Integrating Robust Semantics, Event Detection, Information Fusion, and Summarization for Multimedia Question Answering.
Integration of Australian Curriculum English Implementation Workshops Term 3, 2015.
Multi-Class Sentiment Analysis with Clustering and Score Representation Yan Zhu.
MPEG 7 &MPEG 21.
LREC – Workshop on Crossing media for Improved Information Access, Genova, Italy, 23 May Cross-Media Indexing in the Reveal-This System Murat Yakici,
Information Retrieval in Practice
Language Identification and Part-of-Speech Tagging
FOUNDATION AQA GCSE FRENCH AND SPANISH
Multimedia Information Retrieval
Data Warehousing and Data Mining
A Level English Language
Content Augmentation for Mixed-Mode News Broadcasts Mike Dowman
Natural Language Processing (NLP) Systems Joseph E. Gonzalez
Emre Yılmaz, Henk van den Heuvel and David A. van Leeuwen
Presentation transcript:

AQUAINT Herbert Gish and Owen Kimball June 11, 2002 Answer Spotting

2 AQUAINT Project Goals Primary Objectives –Develop answer-spotting technology to provide analysts with best answers available from a spontaneous speech database –Develop application for multiple languages and with potentially limited resources Low-density languages Application Features –Explain to the user the basis for decisions –Export semantic components of answer to a multi-media system –Account for variability in resources in extracting information –Enable rapid deployment in new languages

3 AQUAINT BBN Approach Topic-dependent language models Semantic-category class grammars Unsupervised training methods

4 AQUAINT Query Formation A query for the conversational speech domain addresses: –Topic Domains of interest to the analyst –Semantic categories/classes Key topic components –Keywords/Phrases The elements of the semantic categories

5 AQUAINT Answer Spotting Approach Answer Spotting Approach Recognize semantic category specific language activity –Generalization of word and phrase spotting Integrate topic search into recognizer for best answers into the speech recognition process –Use topic relevant language model(s) to select relevant data –Incorporate semantic classification of words or phrases into language model used in recognition –Requires minimal resources and provides the best performance Train topic classifiers without supervision Post-process speech recognition output to put together semantic components of answer

6 AQUAINT Choice of Corpora Choice of Corpora Desired Corpora Features –Spontaneous (telephone) speech –Conversations between people –Consistent query formation and answer representation from data Selected Corpora –Switchboard  Spontaneous telephone conversations between strangers  Topic-driven conversations  Abundant amounts of transcribed data –Callhome  Spontaneous telephone conversations between family members  Corpora available in multiple languages: Spanish, Mandarin and Arabic

7 AQUAINT Query Formation for Switchboard Query Formation for Switchboard Topics –Selected 5 diverse topics –Topic descriptions: Buying a car, Credit cards, News media, Vacation spots, Music –Amount of data for each topic varies from 30 to 60 conversations Semantic Categories/Classes –For each topic, defined a set of semantic categories –At least 5 categories per topic were picked –Manual annotation of semantic categories – no syntactic information used in annotation User-Defined Keywords/Phrases

8 AQUAINT Topic: Buying a Car Topic: Buying a Car What kind of car do you think you might buy next? What sorts of things will enter your decision? See if your requirements and the other caller’s requirements are similar Semantic CategoriesExample words/phrases Make/ManufacturerBMW, Ford, Saab ModelTaurus, Passat ClassVan, wagon, sports car Year/TimelineUsed, new, 1968, couple of years old Place of originAmerican, Japanese, European

9 AQUAINT Topic: News Media Semantic CategoriesExample words/phrases Media TypesRadio, television (TV), newspaper (paper), magazine, written, tabloid Television ChannelsCNN, ABC, Channel Five Newspaper/MagazineDallas Morning Herald, Newsweek Radio ChannelsNPR, Christian radio station, KRLD, CNN, ninety point one Shows (TV & radio)Headline News, Sixty Minutes Anchor Names/Personalities Peter Jennings, Bruce Williams Discuss how you and the caller keep up with current event.

10 AQUAINT System Architecture

11 AQUAINT System Components Recognizer –State-of-the-art Byblos system –Real-time or near real-time performance Topic Identifier –Parallel language model structure in recognizer that separates query topic from non-topic data –Topic & text integrator that uses language model information and word confidences to filter relevant text Category Identifier –Categories integrated into the language model or –Use separate component, for example, Identifinder

12 AQUAINT An Alternate Configuration An Alternate Configuration Recognizer employs a standard n-gram language model Topics identified after recognition Semantic content extracted after recognition Can provide a baseline for the original configuration The choice for low WER recognition Speech Recognizer Semantic Parallel Language Model (Identifinder) Topic Id Information

13 AQUAINT Category Identification - Identifinder Start of sentence End of sentence Semantic Class 1 Semantic Class N Not-a-Semantic-Class 1.Identifinder is an HMM with internal states defined by the semantic classes and a single “not-a-semantic-class state. 2.The state generates words conditioned on the previous state as well as the previous word. 3.Word features can also be used in addition to the word identity.

14 AQUAINT Portability Across Languages Use Callhome corpora for testing system capabilities –Callhome English has conversations between family members –Topics range from family events to immigration issues Callhome is available in multiple languages –Languages that can be tested include Spanish, Mandarin and Arabic –Limited data and linguistic resources are available in these languages posing additional technical challenges

15 AQUAINT Two Modes of Operation Significant Resources –Moderate WER (30%-40%)LVCSR available –Hundreds (nominally) of hours of transcribed data –Large phonetic dictionary Limited Resources –Few hours of transcribed data in the domain of interest –Dictionary limited to training

16 AQUAINT Using Limited Resources Investigate effect of variations in data on various system components –Impact of reduced number of manually annotated conversations on category identification  Use word clustering on other available text resources to find words that fit into the semantic classes of interest  Use relevance feedback techniques, where the user provides feedback that can be used to adapt system response –Impact of reduced transcriptions for acoustic/language modeling on recognition performance  Use auto-transcription techniques if additional audio data is available  Use newspaper & broadcast news text available to augment language modeling performance

17 AQUAINT Using Limited Resources Building system with limited data resources and/or linguistic expertise –Enabling rapid deployment in new languages where linguistic resources (for example, word pronunciation dictionary or word transcriptions) are limited –Annotating topics and semantic categories on a new language where transcriptions are limited

18 AQUAINT Progress Overview Annotation completed –Semantic categories Integrated recognizer-semantics –Language model still being developed Baseline system (LVCSR/Identifinder) Implemented –Initial experiments measuring performance of Identifinder on semantic categories Topic classification in the limited data regime –Topic classification with approximately 4 hours of training –Technology has been transferred to a government agency Classification performance with diffuse topics

19 AQUAINT Initial Experiments Finding the Semantic Categories –Employed recognition followed by identifinder –Real-time Byblos recognizer trained on Switchboard WER 38% –Trained Indentifinder with annotated data from the 3 topics –Evaluated on manual transcription and after decoding Finding Topics with Limited Resources

20 AQUAINT Finding Semantic Categories cont. Finding Semantic Categories cont. Comparison of performance on manually transcribed speech and after speech recognition TopicF-measure Credit cards77.28 Buying a car85.64 News media73.62 TopicF-measure Credit cards60.44 Buying a car65.49 News media59.22 Manually TranscribedAfter Speech Recognition

21 AQUAINTDiscussion Initial results show that semantic categories can survive recognition errors Experiments with limited training are giving very encouraging results Need to integrate language model into the recognizer Explore semantic categories in the limited data situation Provide confidence measures for keywords/phrases Investigate other methods for characterizing performance Characterize performance a function of word error rate Start work on demo system