DIVINES – Speech Rec. and Intrinsic Variation W.S.May 20, 2006 Richard Rose DIVINES SRIV Workshop The Influence of Word Detection Variability on IR Performance.

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

Atomatic summarization of voic messages using lexical and prosodic features Koumpis and Renals Presented by Daniel Vassilev.
Multimedia Database Systems
Multimedia Retrieval. Outline Audio Retrieval Spoken information Music Document Image Analysis and Retrieval Video Retrieval.
ASSESSING SEARCH TERM STRENGTH IN SPOKEN TERM DETECTION Amir Harati and Joseph Picone Institute for Signal and Information Processing, Temple University.
GENERATING AUTOMATIC SEMANTIC ANNOTATIONS FOR RESEARCH DATASETS AYUSH SINGHAL AND JAIDEEP SRIVASTAVA CS DEPT., UNIVERSITY OF MINNESOTA, MN, USA.
Stephan Gammeter, Lukas Bossard, Till Quack, Luc Van Gool.
Dialogue – Driven Intranet Search Suma Adindla School of Computer Science & Electronic Engineering 8th LANGUAGE & COMPUTATION DAY 2009.
1 Texmex – November 15 th, 2005 Strategy for the future Global goal “Understand” (= structure…) TV and other MM documents Prepare these documents for applications.
Information Retrieval in Practice
Search Engines and Information Retrieval
1 CS 430: Information Discovery Lecture 22 Non-Textual Materials 2.
Information Retrieval Concerned with the: Representation of Storage of Organization of, and Access to Information items.
Information Retrieval in Practice
Retrieval Evaluation. Brief Review Evaluation of implementations in computer science often is in terms of time and space complexity. With large document.
Retrieval Evaluation: Precision and Recall. Introduction Evaluation of implementations in computer science often is in terms of time and space complexity.
Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.
Access to News Audio User Interaction in Speech Retrieval Systems by Jinmook Kim and Douglas W. Oard May 31, th Annual Symposium and Open House.
Presented by Zeehasham Rasheed
1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System Supervisor: Prof Michael Lyu Presented by: Lewis Ng,
Chapter 5: Information Retrieval and Web Search
Overview of Search Engines
What’s The Difference??  Subject Directory  Search Engine  Deep Web Search.
IBM Haifa Research Lab © 2008 IBM Corporation Retrieving Spoken Information by Combining Multiple Speech Transcription Methods Jonathan Mamou Joint work.
MediaEval Workshop 2011 Pisa, Italy 1-2 September 2011.
Search Engines and Information Retrieval Chapter 1.
©2008 Srikanth Kallurkar, Quantum Leap Innovations, Inc. All rights reserved. Apollo – Automated Content Management System Srikanth Kallurkar Quantum Leap.
Department of Computer Science and Engineering, CUHK 1 Final Year Project 2003/2004 LYU0302 PVCAIS – Personal Video Conference Archives Indexing System.
A Simple Unsupervised Query Categorizer for Web Search Engines Prashant Ullegaddi and Vasudeva Varma Search and Information Extraction Lab Language Technologies.
1 Applying Collaborative Filtering Techniques to Movie Search for Better Ranking and Browsing Seung-Taek Park and David M. Pennock (ACM SIGKDD 2007)
When Experts Agree: Using Non-Affiliated Experts To Rank Popular Topics Meital Aizen.
Glasgow 02/02/04 NN k networks for content-based image retrieval Daniel Heesch.
1 Information Retrieval Acknowledgements: Dr Mounia Lalmas (QMW) Dr Joemon Jose (Glasgow)
Recognition of spoken and spelled proper names Reporter : CHEN, TZAN HWEI Author :Michael Meyer, Hermann Hild.
Chapter 6: Information Retrieval and Web Search
Automatic Image Annotation by Using Concept-Sensitive Salient Objects for Image Content Representation Jianping Fan, Yuli Gao, Hangzai Luo, Guangyou Xu.
A Phonetic Search Approach to the 2006 NIST Spoken Term Detection Evaluation Roy Wallace, Robbie Vogt and Sridha Sridharan Speech and Audio Research Laboratory,
1 CS 430: Information Discovery Lecture 22 Non-Textual Materials: Informedia.
Music Information Retrieval Information Universe Seongmin Lim Dept. of Industrial Engineering Seoul National University.
Survey of Approaches to Information Retrieval of Speech Message Kenney Ng Spoken Language Systems Group Laboratory for Computer Science Massachusetts Institute.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
Audient: An Acoustic Search Engine By Ted Leath Supervisor: Prof. Paul Mc Kevitt School of Computing and Intelligent Systems Faculty of Engineering University.
IPSOM Indexing, Integration and Sound Retrieval in Multimedia Documents.
Measuring How Good Your Search Engine Is. *. Information System Evaluation l Before 1993 evaluations were done using a few small, well-known corpora of.
Information Retrieval CSE 8337 Spring 2007 Introduction/Overview Some Material for these slides obtained from: Modern Information Retrieval by Ricardo.
Information Retrieval
Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval O. Chum, et al. Presented by Brandon Smith Computer Vision.
Comparing Document Segmentation for Passage Retrieval in Question Answering Jorg Tiedemann University of Groningen presented by: Moy’awiah Al-Shannaq
Copyright © 2013 by Educational Testing Service. All rights reserved. Evaluating Unsupervised Language Model Adaption Methods for Speaking Assessment ShaSha.
ASSOCIATIVE BROWSING Evaluating 1 Jinyoung Kim / W. Bruce Croft / David Smith for Personal Information.
A Logistic Regression Approach to Distributed IR Ray R. Larson : School of Information Management & Systems, University of California, Berkeley --
Behrooz ChitsazLorrie Apple Johnson Microsoft ResearchU.S. Department of Energy.
DISTRIBUTED INFORMATION RETRIEVAL Lee Won Hee.
1 ICASSP Paper Survey Presenter: Chen Yi-Ting. 2 Improved Spoken Document Retrieval With Dynamic Key Term Lexicon and Probabilistic Latent Semantic Analysis.
Information Retrieval Lecture 3 Introduction to Information Retrieval (Manning et al. 2007) Chapter 8 For the MSc Computer Science Programme Dell Zhang.
Instance Discovery and Schema Matching With Applications to Biological Deep Web Data Integration Tantan Liu, Fan Wang, Gagan Agrawal {liut, wangfa,
ASSESSING SEARCH TERM STRENGTH IN SPOKEN TERM DETECTION Amir Harati and Joseph Picone Institute for Signal and Information Processing, Temple University.
ASSESSING SEARCH TERM STRENGTH IN SPOKEN TERM DETECTION Amir Harati and Joseph Picone Institute for Signal and Information Processing, Temple University.
Cross-Dialectal Data Transferring for Gaussian Mixture Model Training in Arabic Speech Recognition Po-Sen Huang Mark Hasegawa-Johnson University of Illinois.
Pruning Analysis for the Position Specific Posterior Lattices for Spoken Document Search Jorge Silva University of Southern California Ciprian Chelba and.
Information Retrieval in Practice
Supervisor: Prof Michael Lyu Presented by: Lewis Ng, Philip Chan
Information Retrieval and Web Search
Martin Rajman, Martin Vesely
Multimedia Information Retrieval
Information Retrieval
Multimedia Information Retrieval
CSE 635 Multimedia Information Retrieval
1Micheal T. Adenibuyan, 2Oluwatoyin A. Enikuomehin and 2Benjamin S
Introduction to Search Engines
Presentation transcript:

DIVINES – Speech Rec. and Intrinsic Variation W.S.May 20, 2006 Richard Rose DIVINES SRIV Workshop The Influence of Word Detection Variability on IR Performance in Automatic Audio Indexing of Course Lectures Saturday May 20, 2006 Richard Rose 1, Renato Rispoli 1, and Jon Arrowood 2 1 McGill University Dept of ECE Montreal, QC Canada 2 Nexidia Inc. Atlanta, GA USA

DIVINES – Speech Rec. and Intrinsic Variation W.S.May 20, 2006 Richard Rose Indexing Audio Lectures Existing multimedia resources have the potential to make recorded University lectures and seminars accessible online to a wider audience It is important that the audio lectures be searchable … … but, human annotation of large corpora is expensive Automatic Speech Recognition (ASR) based tools can be used to facilitate search of the un-transcribed audio material

DIVINES – Speech Rec. and Intrinsic Variation W.S.May 20, 2006 Richard Rose An Audio Search Tool for Course Lectures Text Query Term Retrieved Segments from Lecture Audio Files Click to listen to audio segment Synchronized Presentation Slides User Interface Developed by Nexidia

DIVINES – Speech Rec. and Intrinsic Variation W.S.May 20, 2006 Richard Rose Audio Indexing of Lectures - Motivation Goal – Provide Disabled and Non-Disabled Students and Scholars Access to a Large Collection (thousands of hours) of Audio Lectures and Seminars Multimedia – Permit Synchronization and Interpretation of audio with Lecture Slides and Video Content Challenges – Large variability in dialect, speaking style, recording conditions, and task domain

DIVINES – Speech Rec. and Intrinsic Variation W.S.May 20, 2006 Richard Rose Issues in Audio Indexing Acoustic – Extraction of query terms from audio –Must be extremely fast during search (>>1,000 X real-time) Information Retrieval (IR) – Definition of relevance measure –Score query against hypothesized audio segment Task Domain - Definition of the notion of relevance –When does relevant segment begin and end? Evaluation Metrics –Acoustic: ASR word error rate, Keyword detection performance –IR: Precision / Recall of relevant segments –Task Domain: Increase in Productivity for the target user community

DIVINES – Speech Rec. and Intrinsic Variation W.S.May 20, 2006 Richard Rose Audio Indexing Task Domains Several techniques have been applied to indexing of spoken audio in several task domains: [Rose, 1991]: –Task: Topic Spotting from Conversational Speech –Method: Keyword spotting [Foote et al, 1997]: –Task: Retrieval of multimedia mail messages (Video mail browser) –Method: Phone lattice based open vocabulary indexing [Garofolo, 2000]: –Task: Spoken Document Retrieval (SDR) from Broadcast News –Method: Large vocabulary continuous speech recognition (LVCSR) Course Lectures: –How to define a topic of interest? –How to segment a continuous lecture by topic? –How to define query terms and extract them from audio?

DIVINES – Speech Rec. and Intrinsic Variation W.S.May 20, 2006 Richard Rose Phone Lattice-Based Search Engine –Off-line Lattice Generation (50 x real-time): Obtain phonetic lattice from utterance (50 x real-time) –Search (100,000 x real-time): Submit text based keyword queries, Obtain phonetic expansion, Find best match in phone lattice A Preliminary Study of Audio Indexing

DIVINES – Speech Rec. and Intrinsic Variation W.S.May 20, 2006 Richard Rose Evaluating Information Retrieval Performance Database – Twelve hours of lectures from McGill ECE Photonics Course (Prof. Andrew Kirk) Domain Experts – Course TA’s Target Domain – Example questions taken from course material … –Sample question: “Explain the modal properties of a conducting waveguide from the point of view of destructive and constructive interference” Relevance Labeling –Domain experts identify lecture segments that are relevant to question –A lecture segment is the audio that overlaps a given lecture slide

DIVINES – Speech Rec. and Intrinsic Variation W.S.May 20, 2006 Richard Rose Relevance Measure Given an audio segment of length seconds, For a Query containing query terms Obtain hypothesized occurrences for term with acoustic posterior scores Combine weighted posterior scores to obtain a measure of relevance for segment w.r.t. query Audio Segment k Acoustic Scores for Query Term i Hypothesized Occurrences of Term i

DIVINES – Speech Rec. and Intrinsic Variation W.S.May 20, 2006 Richard Rose Relevance Measure - Normalization There are two normalization components: –Acoustic Confidence Normalization: Function of the average Figure of Merit observed for query term FOM: Average of the detection prob. over a range of false alarm rates –Document Length Normalization: Estimate of the number of words in audio segment k Relies on estimate of speaking rate: words/sec. Relevance Measure:

DIVINES – Speech Rec. and Intrinsic Variation W.S.May 20, 2006 Richard Rose Acoustic Variability Impact of length of phonetic baseform on word detection performance Word duration in phones: Effect of word length in detection performance Prob. of Detection Baseform PhonesFOM (%) 5 or less or more73.03 Figure of Merit vs. Baseform Length: Figure of Merit (FOM): Average over the range form 0 to 10 false alarms per keyword per hour

DIVINES – Speech Rec. and Intrinsic Variation W.S.May 20, 2006 Richard Rose Acoustic Variability Impact of accuracy of phonetic baseforms on word spotting performance Word pronunciation: Comparison of 2 phonetic expansions of the word “ dielectric ” d iy l eh k t r ih k d ay l eh k t r ih k False Alarms per Keyword per Hour (FA/KW/HR ) Prob. of Detection

DIVINES – Speech Rec. and Intrinsic Variation W.S.May 20, 2006 Richard Rose IR Performance Define a relevance metric based on normalized frequency of occurrence of keywords chosen by domain experts Rank segments of messages based relevance metric Plot Results … Rank (R) % queries with at least one relevant document in top R ranks (text) % queries with at least one relevant document in top R ranks (speech) 575%58.33% %66.67% %75% %83.33%