SAFIRE: Situational Awareness for Firefighters Using Acoustic Signal for Enhancing Situational Awareness in SAFIRE Dmitri V. Kalashnikov.

Slides:



Advertisements
Similar presentations
Pseudo-Relevance Feedback For Multimedia Retrieval By Rong Yan, Alexander G. and Rong Jin Mwangi S. Kariuki
Advertisements

Location Recognition Given: A query image A database of images with known locations Two types of approaches: Direct matching: directly match image features.
Time-Frequency Analysis Analyzing sounds as a sequence of frames
1 Evaluation Rong Jin. 2 Evaluation  Evaluation is key to building effective and efficient search engines usually carried out in controlled experiments.
Object Specific Compressed Sensing by minimizing a weighted L2-norm A. Mahalanobis.
Cleaning Uncertain Data with Quality Guarantees Reynold Cheng, Jinchuan Chen, Xike Xie 2008 VLDB Presented by SHAO Yufeng.
Query Optimization of Frequent Itemset Mining on Multiple Databases Mining on Multiple Databases David Fuhry Department of Computer Science Kent State.
Imbalanced data David Kauchak CS 451 – Fall 2013.
Yasuhiro Fujiwara (NTT Cyber Space Labs)
Budapest May 27, 2008 Unifying mixed linear models and the MASH algorithm for breakpoint detection and correction Anders Grimvall, Sackmone Sirisack, Agne.
Dynamic Bayesian Networks (DBNs)
ASSESSING SEARCH TERM STRENGTH IN SPOKEN TERM DETECTION Amir Harati and Joseph Picone Institute for Signal and Information Processing, Temple University.
Content Based Image Clustering and Image Retrieval Using Multiple Instance Learning Using Multiple Instance Learning Xin Chen Advisor: Chengcui Zhang Department.
Decentralised Coordination of Mobile Sensors using the Max-Sum Algorithm Ruben Stranders, Alex Rogers, Nick Jennings School of Electronics and Computer.
Decentralised Coordination of Mobile Sensors using the Max-Sum Algorithm School of Electronics and Computer Science University of Southampton {rs06r2,
Chapter 11 Beyond Bag of Words. Question Answering n Providing answers instead of ranked lists of documents n Older QA systems generated answers n Current.
1 SAFIRE Project DHS Update – July 15, 2009 Introductions  Update since last teleconference Demo Video - Fire Incident Command Board (FICB) SAFIRE Streams.
Tracking Objects with Dynamics Computer Vision CS 543 / ECE 549 University of Illinois Derek Hoiem 04/21/15 some slides from Amin Sadeghi, Lana Lazebnik,
Mutual Information Mathematical Biology Seminar
TRADING OFF PREDICTION ACCURACY AND POWER CONSUMPTION FOR CONTEXT- AWARE WEARABLE COMPUTING Presented By: Jeff Khoshgozaran.
Scheduling with Uncertain Resources Reflective Agent with Distributed Adaptive Reasoning RADAR.
1 Ranked Queries over sources with Boolean Query Interfaces without Ranking Support Vagelis Hristidis, Florida International University Yuheng Hu, Arizona.
Review Rong Jin. Comparison of Different Classification Models  The goal of all classifiers Predicating class label y for an input x Estimate p(y|x)
A Low-Power Low-Memory Real-Time ASR System. Outline Overview of Automatic Speech Recognition (ASR) systems Sub-vector clustering and parameter quantization.
EVENT IDENTIFICATION IN SOCIAL MEDIA Hila Becker, Luis Gravano Mor Naaman Columbia University Rutgers University.
Statistical Natural Language Processing. What is NLP?  Natural Language Processing (NLP), or Computational Linguistics, is concerned with theoretical.
Natural Language Understanding
Decentralised Coordination of Mobile Sensors School of Electronics and Computer Science University of Southampton Ruben Stranders,
Navigating and Browsing 3D Models in 3DLIB Hesham Anan, Kurt Maly, Mohammad Zubair Computer Science Dept. Old Dominion University, Norfolk, VA, (anan,
What’s Making That Sound ?
Multimedia Databases (MMDB)
SixthSense RFID based Enterprise Intelligence Lenin Ravindranath, Venkat Padmanabhan Interns: Piyush Agrawal (IITK), SriKrishna (BITS Pilani)
Searching for Extremes Among Distributed Data Sources with Optimal Probing Zhenyu (Victor) Liu Computer Science Department, UCLA.
Dan Rosenbaum Nir Muchtar Yoav Yosipovich Faculty member : Prof. Daniel LehmannIndustry Representative : Music Genome.
1 SATWARE: A Semantic Middleware for Multi Sensor Applications Sharad Mehrotra.
Comp. Genomics Recitation 3 The statistics of database searching.
Exploiting Context Analysis for Combining Multiple Entity Resolution Systems -Ramu Bandaru Zhaoqi Chen Dmitri V.kalashnikov Sharad Mehrotra.
LANGUAGE MODELS FOR RELEVANCE FEEDBACK Lee Won Hee.
PSEUDO-RELEVANCE FEEDBACK FOR MULTIMEDIA RETRIEVAL Seo Seok Jun.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Reestimation Equations Continuous Distributions.
1 Value of information – SITEX Data analysis Shubha Kadambe (310) Information Sciences Laboratory HRL Labs 3011 Malibu Canyon.
The famous “sprinkler” example (J. Pearl, Probabilistic Reasoning in Intelligent Systems, 1988)
Object Recognition Part 2 Authors: Kobus Barnard, Pinar Duygulu, Nado de Freitas, and David Forsyth Slides by Rong Zhang CSE 595 – Words and Pictures Presentation.
ECE 8443 – Pattern Recognition ECE 8423 – Adaptive Signal Processing Objectives: Supervised Learning Resources: AG: Conditional Maximum Likelihood DP:
Effective Automatic Image Annotation Via A Coherent Language Model and Active Learning Rong Jin, Joyce Y. Chai Michigan State University Luo Si Carnegie.
Additional Topics in Prediction Methodology. Introduction Predictive distribution for random variable Y 0 is meant to capture all the information about.
Human Activity Recognition at Mid and Near Range Ram Nevatia University of Southern California Based on work of several collaborators: F. Lv, P. Natarajan,
De novo discovery of mutated driver pathways in cancer Discussion leader: Matthew Bernstein Scribe: Kun-Chieh Wang Computational Network Biology BMI 826/Computer.
Concept-based P2P Search How to find more relevant documents Ingmar Weber Max-Planck-Institute for Computer Science Joint work with Holger Bast Torino,
Unsupervised Auxiliary Visual Words Discovery for Large-Scale Image Object Retrieval Yin-Hsi Kuo1,2, Hsuan-Tien Lin 1, Wen-Huang Cheng 2, Yi-Hsuan Yang.
Fast Query-Optimized Kernel Machine Classification Via Incremental Approximate Nearest Support Vectors by Dennis DeCoste and Dominic Mazzoni International.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Learning Kernel Classifiers 1. Introduction Summarized by In-Hee Lee.
Understanding Naturally Conveyed Explanations of Device Behavior Michael Oltmans and Randall Davis MIT Artificial Intelligence Lab.
Maximum Entropy techniques for exploiting syntactic, semantic and collocational dependencies in Language Modeling Sanjeev Khudanpur, Jun Wu Center for.
A Connectivity-Based Popularity Prediction Approach for Social Networks Huangmao Quan, Ana Milicic, Slobodan Vucetic, and Jie Wu Department of Computer.
Discriminative n-gram language modeling Brian Roark, Murat Saraclar, Michael Collins Presented by Patty Liu.
Tree and Forest Classification and Regression Tree Bagging of trees Boosting trees Random Forest.
ECE 8443 – Pattern Recognition ECE 8527 – Introduction to Machine Learning and Pattern Recognition Objectives: Bayes Rule Mutual Information Conditional.
Speaker Recognition UNIT -6. Introduction  Speaker recognition is the process of automatically recognizing who is speaking on the basis of information.
1 Dongheng Sun 04/26/2011 Learning with Matrix Factorizations By Nathan Srebro.
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Thrust IC: Action Selection in Joint-Human-Robot Teams
HUMAN LANGUAGE TECHNOLOGY: From Bits to Blogs
“Bayesian Identity Clustering”
Using Acoustic Signal for Enhancing Situational Awareness in SAFIRE
LECTURE 15: REESTIMATION, EM AND MIXTURES
Minwise Hashing and Efficient Search
MIRA, SVM, k-NN Lirong Xia. MIRA, SVM, k-NN Lirong Xia.
Presentation transcript:

SAFIRE: Situational Awareness for Firefighters Using Acoustic Signal for Enhancing Situational Awareness in SAFIRE Dmitri V. Kalashnikov

SAFIRE: Situational Awareness for Firefighters High-level Overview & Vision Type of Acoustic Analysis − Human Speech: Who spoke to whom about what from where and when − Ambient Sounds: explosions, loud sounds, screaming, etc − Physiological Events: cough, gag, excited state of speaker, slurring, … − Other features: too loud, too quiet for too long, … 2 Speech Voice Amb. Noise Processing Conversation Monitoring & Playback Image & Video Tagging Acoustic Capture Acoustic Analysis SA Applications Spatial Messaging Localization via Speech Alerts

SAFIRE: Situational Awareness for Firefighters SA Apps Purpose: alerts IC when certain events happen –Capture firefighter conversations –E.g., if a conversation mentions “victim” - an alert is raised 3 Alerts Conversation Monitoring & Playback Conversation Monitoring & Playback Image & Video Tagging Purpose: allows IC to quickly locate & playback speech blocks that might contain critical info, by visualizing multiple firefighter conversations. Purpose: allows firefighters to capture images of a crisis site and annotate them with important tags using speech interface. The images are then triaged to the IC for analysis. Purpose: allows firefighters to leave spatial messages via speech interface –“This room is clear” –Anyone walking in this room will get the msg. Spatial Messaging Localization via Speech Purpose: creates an additional firefighter localization capability –GPS does not work well indoor –E.g., “I’m near room 101 on the 4th floor”

SAFIRE: Situational Awareness for Firefighters 4 Core Challenge (for ongoing projects) Recognition quality bottleneck –Poor recognition quality in noisy & realistic environments “This is a bad sentence” SpeechSpeech Recognizer This is a bed sun tan Output

SAFIRE: Situational Awareness for Firefighters 5 Different Goals of ASR & SA Applications RecognitionAcoustic Tagging & Retrieval This is a bed sun tan This is a bad sentence Quality Metric : Word Error Rate (WER) Query Retrieve correctly Quality Metric : Precision, recall, F-measure of  returned images  activated triggers It can be possible to build a good retrieval system on uncertain data. Low WER does not imply low retrieval & SA quality. Observe: Errors in words that are not in triggers do not matter Retrieval Algo DB

SAFIRE: Situational Awareness for Firefighters Research Techniques for Enhancing Quality Idea: use past data to derive models of how content has been annotated in the past.  Use N-best lists  Correlation analysis  Probabilistic model based on Max Entropy  Speed optimization techniques 6 Semantics Combining Recognizers Retrieval Idea: combining results of multiple recognizers can improve the recognition quality. –Analyze recognizers mutual behavior on past data –Build a probabilistic model for combining them Idea: (1) Use the fact that quality metrics are application dependent. (2) Develop algorithms for retrieval given uncertainty. –Use the given probabilistic representation –Derive methodology for optimal retrieval

SAFIRE: Situational Awareness for Firefighters Approach to Building SA Applications Utterances N –Best lists coming from the speech recognizer Recognizers offer Alternatives - “N-best list” High precision Low recall High recall Low precision Probabilistic DB Choose a representation that maximizes the performance of application (e.g., maximizes precision and recall) Key Issue: accurately estimate P(W in utterance), for all W in Q 7

SAFIRE: Situational Awareness for Firefighters Estimating P(W in Utterance): Learning Convert confidence levels output by recognizer into probability 8

SAFIRE: Situational Awareness for Firefighters Estimating P(W): Combining Recognizers Exploit multiple recognizers to estimate probability … Merging … 9

SAFIRE: Situational Awareness for Firefighters Estimating P(W): Using Semantics Exploit Semantics 10

SAFIRE: Situational Awareness for Firefighters One SA Application in More Detail Type of Acoustic Analysis − Human Speech: Who spoke to whom about what from where and when − Ambient Sounds: explosions, loud sounds, screaming, etc − Physiological Events: cough, gag, excited state of speaker, slurring, … − Other features: too loud, too quiet for too long, … 11 Speech Voice Amb. Noise Processing Conversation Monitoring & Playback Acoustic Capture Acoustic Analysis SA Applications Spatial Messaging Localization via Speech Alerts Image & Video Tagging

SAFIRE: Situational Awareness for Firefighters Purpose of Image Tagging 12 Take a picture of an incident Speak tags Chemical spill nitric acid Apply speech recognizer, which will suggest alternatives for each utterance (N-best list) Disambiguate among choices, by using a semantic model of how these words have been used in the past

SAFIRE: Situational Awareness for Firefighters Challenge Challenge: The correctness of tags depends on quality of speech recognizer! Tagging Images Using Speech Speech & Image Speech Recognizer Disambiguator Semantic Knowledge N-best lists Image Database Image & Tags USER Interface for image retrieval 13

SAFIRE: Situational Awareness for Firefighters Overview of Solution 14 N-best lists Enumerating Possible Sequences Smart (greedy) enumerator of possible tag sequences Computing Score for Each Sequence 1.Co-occurrence based score 2.Probabilistic score − Using Max Entropy & Lidstone’s Estimation Choosing Sequence (with the highest score) Detecting NULLs (I.e., ground truth tag not present in N-best list) Results (A sequence of tags)

SAFIRE: Situational Awareness for Firefighters Probabilistic Score (Max Entropy) Lidstone’s Estimation “Good” estimates of P for short w 1,w 2,…,w K sequences  P (w i ) ← Marginals  P (w i, w j ) ← Pairwise joints  for many/most  P (w i, w j, w k ) ← Triples  for very few 15 Maximum Entropy (ME) –Estimates joint P() –From known smaller joint P() –“No assumptions”/uniformity –For unknown P() –Optimization problem –Computationally expensive

SAFIRE: Situational Awareness for Firefighters Correlation Score 16 Jaccard Similarity Correlation Graph Direct Correlation Indirect Correlation  Base Correlation Matrix B, where B ij = c (w i, w j )  Indirect Correlation Matrices B 2 = B 2 B k = B k  General Correlations Matrix Considers correlations of various sizes

SAFIRE: Situational Awareness for Firefighters Branch and Bound Method Motivation  Computing ME is expensive  Enumerating N K sequences Exponential  How to scale? Branch and Bound Method! Two logical parts 1. Searching part How to go to the most promising “direction” to search 2. Bounding part How to bound the search space, prune away unnecessary searches 17 Complete Search Tree − Only necessary part of it will be build/considered

SAFIRE: Situational Awareness for Firefighters Experiments Dataset: 60,000 annotated images from Flickr. Split: 80% training + 20% test Experiment 1: – Use Dragon recognizer to generate N-best lists for 120 images from test data – Noise levels by introducing white Gaussian noise through a speaker Figure shows a significant quality improvement by using the semantics- based approach.

SAFIRE: Situational Awareness for Firefighters Experiment 2 Quality of annotation vs. size of N-best lists Tradeoff – With increasing N (size of list): greater chance that ground truth is present in the list. – However, more options to disambiguate among (more uncertainty)

SAFIRE: Situational Awareness for Firefighters Experiment 3: Correlation of ME & CM scores  (Strong correlation) Figure shows the frequency of how often the top-1 sequence according to ME score is contained in among top M sequences according to CM score 20

SAFIRE: Situational Awareness for Firefighters Experiment 4: Quality of BB Algorithm 21

SAFIRE: Situational Awareness for Firefighters Experiment 5: Quality on a Larger Dataset 22

SAFIRE: Situational Awareness for Firefighters Experiment 6: Speedup of BB Algorithm 23

SAFIRE: Situational Awareness for Firefighters Experiment 7: Multi-model Case 24

SAFIRE: Situational Awareness for Firefighters Progress 25