Visualizing Audio for Anomaly Detection Mark Hasegawa-Johnson Camille Goudeseune Hank Kaczmarski Thomas Huang University of Illinois at Urbana-Champaign
Research Goal: Guide audio analysts to anomalies Large dataset: audio Anomalies Cheap to record, expensive to play GUI: listen 10000x faster Robots are poor listeners, but good servants
Anomalies shoot down poor theories
Feature : audio interval numbers Visualization : numbers rendering Audio-based features (spectrogram) Model-based features (Hnull, Hthreatening)
Audio-Based Features: Transformed acoustic data Pitch, Formants Anomaly Salience Score Waveform A = f(t) Spectrogram A = f(t, Hz) Correlogram A = f(t, Hz, fundamental period) Rate-scale representation A = f(t, Hz, bandwidth, ΔHz) Wavelets Multiscale
Model-Based Features Log-likelihood features Display how well a hypothesis fits Let analyst intuit a threshold of fitness Defined by Mixture Gaussians, trained by: EM Parzen Windows One-class SVMs
Model-Based Features Log-likelihood features Log-likelihood ratios (LLR) Because Mixture Gaussian misclassifies outliers as anomalies
Too many features! Evaluate each feature with Kullback-Leibler Divergence Combine features with AdaBoost + SVM + HMM
Two Interactive Testbeds Vary features Vary anomalies Vary background audio Vary how model is trained Vary mapping from features to HSV Anomalousness “bubbles up”
Multi-day audio timeline
1000 μphone = The Milliphone
Human Subject Protocols Tutorial Training with immediate feedback Measure how fast subjects find x% of anomalies
Influence on FODAVA Guide, don’t replace, human analysts Guide them with zoomable features Features from transformed data (audio-based) Features from fitting hypotheses (model-based)
Developing FODAVA Make “big” audio accessible Audio is hard, but its concepts generalize Fast interactive exploration of time series, long (timeline) or wide (milliphone)