G. Valenzise , L. Gerosa, M. Tagliasacchi , F. Antonacci , A. Sarti IEEE Int. Conf. On Advanced Video and Signal-based Surveillance, 2007 * Dipartimento.

G. Valenzise *, L. Gerosa, M. Tagliasacchi *, F. Antonacci *, A. Sarti * IEEE Int. Conf. On Advanced Video and Signal-based Surveillance, 2007 * Dipartimento di Elettronica e Informazione, Politecnico di Milano

Scream and Gunshot Detection and Localization for Audio-Surveillance Systems AVSS 2007, September 5, 2007  Description of the problem  System Overview  Classification ◦ GMM ◦ Feature extraction ◦ Feature selection ◦ Experimental results  Localization ◦ Time Delay Estimation ◦ Source Localization ◦ Experimental results 2

Scream and Gunshot Detection and Localization for Audio-Surveillance Systems AVSS 2007, September 5, 2007  Increasing need for safety in public places (e.g. squares): ◦ High degree of criminality ◦ Large number of video- cameras installed  Aid to the human control of the video-surveillance systems using audio signal to detect and localize anomalous events (e.g. gunshots, screams) and to steer a video-camera 3

Scream and Gunshot Detection and Localization for Audio-Surveillance Systems AVSS 2007, September 5, 2007 4

Scream and Gunshot Detection and Localization for Audio-Surveillance Systems AVSS 2007, September 5, 2007  Large set of descriptors, ◦ innovative ones such as autocorrelation roll-off, decrease, slope  Exhaustive analysis of the feature selection process, ◦ formulation of a hybrid approach integrating different techniques proposed in literature  Improved algorithm for GMM training ◦ Figueiredo-Jain instead of classical EM algorithm  Proposal of a method to zoom the camera ◦ basing on the localization confidence 5

Scream and Gunshot Detection and Localization for Audio-Surveillance Systems AVSS 2007, September 5, 2007 8 Autocorrelation filtered in the frequency range 1000-2500 Hz

Scream and Gunshot Detection and Localization for Audio-Surveillance Systems AVSS 2007, September 5, 2007  From the full set of features, we want a vector of l features: ◦ Similar discrimination power ◦ Less computationally intensive ◦ Resistant to overfitting 9 Filter-based feature vector construction Wrapper-based selection

Scream and Gunshot Detection and Localization for Audio-Surveillance Systems AVSS 2007, September 5, 2007  From the full set of L features, we want a vector of l features ( l <L): ◦ Similar discrimination power ◦ Less computationally intensive ◦ Resistant to overfitting  Hybrid two-step method: ◦ Heuristic algorithm to construct the feature vectors of different size (2≥ l ≤L) using a separability measure (filter approach) ◦ Choose vector dimension evaluating validation performance using a GMM classifier (wrapper approach)  Higher performance w.r.t. filter methods, but less computational complexity than wrapper approaches 11

Scream and Gunshot Detection and Localization for Audio-Surveillance Systems AVSS 2007, September 5, 2007  A class is represented by a weighted sum of multivariate normal distributions in a l- dimensional space  Training: estimate the most probable mixture given a dataset ◦ find the mixture that maximizes the likelihood of the training data  Classification: label a new sample ◦ assigning the example to the class maximizing the likelihood of that datum 12

Scream and Gunshot Detection and Localization for Audio-Surveillance Systems AVSS 2007, September 5, 2007  Classically carried out by means of the Expectation-Maximization algorithm (EM)  Drawbacks of EM: ◦ Initialization (initial parameters and number of components) ◦ Risk of singular solutions (number of components chosen too high)  Figueiredo Jain (FJ) algorithm (2002) ◦ starts from a high number of components ◦ “annihilates” components if they are not supported by data (MML information-theoretical criterion) 13

Scream and Gunshot Detection and Localization for Audio-Surveillance Systems AVSS 2007, September 5, 2007  To evaluate performance of classification three metrics have been used: 14

Scream and Gunshot Detection and Localization for Audio-Surveillance Systems AVSS 2007, September 5, 2007 Test: 0dB Test: 5dB Test: 15dB Test: 10dB Test: 20dB 15

Scream and Gunshot Detection and Localization for Audio-Surveillance Systems AVSS 2007, September 5, 2007  Consider a T-shaped mic array  Center mic is taken as reference  Localization problem can be split in two tasks: ◦ Estimate Time Differences of Arrivals (TDOA) between each mic and reference mic ◦ Estimate source location from TDOAs 16

Scream and Gunshot Detection and Localization for Audio-Surveillance Systems AVSS 2007, September 5, 2007  Use ML-GCC estimator to estimate time delays Where is the Generalized Cross Correlation function, ◦ is the cross spectrum, ◦ is the Discrete Fourier Transform (DFT) of the signal ◦ is the Magnitude Squared Coherence function between x i and x 0 17

Scream and Gunshot Detection and Localization for Audio-Surveillance Systems AVSS 2007, September 5, 2007  Acoustic model of the audio signal received at a couple of microphones:  The TDE problem consists in the estimation of τ  GCC signal waveform Generalized Cross Correlation (GCC) 18

Scream and Gunshot Detection and Localization for Audio-Surveillance Systems AVSS 2007, September 5, 2007  We used Linear Correction Least Square algorithm: ◦ Given the spherical error function where we want to solve the linear problem: subject to the range constraint: 19

Scream and Gunshot Detection and Localization for Audio-Surveillance Systems AVSS 2007, September 5, 2007 Linear-Correction Least Squares Localization (Huang & Benesty, 2004) Linear-Correction Least Squares Localization (Huang & Benesty, 2004) 20

Scream and Gunshot Detection and Localization for Audio-Surveillance Systems AVSS 2007, September 5, 2007  SNR > threshold  small TDOA estimation errors around the true time delay  SNR < threshold  large errors on TDOA estimation 21

Scream and Gunshot Detection and Localization for Audio-Surveillance Systems AVSS 2007, September 5, 2007  Combined system yields a precision of 93% and a false rejection rate of 5% at 10dB SNR  Hybrid feature selection allows to effectively select the most representative features with a reasonable computational effort Future Extensions:  Fusion of multiple mic arrays into a sensor network  increase range and precision 25

Scream and Gunshot Detection and Localization for Audio-Surveillance Systems AVSS 2007, September 5, 2007  M. Figueiredo and A. Jain, “Unsupervised learning of finite mixture models,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 3, pp. 381–396, 2002.  C. Knapp and G. Carter, “The generalized correlation method for estimation of time delay,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 24, no. 4, pp. 320–327, 1976.  J. Chen, Y. Huang, and J. Benesty, Audio Signal Processing for Next- Generation Multimedia Communication Systems. Kluwer, 2004, ch. 4-5  J. Ianniello, “Time delay estimation via cross-correlation in the presence of large estimation errors,” IEEE Transactions on Acoustics, Speech, and Signal Processing, vol. 30, no. 6, pp. 998–1003, 1982 26

G. Valenzise , L. Gerosa, M. Tagliasacchi , F. Antonacci , A. Sarti IEEE Int. Conf. On Advanced Video and Signal-based Surveillance, 2007 * Dipartimento.

Similar presentations

Presentation on theme: "G. Valenzise , L. Gerosa, M. Tagliasacchi , F. Antonacci , A. Sarti IEEE Int. Conf. On Advanced Video and Signal-based Surveillance, 2007 * Dipartimento."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

G. Valenzise *, L. Gerosa, M. Tagliasacchi *, F. Antonacci *, A. Sarti * IEEE Int. Conf. On Advanced Video and Signal-based Surveillance, 2007 * Dipartimento.

Similar presentations

Presentation on theme: "G. Valenzise *, L. Gerosa, M. Tagliasacchi *, F. Antonacci *, A. Sarti * IEEE Int. Conf. On Advanced Video and Signal-based Surveillance, 2007 * Dipartimento."— Presentation transcript:

Similar presentations

About project

Feedback

G. Valenzise , L. Gerosa, M. Tagliasacchi , F. Antonacci , A. Sarti IEEE Int. Conf. On Advanced Video and Signal-based Surveillance, 2007 * Dipartimento.

Presentation on theme: "G. Valenzise , L. Gerosa, M. Tagliasacchi , F. Antonacci , A. Sarti IEEE Int. Conf. On Advanced Video and Signal-based Surveillance, 2007 * Dipartimento."— Presentation transcript: