Presentation is loading. Please wait.

Presentation is loading. Please wait.

WP 1: Embodied Acoustic Sensing for Real-world Environments

Similar presentations


Presentation on theme: "WP 1: Embodied Acoustic Sensing for Real-world Environments"— Presentation transcript:

1 WP 1: Embodied Acoustic Sensing for Real-world Environments
Boaz Rafaely Final Review Meeting Erlangen, November 30, 2016

2 Review Meeting: Report on WP1
WP1 Tasks T1.1 M1-M27, BGU, FAU Design of head-embodied anthropomorphic microphone array T1.2 M1-M27, FAU, UBER Design of adaptive robomorphic microphone array T1.3 M13-M36, UBER, BGU, FAU Active sensing through sensorimotor interaction T1.4 M1-M30, BGU Sound field representation and analysis T1.5 M1-M27, INRIA, BGU Audio-visual data alignment Review Meeting: Report on WP1

3 Review Meeting: Report on WP1
Overview T1.1 Head Array T1.3 Robot motion T1.4 Pre-processing, Spherical Harmonics T1.2 Robomorphic array WP2 WP3 WP4 T1.5 Audio-visual Review Meeting: Report on WP1

4 T1.1 Design of head-embodied anthropomorphic microphone array
Objectives Develop a methodology for the design of microphone arrays embodied into a robot head Apply the methodology to two benchmark designs for Nao. Methodology developed and applied Effective rank as objective design measure [1] Applied to Noa’s head array, extended frequency range [2,3] [1] V. Tourbabin and B. Rafaely. "Theoretical Framework for the Optimization of Microphone Array Configuration for Humanoid Robot Audition". IEEE Trans. Audio Speech Lang. Proc., Vol. 22(12) : p December 2014. [2] V. Tourbabin and B. Rafaely, Optimal Design of Microphone Array for Humanoid-Robot Audition (A), ICR 2016, Israel [3] V. Tourbabin and B. Rafaely. "Design of Pseudo-Spherical Microphone Array with Extended Frequency Range for Robot Audition". The 42nd Annual Conference on Acoustics (DAGA2016)(). Aachen, Germany. March 2016. Review Meeting: Report on WP1

5 T1.1 Design of head-embodied anthropomorphic microphone array
Planning Preliminary design based mostly on existing hardware, serve as baseline for partners M1-M6 A first benchmark design, based on minimal changes to existing head, focusing on frontal sources, with limited number of microphones M7-M18 A second design with a new head, focusing on sources in all directions, with increased number of microphones M19-M27 Audio data for the array collected and annotated for further use. Practice Preliminary design - simulated Phase II Nao head Benchmark 1, 12 MEMS mics., external amps. Benchmark 2, integrated amps., improved mounting Annotated audio delivered in D1.1 Review Meeting: Report on WP1

6 T1.1 Design of head-embodied anthropomorphic microphone array
Quality measures Indicator(s) Method(s) of measurement Target value Milestone / Deliverable Results Correlation Correlation between the proposed array information measure and array performance measures of WNG and DOA estimation accuracy. 0.75 [T-ASLP 2014] Array information For the two benchmark designs, increase in effective rank relative to previous design for speech frequency range and designed look directions. 3 MS4, D1.2 Benchmark 1,2: ER=26 (+13) Review Meeting: Report on WP1

7 T1.2 Design of adaptive robomorphic microphone array
Objectives Optimal configurations derived using optimisation techniques such as evolutionary algorithms using real world data sets The adaptive array geometries will be tailored to the algorithms developed in T2.1, T2.2 for source localisation and signal extraction. Audio data for the array will be collected and annotated for further use. Methodology developed and applied [FAU, UBER to complete] [1]… [2]… [1] V. Tourbabin, H. Barfuss, B. Rafaely and W. Kellermann. "Enhanced robot audition by dynamic acoustic sensing in moving humanoids". IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2015) : p Brisbane, Australia. April 2015. [2] papers Review Meeting: Report on WP1

8 T1.2 Design of adaptive robomorphic microphone array
Quality measures [FAU, UBER to check] Indicator(s) Method(s) of measurement Target value Milestone / Deliverable Results Array information Increase in effective rank due to robomorphic array. 2 MS4, MS7, D2.1 First results show an increased effective rank (dependent on frequency). Further investigations necessary. Aperture Overall relative increase in aperture compared to head array. 3 Definitely increase of relative aperture compared to head array. Improvement of new configuration compared to previous best microphone configuration Localization precision and misdetection rate in sound source localisation(SSL). Both criteria are evaluated in conjunction with algorithms from T2.2, 2.3, 2.4. 3 dB (NR), reduction of SSL angular error by a factor of 2 NR not done yet. Duration of the sensor selection process Convergence time to new configuration from previous best microphone configuration 10 sec MS4, MS7, D1.2 Currently > 10s [IWAENC 2014], depends on acoustic environment (number of interfering sources, T60) as well as signal processing algorithm. Review Meeting: Report on WP1

9 T1.3 Active sensing through sensorimotor interaction
Objectives [FAU, UBER review T1.3] Small and fast human-like movements of the head can be used to enrich spatial sampling and spatial information using data from T1.1 Large head rotations and body movements used to generate increased spatial sampling similar to synthetic-aperture arrays (SAR). Improvement due to active sensing measured by improved spatial resolution of localization, and improved robustness to noise. If improvement in the order of 5dB in the robustness to noise is achieved by M27, it will be incorporated into the second benchmark. Review Meeting: Report on WP1

10 T1.3 Active sensing through sensorimotor interaction
Methodology developed and applied Motion enhancement and motion compensation DoA estimation [1,2] Head and robomorphic array complementary [3] Robot motion for both increased information and natural motion [4] robot motion and signal distortion [5] [1] V. Tourbabin and B. Rafaely. "Direction of arrival estimation using microphone array processing for moving humanoid robots". IEEE Trans. Audio Speech Lang. Proc., Vol. 23(11) : p November 2015. [2] V. Tourbabin and B. Rafaely. "Utilizing motion in humanoid robots to enhance spatial information recorded by microphone arrays". Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA 2014) : p Nancy, France. May 2014. [3] V. Tourbabin, H. Barfuss, B. Rafaely and W. Kellermann. "Enhanced robot audition by dynamic acoustic sensing in moving humanoids". IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2015) : p Brisbane, Australia. April 2015. [4] Saša Bodiroža, Vladimir Tourbabin, Guido Schillaci, Jonathan Sheaffer, Verena Hafner, Boaz Rafaely, On Natural Robot Movements for Enriching Acoustic Information (A), ICR 2016, Israel [5] V. Tourbabin and B. Rafaely. "Analysis of Distortion in Audio Signals Introduced by Microphone Motion". 24th European Signal Processing Conference (EUSIPCO 2016) : p Budapest.September 2016. Review Meeting: Report on WP1

11 T1.3 Active sensing through sensorimotor interaction
Quality measures Indicator(s) Method(s) of measurement Target value Milestone / Deliverable Results Array information Increase in effective rank with active sensing relative to static array, for 10 seconds of robot movements. 2 MS4, MS7, D2.1, D5.3 [ICASSP 2015] Positions Accuracy of available moving robot position data 5 degrees MS4, D2.1 Noise level 0.5 deg. Improvement in the robustness to noise Improvement in the robustness to noise due to active sensing 5 dB ~15dB gain at 20dB SNR Improvement in the spatial resolution of localization due to active sensing reduction of SSL angular error by a factor of 2 3 [ICASSP 2015] Review Meeting: Report on WP1

12 T1.4 Sound field representation and analysis
Objectives Spherical-harmonics transformation of the measured sound pressure, and the specific HRTF dataset for the array facilitate sound field representation that is invariant to the effect of scattering from the head and body the position and orientation of the robotת and the individual position of the microphones The generic spherical harmonics representation facilitates the use of a wealth of newly-developed algorithms as detailed in WP2. This task will use results of T1.1 and T1.3 and will provide necessary input for T1.5 and T2.4. Spherical harmonics transformations for the microphone array data will be developed for the two benchmark designs by M18 and M27. Review Meeting: Report on WP1

13 T1.4 Sound field representation and analysis
Methodology developed and applied Spherical harmonics transformation for benchmark 1,2. Applied to demo. Aliasing cancellation developed for improved spherical harmonics transformation as part of beamforming [1], and plane wave decomposition [2]. Theory and simulation tools developed for spherical harmonics representations and transformations [3,4,5] Enhancement of measured spherical harmonics by interference masking [6] [1] D. L. Alon and B. Rafaely. "Beamforming with Optimal Aliasing Cancellation in Spherical Microphone Arrays". IEEE Trans. Audio Speech Lang. Proc., Vol. 24(1) : p January 2016. [2] D. L. Alon, J. Sheaffer and B. Rafaely. "Plane-Wave Decomposition with Aliasing Cancellation for Binaural Sound Reproduction". 139 Audio Eng. Soc. Convention(9449). New York, USA. October 2015. [3] V. Tourbabin and B. Rafaely. "On the Consistent Use of Space and Time Conventions in Array Processing". Acta Acustica united with Acustica, Vol. 101(3) : p May 2015. [4] J. Sheaffer, M. van Walstijn, B. Rafaely and K. Kowalczyk. "Binaural Reproduction of Finite Difference Simulations using Spherical Array Processing". IEEE Trans. Audio Speech Lang. Proc., Vol. 23(12) : p December 2015. [5] J. Sheaffer, M. van Walstijn, B. Rafaely and K. Kowalczyk. "A Spherical Array Approach for Simulation of Binaural Impulse Responses using the Finite Difference Time Domain Method". Proceedings of Forum Acusticum. Krakow, Poland. September 2014. [6] Spatio-Spectral Masking for Spherical Array Beamforming, Uri Abend Boaz Rafaely, ICSEE 2016, Eilat, Israel. Review Meeting: Report on WP1

14 T1.4 Sound field representation and analysis
Quality measures Indicator(s) Method(s) of measurement Target value Milestone / Deliverable Results SNR Signal (65dB) to noise (sensor noise) ratio per frequency at the speech range for each SH 20dB MS4, D1.2 Benchmark I, II 20 dB up to 1800 Hz 10 dB up to 3300 Hz Oversampling The total number of usable spherical harmonics (N+1)^2 related to order N relative to the number of microphones 0.75 0.75 (order 2) up to 3000 Hz 0.33 (order 1) up to 3300 Hz Review Meeting: Report on WP1

15 T1.5 Audio-visual data alignment
Objectives [INRIA to complete T1.5] Sound field data is to be aligned with visual data, recorded by a stereoscopic camera pair. Data-driven approach for calibration: broadband sound and light sources are excited simultaneously. A piece-wise linear regression model will be used to approximate the visual-to-auditory mapping. The method will be used to calibrate the two benchmark robots in M18 and M27. Review Meeting: Report on WP1

16 T1.5 Audio-visual data alignment
Methodology developed and applied Methods, results, related to objectives [1] papers… Review Meeting: Report on WP1

17 T1.5 Audio-visual data alignment
Quality measures Indicator(s) Method(s) of measurement Target value Milestone / Deliverable Results Review Meeting: Report on WP1

18 WP1 Deliverables and Milestones
D1.1 Annotated database for acoustic signal extraction, audio-visual localisation and recognition tasks [FAU] D1.2 Microphone array design for humanoid robots [BGU] MS3 Robot audition algorithms in generic EARS scenario [IMPERIAL] MS4 Microphone array designs for first prototype [BGU] Review Meeting: Report on WP1


Download ppt "WP 1: Embodied Acoustic Sensing for Real-world Environments"

Similar presentations


Ads by Google