in ♫ ♫ otion Harmony Zohar Barzelay, Yoav Y. Schechner Dept. Elect. Eng. Technion – Israel Institute of Technology 1 Ack: Einav Namer, Yael Waissman, ISF
2 Barzelay, Schechner Violin-guitar: raw “Harmony in otion” ♫ ♫
3 Barzelay, Schechner Violin: Detected and Recovered “Harmony in otion” ♫ ♫
4 Barzelay, Schechner Guitar: Detected and Recovered “Harmony in otion” ♫ ♫
5 Video features: track all Barzelay & Schechner, Harmony in Motion Find the best
6 Barzelay & Schechner, Harmony in Motion Finding an Audio-Visual Object (AVO)
Spatial matching: Many “coincidences” Barzelay & Schechner, Harmony in Motion ? ? ? 7 Corresponding images? * Always: unmatched features * Good image match: many “coincidences” * Spatial Edges
Spatial matching * Feature-based * Feature = significant change in space: edge, corner * Maximize coincidences * No need to match everything Barzelay & Schechner, Harmony in Motion Audio-Visual matching * Feature-based * Feature = significant change in time: temporal-edge * Maximize coincidences * No need to match everything 8
Barzelay & Schechner, Harmony in Motion Feature-based Cross-Modal Matching 9
Barzelay & Schechner, Harmony in Motion Feature-based Cross-Modal Matching 9
Barzelay & Schechner, Harmony in Motion Feature-based Cross-Modal Matching time [frames] Acceleration 10
Feature-based Cross-Modal Matching ‘Visual Onsets’‘Audio Onsets’ t 0 1 t 0 1 Amplitude t 11
Barzelay & Schechner, Harmony in Motion Audio-Visual Coincidences 12
13 Barzelay & Schechner, Harmony in Motion Audio Pre-processing t 0 frequency t amplitude 0 frequency energy 0 F Spectrogram
Significant change in audio Barzelay & Schechner, Harmony in Motion t 0 frequency spectrogram Audio Onsets Beginning of new sounds t 0 temporal derivative 14
Handling pitch-drift Barzelay & Schechner, Harmony in Motion 15
directional derivative spectrogram non-directional derivativespectrogram Barzelay & Schechner, Harmony in Motion Handling pitch-drift 16
0 1 t t Visual Matching 17
t t -5 t Visual Matching 18 Amplitude
0 1 t 0 1 coincidences inconsistencies Barzelay & Schechner, Harmony in Motion Ranking Criterion t 0 t 19
0 1 t 0 1 Barzelay & Schechner, Harmony in Motion Residual Audio Onsets 20 coincidences Residual Onsets 0 t
t 0 1 t Sequential Object Detection 21 t 0 Amplitude Residual Onsets 0 1 Barzelay & Schechner, Harmony in Motion
22 Barzelay, Schechner Speech: raw “Harmony in otion” ♫ ♫
23 Barzelay, Schechner Speech A-B-C: Detected & Recovered “Harmony in otion” ♫ ♫
24 Barzelay, Schechner Speech 1-2-3: Detected & Recovered “Harmony in otion” ♫ ♫
Audio Isolation 25
26 Barzelay & Schechner, Harmony in Motion Audio Pre-processing t 0 frequency t amplitude 0 frequency energy 0 F Spectrogram
t 0 frequency Spectrogram t Audio Isolation 27 Corresponding Onsets Barzelay & Schechner, Harmony in Motion
0 Harmonic Sounds t Audio Isolation Spectrogram 27 Corresponding Onsets t frequency
28 Barzelay & Schechner, Harmony in Motion Fourier representation t 0 frequency t amplitude 0 frequency energy 0 Spectrogram frequency phase 0 F
29 Barzelay & Schechner, Harmony in Motion Filtered audio t 0 frequency t amplitude 0 frequency energy 0 Spectrogram frequency old phase 0 F -1
0 1 t t Barzelay & Schechner, Harmony in Motion Limitations: Temporal Tolerance t 0 t 30 00:00:16 ¼ sec
Time-Frequency overlap Barzelay & Schechner, Harmony in Motion Limitations: Audio Sparsity 31 t frequency Overlapping audio onsets Sounds may overlap in time Onsets should not
0 1 t time acceleration Feature-Detection: –edge scale –significance level –pruning Barzelay & Schechner, Harmony in Motion Detection Parameters 32 Visual Edges: 00:00:15
33 Barzelay, Schechner Dual Viloin “Harmony in otion” ♫ ♫
Barzelay, Schechner “Harmony in otion” ♫ ♫ 34
Barzelay, Schechner “Harmony in otion” ♫ ♫ 35
Feature-based Cross-Modal Association Features: Temporal Audio/Visual Edges. Simultaneous Objects + Sounds. A General Concept. 36