Download presentation
Presentation is loading. Please wait.
Published byMoses Singleton Modified over 9 years ago
1
Roland Goecke Trent Lewis Michael Wagner 1Big ASC Meeting 15-16 April 2010
2
What is Calibration? Calibration is not so much a data collection process, although that will happen to some extent as well Rather it is about ensuring that the hardware components and black box setup are all correct Or at least that the settings have been recorded for subsequent analysis Occurs before the actual data recording takes place. 2Big ASC Meeting 15-16 April 2010
3
What is included? Equipment checking Is the audio and video capturing software running? Are the lights set up correctly? “Recording” of environmental settings What is the light level? What is the acoustic background noise level? What are the distances between camera(s) and microphone(s)? “Recording” of subject calibration sequences Face turning Lip movements Big ASC Meeting 15-16 April 20103
4
Why is this Data Important? The calibration data is potentially fundamental to everyone who will use the corpus. To name just a few research areas that will particularly pay attention to the calibration data: A and AV speech recognition A and AV speaker recognition Biometrics (face recognition, face-voice recognition) Speech Perception/Psycho-Acoustics researchers Speech and Hearing researchers Big ASC Meeting 15-16 April 20104
5
Hardware and Software Requirements Normal recording equipment and software An additional light meter would be useful to measure the ‘global’ level of light in the recording environment Do we need to do something similar for measuring the acoustic background noise? Swivel chair to place subject in Assists the capturing of the face/head from different angles We want the subjects to turn with the chair, not just turning their heads This is more accurate Masking tape to mark chair position, angles, etc. Metronome (AV synchro) Big ASC Meeting 15-16 April 20105
6
Collection Process – Step 1 2-step process Step 1 – Record environment without subject At the beginning of each session or, in case of sessions over longer periods of time, once every hour in case the environmental conditions have changed Audio and video recording of the recording environment without a subject present (30s) Audio and video recording of the metronome in the scene (30s) Measurement of location of light sources and distance to camera(s)(manual measurement) Check camera output is being recorded Check microphone output is being recorded Time 5min Big ASC Meeting 15-16 April 20106
7
Collection Process – Step 2 Step 2 – Person specific calibration At the beginning of each recording session with a subject Sit subject on swivel chair. Measure distances camera(s) to subject and microphone(s) to subject (manual measurement) Turn subject to 90° left. It is important that the subject turns their entire body on the swivel chair such that the face (nose?) points in the required direction. We will need both markers on the floor as well as on the walls in 15° intervals to facilitate the correct turning on the subjects. Turn subjects to every 15° starting from -90° (left profile) to +90° ( right profile), take 2s at each position Big ASC Meeting 15-16 April 20107
8
Collection Process – Step 2 Let subject face camera frontally. Participants are to say the following two lip movement calibration sequences for 5s each: e o e o e o …(testing lip rounding) ba ba ba …(testing vertical mouth opening) This is similar to what was done in the AVOZES corpus and turned out to be quite useful in determining some understanding of the range of lip movements a subject makes Other sequences are possible Time: 5min Big ASC Meeting 15-16 April 20108
9
Coding and Annotation No coding or annotation required as such Want to take note of the environmental conditions in which the recordings take place Light level Can the acoustic base level, i.e. when no one is talking can be measured from the recorded audio stream, be sufficiently determined from the recordings without a subject? If so, no extra measurements required here. Distance of camera(s) to subject(s) Distance of microphone(s) to subject(s), e.g. to chin or mouth Location of light sources and distance to camera(s) or subject(s) (we may need a sketch of the recording environment for each location) Big ASC Meeting 15-16 April 20109
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.