Download presentation
Presentation is loading. Please wait.
Published byShyann Moore Modified over 10 years ago
1
Identification of voices in disguised speech Jessica Clark* & Paul Foulkes** * University of York ** University of York & JP French Associates pf11@york.ac.uk IAFPA, Göteborg 2006
2
2 0.1 outline experiment to test ability of lay listeners to identify disguised familiar voices voices have been disguised artificially, as with commercially available voice changers –pitch modified
3
3 0.2 structure 1.introduction –rationale for experiment 2.experimental design –speakers –listeners –Control condition –Experimental conditions 3.results 4.discussion & conclusion
4
4 1. Introduction
5
5 technical speaker identification is the most frequent task for the forensic phonetician lay identification is also common in legal cases many previous studies have thus examined lay listeners’ ability to identify voices and the factors which affect their ability
6
6 1.1 previous studies identification is not automatic or flawless listeners can make errors even with highly familiar voices –Ladefoged did not recognise his mother from a short sample (Ladefoged & Ladefoged 1980) –flatmates scored only 68% with 10 second samples (Foulkes & Barron 2000)
7
7 1.1 previous studies identification may be affected by [Bull & Clifford 1984] –type of exposure (active/passive) –length of sample –nature of sample (phone, direct, shouting etc) –delay between exposure and test –age of listener –hearing ability –sightedness –natural variability across individual listeners –specific features of voice –degree of familiarity –nature and extent of any disguise
8
8 1.2 degree of familiarity all things equal, more familiar voices are easier to identify e.g. Hollien, Majewski & Doherty (1982) –listening tests with 10 male voices listener groupN% correct (normal condition) familiar1098 trained4740 unfamiliar1427
9
9 1.3 disguise all things equal, disguised voices are harder to identify e.g. Hollien, Majewski & Doherty (1982) –various forms of disguise used listener groupN% correct (normal) % correct (disguised) familiar109879 trained474021 unfamiliar142718 machine approach (LTAS)30
10
10 1.3 disguise previous studies have examined various types of disguise –whisper, pencils between teeth, hypernasality, dialect change, rate change, professional mimics but little if any work on voice changers –hardware based –software based –easily available
11
11 www.maplin.co.uk www.crimebusters911.com www.blazeaudio.com
12
12 1.3 disguise in our study we chose not to use real voice changers, in favour of total control over effects pitch shift chosen as a universal function
13
13 2. Experimental design
14
14 2.1 design outline simple design listeners asked to identify samples of familiar voices Control condition unmodified stimuli 4 Experimental conditions modified stimuli
15
15 2.1 design outline degree of familiarity known to affect rate of successful identification thus we trained listeners to identify a group of speakers –controls degree of familiarity –all listeners had exactly the same exposure in terms of length & quality of samples –identification task carried out under same conditions
16
16 2.2 speakers 4 male speakers –16-18 years old taken from IViE corpus (Grabe, Post & Nolan 2001) –Leeds dialect (nearest to York) –reading text of Cinderella story IViE speakerExperimental name JPEdward JWMatthew MDHarry RPDavid
17
17 2.2 speakers training materials created for each speaker –c. 90 seconds of Cinderella (302 words) –edited out disfluencies, non-speech sounds, long pauses –samples normalised for amplitude with Audacity 1.2.5
18
18 2.3 listeners 36 listeners variety of regional/social backgrounds York residents age range 19-55 10 male, 26 female
19
19 2.4 Control condition all 36 listeners –4 voices * 90 seconds = c. 6 minutes –presented by PowerPoint with speakers’ names –Toshiba laptop –Aiwa A170 headphones –individually in quiet room 1. training phase 2. break 3. listening test
20
20 2.4 Control condition all 36 listeners –10 minutes 1. training phase 2. break 3. listening test
21
2.4 Control condition all 36 listeners –8 stimuli (2 per speaker) –duration c. 10 seconds –5 second gap between –extracts from other parts of Cinderella story –normalised for amplitude with Audacity 1.2.5 –answer sheet with names 1. training phase 2. break 3. listening test
22
22 2.5 Experimental conditions 4 Experimental conditions listening tests same format as Control condition but stimuli modified for pitch Sound Forge 8.0 –pitch shift effect –accuracy setting ‘high’ –speech 1 mode –preserved durations
23
23 2.5 Experimental conditions (i) +8 semitones (ii) +4 semitones (iii)-4 semitones (iv) -8 semitones pitch shift > 8 semitones unnatural and partly incomprehensible
24
24 2.5 Experimental conditions listener groupNconditions (semitones) A18-8, +4 B18-4, +8
25
25 2.5 Experimental conditions listening test 16-92 days after Control test –no clear effects for length of delay same training as in Control condition 10 minute break 2 stimuli for familiarisation 8 experimental stimuli per condition –consecutive runs for + and - stimuli –order reversed for half of each group, but no effect
26
26 3. Results
27
27 3.1 Control condition average correct identification = 4.8/8 (60%)
28
28 3.1 Control condition individuals’ range 8 to 0 29/36 performed better than chance
29
29 3.2 Experimental conditions ** sig. lower than in Control (p <.005, Wilcoxon) trend (n.s.) for higher scores in + conditions **
30
30 variability in listener performance, esp. ±4 majority perform above chance except -8
31
31 3.3 variation by listener sex women sig. better in Control (p =.008, Mann-Whitney) –trend (n.s.) maintained in Experimental tests –same pattern reported by Bull & Clifford (1984) **
32
32 3.4 summary as predicted, identification rates were lower with disguised voices –lowest scores with most extreme form of disguise (±8 semitones) identification rates slightly better when pitch shifted up than down trend for women to perform better than men variability across listeners
33
33 4. Discussion & conclusion
34
34 4. discussion & conclusion tests reported here were not forensically realistic results may be affected by e.g. –degree of familiarity with voice –content of sample (vocabulary, syntax etc) –conditions of exposure (stress etc) –specific form of artificial disguise software, hardware system combination of effects
35
35 4. discussion & conclusion considerable variation in listeners’ scores –courts should not assume all witnesses are equally good at such tasks –supports broader principle that lay witnesses should be tested in their ability to identify a voice
36
36 4. discussion & conclusion but even marked disguise was not catastrophic for listeners a broadly positive conclusion for lay speaker identification –a reasonable chance of identifying familiar voices
37
37 4. discussion & conclusion but a less positive conclusion respect to use of voice changers as a means of protecting vulnerable witnesses giving evidence more extreme forms of modification may affect intelligibility & naturalness less extreme forms of modification may render witness’s voice recognisable different modifications for different voices?
38
38 4. discussion & conclusion as ever… more work is needed
39
39 thanks tack thanks to Peter French, Phil Harrison, Robin How
40
40 References Bull, R. & Clifford, B. (1984) Earwitness voice recognition accuracy. In G. Wells & E. Loftus (eds.) Eyewitness Testimony: Psychological Perspectives. Cambridge: CUP. pp. 92-123. Foulkes, P. & Barron, A. (2000) Telephone speaker recognition amongst members of a close social network. Forensic Linguistics 7: 181-198. Grabe, E., Post, B. & Nolan, F. (2001) English intonation in the British Isles: the IViE corpus. Final report to UK ESRC R000 237145. www.phon.ox.ac.uk/IViE Hollien, H., Majewski, W. & Doherty, E. (1982) Perceptual identification of voices under normal, stress and disguise speaking conditions. Journal of Phonetics 10: 139-148. Ladefoged, P. & Ladefoged, J. (1980) The ability of listeners to identify voices. UCLA Working Papers in Phonetics 49: 43-51.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.