Presentation is loading. Please wait.

Presentation is loading. Please wait.

ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE Symposium - 05/02/091 Objective intelligibility assessment of pathological speakers Catherine Middag,

Similar presentations


Presentation on theme: "ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE Symposium - 05/02/091 Objective intelligibility assessment of pathological speakers Catherine Middag,"— Presentation transcript:

1 ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE Symposium - 05/02/091 Objective intelligibility assessment of pathological speakers Catherine Middag, Gwen Van Nuffelen, Jean-Pierre Martens, Marc De Bodt

2 ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE Symposium - 05/02/092 Introduction Intelligibility = popular measure for pathological speech assessment Perceptual assessment affected by non-speech information : –familiarity with speaker and type of disorder –usage of linguistic context Word intelligibility tests designed to eliminate bias due to linguistic context Replacing the human listener by an automatic speech recognizer (ASR) can solve the other problems, but is the ASR sufficiently reliable? –test case : automation of the Dutch Intelligibility Assessment (DIA)

3 ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE Symposium - 05/02/093 top Dutch Intelligibility Assessment (DIA) 50 isolated CVC words intelligibility = percent phonemes correct

4 ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE Symposium - 05/02/094 How to apply ASR in the DIA? Two approaches –let ASR recognize the words and count the percentage of correct decisions –let ASR check how well the acoustics match with the phonetic transcription of the target word (=alignment) Our experience –intelligibility emerging from first approach insufficiently reliable –therefore we developed a system based on alignment

5 ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE Symposium - 05/02/095 System architecture : flow chart Speech aligner speaker features Intelligibility Prediction Model objective score acoustic feature sequence X t target speech transcription

6 ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE Symposium - 05/02/096 System architecture : flow chart Speech aligner speaker features Intelligibility Prediction Model objective score acoustic feature sequence X t target speech transcription Two systems: complex state-of-the-art HMM-based system (ASR-ESAT) simple system with phonological layer (ASR-ELIS) (point more directly to articulatory problems)

7 ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE Symposium - 05/02/097 System architecture : flow chart Speech aligner acoustic feature sequence X t target speech transcription Intelligibility Prediction Model objective score speaker features Two feature sets: Phonemic features (patient has trouble pronouncing a certain phoneme) Phonological features (patient has problems with voicing, manner or place of articulation)

8 ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE Symposium - 05/02/098 Extraction of phonemic features (PMF) # : (0.7+0.5+0.3) /3 /p/ : (0.4+0.8) /2 /o/: (0.6+0.8) /2 /l/: 0.6 Speech aligner = ASR-ESAT Phonemic features FramePhonemeP(s t |X t ) 1#0.7 2#0.5 3/p/0.4 4/p/0.8 5/o/0.6 6/o/0.8 7/l/0.6 8#0.3

9 ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE Symposium - 05/02/099 Extraction of phonological features (PLF) FramePhone voiced P(K 1 |X t ) back P(K 2 |X t ) burst P(K 3 |X t ) 1#0.1 0.2 2#0.1 3/pcl/0.20.1 4/p/0.2 0.6 5/o/0.80.70.2 6/o/0.60.90.0 7/l/0.5 0.1 8# 0.0 Burst : 0.6 Back : (0.7+0.9)/2 Voiced : (0.8+0.6+0.5)/3 Speech aligner = ASR-ELIS Phonological features

10 ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE Symposium - 05/02/0910 Extraction of phonological features (PLF) Not burst : (0.2+0.1+… Not back : (0.1+0.1+… Not voiced : (0.1+0.1+… Phonological features FramePhone voiced P(K 1 |X t ) back P(K 2 |X t ) burst P(K 3 |X t ) 1#0.1 0.2 2#0.1 3/pcl/0.20.1 4/p/0.2 0.6 5/o/0.80.70.2 6/o/0.60.90.0 7/l/0.5 0.1 8# 0.0 Speech aligner = ASR-ELIS

11 ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE Symposium - 05/02/0911 Irrelevant features for these phones Extraction of phonological features (PLF) Phonological features FramePhone voiced P(K 1 |X t ) back P(K 2 |X t ) burst P(K 3 |X t ) 1#0.1 0.2 2#0.1 3/pcl/0.20.1 4/p/0.2 0.6 5/o/0.80.70.2 6/o/0.60.90.0 7/l/0.5 0.1 8# 0.0 Speech aligner = ASR-ELIS

12 ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE Symposium - 05/02/0912 System architecture : flow chart Speech aligner acoustic feature sequence X t target speech transcription speaker features objective score Intelligibility Prediction Model

13 ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE Symposium - 05/02/0913 Intelligibility prediction model (IPM) Objective map speaker features (PMF, PLF or combinations) to speaker intelligibility score Model training –train on DIA recordings –pathological speakers (+ some normal control speakers) Model type and size –limited number of pathological speakers –high number of features  linear regression model  feature selection

14 ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE Symposium - 05/02/0914 Reference material (DIA) 211 speakers : –51 normals –60 dysarthric –12 clefts –42 hearing impaired –37 with laryngectomy – 7 with dysphonia – 2 others Pathological speakers : mean of 78,7 % Normals : mean of 93,3 % Few with very low score

15 ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE Symposium - 05/02/0915 Results : individual systems Based on five-fold cross validation Measure = Pearson Correlation Coefficient (PCC) ELIS : PLF : PCC = 0.78 ESAT : PMF : PCC = 0.80

16 ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE Symposium - 05/02/0916 Results : combined system PMF + PLF : PCC = 0.86

17 ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE Symposium - 05/02/0917 Results : pathology-specific IPM Instead of creating one general IPM, one can create IPMs for specific pathologies : –still trained on all speakers (enough speakers) –model selection based on performance of speakers of that pathology (importance of features depends on type of disorder) DysarthriaLaryngectomyHearing impairment PCC0.940.910.97

18 ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE Symposium - 05/02/0918 Results : pathology-specific IPM Dysarthria : 0.94 (red circles) Dispersion of other speakers is increased Largest deviations in low intelligibility area : –scarce data in that area –can be solved by adding more weight to patients with very low intelligibility

19 ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE Symposium - 05/02/0919 Development of DIA-tool PMF and PLF can predict intelligibility of pathological speech: –Combining PMF and PLF yields high PCCs: 0.86 for general model over 0.91 for pathology specific model –PCCs for specific pathologies compete with subjective inter-rater agreements (0.91) This opens up possibilities for development of an automated version of the DIA (see demonstration later) based on PLF + PMF

20 ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE Symposium - 05/02/0920 New feature set : Context-dependent phonological features (CD-PLF) Until now: –PMF : Does the patient have trouble pronouncing a certain phoneme? –PLF : Does the patient have problems with voicing, manner or place of articulation New : Does the patient have problems with a desired change of voicing, manner or place of articulation?  CD-PLFs : how well is change in PLF realized?

21 ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE Symposium - 05/02/0921 Extraction of context-dependent phonological features (CD-PLF) SegmentPhone voicedburst… 2#0.10.2 3/pcl/0.2 4/p/0.20.6 6/o/0.60.1 7/s/0.40.3 8#0.20.1 9/m/0.70.3 10/A/0.80.0 11/l/0.60.1 12#0.1 CD-PLF features Speech aligner = ASR-ELIS voicingBurst Off, on, off : +0.6Yes, no, no : +0.1 On, on, on : +0.8No, no, no : +0.0

22 ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE Symposium - 05/02/0922 Results for CD-PLF CD-PLFs alone compete with previous best PLF+PMF : 0.86 CD-PLF+PMF : 0.90  new best! Pathology-specific results for CD-PLF+PMF : DysarthriaLaryngectomyHearing impairment PCC0.950.940.98

23 ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE Symposium - 05/02/0923 Conclusions and future work PMF, PLF and CD-PLF can predict intelligibility of pathological speech –CD-PLFs seem to play an important role : CD-PLF : PCC = 0.87 CD-PLF + PMF : PCC=0.90  not the articulation pattern but the change in the articulation pattern matters? –More research is needed before adding this feature set to the tool High PCCs open up new possibilities for : –more profound articulatory assessment, which is directly related to determination of appropriate therapy –monitoring of effectiveness of chosen therapy  tool –using more natural speech (words, phrases) in tests

24 ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE Symposium - 05/02/0924 Questions?


Download ppt "ELIS-DSSP Sint-Pietersnieuwstraat 41 B-9000 Gent SPACE Symposium - 05/02/091 Objective intelligibility assessment of pathological speakers Catherine Middag,"

Similar presentations


Ads by Google