Presentation is loading. Please wait.

Presentation is loading. Please wait.

Evaluating prosody prediction in synthesis with respect to Modern Greek prenuclear accents Elisabeth Chorianopoulou MSc in Speech and Language Processing.

Similar presentations


Presentation on theme: "Evaluating prosody prediction in synthesis with respect to Modern Greek prenuclear accents Elisabeth Chorianopoulou MSc in Speech and Language Processing."— Presentation transcript:

1 Evaluating prosody prediction in synthesis with respect to Modern Greek prenuclear accents Elisabeth Chorianopoulou MSc in Speech and Language Processing University of Edinburgh Supervisor: Prof. D. R. Ladd External Advisor: Robert Clark (CSTR)

2 2 Today’s presentation Project’s main goal Theoretical background Hypothesis Tools & Methods Pilot experiment design results Future work

3 3 Prosody prediction in modern TTS systems Abstract Level  Acoustics  Perception f0 pitch duration rhythm amplitude loudness Interaction of correlates not always clear… Not necessarily adequate information from text Speaker variability (production & perception) prosodic structure prominence tune

4 4 F 0 prediction Global f 0 properties: declination, reset. Local f 0 properties: contour shape, tonal targets, alignment. F0 predictors:syllable properties word properties rhythm syntactic structure information structure

5 5 Project’s Main Goal Intonational phonetics & phonology  prosody prediction in synthesis Synthetic speech: insight on role of tonal alignment Naturalness judgements effect distribution TTS system design?

6 6 Pre-nuclear accents Prosodic units: IP (intonational phrase) iP (intermediate phrase) iP contains one or more pitch accents Final accent in iP is the nuclear accent All non final accents are pre-nuclear

7 7 The case of Modern Greek (Arvaniti et al., 1998) Tonal targets: scaling & alignment Modern Greek pre-nuclear accents: two tonal targets, a L and a H. Stability of valley (F 0 min) vs variability of peak (F 0 max)  type of accent? a. bitonal L* + H b. L* accent followed by H phrase tone

8 8 The case of Modern Greek (Arvaniti et al., 1998) H L C 0 V 0 *C 1 V 1 Tonal targets independently aligned with specific points in segmental string. Duration & slope of f 0 movement depends on segmental quality. (+15ms)(-5ms)

9 9 What does the project actually involve? Presuppose validity of Arvaniti et al.’s findings Apply them in synthetic speech (DEMOSTHeNES Speech Composer) Move alignment points of both L and H (Praat) Perceptual experiments (E-Prime)

10 10 Original hypothesis Original hypothesis Movements in alignment are not going to influence perception of naturalness significantly. In case perception is affected, late alignment of the F0 max is expected to have the greatest influence.

11 11 Test Sentences At least one unaccented syllable preceding accented one Accented vowel between nasals, lateral At least two syllables before following accent Example Sentence Τοανώνυμογράμματηναναστάτωσε. Toano*nimogra*matinanasta*tose

12 12 DEMOSTHeNES University of Athens, M-PIRO project a modular system like Edinburgh’s Festival (HRG, VSERVER, VCOM, VMOD) Prosody in DEMOSTHeNES duration, pitch, amplitude offered as VCOMs linked to the HRG Current prosodic model: phrasing & lexical stress

13 Output (Praat) f 0 declination reset at phrase breaks limited pitch range limited movements

14 14 Towards naturalness I Apply results of Arvaniti et al. to default pitch contour of DEMOSTHeNES. H L C 0 V 0 *C 1 V 1 Not only first but also second stressed syllable (-5ms) (+15ms)

15 Output (Praat) f0 declination same pitch range more f0 movements

16 16 Towards naturalness II : modifications in alignment Targets moved independently earlier or later than normal alignment points Early – Late Late – Early Normal – Late etc… 40 – 80 ms 50 – 100 ms 60 – 120 ms ?

17 Output Early L (50ms) Late H (100ms)

18 Output Late L (50ms) Early H (100ms)

19 19 Design of pilot perceptual experiment 2 sentences: standard VS modified alignment N – N VSEarly – Late Late – Early Normal - Late Naturalness judgement of pair-comparisons 12 native Greek speakers, students in Edinburgh Aim: 40 – 80 50 – 100 60 - 120 ?

20 Results I

21 Results II

22 22 Future Work 10 sentences: standard VS modified alignment N – N all possible combinations between Early – Normal – Late Modifications by 40 – 80 and 60 – 120 ms Native Greek speakers, Greece, July :-) Aim: patterns in perception of naturalness?

23 23 The contribution of this project Insight on role of alignment in perceiving a synthetic utterance as natural TTS system design results not restricted to Greek evidence for segmental anchoring in other languages – studies of Dutch, German, English

24 24 Sound files DEMOSTHeNES Arvaniti et al. Early L (50ms)– Late H (100ms) Late L (50ms)– Early H (100ms)


Download ppt "Evaluating prosody prediction in synthesis with respect to Modern Greek prenuclear accents Elisabeth Chorianopoulou MSc in Speech and Language Processing."

Similar presentations


Ads by Google