Presentation is loading. Please wait.

Presentation is loading. Please wait.

Взаимодействие вербального, просодического и визуального каналов в понимании речи А.А. Кибрик (Институт языкознания РАН и МГУ имени М.В.Ломоносова)

Similar presentations


Presentation on theme: "Взаимодействие вербального, просодического и визуального каналов в понимании речи А.А. Кибрик (Институт языкознания РАН и МГУ имени М.В.Ломоносова)"— Presentation transcript:

1 Взаимодействие вербального, просодического и визуального каналов в понимании речи А.А. Кибрик (Институт языкознания РАН и МГУ имени М.В.Ломоносова) aakibrik@gmail.com Ярославль 22 ноября 2012

2 INTERACTION OF THE VERBAL, PROSODIC, AND VISUAL COMPONENTS in language understanding Andrej A. Kibrik (Institute of Linguistics RAN and Lomonosov Moscow State University) aakibrik@gmail.com Jaroslavl’ November 22, 2012

3 3 The mainstream linguistic approach  Language consists of hierarchically organized segmental units, such as phonemes, morphemes, words, phrases, and sentences  Linguistic form is thus equated with verbal form

4 4 However  Apart from sound, there are other channels (or components) of communication, in the first place through vision (body language - gesture, mimic, gaze, posture, etc.)  Also, there are prosodic, that is non-verbal (non-segmental) aspects to sound  Imagine prosody-free talk  or, vice versa, talk behind a wall

5 5 Communication channels  The verbal component, prosody, and body language all count as distinct communication (or information) channels  They all cooperate in getting message from speaker to addressee  This is what is sometimes called the multimodal approach  Cf. Реформатский 1963: How the non-verbal “text” interacts with the verbal text?

6 6 Multimodality  ‘‘A multimodal approach assumes that the message is ‘spread across’ all the modes of communication. If this is so, then each mode is a partial bearer of the overall meaning of the message. ’’ (Kress 2002).  “Any use of language is inescapably multimodal” (Scollon 2006)  “Unimpaired communication is, of course, inherently multimodal, with the speech content being modified by prosody and delivered in parallel with facial expression, gesture, posture, and a range of other nonverbal communication methods.” (Alm 2006)  “Within biology, experimental psychology, and cognitive neuroscience, a separate rapidly growing literature has clarified that multisensory perception and integration cannot be predicted by studying the senses in isolation.” (Cohen and Oviatt 2006)

7 7 What is the contribution of different channels?  Traditional approach of mainstream linguistics: the verbal channel is so central that prosody and the visual channel are at best downgraded as “paralinguistics”  Applied psychology It is often stated that (figures go back to Mehrabian 1971): body language conveys 55% of information prosody conveys 38% of information the verbal component conveys 7% of information  «Words may be what men use when all else fails» (Крейдлин 2002: 6)  Who is right?

8 8 Relative contribution of three communication channels? DISCOURSE Vocal channelsVisual channel Verbal channel Prosodic channel

9 9 Experimental design  Isolate the three communication channels  Present a sample discourse in all possible variants (2 3 =8)  Present each of the eight variants to a group of subjects  Assess the degree of understanding in each case  Such assessment may lead to estimates of the contributions of communication channels

10 10 Studies in this line of research  Èl’bert 2006, year paper  Èl’bert 2007, diploma thesis  Reinterpreted and refined in Kibrik and Èl’bert 2008  Molchanova 2008, year paper  Molchanova 2009, year paper  Molchanova 2010, diploma thesis  Reinterpreted and refined in Kibrik 2011

11 11 Èl’bert 2007, Kibrik and Èl’bert 2008  Russian TV serial “Tajny sledstvija” – “Mysteries of the investigation”  Experimental excerpt: 3 min. 20 sec.  Preceded by a 8 minutes context (that starts from the beginning of the series)  The excerpt fully consists of a conversation, to ensure that we are testing the understanding of discourse rather than of the film in general  Two vocal channels have been separated:  Verbal: running subtitles  Prosodic: superimposed filter creating the “behind a wall” effect  Participants:  99 participants, divided into 8 groups  Native speakers of Russian  Each group comprised 10 to 17 participants

12 12 Eight experimental groups  Group 0: only the context excerpt  Groups 1 (one communication channel)  Verbal: subtitles, temporally aligned  Prosodic: filtered sound  Visual: video  Groups 2 (two communication channels):  Verbal + prosodic = original sound  Verbal + visual: subtitles and video  Prosodic + visual: filtered sound and video  Group 3: original material

13 13 Group 3: original material

14 14 Verbal + visual

15 15 Visual + prosodic

16 16 Procedure  The context and the experimental excerpts were shown to a group of subjects on a large screen  Each subject was instructed to watch the context and the experimental excerpt and then answer a set of questions concerned with the experimental excerpt alone  Questionnaire was constructed in accordance with the received principles of test tasks (Panchenko 2000)  23 multiple-choice questions in questionnaire  A subject was supposed to choose only one answer out of four listed variants  What Tamara Stepanovna offers Masha before the beginning of the conversation:  a. to take off her coat  b. to have a cup of tea  c. to have a seat  d. to have a drink  Percentage of correct answers is used as an assessment of a subject’s degree of understanding

17 17 Results  All three channels are substantially informative  Verbal > visual > prosodic  Integration of visual and prosodic channels is difficult

18 18 Molchanova 2010  “Contribution of information channels in understanding spoken discourse: methodological aspects”  The following aspects of the prior study have been changed (improved)  Stimulus material  Prosodic channel  Verbal channel  Questionnaire  Interviewing procedure

19 19 Stimulus material: discourse type  Shortcomings of movies  Plot facilitates guessing  Possible familiarity with the movie  Quasi-natural behavior of actors  Solution: natural dialogue  Shared activity Figure-guessing game Can be filmed by one camera все 3 канала.avi, 0:19 – 0:57  Remaining problems  Hard to remember the sequence of events  Many events are similar

20 20 Stimulus material: speakers  Shortcomings of the prior studies  Same-sex speakers  indistinguishable in the prosody-only version  Solutions  Different sexes: F0 range is different  Additional features  Acquainted  Not close friends

21 21 Prosodic channel  Shortcomings of the prosodic material as used in previous studies  Èl’bert 2007: noisy sound  Molchanova 2009: Unnatural, “electronic”, sound  Solution:  Loudness is decreased radically at all frequencies except for the speaker’s average F0 frequency  This has led to the “behind the wall” (or “behind the glass”) effect

22 22 Visual + prosodic

23 23 Verbal channel  Shortcomings of subtitles  Hard to read without punctuation  Especially at the rate of speech  And especially in the “verbal + visual” condition  Solution: spoken prosody-free signal  Each word in transcript is replaced by an individually pronounced word  All thus elicited words are glued together in the right order

24 24 Visual + verbal

25 25 Verbal channel  Remaining problem  Unnatural input No reduction No intonation etc.

26 26 Questionnaire  Shortcomings of prior studies  Èl’bert 2007: gap between Group 0 (38.3%) and Group 3 (87.4%) is insufficient  Solution  Testing stage Identify trivial questions (high Group 0) Identify unfortunate questions (low Group 3) 30  17  Group 0: 24.7% correct answers  Group 3: 91.2% correct answers

27 27 Interviewing procedure  Shortcomings of prior studies  Participants of various age and life experience  Multiple participants may affect each other’s performance  Need for a large room, loud speakers, and big screen  Solutions  Control for age, gender, geographical origin, social status  Remote implementation Stimulus materials at Youtube.com Questionnaire at Googledocs  All participants are in similar conditions  Comfortable, adjustable conditions  No need for audio and video control in large rooms

28 28 Kibrik and Èl’bert 2008 vs. Molchanova 2010  General picture is remarkably similar  All three channels are substantially informative  Verbal > visual > prosodic  Visual + prosodic dip is even sharper  Cleaner results  Two channels is much better than one channel  Verbal and visual channels integrate well

29 29 Normalized contribution of three channels  Suppose the three channels are independent  Sum up all percentages of individual channel contributions and normalize to 100%  Identify normalized contribution

30 30 Normalized contribution of three channels Kibrik and Èl’bert 2008Molchanova 2010 Summed percentages72+51+62=18559+46+49=154 Normalized contributions Verbal 72%:1.85≈39%59%:1.54≈38% Prosodic 51%:1.85≈28%46%:1.54≈30% Visual 62%:1.85≈33%49%:1.54≈32%

31 31 Gender differences  Molchanova 2010: gender advantages  Percentages of correct answers ConditionMenWomenAdvantage Verbal only59.169.9Women: +10.7 Visual + prosodic 66.151.6Men: +14.5

32 32 Conclusions  All communicatioin channels are highly significant  the traditional linguistic viewpoint is erroneous  The verbal channel is the leading one  the viewpoint popular in applied psychology is erroneous  Information from the prosodic and the visual channels is primarily used through integration with the verbal channel  Very similar results have been attained in different studies, in spite of very different methodological details

33 33 Further questions  Auditory or graphic presentation of the “verbal alone” channel?  Optimal discourse type?  …and: Other suggestions on this approach?

34 34 Thanks for your attention verbal channel visual channel prosodic channel language


Download ppt "Взаимодействие вербального, просодического и визуального каналов в понимании речи А.А. Кибрик (Институт языкознания РАН и МГУ имени М.В.Ломоносова)"

Similar presentations


Ads by Google