1 / 34

Взаимодействие вербального, просодического и визуального каналов в понимании речи

Взаимодействие вербального, просодического и визуального каналов в понимании речи. Ярославль 22 ноября 2012. А.А. Кибрик (Институт языкознания РАН и МГУ имени М.В.Ломоносова) aakibrik@gmail.com. INTERACTION OF THE VERBAL, PROSODIC, AND VISUAL COMPONENTS in language understanding.

jrocio
Download Presentation

Взаимодействие вербального, просодического и визуального каналов в понимании речи

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Взаимодействиевербального,просодическогоивизуального каналов в понимании речи Ярославль 22 ноября 2012 А.А. Кибрик (Институт языкознания РАН и МГУ имени М.В.Ломоносова) aakibrik@gmail.com

  2. INTERACTION OF THE VERBAL, PROSODIC, AND VISUAL COMPONENTS in language understanding Jaroslavl’ November 22, 2012 Andrej A. Kibrik (Institute of Linguistics RAN and Lomonosov Moscow State University) aakibrik@gmail.com

  3. The mainstream linguistic approach • Language consists of hierarchically organized segmental units, such as phonemes, morphemes, words, phrases, and sentences • Linguistic form is thus equated with verbal form

  4. However • Apart from sound, there are other channels (or components) of communication, in the first place through vision (body language - gesture, mimic, gaze, posture, etc.) • Also, there are prosodic, that is non-verbal (non-segmental) aspects to sound • Imagine prosody-free talk • or, vice versa, talk behind a wall

  5. Communication channels • The verbal component, prosody, and body language all count as distinct communication (or information) channels • They all cooperate in getting message from speaker to addressee • This is what is sometimes called the multimodal approach • Cf. Реформатский 1963: How the non-verbal “text” interacts with the verbal text?

  6. Multimodality • ‘‘A multimodal approach assumesthat the message is ‘spread across’ all the modes ofcommunication. If this is so, then each mode is apartial bearer of the overall meaning of the message.’’ (Kress2002). • “Any use of language is inescapably multimodal” (Scollon 2006) • “Unimpairedcommunication is, of course, inherently multimodal,with the speech content being modified byprosody and delivered in parallel with facial expression,gesture, posture, and a range of other nonverbalcommunication methods.” (Alm 2006) • “Within biology, experimental psychology, and cognitive neuroscience, a separate rapidly growing literature has clarified that multisensory perception and integration cannot be predicted by studying the senses in isolation.” (Cohen and Oviatt 2006)

  7. What is the contribution of different channels? • Traditional approach of mainstream linguistics: the verbal channel is so central that prosody and the visual channel are at best downgraded as “paralinguistics” • Applied psychology It is often stated that (figures go back to Mehrabian 1971): • body language conveys 55% of information • prosody conveys 38% of information • the verbal component conveys 7% of information • «Words may be what men use when all else fails» (Крейдлин 2002: 6) • Who is right?

  8. Relative contribution of three communication channels? DISCOURSE Vocal channelsVisual channel Verbal channelProsodic channel

  9. Experimental design • Isolate the three communication channels • Present a sample discourse in all possible variants (23=8) • Present each of the eight variants to a group of subjects • Assess the degree of understanding in each case • Such assessment may lead to estimates of the contributions of communication channels

  10. Studies in this line of research • Èl’bert 2006, year paper • Èl’bert 2007, diploma thesis • Reinterpreted and refined in Kibrik and Èl’bert 2008 • Molchanova 2008, year paper • Molchanova 2009, year paper • Molchanova 2010, diploma thesis • Reinterpreted and refined in Kibrik 2011

  11. Èl’bert 2007, Kibrik and Èl’bert 2008 • Russian TV serial “Tajny sledstvija” – “Mysteries of the investigation” • Experimental excerpt: 3 min. 20 sec. • Preceded by a 8 minutes context (that starts from the beginning of the series) • The excerpt fully consists of a conversation, to ensure that we are testing the understanding of discourse rather than of the film in general • Two vocal channels have been separated: • Verbal: running subtitles • Prosodic: superimposed filter creating the “behind a wall” effect • Participants: • 99 participants, divided into 8 groups • Native speakers of Russian • Each group comprised 10 to 17 participants

  12. Eight experimental groups • Group 0: only the context excerpt • Groups 1 (one communication channel) • Verbal: subtitles, temporally aligned • Prosodic: filtered sound • Visual: video • Groups 2 (two communication channels): • Verbal + prosodic = original sound • Verbal + visual: subtitles and video • Prosodic + visual: filtered sound and video • Group 3: original material

  13. Group 3: original material

  14. Verbal + visual

  15. Visual + prosodic

  16. Procedure • The context and the experimental excerpts were shown to a group of subjects on a large screen • Each subject was instructed to watch the context and the experimental excerpt and then answer a set of questions concerned with the experimental excerpt alone • Questionnaire was constructed in accordance with the received principles of test tasks (Panchenko 2000) • 23 multiple-choice questions in questionnaire • A subject was supposed to choose only one answer out of four listed variants • What Tamara Stepanovna offers Masha before the beginning of the conversation: • a. to take off her coat • b. to have a cup of tea •  c. to have a seat • d. to have a drink • Percentage of correct answers is used as an assessment of a subject’s degree of understanding

  17. Results • All three channels are substantially informative • Verbal > visual > prosodic • Integration of visual and prosodic channels is difficult

  18. Molchanova 2010 • “Contribution of information channels in understanding spoken discourse: methodological aspects” • The following aspects of the prior study have been changed (improved) • Stimulus material • Prosodic channel • Verbal channel • Questionnaire • Interviewing procedure

  19. Stimulus material: discourse type • Shortcomings of movies • Plot facilitates guessing • Possible familiarity with the movie • Quasi-natural behavior of actors • Solution: natural dialogue • Shared activity • Figure-guessing game • Can be filmed by one camera все 3 канала.avi, 0:19 – 0:57 • Remaining problems • Hard to remember the sequence of events • Many events are similar

  20. Stimulus material: speakers • Shortcomings of the prior studies • Same-sex speakers  indistinguishable in the prosody-only version • Solutions • Different sexes: F0 range is different • Additional features • Acquainted • Not close friends

  21. Prosodic channel • Shortcomings of the prosodic material as used in previous studies • Èl’bert 2007: noisy sound • Molchanova 2009: Unnatural, “electronic”, sound • Solution: • Loudness is decreased radically at all frequencies except for the speaker’s average F0 frequency • This has led to the “behind the wall” (or “behind the glass”) effect

  22. Visual + prosodic

  23. Verbal channel • Shortcomings of subtitles • Hard to read without punctuation • Especially at the rate of speech • And especially in the “verbal + visual” condition • Solution: spoken prosody-free signal • Each word in transcript is replaced by an individually pronounced word • All thus elicited words are glued together in the right order

  24. Visual + verbal

  25. Verbal channel • Remaining problem • Unnatural input • No reduction • No intonation • etc.

  26. Questionnaire • Shortcomings of prior studies • Èl’bert 2007: gap between Group 0 (38.3%) and Group 3 (87.4%) is insufficient • Solution • Testing stage • Identify trivial questions (high Group 0) • Identify unfortunate questions (low Group 3) • 30  17 • Group 0: 24.7% correct answers • Group 3: 91.2% correct answers

  27. Interviewing procedure • Shortcomings of prior studies • Participants of various age and life experience • Multiple participants may affect each other’s performance • Need for a large room, loud speakers, and big screen • Solutions • Control for age, gender, geographical origin, social status • Remote implementation • Stimulus materials at Youtube.com • Questionnaire at Googledocs • All participants are in similar conditions • Comfortable, adjustable conditions • No need for audio and video control in large rooms

  28. Kibrik and Èl’bert 2008 vs. Molchanova 2010 • General picture is remarkably similar • All three channels are substantially informative • Verbal > visual > prosodic • Visual + prosodic dip is even sharper • Cleaner results • Two channels is much better than one channel • Verbal and visual channels integrate well

  29. Normalized contribution of three channels • Suppose the three channels are independent • Sum up all percentages of individual channel contributions and normalize to 100% • Identify normalized contribution

  30. Normalized contribution of three channels

  31. Gender differences • Molchanova 2010: gender advantages • Percentages of correct answers

  32. Conclusions • All communicatioin channels are highly significant  the traditional linguistic viewpoint is erroneous • The verbal channel is the leading one  the viewpoint popular in applied psychology is erroneous • Information from the prosodic and the visual channels is primarily used through integration with the verbal channel • Very similar results have been attained in different studies, in spite of very different methodological details

  33. Further questions • Auditory or graphic presentation of the “verbal alone” channel? • Optimal discourse type? • …and: Other suggestions on this approach?

  34. Thanks for your attention visual channel language verbal channel prosodic channel

More Related