Multimodal communication
1 / 17

~ Multimodal Communication ~ - PowerPoint PPT Presentation

  • Uploaded on

~ Multimodal Communication ~. HOW TO: From raw data to data annotation. Raw data: video file. from video camera to the computer TASX compatible format: AVI for help with this: AVZ (Audio-visuelles Zentrum) to be found on N6, N7

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about '~ Multimodal Communication ~' - rafiki

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Multimodal communication
~ Multimodal Communication ~


From raw data

to data annotation

Raw data video file
Raw data: video file

  • from video camera to the computer

  • TASX compatible format: AVI

    for help with this:

    AVZ (Audio-visuelles Zentrum)

    to be found on N6, N7

Editing the video file with virtualdub
Editing the video file with VirtualDub

cutting the video/

making selections

Editing the video file with virtualdub1
Editing the video file with VirtualDub

saving changes:

File: Save as AVI

select file name &


Extracting audio from video with virtualdub
Extracting audio from video with VirtualDub

saving audio stream:

File: Save WAV

select file name &


Data output
Data output

  • digitized video file, e.g. 9-11.avi

  • digitized sound file, e.g. 9-11.wav

    Why extract the sound file from the video file?

    --> separate speech description in Praat (or other tools, for that matter)

Speech annotation with praat individual steps
Speech annotation with Praat: Individual steps

  • load .wav file into Praat,

  • set up TextGrid (annotation tiers), and

  • EDIT;

  • annotate speech file according to individual needs (granularity of segmentation, means of transcription, ...)

  • write TextGrid to text file (STRG-S), e.g. 9-11.TextGrid

From praat file to tasx file
From Praat file to TASX file

applying script


to Praat file for conversion from .TextGrid

to .xml

in order to make it compatible with TASX


e.g. 9-11.xml)

Confused brief review on previous steps
Confused? Brief review on previous steps

  • editing of video file (file.avi)

  • extraction of sound stream from video file (file.wav)

  • annotation of speech in Praat (or some other tool...) (file.TextGrid)

  • conversion of file.TextGrid into TASX compatible format (file.xml)

What is tasx
What is TASX?

  • TASX - Time Aligned Signal data eXchange (XML based)

  • tool for the annotation of multimodal (audio & video) data (cf. Praat for audio only)

  • Source:

The tasx annotator
The TASX annotator

  • a TASX-annotated corpus consists of a set of sessions

  • each session holds an arbitrary number of descriptive tiers or layers

  • each layer consists of a set of separated events

  • each event holds some kind of textual information (label) and is linked to the primary audio or video data by means of two time stamps (marking the beginning and the end of an interval)


corpus, plural corpora:

A collection of linguistic data, either compiled as written texts or as a transcription of recorded speech. The main purpose of a corpus is to verify a hypothesis about language - for example, to determine how the usage of a particular sound, word, or syntactic construction varies. [...]

(cf. Crystal, David. 1992. An Encyclopedic Dictionary of Language and Languages. Oxford, 85)

Interface of the tasx annotator illustration
Interface of the TASX annotatorIllustration

tiers or layers

segments or events

labels: e.g. “mom“

Getting started part i
Getting started... Part I

  • Start the TASX annotator.

  • Load primary video: File - Load primary video or STRG-W (.avi file)

  • Load audio file (.wav file, optional; loading audio file causes TASX annotator to generate oscillogram)

  • Import speech annotation: File - Import from - TASX (.xml file)

  • In case there is already an existing annotation file (.tbf file), file can simply be loaded into the tool: File - Open (new format) or STRG-O

Getting started part ii
Getting started... Part II

  • File - Merge (new format) merges two separate annotation files

  • Add (an) extra tier(s): Tier - New tier or Shift-N

  • Rename extra tier(s):

    • Activate tier to be renamed by mouseclick (changes to green)

    • Tier - Rename tier or Shift-R

    • Type in new tier name

  • Save complete file: File - Save as... (new format) or STRG-S

    resulting output file: file.tbf

Getting started part iii
Getting started... Part III

  • For further details, see the manual:


    And now, let‘s get started!