1 / 14

Beyond Multimedia Integration: corpora and annotations for cross-media decision mechanisms

Beyond Multimedia Integration: corpora and annotations for cross-media decision mechanisms. Katerina Pastra. Language Technology Applications, Institute for Language and Speech Processing, Athens, Greece. The Multimedia Integration context.

bond
Download Presentation

Beyond Multimedia Integration: corpora and annotations for cross-media decision mechanisms

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Beyond Multimedia Integration: corpora and annotations for cross-media decision mechanisms Katerina Pastra Language Technology Applications, Institute for Language and Speech Processing, Athens, Greece

  2. The Multimedia Integration context • MM integration mechanisms: ones that establish associations between medium-specific representations • Applications: MM dialogue, MM indexing etc. • Need: MM corpora annotated with such associations • Existing collections with such annotations: IBM (Lin et al. 2003), PASCAL visual object categorization challenge (Everingham et al. 2005), and some ad hoc created collections finite set of textual labels associated to visual medium (keyframe/image or image region) OK, for MM integration – but, Beyond ???

  3. Overview • Beyond MM integration: Cross-media decision mechanisms - notion and example application (cross-media indexing) • Corpora for cross-media mechanisms - corpus characteristics & the REVEAL THIS corpus • Annotations for cross-media mechanisms - cross-media relations - cross-media annotation types - annotation language requirements - MPEG7 & EMMA: suitability for such annotations

  4. equivalence The notion of Cross-Media Decision Mechanisms Mechanisms that decide on the relation that holds between medium specific pieces of information: • across documents (Boll et al. 1999) • within documents(Pastra & Piperidis 2006, EuroITV conf.) The mechanisms decided whether medium-specific pieces of information within the same Multimedia Document are: • associated (multimedia integration) • complementary • semantically compatible/incompatible complementarity independence

  5. Cross-media Relation Examples Equivalence: “the yellow taxi-boats…” Non-essential complementarity: “…we are heading to Patmos…” Essential complementarity: “…[pollution has taken its toll] on that..” Independence: “…I have finally found a place that’s not overrun by tourists…”

  6. Cross-media relations • Equivalence: info expressed by different media refers to the same entity (object, state, event or property) • Complementarity: info in one medium is an (essential or not) complement of the info expressed in another. Essential complementarity usually indicated through association signals (e.g. indexicals) Non-essentially complementarity info in one medium is a modifier/adjunct of info expressed in another • Independence: each medium carries an independent (but coherent) part of the MM message Incoherence due to errors in medium-specific processing or artistic/editorial reasons

  7. Application example: a cross-media indexer’s decisions 2 ¬choice 1) Landscape–sea/coast 1¬and 2) Landscape – people or/and Equivalence: “the yellow taxi-boats…” Independence: “…I have finally found a place that’s not overrun by tourists…” and and Essential complementarity: “…[pollution has taken its toll] on this..” Non-essential complementarity: “…we are heading to Patmos…” How shall we develop them?

  8. Corpora for x-media mechanisms Such a corpus should consist of: • multimedia documents(e.g. video, illustrated text) • multi-genre & multi-domain documents within document algorithms! medium-specific processor needs! • The Reveal – This corpus • Multilingual(EL-EN) – part of it parallel + rest comparable • Multimedia(MPEG-2 videos – TV programmes, DVD documentaries etc. but also radio and web-text (=UTF8) • Multi-domain(Politics, Travel, News) • Multi-genreto accommodate: read speech vs. spontaneous • speech, face-rich vs. object-rich imagery, formal vs. colloquial • language

  9. REVEAL THIS corpus specifics

  10. REVEAL THIS corpus specifics (2)

  11. Annotation types Equivalence: • Association – A(X,Y) • Partial Association – PA(X,Y) e.g. “coloured” + Complementarity: • Association Signal – AS(X,Y) • Adjunct – AJ(X,Y) • Apposition – AP(X,Y) e.g. “the president” + Independence: • Coherence – CH(X,Y) “attribute” text vs. “value” image One unit, do not associate as if type:token

  12. Annotation language requirements A markup language should allow for: • modular description of the structure of a MM document and of media it consists of, to facilitate indication of relations between media with different levels of structural granularity (e.g. token-image region, token-set of keyframes etc.) MPEG-7 ideally suited for such description • creation of a multimedia unit with re-defined properties in cases of essential conjunction of medium-specific pieces of info for forming a MM message (e.g. in essential complementarity cases….) EMMA ideally suited for such task

  13. MPEG-7 and EMMA MPEG-7 ISO standard for describing MM content: • low-level feature descriptors (e.g. colour, motion etc.) • high-level feature descriptors (object, event etc.) • structure descriptors • relations between structural units (e.g. spatial, temporal etc.) • textual annotations for each unit (e.g. controlled vocabulary etc.) EMMAW3C standard (working-draft) for describing the output of medium-specific processors and their integration in MM user input scenarios • hook element ~ asso signal • composite derivation  creation of MM unit e.g. “destination” + pen pointing to Boston image region on screen

  14. Conclusions • within-document cross media decisions mechanisms • need multimedia, multi-genre, multi-domain corpora • annotated with a finite set of description elements • that will allow for indicating the cross-media relations (equivalence, complementarity, independence) that hold in the MM documents • using a markup language that will have features that MPEG-7 and EMMA have in combination A timely cooperation between the two schemes?

More Related