combining multiple representations on the trecvid search task n.
Download
Skip this Video
Loading SlideShow in 5 Seconds..
Combining Multiple Representations on the TRECVID Search Task PowerPoint Presentation
Download Presentation
Combining Multiple Representations on the TRECVID Search Task

Loading in 2 Seconds...

play fullscreen
1 / 26

Combining Multiple Representations on the TRECVID Search Task - PowerPoint PPT Presentation


  • 99 Views
  • Uploaded on

Combining Multiple Representations on the TRECVID Search Task. Arjen P. de Vries Thijs Westerveld Tzvetanka I. Ianeva. Introduction. Video Retrieval should take advantage of information from all available sources and modalities …but so far ASR best for almost any query

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Combining Multiple Representations on the TRECVID Search Task' - courtney


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
combining multiple representations on the trecvid search task

Combining Multiple Representations on the TRECVID Search Task

Arjen P. de Vries

Thijs Westerveld

Tzvetanka I. Ianeva

ICASSP, May 21 2004

introduction
Introduction
  • Video Retrieval should take advantage of information from all available sources and modalities
    • …but so far ASR best for almost any query
  • LL11@TRECVID2003: Combining information sources
    • Different models/modalities
    • Multiple example images

ICASSP, May 21 2004

retrieval
Calculate conditional probabilities of observing query samples given each model in the collectionRetrieval

Models

P(Q|M1)

Query

P(Q|M2)

P(Q|M3)

P(Q|M4)

ICASSP, May 21 2004

static model
Indexing

Estimate a Gaussian Mixture Model from each keyframe (using EM)

Fixed number of components (C=8)

Feature vectors contain colour, texture, and position information from pixel blocks: <x,y,DCT>

Static Model

ICASSP, May 21 2004

dynamic model
Dynamic Model
  • Indexing:
    • GMM of multipleframes (N=29) around keyframe
    • Feature vectors extended with time-stamp in [0,1]: <x,y,t,DCT>

1

.5

0

ICASSP, May 21 2004

dynamic model1
Dynamic Model

ICASSP, May 21 2004

dynamic model advantages
Dynamic Model Advantages
  • More training data for models
  • Reduced dependency upon selecting appropriate keyframe
  • Some spatio-temporal aspects of shot are captured
    • (Dis-)appearance of objects

ICASSP, May 21 2004

experimental set up
Experimental Set-up
  • Build models for each shot
    • Static, Dynamic, Language
  • Build Queries from topics
    • Construct simple keyword text query
    • Select visual example
    • Rescale and compress example images to match video size and quality

ICASSP, May 21 2004

combining modalities
Combining Modalities
  • Independence assumption textual/visual
    • P(Qt,Qv|Shot) = P(Qt|LM) * P(Qv|GMM)
  • Combination works if both runs useful [CWI:TREC:2002]
  • Dynamic run moreuseful than static run

ICASSP, May 21 2004

slide12

Dow Jones Topic (120)

ICASSP, May 21 2004

dow jones topic 1201
Dow Jones Topic (120)

ICASSP, May 21 2004

slide15

Arafat topic (103)

ICASSP, May 21 2004

arafat topic 103
Arafat Topic (103)

ICASSP, May 21 2004

slide17

Baseball topic (102)

Basketball topic (101)

ICASSP, May 21 2004

basketball topic
Basketball Topic

ICASSP, May 21 2004

merging run results
Merging Run Results

ICASSP, May 21 2004

merging run results1
Combining (conflicting) examples difficult [CWI:TREC:2002]

Single example  Miss relevant shots

Round-Robin Merging

Merging Run Results

Combined

1

1

2

2

3

3

4

4

.

.

1

2

3

4

5

6

7

8

9

10

1

2

3

4

5

6

7

8

9

10

ICASSP, May 21 2004

merging run results2
Combining (conflicting) examples difficult [CWI:TREC:2002]

Single example  Miss relevant shots

Round-Robin Merging

Merging Run Results

Combined

1

1

2

2

3

3

4

4

.

.

1

2

3

4

5

6

7

8

9

10

1

2

3

4

5

6

7

8

9

10

ICASSP, May 21 2004

slide22

Flames (112)

ICASSP, May 21 2004

flames topic 112
Flames Topic (112)

ICASSP, May 21 2004

conclusions
Conclusions
  • For most topics, neither the static nor the dynamic visual model captures the user information need sufficiently…
  • …averaged over 25 topics however, it is better to use both modalities than ASR only

Working hypothesis: Matching against both modalities gives robustness

ICASSP, May 21 2004

conclusions1
Conclusions
  • Dynamic captures visual similarity better
    • Thanks to spatio-temporal aspects?
      • Experiments with full covariance matrix for <x,y,t>-dims
  • Static model of KF is too fragile
    • Dependency on single KF?
      • To be tested by ranking max(all I-frames in shot)
    • Not enough training data?

ICASSP, May 21 2004

conclusions2
Conclusions
  • Visual aspects of an information need are best captured by using multiple examples
  • Combining results for multiple (good) examples in round-robin fashion, each ranked on both modalities, gives near-best performance for almost all topics

ICASSP, May 21 2004