Unsupervised
Download
1 / 27

Unsupervised Commonality Discovery in Images - PowerPoint PPT Presentation


  • 101 Views
  • Uploaded on

Unsupervised Temporal Commonality Discovery Wen-Sheng Chu , Feng Zhou and Fernando De la Torre Robotics Institute, Carnegie Mellon University July 9, 2013. Unsupervised Commonality Discovery in Images. (Chu’10, Mukherjee’11, Collins’12). Where are the repeated patterns?.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about 'Unsupervised Commonality Discovery in Images' - ronna


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Unsupervised commonality discovery in images

Unsupervised Temporal Commonality DiscoveryWen-Sheng Chu, Feng Zhou and Fernando De la TorreRobotics Institute, Carnegie Mellon UniversityJuly 9, 2013


Unsupervised commonality discovery in images
Unsupervised Commonality Discoveryin Images

(Chu’10, Mukherjee’11, Collins’12)

Where are the repeated patterns?


Unsupervised commonality discovery in videos
Unsupervised Commonality Discoveryin Videos?

  • We name it Temporal Commonality Discovery (TCD).

  • Goal: Given two videos, discover common events in an unsupervised fashion.


Tcd is hard
TCD is hard!

1) No prior knowledge on commonalities

  • We do not know what, where and how many commonalities exist in the video

    2) Exhaustive search are computationally prohibitive

  • E.g., two videos with 300 frames have >8,000,000,000 possible matches.

possible locations possible lengths

possibilities/sequence

Another possibilities/sequence


Formulation
Formulation

Integer programming!


Optimization interpretation
Optimization: Interpretation


Complexity

Optimization: Native Search

Complexity


Optimization branch and bound
Optimization: Branch-and-Bound

  • Similar to the idea of ESS (Lampert’08), we search the space by splittingintervals.


Optimization branch and bound1
Optimization: Branch-and-Bound

  • Bounding histogram bins


Optimization branch and bound2
Optimization: Branch-and-Bound

  • Bounding L1 distance:

  • Intersection similarity:

  • X2 distance:


Searching structure
Searching Structure

State S = (Rectangle set; score)

(B1,E1,B2,E2; -105)

(B1,E1,B2,E2; -10)

(B1,E1,B2,E2; -50)

Unlikelysearchregions

(B1,E1,B2,E2; 32)

Priority queue

(sorted by bound scores)


Algorithm
Algorithm

Pop out

the top state

Top state

2. Split

(B1,E1,B2,E2; -105)

(B1,E1,B2,E2; -105)

(B1,E1,B2,E2; -50)

(B1,E1,B2,E2; 32)

Priority queue

(sorted by bound scores)


Algorithm1
Algorithm

4. Push back the

split states

Top state

3. Compute

bounding scores

(B1,E1,B2,E2; -105)

(B1,E1,B2,E2; -50)

(B1,E’1,B2,E2; -76)

(B1,E1,B2,E2; 32)

Priority queue

(sorted by bound scores)

(B1,E’’1,B2,E2; -61)


Algorithm2
Algorithm

  • The algorithm stop when the top state contains an unique rectangle.

(B1,E’1,B2,E2; -76)

Top state

(B1,E’’1,B2,E2; -61)

(B1,E1,B2,E2; -50)

Omit most of the

search space with

large distances

(B1,E1,B2,E2; 32)

Priority queue

(sorted by bound scores)


Compare with relevant work
Compare with Relevant Work

  • Difference between TCD and ESS [1]/STBB[2]

    • Different learning framework:

      • Unsupervisedv.s. Supervised

    • New bounding functions for TCD

  • Difference between TCD and [3]

    • Different objective:

      • Commonality Discoveryv.s. Temporal Clustering

        [1] “Efficient subwindowsearch: A branch and bound framework for object localization”, PAMI 2009.

        [2] “Discriminative video pattern search for efficient action detection”, PAMI 2011.

        [3] “Unsupervised discovery of facial events”, in CVPR 2010.


Experiment 1 synthesized sequence
Experiment (1): Synthesized Sequence

Histograms of the discovered

pair of subsequences


Experiment 2 discover common facial actions
Experiment (2): Discover Common Facial Actions

  • RU-FACS dataset*

    • Interview videos with 29 subjects

    • 5000~8000 frames/video

    • Collect 100 segments that containing smiley mouths (AU-12)

    • Evaluate in terms of averaged precision

* “Automatic recognition of facial actions in spontaneous expressions”, Journal of Multimedia 2006.


Experiment 2 discover common facial actions1
Experiment (2): Discover Common Facial Actions


Experiment 2 speed evaluation speed evaluation of the distance function
Experiment (2): Speed EvaluationSpeed #evaluation of the distance function

  • Parametric settings for Sliding Windows (SW)

  • Log of #evaluations:

  • Quality of discovered patterns:

  • a


Experiment 2 discover common facial actions2
Experiment (2): Discover Common Facial Actions

  • Compare with LCCS* on -distance

* “Frame-level temporal calibration of unsynchronized cameras by using Longest Consecutive Common Subsequence”, ICASSP 2009.


Experiment 3 discover multiple common human motions
Experiment (3): Discover Multiple Common Human Motions

  • CMU-Mocap dataset:

    • http://mocap.cs.cmu.edu/

  • 15 sequences from Subject 86

  • 1200~2600 frames and up to 10 actions/seq

  • Exclude the comparison with SW because it needs >1012 evaluations


Experiment 3 discover multiple common human motions1
Experiment (3): Discover Multiple Common Human Motions


Experiment 3 discover multiple common human motions2
Experiment (3): Discover Multiple Common Human Motions

  • Compare with LCCS* on -distance


Extension video indexing
Extension: Video Indexing

  • Goal: Given a query , find the best common subsequence in the target video

  • A straightforward extension:

TemporalSearch

Space


A prototype for video indexing
A Prototype for Video Indexing


Summary
Summary

  • We have introduced:

    • NEW! ProblemAlgorithm Bounding functions

    • Able to discovery temporal commonality efficiently and effectively

  • Next:

    • Use submodular functions to improve speed

    • Explore more bounds for other metrics between temporal segments, such as dynamic time warping

    • More applications. E.g., video indexing, co-occurrence detection, irregularity detection, etc.


Questions
Questions?

[1] “Common Visual Pattern Discovery via Spatially Coherent Correspondences,” In CVPR 2010.

[2] “MOMI-cosegmentation: simultaneous segmentation of multiple objects among multiple images,” In ACCV 2010.

[3] “Scale invariant cosegmentation for image groups,” In CVPR 2011.

[4] “Random walks based multi-image segmentation: Quasiconvexity results and GPU-based solutions,” In CVPR 2012.

[5] “Frame-level temporal calibration of unsynchronized cameras by using Longest Consecutive Common Subsequence,” In ICASSP 2009.

[6] “Efficient ESS with submodular score functions,” In CVPR 2011.

http://humansensing.cs.cmu.edu/wschu/