- 109 Views
- Uploaded on
- Presentation posted in: General

On Detection of Multiple Object Instances using Hough Transforms

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

On Detection of Multiple Object Instances using Hough Transforms

Olga Barinova

Moscow State University

Victor Lempitsky

University of Oxford

Pushmeet Kohli

Microsoft Research Cambridge

- Object detection → peaks identification in Hough images
- Beyond lines!!!
- Ballard 1983 – Other primitives
- Lowe, ICCV 1999 – Object detection
- Leibe, Schiele BMVC 2003 – Object class detection
- Last CVPR: Maji& Malik, Gall& Lempitsky, Gu et al. …

Category-level object detection

Example from Gall &Lempitsky CVPR 2009

?

- Identifying the peaks in Hough images is highly nontrivial in case of multiple close objects
- Postprocessing (e.g. non-maximum suppression) is usually used
- Our framework is similar to Hough Transforms but doesn’t require finding local maxima and suppresses non-maxima automatically

Elements space

Hough space

Hypotheses

Voting elements

Elements space

Hough space

1

2

3

x – labelling of voting elements,

xi = index of hypothesis,

if element votes for hypothesis,

xi = 0, if element votes for background

y – labelling of hypotheses, binary variables:

1 = object is present,

0 = otherwise

Elements space

Hough space

x2=1

x3=1

x1=1

1

y1=1

y2=1

x4=2

2

y3=0

3

x8=2

x – labelling of voting elements,

xi = index of hypothesis,

if element votes for hypothesis,

xi = 0, if element votes for background

y – labelling of hypotheses, binary variables:

1 = object is present,

0 = otherwise

x6=2

x7=0

x5=2

Key idea : joint MAP-inference in x and y

Likelihood Term

- Assume that given the existing objects y and the hypotheses assignmentsx, the distributions of the descriptors of voting elements are independent
- Less crude than the independence assumption implicitly made by the traditional Hough

- Prior Term
- Occam razor (or MDL) prior penalizes the number of the active hypotheses

voting elements

0

1

2

3

0

0

1

-∞ if xi = h, and yh = 0

0, otherwise

1

how likely is that voting element belongs to an object

Corresponds to the votes in standard Hough transform: Training stays the same!

2

3

0

0

1

“MDL” prior:

λ, if yh = 1

0, otherwise

1

2

3

0

1

0

2

1

3

Problem known as facility location

hypotheses

0

[Delong et al. CVPR 2010] (today’s poster) looks at facility location with wider set of priors

1

2

3

0

voting elements

1

2

3

- Tried different methods for MAP-inference
- belief propagation
- simulated annealing

- They work well but don’t allow using large number of hypotheses
- graph becomes huge and dense
- sparsification heuristics required

0

0

1

1

2

3

0

0

1

1

2

3

0

1

0

2

1

3

hypotheses

0

1

2

3

0

voting elements

1

2

3

- If labeling of y is given the values ofxiare independent
- After maximizing out xwe get:
- Large-clique, submodular
- Greedy algorithm is as good as anything else (in terms of the approximation factor)
- Greedy inference ~ iterative Hough voting

0

0

1

1

2

3

0

0

1

1

2

3

0

1

0

2

1

3

hypotheses

0

1

2

3

Greedily add detections starting from the empty set

For each iteration

do the voting:

Seth0 = the overall maximum of HoughImage

IfHoughImage(h0) > λ, add h0 to detection set, else terminate

“standard” Hough vote for element i

Maximum over Hough votes for the hypotheses g that have already been switched on, including ‘background’

Sum over all voting elements

Using the Hough forest trained in [Gall&Lempitsky CVPR09]

Datasets from [Andriluka et al. CVPR 2008](with strongly occluded pedestrians added)

Hough transform

+ non-maximum suppression

Our framework

White = correct detection

Green = missing object

Red = false positive

TUD-crossing

TUD-campus

Precision

Precision

Recall

Recall

Blue = Hough transform + non-maximum suppression

Light-blue = our framework

York Urban DB, Elder&Estrada ECCV 2008

- our framework is able to discern very close yet distinct lines, and is in general much less plagued by spurious detections

Our framework

Hough + NMS

- Framework for detecting multiple objects, greedy inference ~ iterated Hough transform
- No need to find local maxima and suppress non-maxima – just take the only global maximum
- Probabilistic model allows for extensions(ECCV paper coming: lines + vanishing points + horizon + zenith)
- Training stays the same as for the recent Hough-based framework
- Code available at the project page: http://graphics.cs.msu.ru/en/
science/research/machinelearning/hough

Thank you for your attention!