On Detection of Multiple Object Instances using Hough Transforms

On Detection of Multiple Object Instances using Hough Transforms Olga Barinova Moscow State University Victor Lempitsky University of Oxford Pushmeet Kohli Microsoft Research Cambridge

Hough transforms • Object detection → peaks identification in Hough images • Beyond lines!!! • Ballard 1983 – Other primitives • Lowe, ICCV 1999 – Object detection • Leibe, Schiele BMVC 2003 – Object class detection • Last CVPR: Maji& Malik, Gall& Lempitsky, Gu et al. …

Category-level object detection Example from Gall &Lempitsky CVPR 2009

Category-level object detection ?

Multiple lines detection • Identifying the peaks in Hough images is highly nontrivial in case of multiple close objects • Postprocessing (e.g. non-maximum suppression) is usually used • Our framework is similar to Hough Transforms but doesn’t require finding local maxima and suppresses non-maxima automatically

Elements space Hough space Our framework Hypotheses Voting elements

Elements space Hough space Our framework 1 2 3 x – labelling of voting elements, xi = index of hypothesis, if element votes for hypothesis, xi = 0, if element votes for background y – labelling of hypotheses, binary variables: 1 = object is present, 0 = otherwise

Elements space Hough space Our framework x2=1 x3=1 x1=1 1 y1=1 y2=1 x4=2 2 y3=0 3 x8=2 x – labelling of voting elements, xi = index of hypothesis, if element votes for hypothesis, xi = 0, if element votes for background y – labelling of hypotheses, binary variables: 1 = object is present, 0 = otherwise x6=2 x7=0 x5=2 Key idea : joint MAP-inference in x and y

Probabilistic derivation Likelihood Term • Assume that given the existing objects y and the hypotheses assignmentsx, the distributions of the descriptors of voting elements are independent • Less crude than the independence assumption implicitly made by the traditional Hough • Prior Term • Occam razor (or MDL) prior penalizes the number of the active hypotheses

Probabilistic derivation voting elements 0 1 2 3 0 0 1 -∞ if xi = h, and yh = 0 0, otherwise 1 how likely is that voting element belongs to an object Corresponds to the votes in standard Hough transform: Training stays the same! 2 3 0 0 1 “MDL” prior: λ, if yh = 1 0, otherwise 1 2 3 0 1 0 2 1 3 Problem known as facility location hypotheses 0 [Delong et al. CVPR 2010] (today’s poster) looks at facility location with wider set of priors 1 2 3

Probabilistic derivation 0 voting elements 1 2 3 • Tried different methods for MAP-inference • belief propagation • simulated annealing • They work well but don’t allow using large number of hypotheses • graph becomes huge and dense • sparsification heuristics required 0 0 1 1 2 3 0 0 1 1 2 3 0 1 0 2 1 3 hypotheses 0 1 2 3

Probabilistic derivation 0 voting elements 1 2 3 • If labeling of y is given the values ofxiare independent • After maximizing out xwe get: • Large-clique, submodular • Greedy algorithm is as good as anything else (in terms of the approximation factor) • Greedy inference ~ iterative Hough voting 0 0 1 1 2 3 0 0 1 1 2 3 0 1 0 2 1 3 hypotheses 0 1 2 3

Greedy maximization for our energy Greedily add detections starting from the empty set For each iteration do the voting: Seth0 = the overall maximum of HoughImage IfHoughImage(h0) > λ, add h0 to detection set, else terminate “standard” Hough vote for element i Maximum over Hough votes for the hypotheses g that have already been switched on, including ‘background’ Sum over all voting elements

Inference Using the Hough forest trained in [Gall&Lempitsky CVPR09] Datasets from [Andriluka et al. CVPR 2008](with strongly occluded pedestrians added)

Results for pedestrians detection Hough transform + non-maximum suppression Our framework White = correct detection Green = missing object Red = false positive

Results for pedestrians detection TUD-crossing TUD-campus Precision Precision Recall Recall Blue = Hough transform + non-maximum suppression Light-blue = our framework

Results for lines detection York Urban DB, Elder&Estrada ECCV 2008 • our framework is able to discern very close yet distinct lines, and is in general much less plagued by spurious detections Our framework Hough + NMS

Conclusion • Framework for detecting multiple objects, greedy inference ~ iterated Hough transform • No need to find local maxima and suppress non-maxima – just take the only global maximum • Probabilistic model allows for extensions(ECCV paper coming: lines + vanishing points + horizon + zenith) • Training stays the same as for the recent Hough-based framework • Code available at the project page: http://graphics.cs.msu.ru/en/ science/research/machinelearning/hough Thank you for your attention!

On Detection of Multiple Object Instances using Hough Transforms

On Detection of Multiple Object Instances using Hough Transforms

Presentation Transcript

Object Detection Using the Statistics of Parts

Object detection

Vision-based Lane Detection using Hough Transform

Infrasonic Signal Detection Using The Hough Transform

A Tutorial on Object Detection Using OpenCV

Class-Specific Hough Forests for Object Detection

Class-Specific Hough Forests for Object Detection

Generic Object Detection using Feature Maps

Object Detection

Hough Transforms and Computer Vision Systems

“Secret” of Object Detection

Automated Crater Detection and Counting Using the Hough Transform

Running Multiple MySQL Instances On a Single Server

Object Detection using Histograms of Oriented Gradients

Object Detection

Object Detection Using Marked Point Process

object diagram with concrete examples of instances

Object detection

Vision-based Lane Detection using Hough Transform