180 likes | 404 Views
On Detection of Multiple Object Instances using Hough Transforms. Olga Barinova Moscow State University. Victor Lempitsky University of Oxford. Pushmeet Kohli Microsoft Research Cambridge. Hough transforms. Object detection → peaks identification in Hough images Beyond lines!!!
E N D
On Detection of Multiple Object Instances using Hough Transforms Olga Barinova Moscow State University Victor Lempitsky University of Oxford Pushmeet Kohli Microsoft Research Cambridge
Hough transforms • Object detection → peaks identification in Hough images • Beyond lines!!! • Ballard 1983 – Other primitives • Lowe, ICCV 1999 – Object detection • Leibe, Schiele BMVC 2003 – Object class detection • Last CVPR: Maji& Malik, Gall& Lempitsky, Gu et al. …
Category-level object detection Example from Gall &Lempitsky CVPR 2009
Multiple lines detection • Identifying the peaks in Hough images is highly nontrivial in case of multiple close objects • Postprocessing (e.g. non-maximum suppression) is usually used • Our framework is similar to Hough Transforms but doesn’t require finding local maxima and suppresses non-maxima automatically
Elements space Hough space Our framework Hypotheses Voting elements
Elements space Hough space Our framework 1 2 3 x – labelling of voting elements, xi = index of hypothesis, if element votes for hypothesis, xi = 0, if element votes for background y – labelling of hypotheses, binary variables: 1 = object is present, 0 = otherwise
Elements space Hough space Our framework x2=1 x3=1 x1=1 1 y1=1 y2=1 x4=2 2 y3=0 3 x8=2 x – labelling of voting elements, xi = index of hypothesis, if element votes for hypothesis, xi = 0, if element votes for background y – labelling of hypotheses, binary variables: 1 = object is present, 0 = otherwise x6=2 x7=0 x5=2 Key idea : joint MAP-inference in x and y
Probabilistic derivation Likelihood Term • Assume that given the existing objects y and the hypotheses assignmentsx, the distributions of the descriptors of voting elements are independent • Less crude than the independence assumption implicitly made by the traditional Hough • Prior Term • Occam razor (or MDL) prior penalizes the number of the active hypotheses
Probabilistic derivation voting elements 0 1 2 3 0 0 1 -∞ if xi = h, and yh = 0 0, otherwise 1 how likely is that voting element belongs to an object Corresponds to the votes in standard Hough transform: Training stays the same! 2 3 0 0 1 “MDL” prior: λ, if yh = 1 0, otherwise 1 2 3 0 1 0 2 1 3 Problem known as facility location hypotheses 0 [Delong et al. CVPR 2010] (today’s poster) looks at facility location with wider set of priors 1 2 3
Probabilistic derivation 0 voting elements 1 2 3 • Tried different methods for MAP-inference • belief propagation • simulated annealing • They work well but don’t allow using large number of hypotheses • graph becomes huge and dense • sparsification heuristics required 0 0 1 1 2 3 0 0 1 1 2 3 0 1 0 2 1 3 hypotheses 0 1 2 3
Probabilistic derivation 0 voting elements 1 2 3 • If labeling of y is given the values ofxiare independent • After maximizing out xwe get: • Large-clique, submodular • Greedy algorithm is as good as anything else (in terms of the approximation factor) • Greedy inference ~ iterative Hough voting 0 0 1 1 2 3 0 0 1 1 2 3 0 1 0 2 1 3 hypotheses 0 1 2 3
Greedy maximization for our energy Greedily add detections starting from the empty set For each iteration do the voting: Seth0 = the overall maximum of HoughImage IfHoughImage(h0) > λ, add h0 to detection set, else terminate “standard” Hough vote for element i Maximum over Hough votes for the hypotheses g that have already been switched on, including ‘background’ Sum over all voting elements
Inference Using the Hough forest trained in [Gall&Lempitsky CVPR09] Datasets from [Andriluka et al. CVPR 2008](with strongly occluded pedestrians added)
Results for pedestrians detection Hough transform + non-maximum suppression Our framework White = correct detection Green = missing object Red = false positive
Results for pedestrians detection TUD-crossing TUD-campus Precision Precision Recall Recall Blue = Hough transform + non-maximum suppression Light-blue = our framework
Results for lines detection York Urban DB, Elder&Estrada ECCV 2008 • our framework is able to discern very close yet distinct lines, and is in general much less plagued by spurious detections Our framework Hough + NMS
Conclusion • Framework for detecting multiple objects, greedy inference ~ iterated Hough transform • No need to find local maxima and suppress non-maxima – just take the only global maximum • Probabilistic model allows for extensions(ECCV paper coming: lines + vanishing points + horizon + zenith) • Training stays the same as for the recent Hough-based framework • Code available at the project page: http://graphics.cs.msu.ru/en/ science/research/machinelearning/hough Thank you for your attention!