280 likes | 371 Views
Explore the J-linkage approach for fitting multiple instances of a model to noisy data with outliers. Learn about Random Sampling Consensus and how it enhances model selection. Discover the concept of Preference Sets and J-linkage properties in clustering data. Examine experiments comparing J-linkage to other methods like RANSAC and Mean-Shift. Real data application includes fitting 3D primitives to point clouds.
E N D
Fitting multiple structures to geometric data: the J-linkage approach Roberto Toldo and Andrea Fusiello University of Verona University of Udine Big Data - Udine 5/6/2013
The problem • Fitting multiple instancesof a modelto data corruptedbynoise and outliers
Twotypesofoutliers • Grossoutlliers • Pseudo-outliers
Previous work • Sequential RANSAC: theoretically wrong • MultiRANSAC [Zuliani et al. ICIP05]: problems with intersecting models • Residual Histogram Analysis [Zhang&Koseka ECCV06]: peaks finding is unreliable • Mode finding in parameter space. • Randomized HT: discretization is critical • Mean Shift clustering: not rubust enough
Randomsamplingconsensus • Draw minimal sample sets (MSS) from data points • Fit a model to each MSS • Build the consensus set of the model: the set of points such that their distance to the model is below a given threshold (inlier band) • Select the model with the highest consensus
Randomsamplingconsensus • The number of MSS to be drawn must be large enough to guarantee that at least an outlier-free MSS is selected with high probability. • The assumption is that an outliers-free MSS will achieve the highest consensus, because inliers are structured whereas outliers are random • RANSAC looks at the problem from the model’s viewpoint
The preference set • Let us look at the problem from the perspective of the points. • The Preference Set (PS) of a point is the set of models it belongs to. CS of model j PS of point i
Conceptualrepresentation • The PS (or its characteristic function) is a conceptual representationof a point. • In Pattern Recognition, the conceptual representation of an object x given C classes is: [ P(x | class 1) · · · P(x| class C ) ]. • Conjecture: points belonging to the same model have “similar” conceptual representations. • In other words, they cluster in the conceptual space.
Jaccarddistance • Models are extracted by agglomerative clustering in the conceptual space using the Jaccard distance: A B A ∪ B A ∩ B ∑=8 ∑=2 dj=6/8=0.75
J-linkage • Define the PS of a cluster as the intersection of the PS of all its points. • Start with one cluster for each point • Pick the two cluster with the smallest J-distance and merge them • Repeat 2 while the smallest J-distance < 1 • Postcondition: all the clusters have distance 1 (their PS do not overlap)
J-linkage: properties model that fits all the points of cluster 2 model that fits all the points of clusters 1 & 2 model that fits all the points of cluster 1 • Foreach cluster, thereis a modelthatfitsall the point, otherwisetheywouldhavedistance = 1 • A modelcannotfitall the pointsoftwoclusters, otherwisetheywouldhavedistance < 1 cluster 2 points cluster 1
How many clusters? • Outliersemergesassmallclusters • If the numberMofmodelsisknown, the largestMclusters are retained • If the overallnumberofinlier can beestimated, the largestclusters up to the numbetofinliers are retained • Modelselectiontools can help to solve thisissue
What about the inlier threshold? • We presented [ICIAP 09] a technique based on clustering validation that is able to automatically select the “just right” threshold.
Continousrelaxation • The votingfunctionin J-linkageis a stepfunction (indicatorfunctionof the inlier band) • Idea: choose a soft votingfunctionswithvalues in [0,1] • ris the residual • the time constant τplays the role of the inlier theshold
Continousrelaxation • Instead of the characteristic function of the preference set now the preference vector of a point has entries in [0,1] as produced by the soft voting function. • The Jaccard distance is generalized by the Tanimoto distance: • where p,q are the preference vectors
Continous relaxation • The preferencevectorof a cluster isobtainedas the componentwise minimum among the preferencevectorsof the cluster (generalizes the logical AND ofpreferencevectors) • The soft J-linkageproceedsasits discrete version • Post-conditions: • Atleastonemodel in a cluster hasvotes >0fromall the points in the cluster (itfitsall the points) • A modelcannothavevotes >0fromall the points in two separate cluster
Experiments • Each model consists of 50 inliers, corrupted by variable Gaussian noise and variable outliers percentage. • Compared to: sequential RANSAC, multiRANSAC, residual histogram analysis (RHA) and Mean-Shift. • Same samples. • Same inlier threshold • Parameters needed by MS (bandwidth) and by RHA have been optimized manually. • Number of models is given.
Real data • The motivation for this work is fitting 3D primitives (planes, cylinders) to cloud of 3D points provided by a SaM reconstruction pipeline.