Fitting multiple structures to geometric data: the J-linkage approach

Fitting multiple structures to geometric data: the J-linkage approach Roberto Toldo and Andrea Fusiello University of Verona University of Udine Big Data - Udine 5/6/2013

The problem • Fitting multiple instancesof a modelto data corruptedbynoise and outliers

Twotypesofoutliers • Grossoutlliers • Pseudo-outliers

Previous work • Sequential RANSAC: theoretically wrong • MultiRANSAC [Zuliani et al. ICIP05]: problems with intersecting models • Residual Histogram Analysis [Zhang&Koseka ECCV06]: peaks finding is unreliable • Mode finding in parameter space. • Randomized HT: discretization is critical • Mean Shift clustering: not rubust enough

Randomsamplingconsensus • Draw minimal sample sets (MSS) from data points • Fit a model to each MSS • Build the consensus set of the model: the set of points such that their distance to the model is below a given threshold (inlier band) • Select the model with the highest consensus

Randomsamplingconsensus • The number of MSS to be drawn must be large enough to guarantee that at least an outlier-free MSS is selected with high probability. • The assumption is that an outliers-free MSS will achieve the highest consensus, because inliers are structured whereas outliers are random • RANSAC looks at the problem from the model’s viewpoint

The preference set • Let us look at the problem from the perspective of the points. • The Preference Set (PS) of a point is the set of models it belongs to. CS of model j PS of point i

Conceptualrepresentation • The PS (or its characteristic function) is a conceptual representationof a point. • In Pattern Recognition, the conceptual representation of an object x given C classes is: [ P(x | class 1) · · · P(x| class C ) ]. • Conjecture: points belonging to the same model have “similar” conceptual representations. • In other words, they cluster in the conceptual space.

Jaccarddistance • Models are extracted by agglomerative clustering in the conceptual space using the Jaccard distance: A B A ∪ B A ∩ B ∑=8 ∑=2 dj=6/8=0.75

J-linkage • Define the PS of a cluster as the intersection of the PS of all its points. • Start with one cluster for each point • Pick the two cluster with the smallest J-distance and merge them • Repeat 2 while the smallest J-distance < 1 • Postcondition: all the clusters have distance 1 (their PS do not overlap)

J-linkage: properties model that fits all the points of cluster 2 model that fits all the points of clusters 1 & 2 model that fits all the points of cluster 1 • Foreach cluster, thereis a modelthatfitsall the point, otherwisetheywouldhavedistance = 1 • A modelcannotfitall the pointsoftwoclusters, otherwisetheywouldhavedistance < 1 cluster 2 points cluster 1

How many clusters? • Outliersemergesassmallclusters • If the numberMofmodelsisknown, the largestMclusters are retained • If the overallnumberofinlier can beestimated, the largestclusters up to the numbetofinliers are retained • Modelselectiontools can help to solve thisissue

What about the inlier threshold? • We presented [ICIAP 09] a technique based on clustering validation that is able to automatically select the “just right” threshold.

Continousrelaxation • The votingfunctionin J-linkageis a stepfunction (indicatorfunctionof the inlier band) • Idea: choose a soft votingfunctionswithvalues in [0,1] • ris the residual • the time constant τplays the role of the inlier theshold

Continousrelaxation • Instead of the characteristic function of the preference set now the preference vector of a point has entries in [0,1] as produced by the soft voting function. • The Jaccard distance is generalized by the Tanimoto distance: • where p,q are the preference vectors

Continous relaxation • The preferencevectorof a cluster isobtainedas the componentwise minimum among the preferencevectorsof the cluster (generalizes the logical AND ofpreferencevectors) • The soft J-linkageproceedsasits discrete version • Post-conditions: • Atleastonemodel in a cluster hasvotes >0fromall the points in the cluster (itfitsall the points) • A modelcannothavevotes >0fromall the points in two separate cluster

Experiments • Each model consists of 50 inliers, corrupted by variable Gaussian noise and variable outliers percentage. • Compared to: sequential RANSAC, multiRANSAC, residual histogram analysis (RHA) and Mean-Shift. • Same samples. • Same inlier threshold • Parameters needed by MS (bandwidth) and by RHA have been optimized manually. • Number of models is given.

Real data • The motivation for this work is fitting 3D primitives (planes, cylinders) to cloud of 3D points provided by a SaM reconstruction pipeline.

Plane fitting

Questions?

Fitting multiple structures to geometric data: the J-linkage approach

Fitting multiple structures to geometric data: the J-linkage approach

Presentation Transcript

Name Standardization for Genealogical Record Linkage

CSE 326: Data Structures

Logical Line Fitting: One Step in the EDA Process

The Data Linkage Service

Week 2 Video 5

C structures and Compilation to IA32

Structures Lesson Outline

MRF-based Fitting of a Subdivision-based Geometric Atlas

Section 5.4 Fitting a Line to Data

Geometry Day 55

URI Identity Management for Semantic Web Data Integration and Linkage

Bioinformatics Algorithms and Data Structures

Data Structures Books

A Novel Approach for Entity Linkage

Lecture 23: Causes and Consequences of Linkage Disequilibrium

Geometric Modeling using Polygonal Meshes Lecture 2: Mesh data structures

Basic Model For Genetic Linkage Analysis Lecture #3

best-fitting line .

Linkage and Crossing over