fbpx
Wikipedia

Point-set registration

In computer vision, pattern recognition, and robotics, point-set registration, also known as point-cloud registration or scan matching, is the process of finding a spatial transformation (e.g., scaling, rotation and translation) that aligns two point clouds. The purpose of finding such a transformation includes merging multiple data sets into a globally consistent model (or coordinate frame), and mapping a new measurement to a known data set to identify features or to estimate its pose. Raw 3D point cloud data are typically obtained from Lidars and RGB-D cameras. 3D point clouds can also be generated from computer vision algorithms such as triangulation, bundle adjustment, and more recently, monocular image depth estimation using deep learning. For 2D point set registration used in image processing and feature-based image registration, a point set may be 2D pixel coordinates obtained by feature extraction from an image, for example corner detection. Point cloud registration has extensive applications in autonomous driving,[1] motion estimation and 3D reconstruction,[2] object detection and pose estimation,[3][4] robotic manipulation,[5] simultaneous localization and mapping (SLAM),[6][7] panorama stitching,[8] virtual and augmented reality,[9] and medical imaging.[10]

Point set registration is the process of aligning two point sets. Here, the blue fish is being registered to the red fish.

As a special case, registration of two point sets that only differ by a 3D rotation (i.e., there is no scaling and translation), is called the Wahba Problem and also related to the orthogonal procrustes problem.

Formulation edit

 
Data from two 3D scans of the same environment need to be aligned using point set registration.
 
Data from above, registered successfully using a variant of iterative closest point.

The problem may be summarized as follows:[11] Let   be two finite size point sets in a finite-dimensional real vector space  , which contain   and   points respectively (e.g.,   recovers the typical case of when   and   are 3D point sets). The problem is to find a transformation to be applied to the moving "model" point set   such that the difference (typically defined in the sense of point-wise Euclidean distance) between   and the static "scene" set   is minimized. In other words, a mapping from   to   is desired which yields the best alignment between the transformed "model" set and the "scene" set. The mapping may consist of a rigid or non-rigid transformation. The transformation model may be written as  , using which the transformed, registered model point set is:

 

 

 

 

 

(1)

The output of a point set registration algorithm is therefore the optimal transformation   such that   is best aligned to  , according to some defined notion of distance function  :

 

 

 

 

 

(2)

where   is used to denote the set of all possible transformations that the optimization tries to search for. The most popular choice of the distance function is to take the square of the Euclidean distance for every pair of points:

 

 

 

 

 

(3)

where   denotes the vector 2-norm,   is the corresponding point in set   that attains the shortest distance to a given point   in set   after transformation. Minimizing such a function in rigid registration is equivalent to solving a least squares problem.

Types of algorithms edit

When the correspondences (i.e.,  ) are given before the optimization, for example, using feature matching techniques, then the optimization only needs to estimate the transformation. This type of registration is called correspondence-based registration. On the other hand, if the correspondences are unknown, then the optimization is required to jointly find out the correspondences and transformation together. This type of registration is called simultaneous pose and correspondence registration.

Rigid registration edit

Given two point sets, rigid registration yields a rigid transformation which maps one point set to the other. A rigid transformation is defined as a transformation that does not change the distance between any two points. Typically such a transformation consists of translation and rotation.[12] In rare cases, the point set may also be mirrored. In robotics and computer vision, rigid registration has the most applications.

Non-rigid registration edit

 
Registered point cloud from a lidar mounted on a moving car.

Given two point sets, non-rigid registration yields a non-rigid transformation which maps one point set to the other. Non-rigid transformations include affine transformations such as scaling and shear mapping. However, in the context of point set registration, non-rigid registration typically involves nonlinear transformation. If the eigenmodes of variation of the point set are known, the nonlinear transformation may be parametrized by the eigenvalues.[13] A nonlinear transformation may also be parametrized as a thin plate spline.[14][13]

Other types edit

Some approaches to point set registration use algorithms that solve the more general graph matching problem.[11] However, the computational complexity of such methods tend to be high and they are limited to rigid registrations. In this article, we will only consider algorithms for rigid registration, where the transformation is assumed to contain 3D rotations and translations (possibly also including a uniform scaling).

The PCL (Point Cloud Library) is an open-source framework for n-dimensional point cloud and 3D geometry processing. It includes several point registration algorithms.[15]

Correspondence-based registration edit

Correspondence-based methods assume the putative correspondences   are given for every point  . Therefore, we arrive at a setting where both point sets   and   have   points and the correspondences   are given.

Outlier-free registration edit

In the simplest case, one can assume that all the correspondences are correct, meaning that the points   are generated as follows:

 

 

 

 

 

(cb.1)

where   is a uniform scaling factor (in many cases   is assumed),   is a proper 3D rotation matrix (  is the special orthogonal group of degree  ),   is a 3D translation vector and   models the unknown additive noise (e.g., Gaussian noise). Specifically, if the noise   is assumed to follow a zero-mean isotropic Gaussian distribution with standard deviation  , i.e.,  , then the following optimization can be shown to yield the maximum likelihood estimate for the unknown scale, rotation and translation:

 

 

 

 

 

(cb.2)

Note that when the scaling factor is 1 and the translation vector is zero, then the optimization recovers the formulation of the Wahba problem. Despite the non-convexity of the optimization (cb.2) due to non-convexity of the set  , seminal work by Berthold K.P. Horn showed that (cb.2) actually admits a closed-form solution, by decoupling the estimation of scale, rotation and translation.[16] Similar results were discovered by Arun et al.[17] In addition, in order to find a unique transformation  , at least   non-collinear points in each point set are required.

More recently, Briales and Gonzalez-Jimenez have developed a semidefinite relaxation using Lagrangian duality, for the case where the model set   contains different 3D primitives such as points, lines and planes (which is the case when the model   is a 3D mesh).[18] Interestingly, the semidefinite relaxation is empirically tight, i.e., a certifiably globally optimal solution can be extracted from the solution of the semidefinite relaxation.

Robust registration edit

The least squares formulation (cb.2) is known to perform arbitrarily bad in the presence of outliers. An outlier correspondence is a pair of measurements   that departs from the generative model (cb.1). In this case, one can consider a different generative model as follows:[19]

 

 

 

 

 

(cb.3)

where if the  th pair   is an inlier, then it obeys the outlier-free model (cb.1), i.e.,   is obtained from   by a spatial transformation plus some small noise; however, if the  th pair   is an outlier, then   can be any arbitrary vector  . Since one does not know which correspondences are outliers beforehand, robust registration under the generative model (cb.3) is of paramount importance for computer vision and robotics deployed in the real world, because current feature matching techniques tend to output highly corrupted correspondences where over   of the correspondences can be outliers.[20]

Next, we describe several common paradigms for robust registration.

Maximum consensus edit

Maximum consensus seeks to find the largest set of correspondences that are consistent with the generative model (cb.1) for some choice of spatial transformation  . Formally speaking, maximum consensus solves the following optimization:

 

 

 

 

 

(cb.4)

where   denotes the cardinality of the set  . The constraint in (cb.4) enforces that every pair of measurements in the inlier set   must have residuals smaller than a pre-defined threshold  . Unfortunately, recent analyses have shown that globally solving problem (cb.4) is NP-Hard, and global algorithms typically have to resort to branch-and-bound (BnB) techniques that take exponential-time complexity in the worst case.[21][22][23][24][25]

Although solving consensus maximization exactly is hard, there exist efficient heuristics that perform quite well in practice. One of the most popular heuristics is the Random Sample Consensus (RANSAC) scheme.[26] RANSAC is an iterative hypothesize-and-verify method. At each iteration, the method first randomly samples 3 out of the total number of   correspondences and computes a hypothesis   using Horn's method,[16] then the method evaluates the constraints in (cb.4) to count how many correspondences actually agree with such a hypothesis (i.e., it computes the residual   and compares it with the threshold   for each pair of measurements). The algorithm terminates either after it has found a consensus set that has enough correspondences, or after it has reached the total number of allowed iterations. RANSAC is highly efficient because the main computation of each iteration is carrying out the closed-form solution in Horn's method. However, RANSAC is non-deterministic and only works well in the low-outlier-ratio regime (e.g., below  ), because its runtime grows exponentially with respect to the outlier ratio.[20]

To fill the gap between the fast but inexact RANSAC scheme and the exact but exhaustive BnB optimization, recent researches have developed deterministic approximate methods to solve consensus maximization.[21][22][27][23]

Outlier removal edit

Outlier removal methods seek to pre-process the set of highly corrupted correspondences before estimating the spatial transformation. The motivation of outlier removal is to significantly reduce the number of outlier correspondences, while maintaining inlier correspondences, so that optimization over the transformation becomes easier and more efficient (e.g., RANSAC works poorly when the outlier ratio is above   but performs quite well when outlier ratio is below  ).

Parra et al. have proposed a method called Guaranteed Outlier Removal (GORE) that uses geometric constraints to prune outlier correspondences while guaranteeing to preserve inlier correspondences.[20] GORE has been shown to be able to drastically reduce the outlier ratio, which can significantly boost the performance of consensus maximization using RANSAC or BnB. Yang and Carlone have proposed to build pairwise translation-and-rotation-invariant measurements (TRIMs) from the original set of measurements and embed TRIMs as the edges of a graph whose nodes are the 3D points. Since inliers are pairwise consistent in terms of the scale, they must form a clique within the graph. Therefore, using efficient algorithms for computing the maximum clique of a graph can find the inliers and effectively prune the outliers.[4] The maximum clique based outlier removal method is also shown to be quite useful in real-world point set registration problems.[19] Similar outlier removal ideas were also proposed by Parra et al..[28]

M-estimation edit

M-estimation replaces the least squares objective function in (cb.2) with a robust cost function that is less sensitive to outliers. Formally, M-estimation seeks to solve the following problem:

 

 

 

 

 

(cb.5)

where   represents the choice of the robust cost function. Note that choosing   recovers the least squares estimation in (cb.2). Popular robust cost functions include  -norm loss, Huber loss,[29] Geman-McClure loss[30] and truncated least squares loss.[19][8][4] M-estimation has been one of the most popular paradigms for robust estimation in robotics and computer vision.[31][32] Because robust objective functions are typically non-convex (e.g., the truncated least squares loss v.s. the least squares loss), algorithms for solving the non-convex M-estimation are typically based on local optimization, where first an initial guess is provided, following by iterative refinements of the transformation to keep decreasing the objective function. Local optimization tends to work well when the initial guess is close to the global minimum, but it is also prone to get stuck in local minima if provided with poor initialization.

Graduated non-convexity edit

Graduated non-convexity (GNC) is a general-purpose framework for solving non-convex optimization problems without initialization. It has achieved success in early vision and machine learning applications.[33][34] The key idea behind GNC is to solve the hard non-convex problem by starting from an easy convex problem. Specifically, for a given robust cost function  , one can construct a surrogate function   with a hyper-parameter  , tuning which can gradually increase the non-convexity of the surrogate function   until it converges to the target function  .[34][35] Therefore, at each level of the hyper-parameter  , the following optimization is solved:

 

 

 

 

 

(cb.6)

Black and Rangarajan proved that the objective function of each optimization (cb.6) can be dualized into a sum of weighted least squares and a so-called outlier process function on the weights that determine the confidence of the optimization in each pair of measurements.[33] Using Black-Rangarajan duality and GNC tailored for the Geman-McClure function, Zhou et al. developed the fast global registration algorithm that is robust against about   outliers in the correspondences.[30] More recently, Yang et al. showed that the joint use of GNC (tailored to the Geman-McClure function and the truncated least squares function) and Black-Rangarajan duality can lead to a general-purpose solver for robust registration problems, including point clouds and mesh registration.[35]

Certifiably robust registration edit

Almost none of the robust registration algorithms mentioned above (except the BnB algorithm that runs in exponential-time in the worst case) comes with performance guarantees, which means that these algorithms can return completely incorrect estimates without notice. Therefore, these algorithms are undesirable for safety-critical applications like autonomous driving.

Very recently, Yang et al. has developed the first certifiably robust registration algorithm, named Truncated least squares Estimation And SEmidefinite Relaxation (TEASER).[19] For point cloud registration, TEASER not only outputs an estimate of the transformation, but also quantifies the optimality of the given estimate. TEASER adopts the following truncated least squares (TLS) estimator:

 

 

 

 

 

(cb.7)

which is obtained by choosing the TLS robust cost function  , where  is a pre-defined constant that determines the maximum allowed residuals to be considered inliers. The TLS objective function has the property that for inlier correspondences ( ), the usual least square penalty is applied; while for outlier correspondences ( ), no penalty is applied and the outliers are discarded. If the TLS optimization (cb.7) is solved to global optimality, then it is equivalent to running Horn's method on only the inlier correspondences.

However, solving (cb.7) is quite challenging due to its combinatorial nature. TEASER solves (cb.7) as follows : (i) It builds invariant measurements such that the estimation of scale, rotation and translation can be decoupled and solved separately, a strategy that is inspired by the original Horn's method; (ii) The same TLS estimation is applied for each of the three sub-problems, where the scale TLS problem can be solved exactly using an algorithm called adaptive voting, the rotation TLS problem can relaxed to a semidefinite program (SDP) where the relaxation is exact in practice,[8] even with large amount of outliers; the translation TLS problem can solved using component-wise adaptive voting. A fast implementation leveraging GNC is open-sourced here. In practice, TEASER can tolerate more than   outlier correspondences and runs in milliseconds.

In addition to developing TEASER, Yang et al. also prove that, under some mild conditions on the point cloud data, TEASER's estimated transformation has bounded errors from the ground-truth transformation.[19]

Simultaneous pose and correspondence registration edit

Iterative closest point edit

The iterative closest point (ICP) algorithm was introduced by Besl and McKay.[36] The algorithm performs rigid registration in an iterative fashion by alternating in (i) given the transformation, finding the closest point in   for every point in  ; and (ii) given the correspondences, finding the best rigid transformation by solving the least squares problem (cb.2). As such, it works best if the initial pose of   is sufficiently close to  . In pseudocode, the basic algorithm is implemented as follows:

algorithm ICP(M, S) θ := θ0while not registered: X := ∅ for miT(M, θ): ŝj := closest point in S to mi X := X + ⟨mi, ŝj θ := least_squares(X) return θ 

Here, the function least_squares performs least squares optimization to minimize the distance in each of the   pairs, using the closed-form solutions by Horn[16] and Arun.[17]

Because the cost function of registration depends on finding the closest point in   to every point in  , it can change as the algorithm is running. As such, it is difficult to prove that ICP will in fact converge exactly to the local optimum.[37] In fact, empirically, ICP and EM-ICP do not converge to the local minimum of the cost function.[37] Nonetheless, because ICP is intuitive to understand and straightforward to implement, it remains the most commonly used point set registration algorithm.[37] Many variants of ICP have been proposed, affecting all phases of the algorithm from the selection and matching of points to the minimization strategy.[13][38] For example, the expectation maximization algorithm is applied to the ICP algorithm to form the EM-ICP method, and the Levenberg-Marquardt algorithm is applied to the ICP algorithm to form the LM-ICP method.[12]

Robust point matching edit

Robust point matching (RPM) was introduced by Gold et al.[39] The method performs registration using deterministic annealing and soft assignment of correspondences between point sets. Whereas in ICP the correspondence generated by the nearest-neighbour heuristic is binary, RPM uses a soft correspondence where the correspondence between any two points can be anywhere from 0 to 1, although it ultimately converges to either 0 or 1. The correspondences found in RPM is always one-to-one, which is not always the case in ICP.[14] Let   be the  th point in   and   be the  th point in  . The match matrix   is defined as such:

 

 

 

 

 

(rpm.1)

The problem is then defined as: Given two point sets   and   find the Affine transformation   and the match matrix   that best relates them.[39] Knowing the optimal transformation makes it easy to determine the match matrix, and vice versa. However, the RPM algorithm determines both simultaneously. The transformation may be decomposed into a translation vector and a transformation matrix:

 

The matrix   in 2D is composed of four separate parameters  , which are scale, rotation, and the vertical and horizontal shear components respectively. The cost function is then:

 

 

 

 

 

(rpm.2)

subject to  ,  ,  . The   term biases the objective towards stronger correlation by decreasing the cost if the match matrix has more ones in it. The function   serves to regularize the Affine transformation by penalizing large values of the scale and shear components:

 

for some regularization parameter  .

The RPM method optimizes the cost function using the Softassign algorithm. The 1D case will be derived here. Given a set of variables   where  . A variable   is associated with each   such that  . The goal is to find   that maximizes  . This can be formulated as a continuous problem by introducing a control parameter  . In the deterministic annealing method, the control parameter   is slowly increased as the algorithm runs. Let   be:

 

 

 

 

 

(rpm.3)

this is known as the softmax function. As   increases, it approaches a binary value as desired in Equation (rpm.1). The problem may now be generalized to the 2D case, where instead of maximizing  , the following is maximized:

 

 

 

 

 

(rpm.4)

where

 

This is straightforward, except that now the constraints on   are doubly stochastic matrix constraints:   and  . As such the denominator from Equation (rpm.3) cannot be expressed for the 2D case simply. To satisfy the constraints, it is possible to use a result due to Sinkhorn,[39] which states that a doubly stochastic matrix is obtained from any square matrix with all positive entries by the iterative process of alternating row and column normalizations. Thus the algorithm is written as such:[39]

algorithm RPM2D  t := 0 a, θ b, c := 0 β := β0   while β < βf: while μ has not converged: // update correspondence parameters by softassign     // apply Sinkhorn's method while   has not converged: // update   by normalizing across all rows:   // update   by normalizing across all columns:   // update pose parameters by coordinate descent update θ using analytical solution update t using analytical solution update a, b, c using Newton's method     return a, b, c, θ and t 

where the deterministic annealing control parameter   is initially set to   and increases by factor   until it reaches the maximum value  . The summations in the normalization steps sum to   and   instead of just   and   because the constraints on   are inequalities. As such the  th and  th elements are slack variables.

The algorithm can also be extended for point sets in 3D or higher dimensions. The constraints on the correspondence matrix   are the same in the 3D case as in the 2D case. Hence the structure of the algorithm remains unchanged, with the main difference being how the rotation and translation matrices are solved.[39]

Thin plate spline robust point matching edit

 
Animation of 2D non-rigid registration of the green point set   to the magenta point set   corrupted with noisy outliers. The size of the blue circles is inversely related to the control parameter  . The yellow lines indicate correspondence.

The thin plate spline robust point matching (TPS-RPM) algorithm by Chui and Rangarajan augments the RPM method to perform non-rigid registration by parametrizing the transformation as a thin plate spline.[14] However, because the thin plate spline parametrization only exists in three dimensions, the method cannot be extended to problems involving four or more dimensions.

Kernel correlation edit

The kernel correlation (KC) approach of point set registration was introduced by Tsin and Kanade.[37] Compared with ICP, the KC algorithm is more robust against noisy data. Unlike ICP, where, for every model point, only the closest scene point is considered, here every scene point affects every model point.[37] As such this is a multiply-linked registration algorithm. For some kernel function  , the kernel correlation   of two points   is defined thus:[37]

 

 

 

 

 

(kc.1)

The kernel function   chosen for point set registration is typically symmetric and non-negative kernel, similar to the ones used in the Parzen window density estimation. The Gaussian kernel typically used for its simplicity, although other ones like the Epanechnikov kernel and the tricube kernel may be substituted.[37] The kernel correlation of an entire point set   is defined as the sum of the kernel correlations of every point in the set to every other point in the set:[37]

 

 

 

 

 

(kc.2)

The logarithm of KC of a point set is proportional, within a constant factor, to the information entropy. Observe that the KC is a measure of a "compactness" of the point set—trivially, if all points in the point set were at the same location, the KC would evaluate to a large value. The cost function of the point set registration algorithm for some transformation parameter   is defined thus:

 

 

 

 

 

(kc.3)

Some algebraic manipulation yields:

 

 

 

 

 

(kc.4)

The expression is simplified by observing that   is independent of  . Furthermore, assuming rigid registration,   is invariant when   is changed because the Euclidean distance between every pair of points stays the same under rigid transformation. So the above equation may be rewritten as:

 

 

 

 

 

(kc.5)

The kernel density estimates are defined as:

 
 

The cost function can then be shown to be the correlation of the two kernel density estimates:

 

 

 

 

 

(kc.6)

Having established the cost function, the algorithm simply uses gradient descent to find the optimal transformation. It is computationally expensive to compute the cost function from scratch on every iteration, so a discrete version of the cost function Equation (kc.6) is used. The kernel density estimates   can be evaluated at grid points and stored in a lookup table. Unlike the ICP and related methods, it is not necessary to find the nearest neighbour, which allows the KC algorithm to be comparatively simple in implementation.

Compared to ICP and EM-ICP for noisy 2D and 3D point sets, the KC algorithm is less sensitive to noise and results in correct registration more often.[37]

Gaussian mixture model edit

The kernel density estimates are sums of Gaussians and may therefore be represented as Gaussian mixture models (GMM).[40] Jian and Vemuri use the GMM version of the KC registration algorithm to perform non-rigid registration parametrized by thin plate splines.

Coherent point drift edit

 
Rigid (with the addition of scaling) registration of a blue point set   to the red point set   using the Coherent Point Drift algorithm. Both point sets have been corrupted with removed points and random spurious outlier points.
 
Affine registration of a blue point set   to the red point set   using the Coherent Point Drift algorithm.
 
Non-rigid registration of a blue point set   to the red point set   using the Coherent Point Drift algorithm. Both point sets have been corrupted with removed points and random spurious outlier points.

Coherent point drift (CPD) was introduced by Myronenko and Song.[13][41] The algorithm takes a probabilistic approach to aligning point sets, similar to the GMM KC method. Unlike earlier approaches to non-rigid registration which assume a thin plate spline transformation model, CPD is agnostic with regard to the transformation model used. The point set   represents the Gaussian mixture model (GMM) centroids. When the two point sets are optimally aligned, the correspondence is the maximum of the GMM posterior probability for a given data point. To preserve the topological structure of the point sets, the GMM centroids are forced to move coherently as a group. The expectation maximization algorithm is used to optimize the cost function.[13]

Let there be M points in   and N points in  . The GMM probability density function for a point s is:

 

 

 

 

 

(cpd.1)

where, in D dimensions,   is the Gaussian distribution centered on point  .

 

The membership probabilities   is equal for all GMM components. The weight of the uniform distribution is denoted as  . The mixture model is then:

 

 

 

 

 

(cpd.2)

The GMM centroids are re-parametrized by a set of parameters   estimated by maximizing the likelihood. This is equivalent to minimizing the negative log-likelihood function:

 

 

 

 

 

(cpd.3)

where it is assumed that the data is independent and identically distributed. The correspondence probability between two points   and   is defined as the posterior probability of the GMM centroid given the data point:

 

The expectation maximization (EM) algorithm is used to find   and  . The EM algorithm consists of two steps. First, in the E-step or estimation step, it guesses the values of parameters ("old" parameter values) and then uses Bayes' theorem to compute the posterior probability distributions   of mixture components. Second, in the M-step or maximization step, the "new" parameter values are then found by minimizing the expectation of the complete negative log-likelihood function, i.e. the cost function:

 

 

 

 

 

(cpd.4)

Ignoring constants independent of   and  , Equation (cpd.4) can be expressed thus:

 

 

 

 

 

(cpd.5)

where

 

with   only if  . The posterior probabilities of GMM components computed using previous parameter values   is:

 

 

 

 

 

(cpd.6)

Minimizing the cost function in Equation (cpd.5) necessarily decreases the negative log-likelihood function E in Equation (cpd.3) unless it is already at a local minimum.[13] Thus, the algorithm can be expressed using the following pseudocode, where the point sets   and   are represented as   and   matrices   and   respectively:[13]

algorithm CPD  θ := θ0 initialize 0 ≤ w ≤ 1   while not registered: // E-step, compute P for i ∊ [1, M] and j ∊ [1, N]:   // M-step, solve for optimal transformation {θ, σ2} := solve(S, M, P) return θ 

where the vector   is a column vector of ones. The solve function differs by the type of registration performed. For example, in rigid registration, the output is a scale a, a rotation matrix  , and a translation vector  . The parameter   can be written as a tuple of these:

 

which is initialized to one, the identity matrix, and a column vector of zeroes:

 

The aligned point set is:

 

The solve_rigid function for rigid registration can then be written as follows, with derivation of the algebra explained in Myronenko's 2010 paper.[13]

solve_rigid(S, M, P) NP := 1TP1           U, V := svd(A) // the singular value decomposition of A = UΣVT C := diag(1, …, 1, det(UVT)) // diag(ξ)is the diagonal matrix formed from vector ξ R := UCVT   // tr is the trace of a matrix t := μsaRμm   return {a, R, t}, σ2 

For affine registration, where the goal is to find an affine transformation instead of a rigid one, the output is an affine transformation matrix   and a translation   such that the aligned point set is:

 

The solve_affine function for rigid registration can then be written as follows, with derivation of the algebra explained in Myronenko's 2010 paper.[13]

solve_affine(S, M, P) NP := 1TP1           t := μsBμm   return {B, t}, σ2 

It is also possible to use CPD with non-rigid registration using a parametrization derived using calculus of variations.[13]

Sums of Gaussian distributions can be computed in linear time using the fast Gauss transform (FGT).[13] Consequently, the time complexity of CPD is  , which is asymptotically much faster than   methods.[13]

Bayesian coherent point drift (BCPD) edit

A variant of coherent point drift, called Bayesian coherent point drift (BCPD), was derived through a Bayesian formulation of point set registration. [42] BCPD has several advantages over CPD, e.g., (1) nonrigid and rigid registrations can be performed in a single algorithm, (2) the algorithm can be accelerated regardless of the Gaussianity of a Gram matrix to define motion coherence, (3) the algorithm is more robust against outliers because of a more reasonable definition of an outlier distribution. Additionally, in the Bayesian formulation, motion coherence was introduced through a prior distribution of displacement vectors, providing a clear difference between tuning parameters that control motion coherence. BCPD was further accelerated by a method called BCPD++, which is a three-step procedure composed of (1) downsampling of point sets, (2) registration of downsampled point sets, and (3) interpolation of a deformation field. [43] The method can register point sets composed of more than 10M points while maintaining its registration accuracy.

Coherent point drift with local surface geometry (LSG-CPD) edit

An variant of coherent point drift called CPD with Local Surface Geometry (LSG-CPD) for rigid point cloud registration.[44] The method adaptively adds different levels of point-to-plane penalization on top of the point-to-point penalization based on the flatness of the local surface. This results in GMM components with anisotropic covariances, instead of the isotropic covariances in the original CPD.[13] The anisotropic covariance matrix is modeled as:

 

 

 

 

 

(lsg-cpd.1)

where

 

 

 

 

 

(lsg-cpd.2)

  is the anisotropic covariance matrix of the m-th point in the target set;   is the normal vector corresponding to the same point;   is an identity matrix, serving as a regularizer, pulling the problem away from ill-posedness.   is penalization coefficient (a modified sigmoid function), which is set adaptively to add different levels of point-to-plane penalization depending on how flat the local surface is. This is realized by evaluating the surface variation  [45] within the neighborhood of the m-th target point.   is the upper bound of the penalization.

The point cloud registration is formulated as a maximum likelihood estimation (MLE) problem and solve it with the Expectation-Maximization (EM) algorithm. In the E step, the correspondence computation is recast into simple matrix manipulations and efficiently computed on a GPU. In the M step, an unconstrained optimization on a matrix Lie group is designed to efficiently update the rigid transformation of the registration. Taking advantage of the local geometric covariances, the method shows a superior performance in accuracy and robustness to noise and outliers, compared with the baseline CPD.[46] An enhanced runtime performance is expected thanks to the GPU accelerated correspondence calculation. An implementation of the LSG-CPD is open-sourced here.

Sorting the Correspondence Space (SCS) edit

This algorithm was introduced in 2013 by H. Assalih to accommodate sonar image registration.[47] These types of images tend to have high amounts of noise, so it is expected to have many outliers in the point sets to match. SCS delivers high robustness against outliers and can surpass ICP and CPD performance in the presence of outliers. SCS doesn't use iterative optimization in high dimensional space and is neither probabilistic nor spectral. SCS can match rigid and non-rigid transformations, and performs best when the target transformation is between three and six degrees of freedom.

See also edit

References edit

  1. ^ Zhang, Ji; Singh, Sanjiv (May 2015). "Visual-lidar odometry and mapping: Low-drift, robust, and fast". 2015 IEEE International Conference on Robotics and Automation (ICRA). pp. 2174–2181. doi:10.1109/ICRA.2015.7139486. ISBN 978-1-4799-6923-4. S2CID 6054487.
  2. ^ Choi, Sungjoon; Zhou, Qian-Yi; Koltun, Vladlen (2015). "Robust reconstruction of indoor scenes" (PDF). Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR): 5556–5565.
  3. ^ Lai, Kevin; Bo, Liefeng; Ren, Xiaofeng; Fox, Dieter (May 2011). "A large-scale hierarchical multi-view RGB-D object dataset". 2011 IEEE International Conference on Robotics and Automation. pp. 1817–1824. CiteSeerX 10.1.1.190.1598. doi:10.1109/ICRA.2011.5980382. ISBN 978-1-61284-386-5. S2CID 14986048.
  4. ^ a b c Yang, Heng; Carlone, Luca (2019). "A polynomial-time solution for robust registration with extreme outlier rates". Robotics: Science and Systems. arXiv:1903.08588. doi:10.15607/RSS.2019.XV.003. ISBN 978-0-9923747-5-4. S2CID 84186750.
  5. ^ Calli, Berk; Singh, Arjun; Bruce, James; Walsman, Aaron; Konolige, Kurt; Srinivasa, Siddhartha; Abbeel, Pieter; Dollar, Aaron M (2017-03-01). "Yale-CMU-Berkeley dataset for robotic manipulation research". The International Journal of Robotics Research. 36 (3): 261–268. doi:10.1177/0278364917700714. ISSN 0278-3649. S2CID 6522002.
  6. ^ Cadena, Cesar; Carlone, Luca; Carrillo, Henry; Latif, Yasir; Scaramuzza, Davide; Neira, José; Reid, Ian; Leonard, John J. (December 2016). "Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age". IEEE Transactions on Robotics. 32 (6): 1309–1332. arXiv:1606.05830. Bibcode:2016arXiv160605830C. doi:10.1109/TRO.2016.2624754. ISSN 1941-0468. S2CID 2596787.
  7. ^ Mur-Artal, Raúl; Montiel, J. M. M.; Tardós, Juan D. (October 2015). "ORB-SLAM: A Versatile and Accurate Monocular SLAM System". IEEE Transactions on Robotics. 31 (5): 1147–1163. arXiv:1502.00956. Bibcode:2015arXiv150200956M. doi:10.1109/TRO.2015.2463671. ISSN 1941-0468. S2CID 206775100.
  8. ^ a b c Yang, Heng; Carlone, Luca (2019). "A Quaternion-based Certifiably Optimal Solution to the Wahba Problem with Outliers" (PDF). Proceedings of the IEEE International Conference on Computer Vision (ICCV): 1665–1674. arXiv:1905.12536. Bibcode:2019arXiv190512536Y.
  9. ^ Newcombe, Richard A.; Izadi, Shahram; Hilliges, Otmar; Molyneaux, David; Kim, David; Davison, Andrew J.; Kohi, Pushmeet; Shotton, Jamie; Hodges, Steve; Fitzgibbon, Andrew (October 2011). "KinectFusion: Real-time dense surface mapping and tracking". 2011 10th IEEE International Symposium on Mixed and Augmented Reality. pp. 127–136. CiteSeerX 10.1.1.453.53. doi:10.1109/ISMAR.2011.6092378. ISBN 978-1-4577-2183-0. S2CID 11830123.
  10. ^ Audette, Michel A.; Ferrie, Frank P.; Peters, Terry M. (2000-09-01). "An algorithmic overview of surface registration techniques for medical imaging". Medical Image Analysis. 4 (3): 201–217. doi:10.1016/S1361-8415(00)00014-1. ISSN 1361-8415. PMID 11145309.
  11. ^ a b Jian, Bing; Vemuri, Baba C. (2011). "Robust Point Set Registration Using Gaussian Mixture Models". IEEE Transactions on Pattern Analysis and Machine Intelligence. 33 (8): 1633–1645. doi:10.1109/tpami.2010.223. PMID 21173443. S2CID 10923565.
  12. ^ a b Fitzgibbon, Andrew W. (2003). "Robust registration of 2D and 3D point sets". Image and Vision Computing. 21 (13): 1145–1153. CiteSeerX 10.1.1.335.116. doi:10.1016/j.imavis.2003.09.004.
  13. ^ a b c d e f g h i j k l m Myronenko, Andriy; Song, Xubo (2010). "Point set registration: Coherent Point drift". IEEE Transactions on Pattern Analysis and Machine Intelligence. 32 (2): 2262–2275. arXiv:0905.2635. doi:10.1109/tpami.2010.46. PMID 20975122. S2CID 10809031.
  14. ^ a b c Chui, Haili; Rangarajan, Anand (2003). "A new point matching algorithm for non-rigid registration". Computer Vision and Image Understanding. 89 (2): 114–141. CiteSeerX 10.1.1.7.4365. doi:10.1016/S1077-3142(03)00009-2.
  15. ^ Holz, Dirk; Ichim, Alexandru E.; Tombari, Federico; Rusu, Radu B.; Behnke, Sven (2015). "Registration with the Point Cloud Library: A Modular Framework for Aligning in 3-D". IEEE Robotics Automation Magazine. 22 (4): 110–124. doi:10.1109/MRA.2015.2432331. S2CID 2621807.
  16. ^ a b c Horn, Berthold K. P. (1987-04-01). "Closed-form solution of absolute orientation using unit quaternions". JOSA A. 4 (4): 629–642. Bibcode:1987JOSAA...4..629H. doi:10.1364/JOSAA.4.000629. ISSN 1520-8532. S2CID 11038004.
  17. ^ a b Arun, K. S.; Huang, T. S.; Blostein, S. D. (September 1987). "Least-Squares Fitting of Two 3-D Point Sets". IEEE Transactions on Pattern Analysis and Machine Intelligence. PAMI-9 (5): 698–700. doi:10.1109/TPAMI.1987.4767965. ISSN 1939-3539. PMID 21869429. S2CID 8724100.
  18. ^ Briales, Jesus; Gonzalez-Jimenez, Javier (July 2017). "Convex Global 3D Registration with Lagrangian Duality". 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 5612–5621. doi:10.1109/CVPR.2017.595. hdl:10630/14599. ISBN 978-1-5386-0457-1. S2CID 11549421.
  19. ^ a b c d e Yang, Heng; Shi, Jingnan; Carlone, Luca (2020-01-21). "TEASER: Fast and Certifiable Point Cloud Registration". arXiv:2001.07715 [cs.RO].
  20. ^ a b c Parra Bustos, Álvaro; Chin, Tat-Jun (December 2018). "Guaranteed Outlier Removal for Point Cloud Registration with Correspondences". IEEE Transactions on Pattern Analysis and Machine Intelligence. 40 (12): 2868–2882. arXiv:1711.10209. doi:10.1109/TPAMI.2017.2773482. ISSN 1939-3539. PMID 29990122. S2CID 3331003.
  21. ^ a b Chin, Tat-Jun; Suter, David (2017-02-27). "The Maximum Consensus Problem: Recent Algorithmic Advances". Synthesis Lectures on Computer Vision. 7 (2): 1–194. doi:10.2200/s00757ed1v01y201702cov011. ISSN 2153-1056.
  22. ^ a b Wen, Fei; Ying, Rendong; Gong, Zheng; Liu, Peilin (February 2020). "Efficient Algorithms for Maximum Consensus Robust Fitting". IEEE Transactions on Robotics. 36 (1): 92–106. doi:10.1109/TRO.2019.2943061. ISSN 1941-0468. S2CID 209976632.
  23. ^ a b Cai, Zhipeng; Chin, Tat-Jun; Koltun, Vladlen (2019). "Consensus Maximization Tree Search Revisited". Proceedings of IEEE International Conference on Computer Vision (ICCV): 1637–1645. arXiv:1908.02021. Bibcode:2019arXiv190802021C.
  24. ^ Bazin, Jean-Charles; Seo, Yongduek; Pollefeys, Marc (2013). "Globally Optimal Consensus Set Maximization through Rotation Search". In Lee, Kyoung Mu; Matsushita, Yasuyuki; Rehg, James M.; Hu, Zhanyi (eds.). Computer Vision – ACCV 2012. Lecture Notes in Computer Science. Vol. 7725. Berlin, Heidelberg: Springer. pp. 539–551. doi:10.1007/978-3-642-37444-9_42. ISBN 978-3-642-37444-9.
  25. ^ Hartley, Richard I.; Kahl, Fredrik (2009-04-01). "Global Optimization through Rotation Space Search". International Journal of Computer Vision. 82 (1): 64–79. doi:10.1007/s11263-008-0186-9. hdl:1885/50831. ISSN 1573-1405. S2CID 509788.
  26. ^ Fischler, Martin; Bolles, Robert (1981). "Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography". Communications of the ACM. 24 (6): 381–395. doi:10.1145/358669.358692. S2CID 972888.
  27. ^ Le, Huu Minh; Chin, Tat-Jun; Eriksson, Anders; Do, Thanh-Toan; Suter, David (2019). "Deterministic Approximate Methods for Maximum Consensus Robust Fitting". IEEE Transactions on Pattern Analysis and Machine Intelligence. 43 (3): 842–857. arXiv:1710.10003. doi:10.1109/TPAMI.2019.2939307. ISSN 1939-3539. PMID 31494545. S2CID 29346470.
  28. ^ Bustos, Alvaro Parra; Chin, Tat-Jun; Neumann, Frank; Friedrich, Tobias; Katzmann, Maximilian (2019-02-04). "A Practical Maximum Clique Algorithm for Matching with Pairwise Constraints". arXiv:1902.01534 [cs.CV].
  29. ^ Huber, Peter J.; Ronchetti, Elvezio M. (2009-01-29). Robust Statistics. Wiley Series in Probability and Statistics. Hoboken, NJ, USA: John Wiley & Sons, Inc. doi:10.1002/9780470434697. ISBN 978-0-470-43469-7.
  30. ^ a b Zhou, Qian-Yi; Park, Jaesik; Koltun, Vladlen (2016). "Fast Global Registration". In Leibe, Bastian; Matas, Jiri; Sebe, Nicu; Welling, Max (eds.). Computer Vision – ECCV 2016. Lecture Notes in Computer Science. Vol. 9906. Cham: Springer International Publishing. pp. 766–782. doi:10.1007/978-3-319-46475-6_47. ISBN 978-3-319-46475-6. S2CID 27362942.
  31. ^ MacTavish, Kirk; Barfoot, Timothy D. (2015). "At all Costs: A Comparison of Robust Cost Functions for Camera Correspondence Outliers". 2015 12th Conference on Computer and Robot Vision. pp. 62–69. doi:10.1109/CRV.2015.52. ISBN 978-1-4799-1986-4. S2CID 9305263.
  32. ^ Bosse, Michael; Agamennoni, Gabriel; Gilitschenski, Igor (2016). "Robust Estimation and Applications in Robotics". Foundations and Trends in Robotics. 4 (4). now: 225–269. doi:10.1561/2300000047.
  33. ^ a b Black, Michael J.; Rangarajan, Anand (1996-07-01). "On the unification of line processes, outlier rejection, and robust statistics with applications in early vision". International Journal of Computer Vision. 19 (1): 57–91. doi:10.1007/BF00131148. ISSN 1573-1405. S2CID 7510079.
  34. ^ a b Blake, Andrew; Zisserman, Andrew (1987). Visual reconstruction. The MIT Press. ISBN 9780262524063.
  35. ^ a b Yang, Heng; Antonante, Pasquale; Tzoumas, Vasileios; Carlone, Luca (2020). "Graduated Non-Convexity for Robust Spatial Perception: From Non-Minimal Solvers to Global Outlier Rejection". IEEE Robotics and Automation Letters. 5 (2): 1127–1134. arXiv:1909.08605. doi:10.1109/LRA.2020.2965893. ISSN 2377-3774. S2CID 202660784.
  36. ^ Besl, Paul; McKay, Neil (1992). "A Method for Registration of 3-D Shapes". IEEE Transactions on Pattern Analysis and Machine Intelligence. 14 (2): 239–256. Bibcode:1992SPIE.1611..586B. doi:10.1109/34.121791.
  37. ^ a b c d e f g h i Tsin, Yanghai; Kanade, Takeo (2004). "A Correlation-Based Approach to Robust Point Set Registration". Computer Vision - ECCV 2004. Lecture Notes in Computer Science. Vol. 3023. Springer Berlin Heidelberg. pp. 558–569. CiteSeerX 10.1.1.156.6729. doi:10.1007/978-3-540-24672-5_44. ISBN 978-3-540-21982-8. {{cite book}}: |journal= ignored (help)
  38. ^

point, registration, computer, vision, pattern, recognition, robotics, point, registration, also, known, point, cloud, registration, scan, matching, process, finding, spatial, transformation, scaling, rotation, translation, that, aligns, point, clouds, purpose. In computer vision pattern recognition and robotics point set registration also known as point cloud registration or scan matching is the process of finding a spatial transformation e g scaling rotation and translation that aligns two point clouds The purpose of finding such a transformation includes merging multiple data sets into a globally consistent model or coordinate frame and mapping a new measurement to a known data set to identify features or to estimate its pose Raw 3D point cloud data are typically obtained from Lidars and RGB D cameras 3D point clouds can also be generated from computer vision algorithms such as triangulation bundle adjustment and more recently monocular image depth estimation using deep learning For 2D point set registration used in image processing and feature based image registration a point set may be 2D pixel coordinates obtained by feature extraction from an image for example corner detection Point cloud registration has extensive applications in autonomous driving 1 motion estimation and 3D reconstruction 2 object detection and pose estimation 3 4 robotic manipulation 5 simultaneous localization and mapping SLAM 6 7 panorama stitching 8 virtual and augmented reality 9 and medical imaging 10 Point set registration is the process of aligning two point sets Here the blue fish is being registered to the red fish As a special case registration of two point sets that only differ by a 3D rotation i e there is no scaling and translation is called the Wahba Problem and also related to the orthogonal procrustes problem Contents 1 Formulation 2 Types of algorithms 2 1 Rigid registration 2 2 Non rigid registration 2 3 Other types 3 Correspondence based registration 3 1 Outlier free registration 3 2 Robust registration 3 2 1 Maximum consensus 3 2 2 Outlier removal 3 2 3 M estimation 3 2 4 Graduated non convexity 3 2 5 Certifiably robust registration 4 Simultaneous pose and correspondence registration 4 1 Iterative closest point 4 2 Robust point matching 4 2 1 Thin plate spline robust point matching 4 3 Kernel correlation 4 3 1 Gaussian mixture model 4 4 Coherent point drift 4 4 1 Bayesian coherent point drift BCPD 4 4 2 Coherent point drift with local surface geometry LSG CPD 4 5 Sorting the Correspondence Space SCS 5 See also 6 References 7 External linksFormulation edit nbsp Data from two 3D scans of the same environment need to be aligned using point set registration nbsp Data from above registered successfully using a variant of iterative closest point The problem may be summarized as follows 11 Let M S displaystyle lbrace mathcal M mathcal S rbrace nbsp be two finite size point sets in a finite dimensional real vector space Rd displaystyle mathbb R d nbsp which contain M displaystyle M nbsp and N displaystyle N nbsp points respectively e g d 3 displaystyle d 3 nbsp recovers the typical case of when M displaystyle mathcal M nbsp and S displaystyle mathcal S nbsp are 3D point sets The problem is to find a transformation to be applied to the moving model point set M displaystyle mathcal M nbsp such that the difference typically defined in the sense of point wise Euclidean distance between M displaystyle mathcal M nbsp and the static scene set S displaystyle mathcal S nbsp is minimized In other words a mapping from Rd displaystyle mathbb R d nbsp to Rd displaystyle mathbb R d nbsp is desired which yields the best alignment between the transformed model set and the scene set The mapping may consist of a rigid or non rigid transformation The transformation model may be written as T displaystyle T nbsp using which the transformed registered model point set is T M displaystyle T mathcal M nbsp 1 The output of a point set registration algorithm is therefore the optimal transformation T displaystyle T star nbsp such that M displaystyle mathcal M nbsp is best aligned to S displaystyle mathcal S nbsp according to some defined notion of distance function dist displaystyle operatorname dist cdot cdot nbsp T arg minT Tdist T M S displaystyle T star arg min T in mathcal T text dist T mathcal M mathcal S nbsp 2 where T displaystyle mathcal T nbsp is used to denote the set of all possible transformations that the optimization tries to search for The most popular choice of the distance function is to take the square of the Euclidean distance for every pair of points dist T M S m T M m sm 22 sm arg mins S s m 22 displaystyle operatorname dist T mathcal M mathcal S sum m in T mathcal M Vert m s m Vert 2 2 quad s m arg min s in mathcal S Vert s m Vert 2 2 nbsp 3 where 2 displaystyle cdot 2 nbsp denotes the vector 2 norm sm displaystyle s m nbsp is the corresponding point in set S displaystyle mathcal S nbsp that attains the shortest distance to a given point m displaystyle m nbsp in set M displaystyle mathcal M nbsp after transformation Minimizing such a function in rigid registration is equivalent to solving a least squares problem Types of algorithms editWhen the correspondences i e sm m displaystyle s m leftrightarrow m nbsp are given before the optimization for example using feature matching techniques then the optimization only needs to estimate the transformation This type of registration is called correspondence based registration On the other hand if the correspondences are unknown then the optimization is required to jointly find out the correspondences and transformation together This type of registration is called simultaneous pose and correspondence registration Rigid registration edit Given two point sets rigid registration yields a rigid transformation which maps one point set to the other A rigid transformation is defined as a transformation that does not change the distance between any two points Typically such a transformation consists of translation and rotation 12 In rare cases the point set may also be mirrored In robotics and computer vision rigid registration has the most applications Non rigid registration edit nbsp Registered point cloud from a lidar mounted on a moving car Given two point sets non rigid registration yields a non rigid transformation which maps one point set to the other Non rigid transformations include affine transformations such as scaling and shear mapping However in the context of point set registration non rigid registration typically involves nonlinear transformation If the eigenmodes of variation of the point set are known the nonlinear transformation may be parametrized by the eigenvalues 13 A nonlinear transformation may also be parametrized as a thin plate spline 14 13 Other types edit Some approaches to point set registration use algorithms that solve the more general graph matching problem 11 However the computational complexity of such methods tend to be high and they are limited to rigid registrations In this article we will only consider algorithms for rigid registration where the transformation is assumed to contain 3D rotations and translations possibly also including a uniform scaling The PCL Point Cloud Library is an open source framework for n dimensional point cloud and 3D geometry processing It includes several point registration algorithms 15 Correspondence based registration editCorrespondence based methods assume the putative correspondences m sm displaystyle m leftrightarrow s m nbsp are given for every point m M displaystyle m in mathcal M nbsp Therefore we arrive at a setting where both point sets M displaystyle mathcal M nbsp and S displaystyle mathcal S nbsp have N displaystyle N nbsp points and the correspondences mi si i 1 N displaystyle m i leftrightarrow s i i 1 dots N nbsp are given Outlier free registration editIn the simplest case one can assume that all the correspondences are correct meaning that the points mi si R3 displaystyle m i s i in mathbb R 3 nbsp are generated as follows si lRmi t ϵi i 1 N displaystyle s i lRm i t epsilon i i 1 dots N nbsp cb 1 where l gt 0 displaystyle l gt 0 nbsp is a uniform scaling factor in many cases l 1 displaystyle l 1 nbsp is assumed R SO 3 displaystyle R in text SO 3 nbsp is a proper 3D rotation matrix SO d displaystyle text SO d nbsp is the special orthogonal group of degree d displaystyle d nbsp t R3 displaystyle t in mathbb R 3 nbsp is a 3D translation vector and ϵi R3 displaystyle epsilon i in mathbb R 3 nbsp models the unknown additive noise e g Gaussian noise Specifically if the noise ϵi displaystyle epsilon i nbsp is assumed to follow a zero mean isotropic Gaussian distribution with standard deviation si displaystyle sigma i nbsp i e ϵi N 0 si2I3 displaystyle epsilon i sim mathcal N 0 sigma i 2 I 3 nbsp then the following optimization can be shown to yield the maximum likelihood estimate for the unknown scale rotation and translation l R t arg minl gt 0 R SO 3 t R3 i 1N1si2 si lRmi t 22 displaystyle l star R star t star arg min l gt 0 R in text SO 3 t in mathbb R 3 sum i 1 N frac 1 sigma i 2 left Vert s i lRm i t right Vert 2 2 nbsp cb 2 Note that when the scaling factor is 1 and the translation vector is zero then the optimization recovers the formulation of the Wahba problem Despite the non convexity of the optimization cb 2 due to non convexity of the set SO 3 displaystyle text SO 3 nbsp seminal work by Berthold K P Horn showed that cb 2 actually admits a closed form solution by decoupling the estimation of scale rotation and translation 16 Similar results were discovered by Arun et al 17 In addition in order to find a unique transformation l R t displaystyle l R t nbsp at least N 3 displaystyle N 3 nbsp non collinear points in each point set are required More recently Briales and Gonzalez Jimenez have developed a semidefinite relaxation using Lagrangian duality for the case where the model set M displaystyle mathcal M nbsp contains different 3D primitives such as points lines and planes which is the case when the model M displaystyle mathcal M nbsp is a 3D mesh 18 Interestingly the semidefinite relaxation is empirically tight i e a certifiably globally optimal solution can be extracted from the solution of the semidefinite relaxation Robust registration editThe least squares formulation cb 2 is known to perform arbitrarily bad in the presence of outliers An outlier correspondence is a pair of measurements si mi displaystyle s i leftrightarrow m i nbsp that departs from the generative model cb 1 In this case one can consider a different generative model as follows 19 si lRmi t ϵiif i th pair is an inlieroiif i th pair is an outlier displaystyle s i begin cases lRm i t epsilon i amp text if i text th pair is an inlier o i amp text if i text th pair is an outlier end cases nbsp cb 3 where if the i displaystyle i nbsp th pair si mi displaystyle s i leftrightarrow m i nbsp is an inlier then it obeys the outlier free model cb 1 i e si displaystyle s i nbsp is obtained from mi displaystyle m i nbsp by a spatial transformation plus some small noise however if the i displaystyle i nbsp th pair si mi displaystyle s i leftrightarrow m i nbsp is an outlier then si displaystyle s i nbsp can be any arbitrary vector oi displaystyle o i nbsp Since one does not know which correspondences are outliers beforehand robust registration under the generative model cb 3 is of paramount importance for computer vision and robotics deployed in the real world because current feature matching techniques tend to output highly corrupted correspondences where over 95 displaystyle 95 nbsp of the correspondences can be outliers 20 Next we describe several common paradigms for robust registration Maximum consensus edit Maximum consensus seeks to find the largest set of correspondences that are consistent with the generative model cb 1 for some choice of spatial transformation l R t displaystyle l R t nbsp Formally speaking maximum consensus solves the following optimization maxl gt 0 R SO 3 t R3 I I subject to 1si2 si lRmi t 22 3 i I displaystyle max l gt 0 R in text SO 3 t in mathbb R 3 mathcal I vert mathcal I vert quad text subject to frac 1 sigma i 2 Vert s i lRm i t Vert 2 2 leq xi forall i in mathcal I nbsp cb 4 where I displaystyle vert mathcal I vert nbsp denotes the cardinality of the set I displaystyle mathcal I nbsp The constraint in cb 4 enforces that every pair of measurements in the inlier set I displaystyle mathcal I nbsp must have residuals smaller than a pre defined threshold 3 displaystyle xi nbsp Unfortunately recent analyses have shown that globally solving problem cb 4 is NP Hard and global algorithms typically have to resort to branch and bound BnB techniques that take exponential time complexity in the worst case 21 22 23 24 25 Although solving consensus maximization exactly is hard there exist efficient heuristics that perform quite well in practice One of the most popular heuristics is the Random Sample Consensus RANSAC scheme 26 RANSAC is an iterative hypothesize and verify method At each iteration the method first randomly samples 3 out of the total number of N displaystyle N nbsp correspondences and computes a hypothesis l R t displaystyle l R t nbsp using Horn s method 16 then the method evaluates the constraints in cb 4 to count how many correspondences actually agree with such a hypothesis i e it computes the residual si lRmi t 22 si2 displaystyle Vert s i lRm i t Vert 2 2 sigma i 2 nbsp and compares it with the threshold 3 displaystyle xi nbsp for each pair of measurements The algorithm terminates either after it has found a consensus set that has enough correspondences or after it has reached the total number of allowed iterations RANSAC is highly efficient because the main computation of each iteration is carrying out the closed form solution in Horn s method However RANSAC is non deterministic and only works well in the low outlier ratio regime e g below 50 displaystyle 50 nbsp because its runtime grows exponentially with respect to the outlier ratio 20 To fill the gap between the fast but inexact RANSAC scheme and the exact but exhaustive BnB optimization recent researches have developed deterministic approximate methods to solve consensus maximization 21 22 27 23 Outlier removal edit Outlier removal methods seek to pre process the set of highly corrupted correspondences before estimating the spatial transformation The motivation of outlier removal is to significantly reduce the number of outlier correspondences while maintaining inlier correspondences so that optimization over the transformation becomes easier and more efficient e g RANSAC works poorly when the outlier ratio is above 95 displaystyle 95 nbsp but performs quite well when outlier ratio is below 50 displaystyle 50 nbsp Parra et al have proposed a method called Guaranteed Outlier Removal GORE that uses geometric constraints to prune outlier correspondences while guaranteeing to preserve inlier correspondences 20 GORE has been shown to be able to drastically reduce the outlier ratio which can significantly boost the performance of consensus maximization using RANSAC or BnB Yang and Carlone have proposed to build pairwise translation and rotation invariant measurements TRIMs from the original set of measurements and embed TRIMs as the edges of a graph whose nodes are the 3D points Since inliers are pairwise consistent in terms of the scale they must form a clique within the graph Therefore using efficient algorithms for computing the maximum clique of a graph can find the inliers and effectively prune the outliers 4 The maximum clique based outlier removal method is also shown to be quite useful in real world point set registration problems 19 Similar outlier removal ideas were also proposed by Parra et al 28 M estimation edit M estimation replaces the least squares objective function in cb 2 with a robust cost function that is less sensitive to outliers Formally M estimation seeks to solve the following problem l R t arg minl gt 0 R SO 3 t R3 i 1Nr 1si si lRmi t 2 displaystyle l star R star t star arg min l gt 0 R in text SO 3 t in mathbb R 3 sum i 1 N rho left frac 1 sigma i left Vert s i lRm i t right Vert 2 right nbsp cb 5 where r displaystyle rho cdot nbsp represents the choice of the robust cost function Note that choosing r x x2 displaystyle rho x x 2 nbsp recovers the least squares estimation in cb 2 Popular robust cost functions include ℓ1 displaystyle ell 1 nbsp norm loss Huber loss 29 Geman McClure loss 30 and truncated least squares loss 19 8 4 M estimation has been one of the most popular paradigms for robust estimation in robotics and computer vision 31 32 Because robust objective functions are typically non convex e g the truncated least squares loss v s the least squares loss algorithms for solving the non convex M estimation are typically based on local optimization where first an initial guess is provided following by iterative refinements of the transformation to keep decreasing the objective function Local optimization tends to work well when the initial guess is close to the global minimum but it is also prone to get stuck in local minima if provided with poor initialization Graduated non convexity editGraduated non convexity GNC is a general purpose framework for solving non convex optimization problems without initialization It has achieved success in early vision and machine learning applications 33 34 The key idea behind GNC is to solve the hard non convex problem by starting from an easy convex problem Specifically for a given robust cost function r displaystyle rho cdot nbsp one can construct a surrogate function rm displaystyle rho mu cdot nbsp with a hyper parameter m displaystyle mu nbsp tuning which can gradually increase the non convexity of the surrogate function rm displaystyle rho mu cdot nbsp until it converges to the target function r displaystyle rho cdot nbsp 34 35 Therefore at each level of the hyper parameter m displaystyle mu nbsp the following optimization is solved lm Rm tm arg minl gt 0 R SO 3 t R3 i 1Nrm 1si si lRmi t 2 displaystyle l mu star R mu star t mu star arg min l gt 0 R in text SO 3 t in mathbb R 3 sum i 1 N rho mu left frac 1 sigma i left Vert s i lRm i t right Vert 2 right nbsp cb 6 Black and Rangarajan proved that the objective function of each optimization cb 6 can be dualized into a sum of weighted least squares and a so called outlier process function on the weights that determine the confidence of the optimization in each pair of measurements 33 Using Black Rangarajan duality and GNC tailored for the Geman McClure function Zhou et al developed the fast global registration algorithm that is robust against about 80 displaystyle 80 nbsp outliers in the correspondences 30 More recently Yang et al showed that the joint use of GNC tailored to the Geman McClure function and the truncated least squares function and Black Rangarajan duality can lead to a general purpose solver for robust registration problems including point clouds and mesh registration 35 Certifiably robust registration edit Almost none of the robust registration algorithms mentioned above except the BnB algorithm that runs in exponential time in the worst case comes with performance guarantees which means that these algorithms can return completely incorrect estimates without notice Therefore these algorithms are undesirable for safety critical applications like autonomous driving Very recently Yang et al has developed the first certifiably robust registration algorithm named Truncated least squares Estimation And SEmidefinite Relaxation TEASER 19 For point cloud registration TEASER not only outputs an estimate of the transformation but also quantifies the optimality of the given estimate TEASER adopts the following truncated least squares TLS estimator l R t arg minl gt 0 R SO 3 t R3 i 1Nmin 1si2 si lRmi t 22 c 2 displaystyle l star R star t star arg min l gt 0 R in text SO 3 t in mathbb R 3 sum i 1 N min left frac 1 sigma i 2 left Vert s i lRm i t right Vert 2 2 bar c 2 right nbsp cb 7 which is obtained by choosing the TLS robust cost function r x min x2 c 2 displaystyle rho x min x 2 bar c 2 nbsp where c 2 displaystyle bar c 2 nbsp is a pre defined constant that determines the maximum allowed residuals to be considered inliers The TLS objective function has the property that for inlier correspondences si lRmi t 22 si2 lt c 2 displaystyle Vert s i lRm i t Vert 2 2 sigma i 2 lt bar c 2 nbsp the usual least square penalty is applied while for outlier correspondences si lRmi t 22 si2 gt c 2 displaystyle Vert s i lRm i t Vert 2 2 sigma i 2 gt bar c 2 nbsp no penalty is applied and the outliers are discarded If the TLS optimization cb 7 is solved to global optimality then it is equivalent to running Horn s method on only the inlier correspondences However solving cb 7 is quite challenging due to its combinatorial nature TEASER solves cb 7 as follows i It builds invariant measurements such that the estimation of scale rotation and translation can be decoupled and solved separately a strategy that is inspired by the original Horn s method ii The same TLS estimation is applied for each of the three sub problems where the scale TLS problem can be solved exactly using an algorithm called adaptive voting the rotation TLS problem can relaxed to a semidefinite program SDP where the relaxation is exact in practice 8 even with large amount of outliers the translation TLS problem can solved using component wise adaptive voting A fast implementation leveraging GNC is open sourced here In practice TEASER can tolerate more than 99 displaystyle 99 nbsp outlier correspondences and runs in milliseconds In addition to developing TEASER Yang et al also prove that under some mild conditions on the point cloud data TEASER s estimated transformation has bounded errors from the ground truth transformation 19 Simultaneous pose and correspondence registration editIterative closest point edit Main article Iterative closest point The iterative closest point ICP algorithm was introduced by Besl and McKay 36 The algorithm performs rigid registration in an iterative fashion by alternating in i given the transformation finding the closest point in S displaystyle mathcal S nbsp for every point in M displaystyle mathcal M nbsp and ii given the correspondences finding the best rigid transformation by solving the least squares problem cb 2 As such it works best if the initial pose of M displaystyle mathcal M nbsp is sufficiently close to S displaystyle mathcal S nbsp In pseudocode the basic algorithm is implemented as follows algorithm ICP M S 8 80while not registered X for mi T M 8 ŝj closest point in S to mi X X mi ŝj 8 least squares X return 8 Here the function least squares performs least squares optimization to minimize the distance in each of the mi s j displaystyle langle m i hat s j rangle nbsp pairs using the closed form solutions by Horn 16 and Arun 17 Because the cost function of registration depends on finding the closest point in S displaystyle mathcal S nbsp to every point in M displaystyle mathcal M nbsp it can change as the algorithm is running As such it is difficult to prove that ICP will in fact converge exactly to the local optimum 37 In fact empirically ICP and EM ICP do not converge to the local minimum of the cost function 37 Nonetheless because ICP is intuitive to understand and straightforward to implement it remains the most commonly used point set registration algorithm 37 Many variants of ICP have been proposed affecting all phases of the algorithm from the selection and matching of points to the minimization strategy 13 38 For example the expectation maximization algorithm is applied to the ICP algorithm to form the EM ICP method and the Levenberg Marquardt algorithm is applied to the ICP algorithm to form the LM ICP method 12 Robust point matching edit Robust point matching RPM was introduced by Gold et al 39 The method performs registration using deterministic annealing and soft assignment of correspondences between point sets Whereas in ICP the correspondence generated by the nearest neighbour heuristic is binary RPM uses a soft correspondence where the correspondence between any two points can be anywhere from 0 to 1 although it ultimately converges to either 0 or 1 The correspondences found in RPM is always one to one which is not always the case in ICP 14 Let mi displaystyle m i nbsp be the i displaystyle i nbsp th point in M displaystyle mathcal M nbsp and sj displaystyle s j nbsp be the j displaystyle j nbsp th point in S displaystyle mathcal S nbsp The match matrix m displaystyle mathbf mu nbsp is defined as such mij 1if point mi corresponds to point sj0otherwise displaystyle mu ij begin cases 1 amp text if point m i text corresponds to point s j 0 amp text otherwise end cases nbsp rpm 1 The problem is then defined as Given two point sets M displaystyle mathcal M nbsp and S displaystyle mathcal S nbsp find the Affine transformation T displaystyle T nbsp and the match matrix m displaystyle mathbf mu nbsp that best relates them 39 Knowing the optimal transformation makes it easy to determine the match matrix and vice versa However the RPM algorithm determines both simultaneously The transformation may be decomposed into a translation vector and a transformation matrix T m Am t displaystyle T m mathbf A m mathbf t nbsp The matrix A displaystyle mathbf A nbsp in 2D is composed of four separate parameters a 8 b c displaystyle lbrace a theta b c rbrace nbsp which are scale rotation and the vertical and horizontal shear components respectively The cost function is then cost j 1N i 1Mmij sj t Ami 2 g A a j 1N i 1Mmij displaystyle operatorname cost sum j 1 N sum i 1 M mu ij lVert s j mathbf t mathbf A m i rVert 2 g mathbf A alpha sum j 1 N sum i 1 M mu ij nbsp rpm 2 subject to j i 1Mmij 1 textstyle forall j sum i 1 M mu ij leq 1 nbsp i j 1Nmij 1 textstyle forall i sum j 1 N mu ij leq 1 nbsp ij mij 0 1 textstyle forall ij mu ij in lbrace 0 1 rbrace nbsp The a displaystyle alpha nbsp term biases the objective towards stronger correlation by decreasing the cost if the match matrix has more ones in it The function g A displaystyle g mathbf A nbsp serves to regularize the Affine transformation by penalizing large values of the scale and shear components g A a 8 b c g a2 b2 c2 displaystyle g mathbf A a theta b c gamma a 2 b 2 c 2 nbsp for some regularization parameter g displaystyle gamma nbsp The RPM method optimizes the cost function using the Softassign algorithm The 1D case will be derived here Given a set of variables Qj displaystyle lbrace Q j rbrace nbsp where Qj R1 displaystyle Q j in mathbb R 1 nbsp A variable mj displaystyle mu j nbsp is associated with each Qj displaystyle Q j nbsp such that j 1Jmj 1 textstyle sum j 1 J mu j 1 nbsp The goal is to find m displaystyle mathbf mu nbsp that maximizes j 1JmjQj textstyle sum j 1 J mu j Q j nbsp This can be formulated as a continuous problem by introducing a control parameter b gt 0 displaystyle beta gt 0 nbsp In the deterministic annealing method the control parameter b displaystyle beta nbsp is slowly increased as the algorithm runs Let m displaystyle mathbf mu nbsp be mj exp bQj j 1Jexp bQj displaystyle mu hat j frac exp beta Q hat j sum j 1 J exp beta Q j nbsp rpm 3 this is known as the softmax function As b displaystyle beta nbsp increases it approaches a binary value as desired in Equation rpm 1 The problem may now be generalized to the 2D case where instead of maximizing j 1JmjQj textstyle sum j 1 J mu j Q j nbsp the following is maximized E m j 1N i 0MmijQij displaystyle E mu sum j 1 N sum i 0 M mu ij Q ij nbsp rpm 4 where Qij sj t Ami 2 a cost mij displaystyle Q ij lVert s j mathbf t mathbf A m i rVert 2 alpha frac partial operatorname cost partial mu ij nbsp This is straightforward except that now the constraints on m displaystyle mu nbsp are doubly stochastic matrix constraints j i 1Mmij 1 textstyle forall j sum i 1 M mu ij 1 nbsp and i j 1Nmij 1 textstyle forall i sum j 1 N mu ij 1 nbsp As such the denominator from Equation rpm 3 cannot be expressed for the 2D case simply To satisfy the constraints it is possible to use a result due to Sinkhorn 39 which states that a doubly stochastic matrix is obtained from any square matrix with all positive entries by the iterative process of alternating row and column normalizations Thus the algorithm is written as such 39 algorithm RPM2D M S displaystyle mathcal M mathcal S nbsp t 0 a 8 b c 0 b b0 m ij 1 ϵ displaystyle hat mu ij 1 epsilon nbsp while b lt bf while m has not converged update correspondence parameters by softassign Qij cost mij displaystyle Q ij frac partial operatorname cost partial mu ij nbsp mij0 exp bQij displaystyle mu ij 0 exp beta Q ij nbsp apply Sinkhorn s method while m displaystyle hat mu nbsp has not converged update m displaystyle hat mu nbsp by normalizing across all rows m ij1 m ij0 i 1M 1m ij0 displaystyle hat mu ij 1 frac hat mu ij 0 sum i 1 M 1 hat mu ij 0 nbsp update m displaystyle hat mu nbsp by normalizing across all columns m ij0 m ij1 j 1N 1m ij1 displaystyle hat mu ij 0 frac hat mu ij 1 sum j 1 N 1 hat mu ij 1 nbsp update pose parameters by coordinate descent update 8 using analytical solution update t using analytical solution update a b c using Newton s method b brb displaystyle beta beta r beta nbsp g gbr displaystyle gamma frac gamma beta r nbsp return a b c 8 and t where the deterministic annealing control parameter b displaystyle beta nbsp is initially set to b0 displaystyle beta 0 nbsp and increases by factor br displaystyle beta r nbsp until it reaches the maximum value bf displaystyle beta f nbsp The summations in the normalization steps sum to M 1 displaystyle M 1 nbsp and N 1 displaystyle N 1 nbsp instead of just M displaystyle M nbsp and N displaystyle N nbsp because the constraints on m displaystyle mu nbsp are inequalities As such the M 1 displaystyle M 1 nbsp th and N 1 displaystyle N 1 nbsp th elements are slack variables The algorithm can also be extended for point sets in 3D or higher dimensions The constraints on the correspondence matrix m displaystyle mathbf mu nbsp are the same in the 3D case as in the 2D case Hence the structure of the algorithm remains unchanged with the main difference being how the rotation and translation matrices are solved 39 Thin plate spline robust point matching edit nbsp Animation of 2D non rigid registration of the green point set M displaystyle mathcal M nbsp to the magenta point set S displaystyle mathcal S nbsp corrupted with noisy outliers The size of the blue circles is inversely related to the control parameter b displaystyle beta nbsp The yellow lines indicate correspondence The thin plate spline robust point matching TPS RPM algorithm by Chui and Rangarajan augments the RPM method to perform non rigid registration by parametrizing the transformation as a thin plate spline 14 However because the thin plate spline parametrization only exists in three dimensions the method cannot be extended to problems involving four or more dimensions Kernel correlation edit The kernel correlation KC approach of point set registration was introduced by Tsin and Kanade 37 Compared with ICP the KC algorithm is more robust against noisy data Unlike ICP where for every model point only the closest scene point is considered here every scene point affects every model point 37 As such this is a multiply linked registration algorithm For some kernel function K displaystyle K nbsp the kernel correlation KC displaystyle KC nbsp of two points xi xj displaystyle x i x j nbsp is defined thus 37 KC xi xj K x xi K x xj dx displaystyle KC x i x j int K x x i cdot K x x j dx nbsp kc 1 The kernel function K displaystyle K nbsp chosen for point set registration is typically symmetric and non negative kernel similar to the ones used in the Parzen window density estimation The Gaussian kernel typically used for its simplicity although other ones like the Epanechnikov kernel and the tricube kernel may be substituted 37 The kernel correlation of an entire point set x displaystyle mathcal chi nbsp is defined as the sum of the kernel correlations of every point in the set to every other point in the set 37 KC X i jKC xi xj 2 i lt jKC xi xj displaystyle KC mathcal X sum i neq j KC x i x j 2 sum i lt j KC x i x j nbsp kc 2 The logarithm of KC of a point set is proportional within a constant factor to the information entropy Observe that the KC is a measure of a compactness of the point set trivially if all points in the point set were at the same location the KC would evaluate to a large value The cost function of the point set registration algorithm for some transformation parameter 8 displaystyle theta nbsp is defined thus cost S M 8 m M s SKC s T m 8 displaystyle operatorname cost mathcal S mathcal M theta sum m in mathcal M sum s in mathcal S KC s T m theta nbsp kc 3 Some algebraic manipulation yields KC S T M 8 KC S KC T M 8 2cost S M 8 displaystyle KC mathcal S cup T mathcal M theta KC mathcal S KC T mathcal M theta 2 operatorname cost mathcal S mathcal M theta nbsp kc 4 The expression is simplified by observing that KC S displaystyle KC mathcal S nbsp is independent of 8 displaystyle theta nbsp Furthermore assuming rigid registration KC T M 8 displaystyle KC T mathcal M theta nbsp is invariant when 8 displaystyle theta nbsp is changed because the Euclidean distance between every pair of points stays the same under rigid transformation So the above equation may be rewritten as KC S T M 8 C 2cost S M 8 displaystyle KC mathcal S cup T mathcal M theta C 2 operatorname cost mathcal S mathcal M theta nbsp kc 5 The kernel density estimates are defined as PM x 8 1M m MK x T m 8 displaystyle P mathcal M x theta frac 1 M sum m in mathcal M K x T m theta nbsp PS x 1N s SK x s displaystyle P mathcal S x frac 1 N sum s in mathcal S K x s nbsp The cost function can then be shown to be the correlation of the two kernel density estimates cost S M 8 N2 xPM PS dx displaystyle operatorname cost mathcal S mathcal M theta N 2 int x P mathcal M cdot P mathcal S dx nbsp kc 6 Having established the cost function the algorithm simply uses gradient descent to find the optimal transformation It is computationally expensive to compute the cost function from scratch on every iteration so a discrete version of the cost function Equation kc 6 is used The kernel density estimates PM PS displaystyle P mathcal M P mathcal S nbsp can be evaluated at grid points and stored in a lookup table Unlike the ICP and related methods it is not necessary to find the nearest neighbour which allows the KC algorithm to be comparatively simple in implementation Compared to ICP and EM ICP for noisy 2D and 3D point sets the KC algorithm is less sensitive to noise and results in correct registration more often 37 Gaussian mixture model edit The kernel density estimates are sums of Gaussians and may therefore be represented as Gaussian mixture models GMM 40 Jian and Vemuri use the GMM version of the KC registration algorithm to perform non rigid registration parametrized by thin plate splines Coherent point drift edit nbsp Rigid with the addition of scaling registration of a blue point set M displaystyle mathcal M nbsp to the red point set S displaystyle mathcal S nbsp using the Coherent Point Drift algorithm Both point sets have been corrupted with removed points and random spurious outlier points nbsp Affine registration of a blue point set M displaystyle mathcal M nbsp to the red point set S displaystyle mathcal S nbsp using the Coherent Point Drift algorithm nbsp Non rigid registration of a blue point set M displaystyle mathcal M nbsp to the red point set S displaystyle mathcal S nbsp using the Coherent Point Drift algorithm Both point sets have been corrupted with removed points and random spurious outlier points Coherent point drift CPD was introduced by Myronenko and Song 13 41 The algorithm takes a probabilistic approach to aligning point sets similar to the GMM KC method Unlike earlier approaches to non rigid registration which assume a thin plate spline transformation model CPD is agnostic with regard to the transformation model used The point set M displaystyle mathcal M nbsp represents the Gaussian mixture model GMM centroids When the two point sets are optimally aligned the correspondence is the maximum of the GMM posterior probability for a given data point To preserve the topological structure of the point sets the GMM centroids are forced to move coherently as a group The expectation maximization algorithm is used to optimize the cost function 13 Let there be M points in M displaystyle mathcal M nbsp and N points in S displaystyle mathcal S nbsp The GMM probability density function for a point s is p s i 1M 1P i p s i displaystyle p s sum i 1 M 1 P i p s i nbsp cpd 1 where in D dimensions p s i displaystyle p s i nbsp is the Gaussian distribution centered on point mi M displaystyle m i in mathcal M nbsp p s i 1 2ps2 D 2exp s mi 22s2 displaystyle p s i frac 1 2 pi sigma 2 D 2 exp left frac lVert s m i rVert 2 2 sigma 2 right nbsp The membership probabilities P i 1M displaystyle P i frac 1 M nbsp is equal for all GMM components The weight of the uniform distribution is denoted as w 0 1 displaystyle w in 0 1 nbsp The mixture model is then p s w1N 1 w i 1M1Mp s i displaystyle p s w frac 1 N 1 w sum i 1 M frac 1 M p s i nbsp cpd 2 The GMM centroids are re parametrized by a set of parameters 8 displaystyle theta nbsp estimated by maximizing the likelihood This is equivalent to minimizing the negative log likelihood function E 8 s2 j 1Nlog i 1M 1P i p s i displaystyle E theta sigma 2 sum j 1 N log sum i 1 M 1 P i p s i nbsp cpd 3 where it is assumed that the data is independent and identically distributed The correspondence probability between two points mi displaystyle m i nbsp and sj displaystyle s j nbsp is defined as the posterior probability of the GMM centroid given the data point P i sj P i p sj i p sj displaystyle P i s j frac P i p s j i p s j nbsp The expectation maximization EM algorithm is used to find 8 displaystyle theta nbsp and s2 displaystyle sigma 2 nbsp The EM algorithm consists of two steps First in the E step or estimation step it guesses the values of parameters old parameter values and then uses Bayes theorem to compute the posterior probability distributions Pold i sj displaystyle P text old i s j nbsp of mixture components Second in the M step or maximization step the new parameter values are then found by minimizing the expectation of the complete negative log likelihood function i e the cost function cost j 1N i 1M 1Pold i sj log Pnew i pnew sj i displaystyle operatorname cost sum j 1 N sum i 1 M 1 P text old i s j log P text new i p text new s j i nbsp cpd 4 Ignoring constants independent of 8 displaystyle theta nbsp and s displaystyle sigma nbsp Equation cpd 4 can be expressed thus cost 8 s2 12s2 j 1N i 1M 1Pold i sj sj T mi 8 2 NPD2log s2 displaystyle operatorname cost theta sigma 2 frac 1 2 sigma 2 sum j 1 N sum i 1 M 1 P text old i s j lVert s j T m i theta rVert 2 frac N mathbf P D 2 log sigma 2 nbsp cpd 5 where NP j 0N i 0MPold i sj N displaystyle N mathbf P sum j 0 N sum i 0 M P text old i s j leq N nbsp with N NP displaystyle N N mathbf P nbsp only if w 0 displaystyle w 0 nbsp The posterior probabilities of GMM components computed using previous parameter values Pold displaystyle P text old nbsp is Pold i sj exp 12sold2 sj T mi 8old 2 k 1Mexp 12sold2 sj T mk 8old 2 2ps2 D2w1 wMN displaystyle P text old i s j frac exp left frac 1 2 sigma text old 2 lVert s j T m i theta text old rVert 2 right sum k 1 M exp left frac 1 2 sigma text old 2 lVert s j T m k theta text old rVert 2 right 2 pi sigma 2 frac D 2 frac w 1 w frac M N nbsp cpd 6 Minimizing the cost function in Equation cpd 5 necessarily decreases the negative log likelihood function E in Equation cpd 3 unless it is already at a local minimum 13 Thus the algorithm can be expressed using the following pseudocode where the point sets M displaystyle mathcal M nbsp and S displaystyle mathcal S nbsp are represented as M D displaystyle M times D nbsp and N D displaystyle N times D nbsp matrices M displaystyle mathbf M nbsp and S displaystyle mathbf S nbsp respectively 13 algorithm CPD M S displaystyle mathcal M mathcal S nbsp 8 80 initialize 0 w 1 s2 1DNM j 1N i 1M sj mi 2 displaystyle sigma 2 frac 1 DNM sum j 1 N sum i 1 M lVert s j m i rVert 2 nbsp while not registered E step compute P for i 1 M and j 1 N pij exp 12s2 sj T mi 8 2 k 1Mexp 12s2 sj T mk 8 2 2ps2 D2w1 wMN displaystyle p ij frac exp left frac 1 2 sigma 2 lVert s j T m i theta rVert 2 right sum k 1 M exp left frac 1 2 sigma 2 lVert s j T m k theta rVert 2 right 2 pi sigma 2 frac D 2 frac w 1 w frac M N nbsp M step solve for optimal transformation 8 s2 solve S M P return 8 where the vector 1 displaystyle mathbf 1 nbsp is a column vector of ones The b solve b function differs by the type of registration performed For example in rigid registration the output is a scale a a rotation matrix R displaystyle mathbf R nbsp and a translation vector t displaystyle mathbf t nbsp The parameter 8 displaystyle theta nbsp can be written as a tuple of these 8 a R t displaystyle theta lbrace a mathbf R mathbf t rbrace nbsp which is initialized to one the identity matrix and a column vector of zeroes 80 1 I 0 displaystyle theta 0 lbrace 1 mathbf I mathbf 0 rbrace nbsp The aligned point set is T M aMRT 1tT displaystyle T mathbf M a mathbf M mathbf R T mathbf 1 mathbf t T nbsp The b solve rigid b function for rigid registration can then be written as follows with derivation of the algebra explained in Myronenko s 2010 paper 13 solve rigid S M P NP 1TP1 ms 1NPSTPT1 displaystyle mu s frac 1 N mathbf P mathbf S T mathbf P T mathbf 1 nbsp mm 1NPMTP1 displaystyle mu m frac 1 N mathbf P mathbf M T mathbf P mathbf 1 nbsp S S 1msT displaystyle hat mathbf S mathbf S mathbf 1 mu s T nbsp M M 1mmT displaystyle hat mathbf M mathbf M mathbf 1 mu m T nbsp A ST PTM displaystyle mathbf A hat mathbf S T mathbf P T hat mathbf M nbsp U V svd A the singular value decomposition of A USVT C diag 1 1 det UVT diag 3 is the diagonal matrix formed from vector 3 R UCVT a tr ATR tr M Tdiag P1 M displaystyle a frac operatorname tr mathbf A T mathbf R operatorname tr mathbf hat mathbf M T operatorname diag mathbf P mathbf 1 hat mathbf M nbsp tr is the trace of a matrix t ms aRmm s2 1NPD tr S Tdiag PT1 S atr ATR displaystyle sigma 2 frac 1 N mathbf P D operatorname tr mathbf hat mathbf S T operatorname diag mathbf P T mathbf 1 hat mathbf S a operatorname tr mathbf A T mathbf R nbsp return a R t s2 For affine registration where the goal is to find an affine transformation instead of a rigid one the output is an affine transformation matrix B displaystyle mathbf B nbsp and a translation t displaystyle mathbf t nbsp such that the aligned point set is T M MBT 1tT displaystyle T mathbf M mathbf M mathbf B T mathbf 1 mathbf t T nbsp The b solve affine b function for rigid registration can then be written as follows with derivation of the algebra explained in Myronenko s 2010 paper 13 solve affine S M P NP 1TP1 ms 1NPSTPT1 displaystyle mu s frac 1 N mathbf P mathbf S T mathbf P T mathbf 1 nbsp mm 1NPMTP1 displaystyle mu m frac 1 N mathbf P mathbf M T mathbf P mathbf 1 nbsp S S 1msT displaystyle hat mathbf S mathbf S mathbf 1 mu s T nbsp M M 1mmT displaystyle hat mathbf M mathbf M mathbf 1 mu m T nbsp B S TPTM M Tdiag P1 M 1 displaystyle mathbf B hat mathbf S T mathbf P T hat mathbf M hat mathbf M T operatorname diag mathbf P mathbf 1 hat mathbf M 1 nbsp t ms Bmm s2 1NPD tr S Tdiag PT1 S tr S TPTM BT displaystyle sigma 2 frac 1 N mathbf P D operatorname tr hat mathbf S T operatorname diag mathbf P T mathbf 1 hat mathbf S operatorname tr hat mathbf S T mathbf P T hat mathbf M mathbf B T nbsp return B t s2 It is also possible to use CPD with non rigid registration using a parametrization derived using calculus of variations 13 Sums of Gaussian distributions can be computed in linear time using the fast Gauss transform FGT 13 Consequently the time complexity of CPD is O M N displaystyle O M N nbsp which is asymptotically much faster than O MN displaystyle O MN nbsp methods 13 Bayesian coherent point drift BCPD edit A variant of coherent point drift called Bayesian coherent point drift BCPD was derived through a Bayesian formulation of point set registration 42 BCPD has several advantages over CPD e g 1 nonrigid and rigid registrations can be performed in a single algorithm 2 the algorithm can be accelerated regardless of the Gaussianity of a Gram matrix to define motion coherence 3 the algorithm is more robust against outliers because of a more reasonable definition of an outlier distribution Additionally in the Bayesian formulation motion coherence was introduced through a prior distribution of displacement vectors providing a clear difference between tuning parameters that control motion coherence BCPD was further accelerated by a method called BCPD which is a three step procedure composed of 1 downsampling of point sets 2 registration of downsampled point sets and 3 interpolation of a deformation field 43 The method can register point sets composed of more than 10M points while maintaining its registration accuracy Coherent point drift with local surface geometry LSG CPD edit An variant of coherent point drift called CPD with Local Surface Geometry LSG CPD for rigid point cloud registration 44 The method adaptively adds different levels of point to plane penalization on top of the point to point penalization based on the flatness of the local surface This results in GMM components with anisotropic covariances instead of the isotropic covariances in the original CPD 13 The anisotropic covariance matrix is modeled as Sm 1 1s2 amnmnmT I displaystyle Sigma m 1 frac 1 sigma 2 left alpha m mathbf n m mathbf n m T mathbf I right nbsp lsg cpd 1 where am 1 exp l 3 1km 1 exp l 3 1km amax displaystyle alpha m frac 1 exp left lambda left 3 frac 1 kappa m right right 1 exp left lambda left 3 frac 1 kappa m right right alpha max nbsp lsg cpd 2 Sm displaystyle Sigma m nbsp is the anisotropic covariance matrix of the m th point in the target set nm displaystyle mathbf n m nbsp is the normal vector corresponding to the same point I displaystyle mathbf I nbsp is an identity matrix serving as a regularizer pulling the problem away from ill posedness am displaystyle alpha m nbsp is penalization coefficient a modified sigmoid function which is set adaptively to add different levels of point to plane penalization depending on how flat the local surface is This is realized by evaluating the surface variation km displaystyle kappa m nbsp 45 within the neighborhood of the m th target point amax displaystyle alpha max nbsp is the upper bound of the penalization The point cloud registration is formulated as a maximum likelihood estimation MLE problem and solve it with the Expectation Maximization EM algorithm In the E step the correspondence computation is recast into simple matrix manipulations and efficiently computed on a GPU In the M step an unconstrained optimization on a matrix Lie group is designed to efficiently update the rigid transformation of the registration Taking advantage of the local geometric covariances the method shows a superior performance in accuracy and robustness to noise and outliers compared with the baseline CPD 46 An enhanced runtime performance is expected thanks to the GPU accelerated correspondence calculation An implementation of the LSG CPD is open sourced here Sorting the Correspondence Space SCS edit This algorithm was introduced in 2013 by H Assalih to accommodate sonar image registration 47 These types of images tend to have high amounts of noise so it is expected to have many outliers in the point sets to match SCS delivers high robustness against outliers and can surpass ICP and CPD performance in the presence of outliers SCS doesn t use iterative optimization in high dimensional space and is neither probabilistic nor spectral SCS can match rigid and non rigid transformations and performs best when the target transformation is between three and six degrees of freedom See also editPoint feature matching Point set triangulation Normal distributions transformReferences edit Zhang Ji Singh Sanjiv May 2015 Visual lidar odometry and mapping Low drift robust and fast 2015 IEEE International Conference on Robotics and Automation ICRA pp 2174 2181 doi 10 1109 ICRA 2015 7139486 ISBN 978 1 4799 6923 4 S2CID 6054487 Choi Sungjoon Zhou Qian Yi Koltun Vladlen 2015 Robust reconstruction of indoor scenes PDF Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition CVPR 5556 5565 Lai Kevin Bo Liefeng Ren Xiaofeng Fox Dieter May 2011 A large scale hierarchical multi view RGB D object dataset 2011 IEEE International Conference on Robotics and Automation pp 1817 1824 CiteSeerX 10 1 1 190 1598 doi 10 1109 ICRA 2011 5980382 ISBN 978 1 61284 386 5 S2CID 14986048 a b c Yang Heng Carlone Luca 2019 A polynomial time solution for robust registration with extreme outlier rates Robotics Science and Systems arXiv 1903 08588 doi 10 15607 RSS 2019 XV 003 ISBN 978 0 9923747 5 4 S2CID 84186750 Calli Berk Singh Arjun Bruce James Walsman Aaron Konolige Kurt Srinivasa Siddhartha Abbeel Pieter Dollar Aaron M 2017 03 01 Yale CMU Berkeley dataset for robotic manipulation research The International Journal of Robotics Research 36 3 261 268 doi 10 1177 0278364917700714 ISSN 0278 3649 S2CID 6522002 Cadena Cesar Carlone Luca Carrillo Henry Latif Yasir Scaramuzza Davide Neira Jose Reid Ian Leonard John J December 2016 Past Present and Future of Simultaneous Localization and Mapping Toward the Robust Perception Age IEEE Transactions on Robotics 32 6 1309 1332 arXiv 1606 05830 Bibcode 2016arXiv160605830C doi 10 1109 TRO 2016 2624754 ISSN 1941 0468 S2CID 2596787 Mur Artal Raul Montiel J M M Tardos Juan D October 2015 ORB SLAM A Versatile and Accurate Monocular SLAM System IEEE Transactions on Robotics 31 5 1147 1163 arXiv 1502 00956 Bibcode 2015arXiv150200956M doi 10 1109 TRO 2015 2463671 ISSN 1941 0468 S2CID 206775100 a b c Yang Heng Carlone Luca 2019 A Quaternion based Certifiably Optimal Solution to the Wahba Problem with Outliers PDF Proceedings of the IEEE International Conference on Computer Vision ICCV 1665 1674 arXiv 1905 12536 Bibcode 2019arXiv190512536Y Newcombe Richard A Izadi Shahram Hilliges Otmar Molyneaux David Kim David Davison Andrew J Kohi Pushmeet Shotton Jamie Hodges Steve Fitzgibbon Andrew October 2011 KinectFusion Real time dense surface mapping and tracking 2011 10th IEEE International Symposium on Mixed and Augmented Reality pp 127 136 CiteSeerX 10 1 1 453 53 doi 10 1109 ISMAR 2011 6092378 ISBN 978 1 4577 2183 0 S2CID 11830123 Audette Michel A Ferrie Frank P Peters Terry M 2000 09 01 An algorithmic overview of surface registration techniques for medical imaging Medical Image Analysis 4 3 201 217 doi 10 1016 S1361 8415 00 00014 1 ISSN 1361 8415 PMID 11145309 a b Jian Bing Vemuri Baba C 2011 Robust Point Set Registration Using Gaussian Mixture Models IEEE Transactions on Pattern Analysis and Machine Intelligence 33 8 1633 1645 doi 10 1109 tpami 2010 223 PMID 21173443 S2CID 10923565 a b Fitzgibbon Andrew W 2003 Robust registration of 2D and 3D point sets Image and Vision Computing 21 13 1145 1153 CiteSeerX 10 1 1 335 116 doi 10 1016 j imavis 2003 09 004 a b c d e f g h i j k l m Myronenko Andriy Song Xubo 2010 Point set registration Coherent Point drift IEEE Transactions on Pattern Analysis and Machine Intelligence 32 2 2262 2275 arXiv 0905 2635 doi 10 1109 tpami 2010 46 PMID 20975122 S2CID 10809031 a b c Chui Haili Rangarajan Anand 2003 A new point matching algorithm for non rigid registration Computer Vision and Image Understanding 89 2 114 141 CiteSeerX 10 1 1 7 4365 doi 10 1016 S1077 3142 03 00009 2 Holz Dirk Ichim Alexandru E Tombari Federico Rusu Radu B Behnke Sven 2015 Registration with the Point Cloud Library A Modular Framework for Aligning in 3 D IEEE Robotics Automation Magazine 22 4 110 124 doi 10 1109 MRA 2015 2432331 S2CID 2621807 a b c Horn Berthold K P 1987 04 01 Closed form solution of absolute orientation using unit quaternions JOSA A 4 4 629 642 Bibcode 1987JOSAA 4 629H doi 10 1364 JOSAA 4 000629 ISSN 1520 8532 S2CID 11038004 a b Arun K S Huang T S Blostein S D September 1987 Least Squares Fitting of Two 3 D Point Sets IEEE Transactions on Pattern Analysis and Machine Intelligence PAMI 9 5 698 700 doi 10 1109 TPAMI 1987 4767965 ISSN 1939 3539 PMID 21869429 S2CID 8724100 Briales Jesus Gonzalez Jimenez Javier July 2017 Convex Global 3D Registration with Lagrangian Duality 2017 IEEE Conference on Computer Vision and Pattern Recognition CVPR pp 5612 5621 doi 10 1109 CVPR 2017 595 hdl 10630 14599 ISBN 978 1 5386 0457 1 S2CID 11549421 a b c d e Yang Heng Shi Jingnan Carlone Luca 2020 01 21 TEASER Fast and Certifiable Point Cloud Registration arXiv 2001 07715 cs RO a b c Parra Bustos Alvaro Chin Tat Jun December 2018 Guaranteed Outlier Removal for Point Cloud Registration with Correspondences IEEE Transactions on Pattern Analysis and Machine Intelligence 40 12 2868 2882 arXiv 1711 10209 doi 10 1109 TPAMI 2017 2773482 ISSN 1939 3539 PMID 29990122 S2CID 3331003 a b Chin Tat Jun Suter David 2017 02 27 The Maximum Consensus Problem Recent Algorithmic Advances Synthesis Lectures on Computer Vision 7 2 1 194 doi 10 2200 s00757ed1v01y201702cov011 ISSN 2153 1056 a b Wen Fei Ying Rendong Gong Zheng Liu Peilin February 2020 Efficient Algorithms for Maximum Consensus Robust Fitting IEEE Transactions on Robotics 36 1 92 106 doi 10 1109 TRO 2019 2943061 ISSN 1941 0468 S2CID 209976632 a b Cai Zhipeng Chin Tat Jun Koltun Vladlen 2019 Consensus Maximization Tree Search Revisited Proceedings of IEEE International Conference on Computer Vision ICCV 1637 1645 arXiv 1908 02021 Bibcode 2019arXiv190802021C Bazin Jean Charles Seo Yongduek Pollefeys Marc 2013 Globally Optimal Consensus Set Maximization through Rotation Search In Lee Kyoung Mu Matsushita Yasuyuki Rehg James M Hu Zhanyi eds Computer Vision ACCV 2012 Lecture Notes in Computer Science Vol 7725 Berlin Heidelberg Springer pp 539 551 doi 10 1007 978 3 642 37444 9 42 ISBN 978 3 642 37444 9 Hartley Richard I Kahl Fredrik 2009 04 01 Global Optimization through Rotation Space Search International Journal of Computer Vision 82 1 64 79 doi 10 1007 s11263 008 0186 9 hdl 1885 50831 ISSN 1573 1405 S2CID 509788 Fischler Martin Bolles Robert 1981 Random sample consensus a paradigm for model fitting with applications to image analysis and automated cartography Communications of the ACM 24 6 381 395 doi 10 1145 358669 358692 S2CID 972888 Le Huu Minh Chin Tat Jun Eriksson Anders Do Thanh Toan Suter David 2019 Deterministic Approximate Methods for Maximum Consensus Robust Fitting IEEE Transactions on Pattern Analysis and Machine Intelligence 43 3 842 857 arXiv 1710 10003 doi 10 1109 TPAMI 2019 2939307 ISSN 1939 3539 PMID 31494545 S2CID 29346470 Bustos Alvaro Parra Chin Tat Jun Neumann Frank Friedrich Tobias Katzmann Maximilian 2019 02 04 A Practical Maximum Clique Algorithm for Matching with Pairwise Constraints arXiv 1902 01534 cs CV Huber Peter J Ronchetti Elvezio M 2009 01 29 Robust Statistics Wiley Series in Probability and Statistics Hoboken NJ USA John Wiley amp Sons Inc doi 10 1002 9780470434697 ISBN 978 0 470 43469 7 a b Zhou Qian Yi Park Jaesik Koltun Vladlen 2016 Fast Global Registration In Leibe Bastian Matas Jiri Sebe Nicu Welling Max eds Computer Vision ECCV 2016 Lecture Notes in Computer Science Vol 9906 Cham Springer International Publishing pp 766 782 doi 10 1007 978 3 319 46475 6 47 ISBN 978 3 319 46475 6 S2CID 27362942 MacTavish Kirk Barfoot Timothy D 2015 At all Costs A Comparison of Robust Cost Functions for Camera Correspondence Outliers 2015 12th Conference on Computer and Robot Vision pp 62 69 doi 10 1109 CRV 2015 52 ISBN 978 1 4799 1986 4 S2CID 9305263 Bosse Michael Agamennoni Gabriel Gilitschenski Igor 2016 Robust Estimation and Applications in Robotics Foundations and Trends in Robotics 4 4 now 225 269 doi 10 1561 2300000047 a b Black Michael J Rangarajan Anand 1996 07 01 On the unification of line processes outlier rejection and robust statistics with applications in early vision International Journal of Computer Vision 19 1 57 91 doi 10 1007 BF00131148 ISSN 1573 1405 S2CID 7510079 a b Blake Andrew Zisserman Andrew 1987 Visual reconstruction The MIT Press ISBN 9780262524063 a b Yang Heng Antonante Pasquale Tzoumas Vasileios Carlone Luca 2020 Graduated Non Convexity for Robust Spatial Perception From Non Minimal Solvers to Global Outlier Rejection IEEE Robotics and Automation Letters 5 2 1127 1134 arXiv 1909 08605 doi 10 1109 LRA 2020 2965893 ISSN 2377 3774 S2CID 202660784 Besl Paul McKay Neil 1992 A Method for Registration of 3 D Shapes IEEE Transactions on Pattern Analysis and Machine Intelligence 14 2 239 256 Bibcode 1992SPIE 1611 586B doi 10 1109 34 121791 a b c d e f g h i Tsin Yanghai Kanade Takeo 2004 A Correlation Based Approach to Robust Point Set Registration Computer Vision ECCV 2004 Lecture Notes in Computer Science Vol 3023 Springer Berlin Heidelberg pp 558 569 CiteSeerX 10 1 1 156 6729 doi 10 1007 978 3 540 24672 5 44 ISBN 978 3 540 21982 8 a href Template Cite book html title Template Cite book cite book a journal ignored help sp, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.