Jekyll2019-11-10T21:50:57+00:00http://gtsam.org/feed.xmlGTSAMGTSAM is a BSD-licensed C++ library that implements sensor fusion for robotics and computer vision using factor graphs.Look Ma, No RANSAC2019-09-20T00:00:00+00:002019-09-20T00:00:00+00:00http://gtsam.org/2019/09/20/robust-noise-model<p>Author: Varun Agrawal<br />
Website: <a href="https://varunagrawal.github.io">varunagrawal.github.io</a></p>
<p>(<em><strong>Psst</strong>, be sure to clone/download GTSAM 4.0.2 which resolves a bug in the Huber model discussed below, for correct weight behavior</em>)</p>
<h2 id="introduction">Introduction</h2>
<figure class="center">
<img src="/assets/images/robust_estimators/se2_matches.png" alt="Matches between books" />
<figcaption><b>Figure 1</b> Two books on a plane separated by an SE(2) transform, and some manually selected feature matches between them. There are some clearly incorrect matches, which are <b>outliers</b>.</figcaption>
</figure>
<p>Robust error models are powerful tools for supplementing parameter estimation algorithms with the added capabilities of modeling outliers. Parameter estimation is a fundamental tool used in many fields, especially in perception and robotics, and thus performing robust parameter estimation across a wide range of applications and scenarios is crucial to strong performance for many applications. This necessitates the need to manage outliers in our measurements, and robust error models provide us the means to do so. Robust error models are amenable to easy plug-and-play use in pre-existing optimization frameworks, requiring minimal changes to existing pipelines.</p>
<p>Using robust error models can even obviate the need for more complex, non-deterministic algorithms such as Random Sample Consensus (a.k.a. RANSAC), a fundamental tool for parameter estimation in many a roboticist’s toolbox for years. While RANSAC has proven to work well in practice, it might need a high number of runs to converge to a good solution. Robust error models are conceptually easy and intuitive and can be used by themselves or in conjunction with RANSAC.</p>
<p>In this blog post, we demonstrate the capabilities of robust error models which downweigh outliers to provide better estimates of the parameters of interest. To illustrate the benefits of robust error models, take a look at <strong>Figure 1</strong>. We show two books next to each other, transformed by an $SE(2)$ transform, with some manually labeled point matches between the two books. We have added some outliers to more realistically model the problem.
As we will see later in the post, the estimates from RANSAC and a robust estimation procedure are quite similar, even though the ratio of inliers to outliers is $2:1$.</p>
<h2 id="parameter-estimation---a-short-tutorial">Parameter Estimation - A Short Tutorial</h2>
<p>We begin by reviewing techniques for parameter estimation as outlined by the <a href="https://www.microsoft.com/en-us/research/wp-content/uploads/2016/11/RR-2676.pdf">tutorial by Zhengyou Zhang</a>, as applied to our $SE(2)$. Given some matches ${(s,s’)}$ between the two images, we want to estimate the the $SE(2)$ parameters $\theta$ that transform a feature $s$ in the first image to a feature $s’$ in the second image:</p>
<p>\[ s’ = f(\theta ; s) \]</p>
<p>Of course, this generalizes to other transformations, including 3D-2D problems. This is a ubiquitous problem seen in multiple domains of perception and robotics and is referred to as <strong>parameter estimation</strong>.</p>
<p>A standard framework to estimate the parameters is via the Least-Squares formulation. We usually have many more observations than we have parameters, i.e., the problem is now overdetermined. To handle this, we minimize the sum of square <strong>residuals</strong> $f(\theta ; s_i) - s’_i$, for $i\in1\ldots N$:</p>
<p>\[ E_{LS}(\theta) = \sum_i \vert\vert f(\theta ; s_i) - s’_i \vert\vert^2 \]</p>
<p>which we refer to as <strong>the cost function</strong> or <strong>the objective function</strong>.
In the case of $SE(2)$ the parameters $\theta$ should be some parameterization of a transformation matrix, having three degrees of freedom (DOF). A simple way to accomplish this is to have $\theta=(x,y,\alpha)$.</p>
<p>Our measurement functions are generally non-linear, and hence we need to linearize the measurement function around an estimate of $\theta$. GTSAM will iteratively do so via optimization procedures such as Gauss-Newton, Levenberg-Marquardt, or Dogleg. Linearization is done via the <strong>Taylor expansion</strong> around a linearization point $\theta_0$:</p>
<p>\[ f(\theta + \Delta\theta; s) = f(\theta; s) + J(\theta; s)\Delta\theta \]</p>
<p>This gives us the following linearized least squares objective function:</p>
<p>\[ E_{LS, \theta_0} = \sum_i \vert\vert f(\theta; s_i) + J(\theta; s_i)\Delta\theta - s_i’ \vert\vert^2 \]</p>
<p>Since the above is now linear in $\Delta\theta$, GTSAM can solve it using either sparse Cholesky or QR factorization.</p>
<h2 id="robust-error-models">Robust Error Models</h2>
<p>We have derived the basic parameter optimization approach in the previous section and seen how the choice of the optimization function affects the optimality of our solution. However, another aspect we need to take into account is the effect of outliers on our optimization and final parameter values.</p>
<p>By default, the optimization objectives outlined above try to model all measurements equally. This means that in the presence of outliers, the optimization process might give us parameter estimates that try to fit these outliers, sacrificing accuracy on the inliers. More formally, given the <em>residual</em> $r_i$ of the $i^{th}$ match, i.e. the difference between the $i^{th}$ observation and the fitted value, the standard least squares approach attempts to optimize the sum of all the squared residuals. This can lead to the estimated parameters being distorted due to the equal weighting of all data points. Surely, there must be a way for the objective to model inliers and outliers in a clever way based on the residual errors?</p>
<p>One way to tackle the presence of outliers is a family of models known as <strong>Robust Error Models</strong> or <strong>M-estimators</strong>. The M-estimators try to reduce the effect of outliers by replacing the squared residuals with a function of the residuals $\rho$ that weighs each residual term by some value:</p>
<p>\[ p = min \sum_i^n \rho(r_i) \]</p>
<p>To allow for optimization, we define $\rho$ to be a symmetric, positive-definite function, thus it has a unique minimum at zero, and is less increasing than square.</p>
<p>The benefit of this formulation is that we can now solve the above minimization objective as an <strong>Iteratively Reweighted Least Squares</strong> problem. The M-estimator of the parameter vector $p$ based on $\rho$ is the value of the parameters which solves</p>
<p>\[ \sum_i \psi(r_i)\frac{\delta r_i}{\delta p_j} = 0 \]</p>
<p>for $j = 1, …, m$ (recall that the maximum/minimum of a function is at the point its derivative is equal to zero). Above, $\psi(x) = \frac{\delta \rho(x)}{\delta x}$ is called the <strong>influence function</strong>, which we can use to define a <strong>weight function</strong> $w(x) = \frac{\psi{x}}{x}$ giving the original derivative as</p>
<p>\[ \sum_i w(r_i) r_i \frac{\delta r_i}{\delta p_j} = 0 \]</p>
<p>which is exactly the system of equations we obtain from iterated reweighted least squares.</p>
<p>In layman’s terms, the influence function $\psi(x)$ measures the influence of a data point on the value of the parameter estimate. This way, the estimated parameters are intelligent to outliers and only sensitive to inliers, since the are no longer susceptible to being significantly modified by a single match, thus making them <strong>robust</strong>.</p>
<h3 id="m-estimator-constraints">M-estimator Constraints</h3>
<p>While M-estimators provide us with significant benefits with respect to outlier modeling, they do come with some constraints which are required to enable their use in a wide variety of optimization problems.</p>
<ul>
<li>The influence function should be bounded.</li>
<li>The robust estimator should be unique, i.e. it should have a unique minimum. This implies that <em>the individual $\rho$-function is convex in variable <strong>p</strong></em>.</li>
<li>The objective should have a gradient, even when the 2nd derivative is singular.</li>
</ul>
<h3 id="common-m-estimators">Common M-estimators</h3>
<p>Below we list some of the common estimators from the literature and which are available out of the box in GTSAM. We also provide accompanying graphs of the corresponding <strong>$\rho$ function</strong>, the <strong>influence function</strong>, and the <strong>weight function</strong> in order, allowing one to visualize the differences and effects of each estimators influence function.</p>
<ol>
<li>
<p>Fair
<img src="/assets/images/robust_estimators/fair.png" alt="fair m-estimator" /></p>
</li>
<li>
<p>Huber
<img src="/assets/images/robust_estimators/huber.png" alt="huber m-estimator" /></p>
</li>
<li>
<p>Cauchy
<img src="/assets/images/robust_estimators/cauchy.png" alt="cauchy m-estimator" /></p>
</li>
<li>
<p>Geman-McClure
<img src="/assets/images/robust_estimators/gemanmcclure.png" alt="geman-mcclure m-estimator" /></p>
</li>
<li>
<p>Welsch
<img src="/assets/images/robust_estimators/welsch.png" alt="welsch m-estimator" /></p>
</li>
<li>
<p>Tukey
<img src="/assets/images/robust_estimators/tukey.png" alt="tukey m-estimator" /></p>
</li>
</ol>
<h2 id="example-with-huber-noise-model">Example with Huber Noise Model</h2>
<p>Now it’s time for the real deal. So far we’ve spoken about how great robust estimators are, and how they can be easily modeled in a least squares objective, but having a concrete example and application can really help illuminate these concepts and demonstrate the power of a robust error model. In this specific case we use the <strong>Huber M-estimator</strong>, though any other provided M-estimator can be used depending on the application or preference.</p>
<p>For our example application, the estimation of an $SE(2)$ transformation between two objects (a scenario commonly seen in PoseSLAM applications), we go back to our image of the two books from the introduction, which we have manually labeled with matches and outliers. A RANSAC estimate using the matches gives us the $SE(2)$ paramters <code class="language-plaintext highlighter-rouge">(347.15593, 420.31040, 0.39645)</code>.</p>
<figure class="center">
<img src="/assets/images/robust_estimators/se2_matches.png" alt="Matches between books" />
<figcaption><b>Figure 2</b>: Matches between the 2 books.</figcaption>
</figure>
<p>To begin, we apply a straightforward optimization process based on Factor Graphs. Using GTSAM, this can be achieved in a few lines of code. We show the core part of the example below, omitting the housekeeping and data loading for brevity.</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// This is the value we wish to estimate</span>
<span class="n">Pose2_</span> <span class="nf">pose_expr</span><span class="p">(</span><span class="mi">0</span><span class="p">);</span>
<span class="c1">// Set up initial values, and Factor Graph</span>
<span class="n">Values</span> <span class="n">initial</span><span class="p">;</span>
<span class="n">ExpressionFactorGraph</span> <span class="n">graph</span><span class="p">;</span>
<span class="c1">// provide an initial estimate which is pretty much random</span>
<span class="n">initial</span><span class="p">.</span><span class="n">insert</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">Pose2</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mf">0.01</span><span class="p">));</span>
<span class="c1">// We assume the same noise model for all points (since it is the same camera)</span>
<span class="k">auto</span> <span class="n">measurementNoise</span> <span class="o">=</span> <span class="n">noiseModel</span><span class="o">::</span><span class="n">Isotropic</span><span class="o">::</span><span class="n">Sigma</span><span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="mf">1.0</span><span class="p">);</span>
<span class="c1">// Now we add in the factors for the measurement matches.</span>
<span class="c1">// Matches is a vector of 4 tuples (index1, keypoint1, index2, keypoint2)</span>
<span class="kt">int</span> <span class="n">index_i</span><span class="p">,</span> <span class="n">index_j</span><span class="p">;</span>
<span class="n">Point2</span> <span class="n">p</span><span class="p">,</span> <span class="n">measurement</span><span class="p">;</span>
<span class="k">for</span> <span class="p">(</span><span class="n">vector</span><span class="o"><</span><span class="n">tuple</span><span class="o"><</span><span class="kt">int</span><span class="p">,</span> <span class="n">Point2</span><span class="p">,</span> <span class="kt">int</span><span class="p">,</span> <span class="n">Point2</span><span class="o">>>::</span><span class="n">iterator</span> <span class="n">it</span> <span class="o">=</span> <span class="n">matches</span><span class="p">.</span><span class="n">begin</span><span class="p">();</span>
<span class="n">it</span> <span class="o">!=</span> <span class="n">matches</span><span class="p">.</span><span class="n">end</span><span class="p">();</span> <span class="o">++</span><span class="n">it</span><span class="p">)</span> <span class="p">{</span>
<span class="n">std</span><span class="o">::</span><span class="n">tie</span><span class="p">(</span><span class="n">index_i</span><span class="p">,</span> <span class="n">measurement</span><span class="p">,</span> <span class="n">index_j</span><span class="p">,</span> <span class="n">p</span><span class="p">)</span> <span class="o">=</span> <span class="o">*</span><span class="n">it</span><span class="p">;</span>
<span class="n">Point2_</span> <span class="n">predicted</span> <span class="o">=</span> <span class="n">transformTo</span><span class="p">(</span><span class="n">pose_expr</span><span class="p">,</span> <span class="n">p</span><span class="p">);</span>
<span class="c1">// Add the Point2 expression variable, an initial estimate, and the measurement noise.</span>
<span class="n">graph</span><span class="p">.</span><span class="n">addExpressionFactor</span><span class="p">(</span><span class="n">predicted</span><span class="p">,</span> <span class="n">measurement</span><span class="p">,</span> <span class="n">measurementNoise</span><span class="p">);</span>
<span class="p">}</span>
<span class="c1">// Optimize and print basic result</span>
<span class="n">Values</span> <span class="n">result</span> <span class="o">=</span> <span class="n">LevenbergMarquardtOptimizer</span><span class="p">(</span><span class="n">graph</span><span class="p">,</span> <span class="n">initial</span><span class="p">).</span><span class="n">optimize</span><span class="p">();</span>
<span class="n">result</span><span class="p">.</span><span class="n">print</span><span class="p">(</span><span class="s">"Final Result:</span><span class="se">\n</span><span class="s">"</span><span class="p">);</span>
</code></pre></div></div>
<p>It is important to note that our initial estimate for the transform values is pretty arbitrary.
Running the above code give us the transform values <code class="language-plaintext highlighter-rouge">(305.751, 520.127, 0.284743)</code>, which when compared to the RANSAC estimate doesn’t look so good.</p>
<p>Now how about we try using M-estimators via the built-in robust error models? This is a two line change as illustrated below:</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// This is the value we wish to estimate</span>
<span class="n">Pose2_</span> <span class="nf">pose_expr</span><span class="p">(</span><span class="mi">0</span><span class="p">);</span>
<span class="c1">// Set up initial values, and Factor Graph</span>
<span class="n">Values</span> <span class="n">initial</span><span class="p">;</span>
<span class="n">ExpressionFactorGraph</span> <span class="n">graph</span><span class="p">;</span>
<span class="c1">// provide an initial estimate which is pretty much random</span>
<span class="n">initial</span><span class="p">.</span><span class="n">insert</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="n">Pose2</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mf">0.01</span><span class="p">));</span>
<span class="c1">// We assume the same noise model for all points (since it is the same camera)</span>
<span class="k">auto</span> <span class="n">measurementNoise</span> <span class="o">=</span> <span class="n">noiseModel</span><span class="o">::</span><span class="n">Isotropic</span><span class="o">::</span><span class="n">Sigma</span><span class="p">(</span><span class="mi">2</span><span class="p">,</span> <span class="mf">1.0</span><span class="p">);</span>
<span class="cm">/********* First change *********/</span>
<span class="c1">// We define our robust error model here, providing the default parameter value for the estimator.</span>
<span class="k">auto</span> <span class="n">huber</span> <span class="o">=</span> <span class="n">noiseModel</span><span class="o">::</span><span class="n">Robust</span><span class="o">::</span><span class="n">Create</span><span class="p">(</span><span class="n">noiseModel</span><span class="o">::</span><span class="n">mEstimator</span><span class="o">::</span><span class="n">Huber</span><span class="o">::</span><span class="n">Create</span><span class="p">(</span><span class="mf">1.345</span><span class="p">),</span> <span class="n">measurementNoise</span><span class="p">);</span>
<span class="c1">// Now we add in the factors for the measurement matches.</span>
<span class="c1">// Matches is a vector of 4 tuples (index1, keypoint1, index2, keypoint2)</span>
<span class="kt">int</span> <span class="n">index_i</span><span class="p">,</span> <span class="n">index_j</span><span class="p">;</span>
<span class="n">Point2</span> <span class="n">p</span><span class="p">,</span> <span class="n">measurement</span><span class="p">;</span>
<span class="k">for</span> <span class="p">(</span><span class="n">vector</span><span class="o"><</span><span class="n">tuple</span><span class="o"><</span><span class="kt">int</span><span class="p">,</span> <span class="n">Point2</span><span class="p">,</span> <span class="kt">int</span><span class="p">,</span> <span class="n">Point2</span><span class="o">>>::</span><span class="n">iterator</span> <span class="n">it</span> <span class="o">=</span> <span class="n">matches</span><span class="p">.</span><span class="n">begin</span><span class="p">();</span>
<span class="n">it</span> <span class="o">!=</span> <span class="n">matches</span><span class="p">.</span><span class="n">end</span><span class="p">();</span> <span class="o">++</span><span class="n">it</span><span class="p">)</span> <span class="p">{</span>
<span class="n">std</span><span class="o">::</span><span class="n">tie</span><span class="p">(</span><span class="n">index_i</span><span class="p">,</span> <span class="n">measurement</span><span class="p">,</span> <span class="n">index_j</span><span class="p">,</span> <span class="n">p</span><span class="p">)</span> <span class="o">=</span> <span class="o">*</span><span class="n">it</span><span class="p">;</span>
<span class="n">Point2_</span> <span class="n">predicted</span> <span class="o">=</span> <span class="n">transformTo</span><span class="p">(</span><span class="n">pose_expr</span><span class="p">,</span> <span class="n">p</span><span class="p">);</span>
<span class="c1">// Add the Point2 expression variable, an initial estimate, and the measurement noise.</span>
<span class="c1">// The graph takes in factors with the robust error model.</span>
<span class="cm">/********* Second change *********/</span>
<span class="n">graph</span><span class="p">.</span><span class="n">addExpressionFactor</span><span class="p">(</span><span class="n">predicted</span><span class="p">,</span> <span class="n">measurement</span><span class="p">,</span> <span class="n">huber</span><span class="p">);</span>
<span class="p">}</span>
<span class="c1">// Optimize and print basic result</span>
<span class="n">Values</span> <span class="n">result</span> <span class="o">=</span> <span class="n">LevenbergMarquardtOptimizer</span><span class="p">(</span><span class="n">graph</span><span class="p">,</span> <span class="n">initial</span><span class="p">).</span><span class="n">optimize</span><span class="p">();</span>
<span class="n">result</span><span class="p">.</span><span class="n">print</span><span class="p">(</span><span class="s">"Final Result:</span><span class="se">\n</span><span class="s">"</span><span class="p">);</span>
</code></pre></div></div>
<p>This version of the parameter estimation gives us the $SE(2)$ transform <code class="language-plaintext highlighter-rouge">(363.76, 458.966, 0.392419)</code>. Quite close for only 15 total matches, especially considering 5 of the 10 were outliers! The estimates are expected to improve with more matches to further constrain the problem.</p>
<p>As an alternate example, we look at an induced $SE(2)$ transform and try to estimate it from scratch (no manual match labeling. Our source image is one of crayon boxes retrieved from the internet, since this image lends itself to good feature detection.</p>
<p><img src="/assets/images/robust_estimators/original_image.png" alt="original image" /></p>
<p>To model an $SE(2)$ transformation, we apply a perspective transform to the image. The transformation applied is <code class="language-plaintext highlighter-rouge">(x, y, theta) = (4.711, 3.702, 0.13963)</code>. This gives us a ground truth value to compare our methods against. The transformed image can be seen below.</p>
<p><img src="/assets/images/robust_estimators/transformed_image.png" alt="Warped image" /></p>
<p>We run the standard pipeline of SIFT feature extraction and FLANN+KDTree based matching to obtain a set of matches. At this point we are ready to evaluate the different methods of estimating the transformation.</p>
<p>The vanilla version of the parameter estimation gives us <code class="language-plaintext highlighter-rouge">(6.26294, -14.6573, 0.153888)</code> which is pretty bad. However, the version with robust error models gives us <code class="language-plaintext highlighter-rouge">(4.75419, 3.60199, 0.139674)</code>, which is a far better estimate when compared to the ground truth, despite the initial estimate for the transform being arbitrary. This makes it apparent that the use of robust estimators and subsequently robust error models is the way to go for use cases where outliers are a concern. Of course, providing better initial estimates will only improve the final estimate.</p>
<p>You may ask how does this compare to our dear old friend RANSAC? Using the OpenCV implementation of RANSAC, a similar pipeline gives use the following $SE(2)$ values: <code class="language-plaintext highlighter-rouge">(4.77360, 3.69461, 0.13960)</code>.</p>
<p>We show, in order, the original image, its warped form, and the recovered image from the $SE(2)$ transformation estimation. The first image is from the ground truth transform, the second one is using RANSAC, the next one is using a vanilla parameter estimation approach, and the last one uses robust error models. As you can see, while the vanilla least-squares optimization result is poor compared to the ground truth, the RANSAC and the robust error model recover the transformation correctly, with the robust error model’s result being comparable to the RANSAC one.</p>
<figure class="center">
<img src="/assets/images/robust_estimators/ground_truth_images.png" alt="ground truth" />
<figcaption>Image warping and recovery using ground truth SE(2) transform. The first image is the original image, the 2nd image is the transformed image, and the last one is the image we get on applying the reverse $SE(2)$ transform.</figcaption>
</figure>
<figure class="center">
<img src="/assets/images/robust_estimators/ransac_images.png" alt="ransac" />
<figcaption>Image warping and recovery using RANSAC.</figcaption>
</figure>
<figure class="center">
<img src="/assets/images/robust_estimators/vanilla_model_images.png" alt="vanilla" />
<figcaption>Image warping and recovery using plain old parameter estimation. You can see that the 3rd (recovered) image does not line up correctly.</figcaption>
</figure>
<figure class="center">
<img src="/assets/images/robust_estimators/robust_model_images.png" alt="robust error model" />
<figcaption>Image warping and recovery using robust error models with parameter estimation. These results are comparable to the ones from RANSAC, demonstrating the promise of robust error models.</figcaption>
</figure>
<!-- That's pretty close too, so why go through the headache of using robust error models? For one, unlike RANSAC, robust error models are deterministic and possess defined behavior. Moreover, one does not need to run multiple runs of optimization to obtain a consistent result, compared to RANSAC which may require hundreds of runs to converge to a good result. -->
<!-- ## Robust Error Models + RANSAC -->
<h2 id="conclusion">Conclusion</h2>
<p>In this post, we have seen the basics of parameter estimation, a ubiquitous mathematical framework for many perception and robotics problems, and we have seen how this framework is susceptible to perturbations from outliers which can throw off the final estimate. More importantly, we have seen how a simple tool called the Robust M-estimator can easily help us deal with these outliers and their effects. An example case for $SE(2)$ transform estimation demonstrates not only their ease of use with GTSAM, but also the efficacy of the solution generated, especially when compared to widely used alternatives such as RANSAC.</p>
<p>Furthermore, robust estimators are deterministic, ameliorating the need for the complexity that comes inherent with RANSAC. While RANSAC is a great tool, robust error models provide us with a solid alternative to be considered. With the benefits of speed, accuracy, and ease of use, robust error models make a strong case for their adoption for many related problems and we hope you will give them a shot the next time you use GTSAM.</p>Author: Varun Agrawal Website: varunagrawal.github.ioLegged Robot Factors Part I2019-09-18T00:00:00+00:002019-09-18T00:00:00+00:00http://gtsam.org/2019/09/18/legged-robot-factors-part-I<p>Author: Ross Hartley<br />
email: <a href="mailto:m.ross.hartley@gmail.com">m.ross.hartley@gmail.com</a></p>
<p>This is the first blog post in a series about using factor graphs for legged robot state estimation. It is meant to provide a high-level overview of what I call <em>kinematic and contact factors</em> and how they can be used in GTSAM. More details can be found in our conference papers:</p>
<ul>
<li><a href="https://arxiv.org/abs/1803.07531">Hybrid Contact Preintegration for Visual-Inertial-Contact State Estimation Using Factor Graphs</a></li>
<li><a href="https://arxiv.org/abs/1712.05873">Legged Robot State-Estimation Through Combined Forward Kinematic and Preintegrated Contact Factors</a></li>
</ul>
<p>It is assumed that the reader is already familiar with the terminology and theory behind factor graph based smoothing.</p>
<h2 id="so-what-makes-legged-robots-different"><strong>So, what makes legged robots different?</strong></h2>
<p>Factor graph methods have been widely successful for mobile robot state estimation and SLAM. A common application is the fusion of inertial data (from an IMU) with visual data (from camera and/or LiDAR sensors) to estimate the robot’s pose over time.</p>
<p>Although this approach is often demonstrated on wheeled and flying robots, identical techniques can be applied to their walking brethren. What makes legged robots different, however, is the presence of additional encoder and contact sensors. As I’ll show, these extra sensor measurements can be leveraged to improve state estimation results.</p>
<blockquote>
<p><strong>Sensors typically found on legged robots:</strong></p>
<ul>
<li>Inertial Measurement Units (IMUs)</li>
<li>Vision Sensors (cameras, LiDARs)</li>
<li>Joint Encoders</li>
<li>Contact Sensors</li>
</ul>
</blockquote>
<p>All of these sensors exist on the University of Michigan’s version of the Cassie robot (developed by <a href="http://www.agilityrobotics.com/">Agility Robotics</a>), which I’ll use as a concrete example.
<img src="/assets/images/Cassie.jpg" alt="Cassie" /></p>
<h2 id="factor-graph-formulation"><strong>Factor Graph Formulation</strong></h2>
<p>Let’s see how we can create a factor graph using these 4 sensor types to estimate the trajectory of a legged robot. Each node in the graph represents the robot’s state at a particular timestep. This state includes the 3D orientation, position, and velocity of the robot’s base frame along with the IMU biases. We also include the pose of the contact frame (where the foot hits the ground) to this list of states. For simplicity, the base frame is assumed to be collocated with the inertial/vision sensor frames.</p>
<blockquote>
<p><strong>Estimated States:</strong></p>
<ul>
<li>Base pose, <code class="language-plaintext highlighter-rouge">X</code></li>
<li>Base velocity, <code class="language-plaintext highlighter-rouge">V</code></li>
<li>Contact pose, <code class="language-plaintext highlighter-rouge">C</code></li>
<li>IMU biases, <code class="language-plaintext highlighter-rouge">b</code></li>
</ul>
</blockquote>
<p>Each independent sensor measurement will place a <em>measurement factor</em> on the graph’s nodes. Solving the factor graph consists of searching for the maximum a posteriori state estimate that minimizes the error between the predicted and actual measurements.</p>
<p>The robot’s inertial measurements can be incorporated into the graph using the <em>preintegrated IMU factor</em> built into GTSAM 4.0. This factor relates the base pose, velocity, and IMU biases across consecutive timesteps.</p>
<p>Vision data can be incorporated into the graph using a number of different factors depending on the sensor type and application. Here we will simply assume that vision provides a relative pose factor between two nodes in the graph. This can be from either visual odometry or loop closures.</p>
<p>At each timestep, the joint encoder data can be used to compute the relative pose transformation between the robot’s base and contact frames (through forward kinematics). This measurement be captured in a unary <strong><em>forward kinematic factor</em></strong>.</p>
<p>Of course, adding the <em>forward kinematic factor</em> will not affect the optimal state estimate unless additional constraints are placed on the contact frame poses. This is achieved using a binary <strong><em>contact factor</em></strong> which uses contact measurements to infer the movement of the contact frame over time. The simplest case being contact implies zero movement of this frame. In other words, this <em>contact factor</em> tries to keep the contact pose fixed across timesteps where contact was measured. When contact is absent, this factor can simply be omitted in the graph.</p>
<p>If we know how the contact frame moves over time and we can measure the relative pose between the robot’s contact and base frames, then we have an implicit measurement of how the robot’s base moves over time.</p>
<blockquote>
<p>The combined <em>forward kinematic and contact factors</em> can be viewed as <strong><em>kinematic odometry</em></strong> measurements of the robot’s base frame.</p>
</blockquote>
<p>All together, a typical legged robot factor graph may be represented by the picture below. It will contain inertial, vision, forward kinematic, and contact factors.
<img src="/assets/images/Factor_Graph.PNG" alt="Cassie Forward Kinematics" /></p>
<h2 id="forward-kinematic-factor"><strong>Forward Kinematic Factor</strong></h2>
<p>The <em>forward kinematics factor</em> relates the base pose to the current contact pose using noisy encoder measurements. This is a simple relative pose factor which means we can use GTSAM’s built in <code class="language-plaintext highlighter-rouge">BetweenFactor<Pose3></code> factor to implement it. We just need to determine what the factor’s covariance will be.</p>
<p>Assuming the encoder noise is gaussian, we can map the encoder covariance to the contact pose covariance using the body <strong>manipulator Jacobian</strong> of the forward kinematics function. In general, the manipulator Jacobian maps joint angle rates to end effector twist, so it makes sense that it can be used to approximate the mapping of encoder uncertainty through the non-linearities of the robot’s kinematics.</p>
<p>For example, if <code class="language-plaintext highlighter-rouge">H_BC</code> is pose of the contact frame relative to the base frame and <code class="language-plaintext highlighter-rouge">J_BC</code> is the corresponding body manipulator Jacobian, the <em>forward kinematic factor</em> can be implemented using the following GTSAM code:</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Matrix6 FK_Cov = J_BC * encoder_covariance_matrix * J_BC.transpose();
BetweenFactor<Pose3> forward_kinematics_factor(X(node), C(node), H_BC, noiseModel::Gaussian::Covariance(FK_cov));
</code></pre></div></div>
<p><img src="/assets/images/Cassie_FK.PNG" alt="Cassie Forward Kinematics" /></p>
<h2 id="rigid-contact-factor"><strong>Rigid Contact Factor</strong></h2>
<p><em>Contact factors</em> come in a number of different flavors depending on the assumptions we want to make about the contact sensor measurements. However, they will all provide a measurement of contact frame odometry (i.e. how to foot will move over time).</p>
<p>For example, in the simplest case, perhaps measuring contact implies that the entire pose of the foot remains fixed. We can call this <strong>rigid contact</strong>, and it may be a good assumption for many humanoid robots that have large, flat feet. In contrast, we could alternatively assume a <strong>point contact</strong>, where the position of the contact frame remains fixed, but the foot is free to rotate.</p>
<p>For now, lets look at how we can implement a <strong><em>rigid contact factor</em></strong>. Again, when contact is measured, we assume there is no relative change in the contact frame pose. In other words, the contact frame velocity is zero. This can be implemented in GTSAM using a <code class="language-plaintext highlighter-rouge">BetweenFactor<Pose3></code> factor, where the measurement is simply the identity element.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>BetweenFactor<Pose3> contact_factor(C(node-1), C(node), Pose3::identity(), noiseModel::Gaussian::Covariance(Sigma_ij);
</code></pre></div></div>
<p>Some potential foot slip can be accommodated through the factor’s covariance, <code class="language-plaintext highlighter-rouge">Sigma_ij</code>. If we are highly confident that the foot remained fixed on the ground, this covariance should be small. A large covariance implies less confidence in this assumption.</p>
<p>One idea is simply assuming Gaussian noise on the contact frame velocity. In this case, the factor’s covariance will grow with the length of time between the graph nodes. Another idea is to use contact force information to model the factor’s covariance.</p>
<h3 id="what-happens-when-contact-is-madebroken"><strong>What happens when contact is made/broken?</strong></h3>
<p>Using the formulation above, every time contact is made or broken, a new node has to be added to the factor graph. This stems from our contact measurement assumption. When the robot loses contact, we have no way to determine the foot movement using contact sensors alone. Our only choice is to add a new node in the graph and omit the contact factor until contact is regained.</p>
<p>This may not be an issue for some slow walking robots where contact states change infrequently, <strong>but what about a running hexapod?</strong> In that case, the numerous contact changes will lead to an explosion in the number of nodes (and optimization variables) needed in the graph. This ultimately affects the performance of the state estimator by increasing the time it takes to solve the underlying optimization problem.</p>
<p>Thankfully, there is a way around this problem.</p>
<h2 id="hybrid-rigid-contact-factor"><strong>Hybrid Rigid Contact Factor</strong></h2>
<p>During many types of walking gaits, when one contact is broken, another is made. For example, with bipedal walking, left stance is followed by right stance, then left, then right, and so on. The order on a hexapod might be more complicated, but one fact remains: <strong>there is often (at least) one foot on the ground at all times</strong>. We can use this knowledge to improve performance in our factor graph by limiting the insertion rate of new graph nodes.</p>
<p>If we choose two arbitrary times (potentially far apart), any single contact is likely to have been broken at some point between them. However, there may be a <strong>chain of contact frames</strong> that we can swap through to maintain the notion of contact with the environment. Each consecutive pair of contact frames in this chain are related to each other through forward kinematics.</p>
<p>For example, lets say our biped robot switched from left stance (L1), to right stance (R1), back to left stance (L2) again. During the L1 phase, we can assume the contact frame pose remained fixed (zero velocity). When the robot switches from L1 to R1, we can map this contact frame from the left to the right foot using the encoder measurements and forward kinematics (since both feet are on the ground at this time). During the R1 phase, we again assume that the contact pose remains fixed. When the second swap happens, R1 to L2, we map the contact frame back to left foot using the encoder measurements and a (different) forward kinematics function. So in effect, we have tracked the movement of the contact frame across two contact switches. This relative pose measurement provides odometry for the contact frame and can be used to create a <strong><em>hybrid rigid contact factor</em></strong>. Using this method, nodes can now be added to the graph at arbitrary times, and do not have to be added when contact is made/broken (unless all feet come off the ground).</p>
<p><img src="/assets/images/Cassie_FK_Switch.PNG" alt="Cassie Forward Kinematics" /></p>
<p>Like the original <em>rigid contact factor</em>, the hybrid version can be created using GTSAM’s <code class="language-plaintext highlighter-rouge">BetweenFactor<Pose3></code> factor, where <code class="language-plaintext highlighter-rouge">delta_Hij</code> is the cumulative change in contact pose across all contact switches. The factor’s covariance is typically larger as it needs to account for the uncertainty in each swap’s forward kinematics.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>BetweenFactor<Pose3> hybrid_contact_factor(C(node-1), C(node), delta_Hij, noiseModel::Gaussian::Covariance(Sigma_ij);
</code></pre></div></div>
<h2 id="coming-soon"><strong>Coming soon…</strong></h2>
<p>I briefly discussed two types of factors that can be used to improve legged robot state estimation: the <strong><em>forward kinematic factor</em></strong> and the <strong><em>(hybrid) rigid contact factor</em></strong>. Combining these two factors allows for kinematic odometry to be added alongside other measurements (inertial, vision, etc.) when building up a factor graph.</p>
<p>In particular, when developing the <em>rigid contact factor</em>, we made the strong assumption that a contact measurement implies zero angular and linear velocity of the contact frame. The factor tries to keep the entire pose of the foot fixed across two timesteps. This assumption may not be valid for all types of walking robots. In fact, it doesn’t even hold for the Cassie robot! The roll angle about Cassie’s foot is unactuated and free to move during walking. In the next post, I will discuss the <strong>(hybrid) point contact factor</strong> which makes no assumptions about the angular velocity of the contact frame.</p>Author: Ross Hartley email: m.ross.hartley@gmail.comLaunching gtsam.org2019-05-20T00:00:00+00:002019-05-20T00:00:00+00:00http://gtsam.org/gtsam.org/2019/05/20/gtsam-org<p>Today we launched GTSAM’s new web presence, <a href="http://gtsam.org">gtsam.org</a>. The site is hosted by Github Pages, and is generated via <a href="https://jekyllrb.com">Jekyll</a>, a simple static website generator.</p>
<p>At the moment, the page still looks rather spartan, but we’ll add a little more design features as time goes on. Plans also call for blog posts (such as this one) and tutorials, on factor graphs and on GTSAM’s implementation of it.</p>Today we launched GTSAM’s new web presence, gtsam.org. The site is hosted by Github Pages, and is generated via Jekyll, a simple static website generator.Moving to Github!2019-05-18T00:00:00+00:002019-05-18T00:00:00+00:00http://gtsam.org/gtsam.org/2019/05/18/moving-to-github<p><a href="https://github.com/borglab/gtsam">GTSAM is now live on Github</a>. Github is doing so many things right, in addition to being the go-to platform for open source: it has free continuous integration for open source projects, it supports building great web-sites, and it is itself supporting many great efforts such as VS-code and Atom. While we initially launched GTSAM on Bitbucket because of its unlimited private repos, we felt we could hold out no longer. Github, here we come :-)</p>GTSAM is now live on Github. Github is doing so many things right, in addition to being the go-to platform for open source: it has free continuous integration for open source projects, it supports building great web-sites, and it is itself supporting many great efforts such as VS-code and Atom. While we initially launched GTSAM on Bitbucket because of its unlimited private repos, we felt we could hold out no longer. Github, here we come :-)