Factor Graphs and GTSAM:A Hands-on Introduction

This is an updated version of the 2012 tech-report Factor Graphs and GTSAM: A Hands-on Introduction by Frank Dellaert. A more thorough introduction to the use of factor graphs in robotics is the 2017 article Factor graphs for robot perception by Frank Dellaert and Michael Kaess.

## Overview

Factor graphs are graphical models (Koller and Friedman, 2009) that are well suited to modeling complex estimation problems, such as Simultaneous Localization and Mapping (SLAM) or Structure from Motion (SFM). You might be familiar with another often used graphical model, Bayes networks, which are directed acyclic graphs. A factor graph, however, is a bipartite graph consisting of factors connected to variables. The variables represent the unknown random variables in the estimation problem, whereas the factors represent probabilistic constraints on those variables, derived from measurements or prior knowledge. In the following sections I will illustrate this with examples from both robotics and vision.
The GTSAM toolbox (GTSAM stands for “Georgia Tech Smoothing and Mapping”) toolbox is a BSD-licensed C++ library based on factor graphs, developed at the Georgia Institute of Technology by myself, many of my students, and collaborators. It provides state of the art solutions to the SLAM and SFM problems, but can also be used to model and solve both simpler and more complex estimation problems. It also provides a MATLAB interface which allows for rapid prototype development, visualization, and user interaction.
GTSAM exploits sparsity to be computationally efficient. Typically measurements only provide information on the relationship between a handful of variables, and hence the resulting factor graph will be sparsely connected. This is exploited by the algorithms implemented in GTSAM to reduce computational complexity. Even when graphs are too dense to be handled efficiently by direct methods, GTSAM provides iterative methods that are quite efficient regardless.

## 1 Factor Graphs

Let us start with a one-page primer on factor graphs, which in no way replaces the excellent and detailed reviews by Kschischang et al. (2001) and Loeliger (2004). Figure 1: An HMM, unrolled over three time-steps, represented by a Bayes net.
Figure 1 shows the Bayes network for a hidden Markov model (HMM) over three time steps. In a Bayes net, each node is associated with a conditional density: the top Markov chain encodes the prior $P\left({X}_{1}\right)$ and transition probabilities $P\left({X}_{2}|{X}_{1}\right)$ and $P\left({X}_{3}|{X}_{2}\right)$, whereas measurements ${Z}_{t}$ depend only on the state ${X}_{t}$, modeled by conditional densities $P\left({Z}_{t}|{X}_{t}\right)$. Given known measurements ${z}_{1}$, ${z}_{2}$ and ${z}_{3}$ we are interested in the hidden state sequence $\left({X}_{1},{X}_{2},{X}_{3}\right)$ that maximizes the posterior probability $P\left({X}_{1},{X}_{2},{X}_{3}|{Z}_{1}={z}_{1},{Z}_{2}={z}_{2},{Z}_{3}={z}_{3}\right)$. Since the measurements ${Z}_{1}$, ${Z}_{2}$, and ${Z}_{3}$ are known, the posterior is proportional to the product of six factors, three of which derive from the the Markov chain, and three likelihood factors defined as $L\left({X}_{t};z\right)\propto P\left({Z}_{t}=z|{X}_{t}\right)$:$P\left({X}_{1},{X}_{2},{X}_{3}|{Z}_{1},{Z}_{2},{Z}_{3}\right)\propto P\left({X}_{1}\right)P\left({X}_{2}|{X}_{1}\right)P\left({X}_{3}|{X}_{2}\right)L\left({X}_{1};{z}_{1}\right)L\left({X}_{2};{z}_{2}\right)L\left({X}_{3};{z}_{3}\right)$ Figure 2: An HMM with observed measurements, unrolled over time, represented as a factor graph.
This motivates a different graphical model, a factor graph, in which we only represent the unknown variables ${X}_{1}$, ${X}_{2}$, and ${X}_{3}$, connected to factors that encode probabilistic information on them, as in Figure 2. To do maximum a-posteriori (MAP) inference, we then maximize the product $f\left({X}_{1},{X}_{2},{X}_{3}\right)=\prod {f}_{i}\left({X}_{i}\right)$i.e., the value of the factor graph. It should be clear from the figure that the connectivity of a factor graph encodes, for each factor ${f}_{i}$, which subset of variables ${X}_{i}$ it depends on. In the examples below, we use factor graphs to model more complex MAP inference problems in robotics.

## 2 Modeling Robot Motion

### 2.1 Modeling with Factor Graphs

Before diving into a SLAM example, let us consider the simpler problem of modeling robot motion. This can be done with a continuous Markov chain, and provides a gentle introduction to GTSAM. Figure 3: Factor graph for robot localization.
The factor graph for a simple example is shown in Figure 3. There are three variables ${x}_{1}$, ${x}_{2}$, and ${x}_{3}$ which represent the poses of the robot over time, rendered in the figure by the open-circle variable nodes. In this example, we have one unary factor ${f}_{0}\left({x}_{1}\right)$ on the first pose ${x}_{1}$ that encodes our prior knowledge about ${x}_{1}$, and two binary factors that relate successive poses, respectively ${f}_{1}\left({x}_{1},{x}_{2};{o}_{1}\right)$ and ${f}_{2}\left({x}_{2},{x}_{3};{o}_{2}\right)$, where ${o}_{1}$ and ${o}_{2}$ represent odometry measurements.

### 2.2 Creating a Factor Graph

The following C++ code, included in GTSAM as an example, creates the factor graph in Figure 3:
// Create an empty nonlinear factor graph
NonlinearFactorGraph graph;

// Add a Gaussian prior on pose x_1
Pose2 priorMean(0.0, 0.0, 0.0);
noiseModel::Diagonal::shared_ptr priorNoise =
noiseModel::Diagonal::Sigmas(Vector3(0.3, 0.3, 0.1));

Pose2 odometry(2.0, 0.0, 0.0);
noiseModel::Diagonal::shared_ptr odometryNoise =
noiseModel::Diagonal::Sigmas(Vector3(0.2, 0.2, 0.1));

Above, line 2 creates an empty factor graph. We then add the factor ${f}_{0}\left({x}_{1}\right)$ on lines 5-8 as an instance of PriorFactor<T>, a templated class provided in the slam subfolder, with T=Pose2. Its constructor takes a variable Key (in this case 1), a mean of type Pose2, created on Line 5, and a noise model for the prior density. We provide a diagonal Gaussian of type noiseModel::Diagonal by specifying three standard deviations in line 7, respectively 30 cm. on the robot's position, and 0.1 radians on the robot's orientation. Note that the Sigmas constructor returns a shared pointer, anticipating that typically the same noise models are used for many different factors.
Similarly, odometry measurements are specified as Pose2 on line 11, with a slightly different noise model defined on line 12-13. We then add the two factors ${f}_{1}\left({x}_{1},{x}_{2};{o}_{1}\right)$ and ${f}_{2}\left({x}_{2},{x}_{3};{o}_{2}\right)$ on lines 14-15, as instances of yet another templated class, BetweenFactor<T>, again with T=Pose2.
When running the example (make OdometryExample.run on the command prompt), it will print out the factor graph as follows:
Factor Graph:
size: 3
Factor 0: PriorFactor on 1
prior mean: (0, 0, 0)
noise model: diagonal sigmas [0.3; 0.3; 0.1];
Factor 1: BetweenFactor(1,2)
measured: (2, 0, 0)
noise model: diagonal sigmas [0.2; 0.2; 0.1];
Factor 2: BetweenFactor(2,3)
measured: (2, 0, 0)
noise model: diagonal sigmas [0.2; 0.2; 0.1];


### 2.3 Factor Graphs versus Values

At this point it is instructive to emphasize two important design ideas underlying GTSAM:
1. The factor graph and its embodiment in code specify the joint probability distribution $P\left(X|Z\right)$ over the entire trajectory $X=\left\{{x}_{1},{x}_{2},{x}_{3}\right\}$ of the robot, rather than just the last pose. This smoothing view of the world gives GTSAM its name: “smoothing and mapping”. Later in this document we will talk about how we can also use GTSAM to do filtering (which you often do not want to do) or incremental inference (which we do all the time).
2. A factor graph in GTSAM is just the specification of the probability density $P\left(X|Z\right)$, and the corresponding FactorGraph class and its derived classes do not ever contain a “solution”. Rather, there is a separate type Values that is used to specify specific values for (in this case) ${x}_{1}$, ${x}_{2}$, and ${x}_{3}$, which can then be used to evaluate the probability (or, more commonly, the error) associated with particular values.
The latter point is often a point of confusion with beginning users of GTSAM. It helps to remember that when designing GTSAM we took a functional approach of classes corresponding to mathematical objects, which are usually immutable. You should think of a factor graph as a function to be applied to values -as the notation $f\left(X\right)\propto P\left(X|Z\right)$ implies- rather than as an object to be modified.

### 2.4 Non-linear Optimization in GTSAM

The listing below creates a Values instance, and uses it as the initial estimate to find the maximum a-posteriori (MAP) assignment for the trajectory $X$:
// create (deliberately inaccurate) initial estimate
Values initial;
initial.insert(1, Pose2(0.5, 0.0, 0.2));
initial.insert(2, Pose2(2.3, 0.1, -0.2));
initial.insert(3, Pose2(4.1, 0.1, 0.1));

// optimize using Levenberg-Marquardt optimization
Values result = LevenbergMarquardtOptimizer(graph, initial).optimize();

Lines 2-5 in Listing 2.4 create the initial estimate, and on line 8 we create a non-linear Levenberg-Marquardt style optimizer, and call optimize using default parameter settings. The reason why GTSAM needs to perform non-linear optimization is because the odometry factors ${f}_{1}\left({x}_{1},{x}_{2};{o}_{1}\right)$ and ${f}_{2}\left({x}_{2},{x}_{3};{o}_{2}\right)$ are non-linear, as they involve the orientation of the robot. This also explains why the factor graph we created in Listing 2.2 is of type NonlinearFactorGraph. The optimization class linearizes this graph, possibly multiple times, to minimize the non-linear squared error specified by the factors.
The relevant output from running the example is as follows:
Initial Estimate:
Values with 3 values:
Value 1: (0.5, 0, 0.2)
Value 2: (2.3, 0.1, -0.2)
Value 3: (4.1, 0.1, 0.1)

Final Result:
Values with 3 values:
Value 1: (-1.8e-16, 8.7e-18, -9.1e-19)
Value 2: (2, 7.4e-18, -2.5e-18)
Value 3: (4, -1.8e-18, -3.1e-18)

It can be seen that, subject to very small tolerance, the ground truth solution ${x}_{1}=\left(0,0,0\right)$, ${x}_{2}=\left(2,0,0\right)$, and ${x}_{3}=\left(4,0,0\right)$ is recovered.

### 2.5 Full Posterior Inference

GTSAM can also be used to calculate the covariance matrix for each pose after incorporating the information from all measurements $Z$. Recognizing that the factor graph encodes the posterior density $P\left(X|Z\right)$, the mean $\mu$ together with the covariance $\Sigma$ for each pose $x$ approximate the marginal posterior density $P\left(x|Z\right)$. Note that this is just an approximation, as even in this simple case the odometry factors are actually non-linear in their arguments, and GTSAM only computes a Gaussian approximation to the true underlying posterior.
The following C++ code will recover the posterior marginals:
// Query the marginals
cout.precision(2);
Marginals marginals(graph, result);
cout << "x1 covariance:\n" << marginals.marginalCovariance(1) << endl;
cout << "x2 covariance:\n" << marginals.marginalCovariance(2) << endl;
cout << "x3 covariance:\n" << marginals.marginalCovariance(3) << endl;

The relevant output from running the example is as follows:
x1 covariance:
0.09     1.1e-47     5.7e-33
1.1e-47        0.09     1.9e-17
5.7e-33     1.9e-17        0.01
x2 covariance:
0.13     4.7e-18     2.4e-18
4.7e-18        0.17        0.02
2.4e-18        0.02        0.02
x3 covariance:
0.17     2.7e-17     8.4e-18
2.7e-17        0.37        0.06
8.4e-18        0.06        0.03

What we see is that the marginal covariance $P\left({x}_{1}|Z\right)$ on ${x}_{1}$ is simply the prior knowledge on ${x}_{1}$, but as the robot moves the uncertainty in all dimensions grows without bound, and the $y$ and $\theta$ components of the pose become (positively) correlated.
An important fact to note when interpreting these numbers is that covariance matrices are given in relative coordinates, not absolute coordinates. This is because internally GTSAM optimizes for a change with respect to a linearization point, as do all nonlinear optimization libraries.

## 3 Robot Localization

### 3.1 Unary Measurement Factors

In this section we add measurements to the factor graph that will help us actually localize the robot over time. The example also serves as a tutorial on creating new factor types. Figure 4: Robot localization factor graph with unary measurement factors at each time step.
In particular, we use unary measurement factors to handle external measurements. The example from Section 2 is not very useful on a real robot, because it only contains factors corresponding to odometry measurements. These are imperfect and will lead to quickly accumulating uncertainty on the last robot pose, at least in the absence of any external measurements (see Section 2.5). Figure 4 shows a new factor graph where the prior ${f}_{0}\left({x}_{1}\right)$ is omitted and instead we added three unary factors ${f}_{1}\left({x}_{1};{z}_{1}\right)$, ${f}_{2}\left({x}_{2};{z}_{2}\right)$, and ${f}_{3}\left({x}_{3};{z}_{3}\right)$, one for each localization measurement ${z}_{t}$, respectively. Such unary factors are applicable for measurements ${z}_{t}$ that depend only on the current robot pose, e.g., GPS readings, correlation of a laser range-finder in a pre-existing map, or indeed the presence of absence of ceiling lights (see Dellaert et al. (1999) for that amusing example).

### 3.2 Defining Custom Factors

In GTSAM, you can create custom unary factors by deriving a new class from the built-in class NoiseModelFactor1<T>, which implements a unary factor corresponding to a measurement likelihood with a Gaussian noise model, $L\left(q;m\right)= exp\left\{-\frac{1}{2}{||h\left(q\right)-m||}_{\Sigma }^{2}\right\}=f\left(q\right)$ where $m$ is the measurement, $q$ is the unknown variable, $h\left(q\right)$ is a (possibly nonlinear) measurement function, and $\Sigma$ is the noise covariance. Note that $m$ is considered known above, and the likelihood $L\left(q;m\right)$ will only ever be evaluated as a function of $q$, which explains why it is a unary factor $f\left(q\right)$. It is always the unknown variable $q$ that is either likely or unlikely, given the measurement.
Note: many people get this backwards, often misled by the conditional density notation $P\left(m|q\right)$. In fact, the likelihood $L\left(q;m\right)$ is defined as any function of $q$ proportional to $P\left(m|q\right)$.
Listing 3.2 shows an example on how to define the custom factor class UnaryFactor which implements a “GPS-like” measurement likelihood:
class UnaryFactor: public NoiseModelFactor1<Pose2> {
double mx_, my_; ///< X and Y measurements

public:
UnaryFactor(Key j, double x, double y, const SharedNoiseModel& model):
NoiseModelFactor1<Pose2>(model, j), mx_(x), my_(y) {}

Vector evaluateError(const Pose2& q,
boost::optional<Matrix&> H = boost::none) const
{
const Rot2& R = q.rotation();
if (H) (*H) = (gtsam::Matrix(2, 3) <<
R.c(), -R.s(), 0.0,
R.s(), R.c(), 0.0).finished();
return (Vector(2) << q.x() - mx_, q.y() - my_).finished();
}
};

In defining the derived class on line 1, we provide the template argument Pose2 to indicate the type of the variable $q$, whereas the measurement is stored as the instance variables mx_ and my_, defined on line 2. The constructor on lines 5-6 simply passes on the variable key $j$ and the noise model to the superclass, and stores the measurement values provided. The most important function to has be implemented by every factor class is evaluateError, which should return $E\left(q\right)=h\left(q\right)-m$ which is done on line 12. Importantly, because we want to use this factor for nonlinear optimization (see e.g., Dellaert and Kaess 2006 for details), whenever the optional argument $H$ is provided, a Matrix reference, the function should assign the Jacobian of $h\left(q\right)$ to it, evaluated at the provided value for $q$. This is done for this example on line 11. In this case, the Jacobian of the 2-dimensional function $h$, which just returns the position of the robot, $h\left(q\right)=\left[\begin{array}{c}{q}_{x}\\ {q}_{y}\end{array}\right]$ with respect the 3-dimensional pose $q=\left({q}_{x},{q}_{y},{q}_{\theta }\right)$, yields the following $2×3$ matrix:
$H=\left[\begin{array}{ccc}cos\left({q}_{\theta }\right)& -sin\left({q}_{\theta }\right)& 0\\ sin\left({q}_{\theta }\right)& cos\left({q}_{\theta }\right)& 0\end{array}\right]$
##### Important Note
Many of our users, when attempting to create a custom factor, are initially surprised at the Jacobian matrix not agreeing with their intuition. For example, above you might simply expect a $2×3$ diagonal matrix. This would be true for variables belonging to a vector space. However, in GTSAM we define the Jacobian more generally to be the matrix $H$ such that$h\left(q{e}^{\stackrel{ˆ}{\xi }}\right)\approx h\left(q\right)+H\xi$where $\xi =\left(\delta x,\delta y,\delta \theta \right)$ is an incremental update and $exp\stackrel{ˆ}{\xi }$ is the exponential map for the variable we want to update, In this case $q\in SE\left(2\right)$, where $SE\left(2\right)$ is the group of 2D rigid transforms, implemented by Pose2. The exponential map for $SE\left(2\right)$ can be approximated to first order as$exp\stackrel{ˆ}{\xi }\approx \left[\begin{array}{ccc}1& -\delta \theta & \delta x\\ \delta \theta & 1& \delta x\\ 0& 0& 1\end{array}\right]$when using the $3×3$ matrix representation for 2D poses, and hence $h\left(q{e}^{\stackrel{ˆ}{\xi }}\right)\approx h\left(\left[\begin{array}{ccc}cos\left({q}_{\theta }\right)& -sin\left({q}_{\theta }\right)& {q}_{x}\\ sin\left({q}_{\theta }\right)& cos\left({q}_{\theta }\right)& {q}_{y}\\ 0& 0& 1\end{array}\right]\left[\begin{array}{ccc}1& -\delta \theta & \delta x\\ \delta \theta & 1& \delta x\\ 0& 0& 1\end{array}\right]\right)=\left[\begin{array}{c}{q}_{x}+cos\left({q}_{\theta }\right)\delta x-sin\left({q}_{\theta }\right)\delta y\\ {q}_{y}+sin\left({q}_{\theta }\right)\delta x+cos\left({q}_{\theta }\right)\delta y\end{array}\right]$which then explains the Jacobian $H$.

### 3.3 Using Custom Factors

The following C++ code fragment illustrates how to create and add custom factors to a factor graph:
// add unary measurement factors, like GPS, on all three poses
noiseModel::Diagonal::shared_ptr unaryNoise =
noiseModel::Diagonal::Sigmas(Vector2(0.1, 0.1)); // 10cm std on x,y

In Listing 3.3, we create the noise model on line 2-3, which now specifies two standard deviations on the measurements ${m}_{x}$ and ${m}_{y}$. On lines 4-6 we create shared_ptr versions of three newly created UnaryFactor instances, and add them to graph. GTSAM uses shared pointers to refer to factors in factor graphs, and boost::make_shared is a convenience function to simultaneously construct a class and create a shared_ptr to it. We obtain the factor graph from Figure 4.