The Manifold Kalman Filter Hierarchy, Part 2: Legged State Estimation

GTSAM Posts

Authors: Frank Dellaert, Varun Agrawal

Part I introduced the new manifold and Lie-group Kalman filter hierarchy in GTSAM. Here we highlight one of the most impactful applications of the invariant filter in robotics: legged state estimation with an IMU and contact kinematics, in the spirit of Ross Hartley, who also wrote an earlier GTSAM post on legged robot factors, and the later paper “Contact-aided invariant extended Kalman filtering for robot state estimation”.

After introducing the filter itself, we will show how factor graphs can be leveraged for increasingly more sophisticated legged estimators: first a local graph update, then a fixed-lag smoother with one shared bias, and finally a fixed-lag smoother with a bias trajectory.

Legged robot staircase estimation using IMU and contact measurements
A staircase sequence from the legged-estimation tutorial. The data source is the Co-RaL dataset, and the animation is generated from the corresponding GTSAM notebook/example.


1. A simplified contact-aided invariant invariant EKF

A cool idea, tracing back to Michael Bloesch et al., is to treat legged pose estimation as a “foot-SLAM” problem: we should not just “use contacts”, but augment the state with contact “landmarks”. Our starting point in this post is Ross Hartley et al.’s later paper, “Contact-aided invariant extended Kalman filtering for robot state estimation”, which does so in a way that preserves invariant dynamics by extending $SE_2(3)$. In GTSAM, the LeggedInvariantEKF follows that same spirit with an ExtendedPose3 state containing rotation, position, velocity, and footholds.

Our implementation is deliberately simpler. Rather than taking joint angles as measurements, it treats feet as body-frame contact measurements in 3D. This is in contrast to full kinematic chain modeling of the legs which, as we show in our Humanoids 2022 publication, leads to better forward kinematics estimates but requires more work to model. In the provided example we replay the small staircase example, pushing “contact packets” into the filter on touchdown, or after 100 ms of dead reckoning, whichever comes first. That logic lives in buildContactReplayPlan.

This gives a very practical invariant filter: light-weight, recursive, and geometry-aware, without forcing an update at a high frequency. Typically we are being pushed by the IMU, with occasional aiding by the foothold “landmark” sightings. The notebook LeggedEstimator.ipynb shows this as LeggedInvariantEKF, and the corresponding C++ replay variant is invariant_ekf in LeggedEstimatorReplayExample.cpp.

2. Replacing the update step with a factor graph solve

Because GTSAM is a factor-graph library, the next natural step is to replace the pure EKF contact correction with a small nonlinear solve over the current base state and all feet in contact.

That is exactly what LeggedInvariantIEKF does. The prediction side still uses the same invariant state representation as variant 1, but the measurement phase becomes an iterated local graph update. In other words, the estimator still behaves like a filter overall, yet each contact event is solved more like a tiny factor graph than a single linearized EKF correction.

Factor-graph fragment for the invariant IEKF contact update
The local graph fragment used in the invariant graph-update variant: a predicted base state on the left, active footholds on the right, and one contact factor per foot currently in stance.


This is a very GTSAM move. Rather than abandoning filtering altogether, variant 2 uses just enough factor-graph machinery to make the contact update richer and more nonlinear. The notebook LeggedEstimator.ipynb calls this LeggedInvariantIEKF, and the replay source exposes it as invariant_graph in LeggedEstimatorReplayExample.cpp.

3. From filtering to smoothing

GTSAM is “Georgia Tech Smoothing and Mapping”. Once we see the contact update as a small inverse kinematics optimization problem, it is natural to then keep a short history and smooth instead of throwing information away immediately.

That is what variant 3 does. LeggedFixedLagSmoother builds a fixed-lag window over contact events, creates foothold variables per contact episode, and links consecutive base states with preintegrated “IMU factors”. In the current implementation those motion links are ImuFactor2 factors from ImuFactor.h. We now no longer exploit invariance, but we now use several steps of information at once.

Fixed-lag smoother with one shared bias variable
Variant 3 keeps a short window over contact events, contact-episode footholds, and one shared IMU bias variable across the window.


This is where the factor-graph viewpoint really starts to pay off. The filter variants summarize the past into one Gaussian belief at the current time. The smoother instead keeps a short recent history alive, which lets it combine delayed contact information with the accumulated IMU evidence between events. In the notebook LeggedEstimator.ipynb, this appears as LeggedFixedLagSmoother, and in the replay driver it is fixed_lag_single_bias in LeggedEstimatorReplayExample.cpp.

4. Estimating bias

The last step is to stop assuming a single bias can explain the entire lag window. Variant 4 keeps the same contact-episode smoother structure, but upgrades the inertial side from one shared bias estimate to a bias trajectory over the window. In GTSAM that means moving from LeggedFixedLagSmoother to LeggedCombinedFixedLagSmoother, and from ImuFactor2-style links to CombinedImuFactor.

Fixed-lag smoother with evolving bias states
Variant 4 keeps the same sliding-window idea, but replaces the shared bias with a bias trajectory and `CombinedImuFactor` links between consecutive events.


Conceptually, the change is simple but important. Variant 3 says: “within this lag, one bias estimate is good enough.” Variant 4 says: “let the bias evolve, and estimate that evolution explicitly.” In the replay source this final estimator is the fixed_lag_combined_bias variant in LeggedEstimatorReplayExample.cpp. That makes it the most expressive of the four reference implementations, and also the closest to the direction many serious legged-estimation systems eventually need to go.

Closing

This sequence of estimators is a nice example of why the filter hierarchy in Part I was worth building in the first place. The invariant structure gives a strong filtering baseline for contact-aided navigation, the local graph fragment makes the contact update more faithful without giving up online behavior, and the fixed-lag smoothers show how naturally the same ideas extend into fuller factor-graph inference.

By providing these reference implementations for leg robot state estimation, we hope to lower the barrier of entry to using these or implementing your own variants on top of them.

If you want to explore more, start with the notebook LeggedEstimator.ipynb, then look at the C++ replay driver LeggedEstimatorReplayExample.cpp and the declarations of LeggedInvariantEKF, LeggedInvariantIEKF, LeggedFixedLagSmoother, and LeggedCombinedFixedLagSmoother.

Disclosure: AI was used to help draft this post.