Gaussian Splatting SLAM

CVPR 2024 (Highlight & Best Demo Award)

Monocular
Reconstruction of TUM fr1/desk


Use mouse navigate.
trackpad
- click and drag to orbit
- click with two fingers and drag to move
- hold ctrl/cmd key and scroll/pinch to move forward/back

mouse
- click and drag to orbit
- right click and drag to move
- middle click and drag or hold ctrl/cmd key and scroll to move forward/back
Scroll Down

Gaussian Splatting SLAM

CVPR 2024 (Highlight & Best Demo Award)

Hidenobu Matsuki*1, Riku Murai*2, Paul H. J. Kelly2, Andrew J. Davison1
1. Dyson Robotics Laboratory, Imperial College London
2. Software Performance Optimisation Group, Imperial College London
* Authors contributed equally to this work
[arXiv] [video] [code]

Abstract

We present the first application of 3D Gaussian Splatting in monocular SLAM, the most fundamental but the hardest setup for Visual SLAM. Our method, which runs live at 3fps, utilises Gaussians as the only 3D representation, unifying the required representation for accurate, efficient tracking, mapping, and high-quality rendering. Designed for challenging monocular settings, our approach is seamlessly extendable to RGB-D SLAM when an external depth sensor is available. Several innovations are required to continuously reconstruct 3D scenes with high fidelity from a live camera. First, to move beyond the original 3DGS algorithm, which requires accurate poses from an offline Structure from Motion (SfM) system, we formulate camera tracking for 3DGS using direct optimisation against the 3D Gaussians, and show that this enables fast and robust tracking with a wide basin of convergence. Second, by utilising the explicit nature of the Gaussians, we introduce geometric verification and regularisation to handle the ambiguities occurring in incremental 3D dense reconstruction. Finally, we introduce a full SLAM system which not only achieves state-of-the-art results in novel view synthesis and trajectory estimation but also reconstruction of tiny and even transparent objects.

Monocular SLAM Results (x20)

Monocular TUM fr2/xyz
Monocular TUM fr3/office

Overview Video

Additional Results

We present additional qualitative results of our method. Self-captured sequences (TableTop, Chairs) use only the RGB images from Intel Realsense d455 and are reconstructed in real time.
Use mouse navigate.
trackpad
- click and drag to orbit
- click with two fingers and drag to move
- hold ctrl/cmd key and scroll/pinch to move forward/back

mouse
- click and drag to orbit
- right click and drag to move
- middle click and drag or hold ctrl/cmd key and scroll to move forward/back

3D Gaussian Visualisation

We visualise the rasterised Gaussians and Gaussians shaded to highlight the geometry. Self-captured sequences (Teapot, Salad) use only the RGB images from Intel Realsense d455 and are reconstructed in real time.

BibTex

@article{Matsuki:Murai:etal:CVPR2024,
  title={{G}aussian {S}platting {SLAM}},
  author={Hidenobu Matsuki and Riku Murai and Paul H. J. Kelly and Andrew J. Davison},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year={2024}
}
        

Acknowledgment

Research presented in this paper has been supported by Dyson Technology Ltd.
We are very grateful to Eric Dexheimer, Kirill Mazur, Xin Kong, Marwan Taher, Ignacio Alzugaray, Gwangbin Bae, Aalok Patwardhan, and members of the Dyson Robotics Lab for their advice and insightful discussions.
This website uses WebGL Gaussian Splatting Visualiser splat created by Kevin Kwok, and the page design is inspired by World Models by David Ha.