Cross-Descriptor Visual Localization and Mapping

Mihai Dusmanu 1, Ondrej Miksik 2, Johannes L. Schönberger 2, Marc Pollefeys 1, 2

1ETH Zürich, 2Microsoft

ICCV 2021 (Oral)

Overview of the proposed method. On the left, the pair network is trained for each description algorithm pair independently. On the right, the encoder-decoder network is trained for all description algorithms at once. All descriptors are mapped to a joint embedding space. In green, we highlight the networks that need to be used in order to translate from SIFT to HardNet.

Abstract

Visual localization and mapping is the key technology underlying the majority of mixed reality and robotics systems. Most state-of-the-art approaches rely on local features to establish correspondences between images. In this paper, we present three novel scenarios for localization and mapping which require the continuous update of feature representations and the ability to match across different feature types. While localization and mapping is a fundamental computer vision problem, the traditional setup supposes the same local features are used throughout the evolution of a map. Thus, whenever the underlying features are changed, the whole process is repeated from scratch. However, this is typically impossible in practice, because raw images are often not stored and re-building the maps could lead to loss of the attached digital content. To overcome the limitations of current approaches, we present the first principled solution to cross-descriptor localization and mapping. Our data-driven approach is agnostic to the feature descriptor type, has low computational requirements, and scales linearly with the number of description algorithms. Extensive experiments demonstrate the effectiveness of our approach on state-of-the-art benchmarks for a variety of handcrafted and learned features.

Video

Code

The code is available on GitHub at mihaidusmanu/cross-descriptor-vis-loc-map.

Paper

Mihai Dusmanu 1, Ondrej Miksik 2, Johannes L. Schönberger 2, Marc Pollefeys 1, 2
1ETH Zürich, 2Microsoft
Cross-Descriptor Visual Localization and Mapping
In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision
[Latest version on arXiv] [Poster] [Video on YouTube]
BibTeX
@InProceedings{Dusmanu2021Cross,
    author = "Dusmanu, Mihai and Miksik, Ondrej and Sch\"onberger, Johannes L. and Pollefeys, Marc",
    title = "{C}ross-{D}escriptor {V}isual {L}ocalization and {M}apping",
    booktitle = "Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision",
    year = "2021"
}