A list of visual(camera) re-localization researches. Visual relocalization refers to the problem of estimating the position and orientation using the information of an existing prior map based on the image captured from visual sensors. We sort out these methods according to the type of map, mainly focusing on image databases and point cloud maps.
This document will be continuously updated and there may inevitably be errors and omissions. Please feel free to point them out through issues or pull requests.
- Awesome-camera-relocalization-in-prior-map
The image database map consists of a series of images with information. The images in the image database can be captured by cameras with different intrinsic parameters. The poses of images in image databases are usually obtained by the SfM algorithm or external measurement devices such as GPS.
The core idea of image retrieval-based methods is to first search for the reference image that is most similar to the query image from the image database. Then, based on the relative pose relationship between the query image and the reference image, as well as the pose label of the reference image, the pose information of the query image is calculated.
Modeling the shape of the scene: A holistic representation of the spatial envelope
IM2GPS: estimating geographic information from a single image
Aggregating local descriptors into a compact image representation |code
Real-time loop detection with bags of binary words
Automatic alignment of paintings and photographs depicting a 3d scene
Fast relocalisation and loop closing in keyframe-based slam
24/7 place recognition by view synthesis
NetVLAD: cnn architecture for weakly supervised place recognition |code
CNN image retrieval learns from bow: unsupervised fine-tuning with hard examples |code
Filtering 3d keypoints using gist for accurate image-based localization
Orb-slam2: An open-source slam system for monocular, stereo, and rgb-d cameras |code
RelocNet: continuous metric learning relocalisation using neural nets
NeXtVLAD: an efficient neural network to aggregate frame-level features for large-scale video classification |code
NeXtVLAD: an efficient neural network to aggregate frame-level features for large-scale video classification |code
TransGeo: transformer is all you need for cross-view image geo-localization |code
Deep visual geo-localization benchmark |code
An efficient and scalable collection of fly-inspired voting units for visual place recognition in changing environments
An efficient and scalable collection of fly-inspired voting units for visual place recognition in changing environments
Image based localization in urban environments
Image geo-localization based on multiplenearest neighbor feature matching usinggeneralized graphs
Are large-scale 3d models really necessary for accurate visual localization
Camera relocalization by computing pairwise relative poses using convolutional neural network
S2dnet: Learning accurate correspondences for sparse-to-dense feature matching |code
SuperGlue: learning feature matching with graph neural networks |code
To learn or not to learn: visual localization from essential matrices
LoFTR: detector-free local feature matching with transformers |code
Clustergnn: Cluster-based coarse-to-fine graph neural network for efficient feature matching
3DG-stfm: 3d geometric guided student-teacher feature matching |code
Night-to-day image translation for retrieval-based localization |code
Adversarial feature disentanglement for place recognition across changing appearance
A deep learning based image enhancement approach for autonomous driving at night
The pose regression method extracts high-dimensional features of the query image through an end-to-end deep neural network, and then uses the high-dimensional features to directly regress the camera pose. We classify them based on different input.
PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization |code
Modelling uncertainty in deep learning for camera relocalization |code
Geometric Loss Functions for Camera Pose Regression With Deep Learning
Image-based localization using hourglass networks
Image-based localization using lstms for structured feature correlation
Delving deeper into convolutional neural networks for camera relocalization
Deep regression for monocular camera-based 6-dof global localization in outdoor environments
Prior guided dropout for robust visual localization in dynamic environments |code
Atloc: Attention guided camera localization |code
Extending absolute pose regression to multiple scenes |code
Learning multi-scene absolute pose regression with transformers |code
DiffPoseNet: direct differentiable camera pose estimation
Dfnet: Enhance absolute pose regression with direct feature matching |code
Sc-wls: Towards interpretable feed-forward camera re-localization |code
Relative camera pose estimation using convolutional neural networks |code
VidLoc: a deep spatio-temporal model for 6-dof video-clip relocalization
Geometry-aware learning of maps for camera localization |code
Deep auxiliary learning for visual localization and odometry |code
VLocNet++: deep multitask learning for semantic visual localization and odometry
Local supports global: deep camera relocalization with sequence enhancement
GTCaR: Graph Transformer for Camera Re-localization
Indoor relocalization in challenging environments with dual-stream convolutional neural networks
Point cloud maps consist of a series of unordered points with three-dimensional coordinates in 3D space. Since the query image and the point cloud map are of different modalities, they cannot be directly compared. The key issue is how to establish data association. Existing methods are divided into three categories:(1) Feature based method(Visual Point Cloud): match by image features; (2): 3D
Monocular vision for mobile robot localization and autonomous navigation
From structure-from-motion point clouds to fast location recognition
Location recognition using prioritized feature matching
Fast image-based localization using direct 2d-to-3d matching
Worldwide pose estimation using 3d point clouds
Efficient global 2d-3d matching for camera localization in a large-scale 3d map
Efficient & effective prioritized matching for large-scale image-based localization
Keypoint recognition using randomized trees
From structure-from-motion point clouds to fast location recognition
Worldwide pose estimation using 3d point clouds
Discriminative feature-to-point matching in image-based localization
Hyperpoints and fine vocabularies for large-scale location recognition
Efficient monocular pose estimation for complex 3d models
Fast localization in large-scale environments using supervised indexing of binary features
Filtering 3d keypoints using gist for accurate image-based localization
Large-scale location recognition and the geometric burstiness problem |code
From coarse to fine: robust hierarchical localization at large scale |code
2d3d-matchnet: Learning to match keypoints across 2d image and 3d point cloud
Lcd: Learned cross-domain descriptors for 2d-3d matching |code
DA4AD: end-to-end deep attention-based visual localization for autonomous driving
Monocular camera localization in prior lidar maps with 2d-3d line correspondences |code
Back to the feature: learning robust camera localization from pixels to pose |code
Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography
Bundle adjustment — a modern synthesis
MLESAC: a new robust estimator with application to estimating image geometry
Complete solution classification for the perspective-three-point problem
Matching with prosac — progressive sample consensus
A general solution to the p4p problem for camera with unknown focal length
EPnP: an accurate o(n) solution to the pnp problem
USAC: a universal framework for random sample consensus
Direct linear transformation from comparator coordinates into object space coordinates in close-range photogrammetry
Automatic registration of lidar and optical images of urban scenes
LAPS - localisation using appearance of prior structure: 6-dof monocular camera localisation using prior pointclouds
Visual localization within lidar maps for automated urban driving
Robust direct visual localisation using normalised information distance
Sampling-based methods for visual navigation in 3d maps by synthesizing depth images
Convolutional neural network-based coarse initial position estimation of a monocular camera in large-scale 3D light detection and ranging maps
CMRNet: camera to lidar-map registration |code
Global visual localization in lidar-maps through shared 2d-3d embedding space
CPO: change robust panorama to point cloud localization
Scene coordinate regression forests for camera relocalization in rgb-d images
Multi-output learning for camera relocalization
Exploiting uncertainty in regression forests for accurate camera relocalization
Deep image retrieval: Learning global representations for image search
On-the-fly adaptation of regression forests for online camera relocalisation
DSAC — differentiable ransac for camera localization |code
Random forests versus neural networks — what’s best for camera localization
Learning less is more - 6d camera localization via 3d surface regression |code
Scene coordinate regression with angle-based reprojection loss for camera relocalization
Exploiting points and lines in regression forests for rgb-d camera relocalization
Expert sample consensus applied to camera re-localization |code
SANet: scene agnostic network for camera localization
Camera relocalization by exploiting multi-view constraints for scene coordinates regression
Hierarchical scene coordinate classification and regression for visual localization
KFNet: learning temporal camera relocalization using kalman filtering |code
Learning camera localization via dense scene matching |code
Visual camera re-localization from RGB and RGB-D images using DSAC |code
Learning to detect scene landmarks for camera localization
A deep feature aggregation network for accurate indoor camera localization
Monocular camera localization in 3d lidar maps
Stereo camera localization in 3d lidar maps
Scale-aware camera localization in 3d lidar maps with a monocular visual odometry
3D map-guided single indoor image localization refinement
DeepI2P: image-to-point cloud registration via deep classification |code
Autonomous vehicle localization with prior visual point cloud map constraints in gnss-challenged environments
Mobile robot localization considering uncertainty of depth regression from camera images
FARLAP: fast robust localisation using appearance priors
Monocular localization in feature-annotated 3d polygon maps
Monocular direct sparse localization in a prior 3d surfel map
3D surfel map-aided visual relocalization with learned descriptors
Metric monocular localization using signed distance fields
Freetures: localization in signed distance function map
Light-weight localization for vehicles using road markings
LaneLoc: lane marking based localization using highly accurate maps
Monocular visual localization using road structural features
Pole-based localization for autonomous vehicles in urban scenarios
Improving vehicle localization using semantic and pole-like landmarks
Monocular localization in urban environments using road markings
Monocular localization in hd maps by combining semantic segmentation and distance transform
In-lane localization and ego-lane identification method based on highway lane endpoints
Monocular localization with vector hd map (mlvhm): a low-cost method for commercial ivs
Coarse-to-fine semantic localization with hd map for autonomous driving in structural scenes
BSP-monoloc: basic semantic primitives based monocular localization on roads
LTSR: long-term semantic relocalization based on hd map for autonomous vehicles
Lost shopping! monocular localization in large indoor spaces
Do you see the bakery? leveraging geo-referenced texts for global localization in public maps
DeLS-3d: deep localization and segmentation with a 3d semantic map |code
Semantic pose verification for outdoor visual localization with self-supervised contrastive learning
Long-term image-based vehicle localization improved with learnt semantic descriptors
Semantic match consistency for long-term visual localization
Long-term visual localization using semantically segmented images
Fine-grained segmentation networks: self-supervised segmentation for improved long-term visual localization |code
SemLoc: accurate and robust visual localization with semantic and structural constraints from prior maps
Semantically guided location recognition for outdoors scenes
Semantics-aware visual localization under challenging perceptual conditions
VLASE: vehicle localization by aggregating semantic edges |code
Semantically-aware attentive neural embeddings for 2d long-term visual localization
PixSelect: less but reliable pixels for accurate and efficient localization
NeRF-Loc: Visual Localization with Conditional Neural Radiance Field
Accurate image localization based on google maps street view