Skip to content

Master-cai/Awesome-camera-relocalization-in-prior-map

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Awesome-camera-relocalization-in-prior-map

A list of visual(camera) re-localization researches. Visual relocalization refers to the problem of estimating the position and orientation using the information of an existing prior map based on the image captured from visual sensors. We sort out these methods according to the type of map, mainly focusing on image databases and point cloud maps.

This document will be continuously updated and there may inevitably be errors and omissions. Please feel free to point them out through issues or pull requests.


Surveys

Visual relocalization in Image Database Maps

The image database map consists of a series of images with information. The images in the image database can be captured by cameras with different intrinsic parameters. The poses of images in image databases are usually obtained by the SfM algorithm or external measurement devices such as GPS.

Image Retrieval

The core idea of image retrieval-based methods is to first search for the reference image that is most similar to the query image from the image database. Then, based on the relative pose relationship between the query image and the reference image, as well as the pose label of the reference image, the pose information of the query image is calculated.

Image retrieval methods

Modeling the shape of the scene: A holistic representation of the spatial envelope

IM2GPS: estimating geographic information from a single image

Aggregating local descriptors into a compact image representation |code

Real-time loop detection with bags of binary words

Automatic alignment of paintings and photographs depicting a 3d scene

Fast relocalisation and loop closing in keyframe-based slam

24/7 place recognition by view synthesis

NetVLAD: cnn architecture for weakly supervised place recognition |code

CNN image retrieval learns from bow: unsupervised fine-tuning with hard examples |code

Filtering 3d keypoints using gist for accurate image-based localization

Orb-slam2: An open-source slam system for monocular, stereo, and rgb-d cameras |code

RelocNet: continuous metric learning relocalisation using neural nets

NeXtVLAD: an efficient neural network to aggregate frame-level features for large-scale video classification |code

NeXtVLAD: an efficient neural network to aggregate frame-level features for large-scale video classification |code

TransGeo: transformer is all you need for cross-view image geo-localization |code

Deep visual geo-localization benchmark |code

An efficient and scalable collection of fly-inspired voting units for visual place recognition in changing environments

An efficient and scalable collection of fly-inspired voting units for visual place recognition in changing environments

Feature matching and pose estimation

Image based localization in urban environments

Image geo-localization based on multiplenearest neighbor feature matching usinggeneralized graphs

Are large-scale 3d models really necessary for accurate visual localization

Camera relocalization by computing pairwise relative poses using convolutional neural network

S2dnet: Learning accurate correspondences for sparse-to-dense feature matching |code

SuperGlue: learning feature matching with graph neural networks |code

To learn or not to learn: visual localization from essential matrices

LoFTR: detector-free local feature matching with transformers |code

Clustergnn: Cluster-based coarse-to-fine graph neural network for efficient feature matching

3DG-stfm: 3d geometric guided student-teacher feature matching |code

Image appearance normalization

Night-to-day image translation for retrieval-based localization |code

Adversarial feature disentanglement for place recognition across changing appearance

A deep learning based image enhancement approach for autonomous driving at night

Pose Regression

The pose regression method extracts high-dimensional features of the query image through an end-to-end deep neural network, and then uses the high-dimensional features to directly regress the camera pose. We classify them based on different input.

monocular camera

PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization |code

Modelling uncertainty in deep learning for camera relocalization |code

Geometric Loss Functions for Camera Pose Regression With Deep Learning

Image-based localization using hourglass networks

Image-based localization using lstms for structured feature correlation

Delving deeper into convolutional neural networks for camera relocalization

Deep regression for monocular camera-based 6-dof global localization in outdoor environments

Prior guided dropout for robust visual localization in dynamic environments |code

Atloc: Attention guided camera localization |code

Extending absolute pose regression to multiple scenes |code

Learning multi-scene absolute pose regression with transformers |code

DiffPoseNet: direct differentiable camera pose estimation

Dfnet: Enhance absolute pose regression with direct feature matching |code

Sc-wls: Towards interpretable feed-forward camera re-localization |code

sequence images

Relative camera pose estimation using convolutional neural networks |code

VidLoc: a deep spatio-temporal model for 6-dof video-clip relocalization

Geometry-aware learning of maps for camera localization |code

Deep auxiliary learning for visual localization and odometry |code

VLocNet++: deep multitask learning for semantic visual localization and odometry

Local supports global: deep camera relocalization with sequence enhancement

GTCaR: Graph Transformer for Camera Re-localization

RGBD image

Indoor relocalization in challenging environments with dual-stream convolutional neural networks

Visual relocalization in Point Cloud Maps

Point cloud maps consist of a series of unordered points with three-dimensional coordinates in 3D space. Since the query image and the point cloud map are of different modalities, they cannot be directly compared. The key issue is how to establish data association. Existing methods are divided into three categories:(1) Feature based method(Visual Point Cloud): match by image features; (2): 3D $\rightarrow$ 2D: Project 3D point cloud to an image and then match. (3): 2D$\rightarrow$ 3D: Convert 2D image to 3D scene data.

Feature based method(Visual Point Cloud)

F2P(Feature to Point) and P2F(Point to Feature)

Monocular vision for mobile robot localization and autonomous navigation

From structure-from-motion point clouds to fast location recognition

Location recognition using prioritized feature matching

Fast image-based localization using direct 2d-to-3d matching

Worldwide pose estimation using 3d point clouds

Efficient global 2d-3d matching for camera localization in a large-scale 3d map

Efficient & effective prioritized matching for large-scale image-based localization

Improved matching method

Keypoint recognition using randomized trees

From structure-from-motion point clouds to fast location recognition

Worldwide pose estimation using 3d point clouds

Discriminative feature-to-point matching in image-based localization

Hyperpoints and fine vocabularies for large-scale location recognition

Efficient monocular pose estimation for complex 3d models

Fast localization in large-scale environments using supervised indexing of binary features

Filtering 3d keypoints using gist for accurate image-based localization

Large-scale location recognition and the geometric burstiness problem |code

From coarse to fine: robust hierarchical localization at large scale |code

2d3d-matchnet: Learning to match keypoints across 2d image and 3d point cloud

Lcd: Learned cross-domain descriptors for 2d-3d matching |code

DA4AD: end-to-end deep attention-based visual localization for autonomous driving

Monocular camera localization in prior lidar maps with 2d-3d line correspondences |code

Back to the feature: learning robust camera localization from pixels to pose |code

2D-3D pose estimation

Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography

Bundle adjustment — a modern synthesis

MLESAC: a new robust estimator with application to estimating image geometry

Complete solution classification for the perspective-three-point problem

Matching with prosac — progressive sample consensus

A general solution to the p4p problem for camera with unknown focal length

EPnP: an accurate o(n) solution to the pnp problem

USAC: a universal framework for random sample consensus

Direct linear transformation from comparator coordinates into object space coordinates in close-range photogrammetry

3D $\rightarrow$ 2D: projection methods

Automatic registration of lidar and optical images of urban scenes

LAPS - localisation using appearance of prior structure: 6-dof monocular camera localisation using prior pointclouds

Visual localization within lidar maps for automated urban driving

Robust direct visual localisation using normalised information distance

Sampling-based methods for visual navigation in 3d maps by synthesizing depth images

Convolutional neural network-based coarse initial position estimation of a monocular camera in large-scale 3D light detection and ranging maps

CMRNet: camera to lidar-map registration |code

Global visual localization in lidar-maps through shared 2d-3d embedding space

CPO: change robust panorama to point cloud localization

2D $\rightarrow$ 3D: Scene Dimensional Upgrading

Scene Coordinate Regression

Scene coordinate regression forests for camera relocalization in rgb-d images

Multi-output learning for camera relocalization

Exploiting uncertainty in regression forests for accurate camera relocalization

Deep image retrieval: Learning global representations for image search

On-the-fly adaptation of regression forests for online camera relocalisation

DSAC — differentiable ransac for camera localization |code

Random forests versus neural networks — what’s best for camera localization

Learning less is more - 6d camera localization via 3d surface regression |code

Scene coordinate regression with angle-based reprojection loss for camera relocalization

Exploiting points and lines in regression forests for rgb-d camera relocalization

Expert sample consensus applied to camera re-localization |code

SANet: scene agnostic network for camera localization

Camera relocalization by exploiting multi-view constraints for scene coordinates regression

Hierarchical scene coordinate classification and regression for visual localization

KFNet: learning temporal camera relocalization using kalman filtering |code

Learning camera localization via dense scene matching |code

Visual camera re-localization from RGB and RGB-D images using DSAC |code

Learning to detect scene landmarks for camera localization

A deep feature aggregation network for accurate indoor camera localization

point cloud Reconstruction

Monocular camera localization in 3d lidar maps

Stereo camera localization in 3d lidar maps

Scale-aware camera localization in 3d lidar maps with a monocular visual odometry

3D map-guided single indoor image localization refinement

DeepI2P: image-to-point cloud registration via deep classification |code

Autonomous vehicle localization with prior visual point cloud map constraints in gnss-challenged environments

Mobile robot localization considering uncertainty of depth regression from camera images

Visual relocalization in Dense Boundary Representation Maps

Mesh map

FARLAP: fast robust localisation using appearance priors

Monocular localization in feature-annotated 3d polygon maps

Surfel Map

Monocular direct sparse localization in a prior 3d surfel map

3D surfel map-aided visual relocalization with learned descriptors

SDF Map

Metric monocular localization using signed distance fields

Freetures: localization in signed distance function map

Visual relocalization in high definition(HD) Map

Light-weight localization for vehicles using road markings

LaneLoc: lane marking based localization using highly accurate maps

Monocular visual localization using road structural features

Pole-based localization for autonomous vehicles in urban scenarios

Improving vehicle localization using semantic and pole-like landmarks

Monocular localization in urban environments using road markings

Monocular localization in hd maps by combining semantic segmentation and distance transform

In-lane localization and ego-lane identification method based on highway lane endpoints

Monocular localization with vector hd map (mlvhm): a low-cost method for commercial ivs

Coarse-to-fine semantic localization with hd map for autonomous driving in structural scenes

BSP-monoloc: basic semantic primitives based monocular localization on roads

LTSR: long-term semantic relocalization based on hd map for autonomous vehicles

Visual relocalization in Semantic Maps

Semantic Global Features

Lost shopping! monocular localization in large indoor spaces

Do you see the bakery? leveraging geo-referenced texts for global localization in public maps

Semantic visual localization

DeLS-3d: deep localization and segmentation with a 3d semantic map |code

Semantic pose verification for outdoor visual localization with self-supervised contrastive learning

Long-term image-based vehicle localization improved with learnt semantic descriptors

Semantic Feature matching

Semantic match consistency for long-term visual localization

Long-term visual localization using semantically segmented images

Fine-grained segmentation networks: self-supervised segmentation for improved long-term visual localization |code

SemLoc: accurate and robust visual localization with semantic and structural constraints from prior maps

Robust Feature Selection

Semantically guided location recognition for outdoors scenes

Semantics-aware visual localization under challenging perceptual conditions

VLASE: vehicle localization by aggregating semantic edges |code

Semantically-aware attentive neural embeddings for 2d long-term visual localization

PixSelect: less but reliable pixels for accurate and efficient localization

Visual relocalization in NeRF map

NeRF-Loc: Visual Localization with Conditional Neural Radiance Field

Dataset

Outdoor

Accurate image localization based on google maps street view

San francisco |page

Dubrovnik 6K |page

KITTI |page

Cambridge |page

Cityscape |page

NCLT |page

Oxford-robotcar |page

Aachen Day-Night |page

CMU Seasons |page

Apollo Scape |page

KAIST Day/Night |page

SemanticKITTI |page

nuScenes |page

SF-XL |page

Indoor

TUM-RGBD |page

7 scenes |page

SceneNN |page

EuRoC MAV |page

3DMatch |page

ADVIO |page

Misc

SceneNet RGB-D |page

LaMAR |page

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published