This document contains information about running and evaluating the KeyPose baseline.
Please download the image and annotation data, camera parameters, object files and split files from the Dropbox link. The next step is to extract the tar files. The user needs to create the soft link or copy the files such that the file format looks like the following:
The command script for launching the training of KeyPose model is
This script will train a net model of predicting object 2D projected keypoints.
Inside the command file, please set the value of data
to be the path to be /path/to/stereobj_1m
. The user may also choose to use ImageNet pre-trained weights to initialize the ResNet-34 backbone of the KeyPose model.
One choice is to download the pre-trained model from here.
Please set the value of pretrained_models
flag to be the path to the downloaded net weights npz file.
We provide trained KeyPose model here. These models are trained on train+val sets. The models are used to report the test set numbers in our paper. The users can try to run these models and get familiar with the evaluation scripts.
The first step of model evaluation is to predict the object 2D projected keypoints in both left and right stereo images.
The command script is
, where "lr" means "left and right".
The keypoint results will be saved in log_lr_${split}_preds/${cls_type}
, where the ${split}
is the dataset split, and ${cls_type}
is the object class name. Available values for split flag are train
, val
, trainval
and test
We provide three different ways for computing 6D pose from 2D projected keypoint predictions in stereo images:
Monocular PnP. The command script for launching monocular PnP is
. The results are stored inlog_pnp_${split}/${cls_type}.json
. -
Binocular classic triangulation. The command script for launching binocular classic triangulation is
. The results are stored inlog_classic_triangulation_${split}/${cls_type}.json
. -
Binocular object triangulation as proposed in our paper. The command script for launching binocular object triangulation is
. The results are stored inlog_object_triangulation_${split}/${cls_type}
. The user needs to run an additional script
to combine the json files into onelog_object_triangulation_${split}/${cls_type}.json
In order to submit the results to EvalAI to get the results on test set, the user needs to merge predictions from all object classes into one single json file.
We provide
for this purpose.
Please refer to
for more details.
The command script for launching 6D pose evaluation is located in ../
Please change the value of input_json
flag in the script to be the path to the (merged) json file dumped from the previous step.