It's common to initialize from backbone models pre-trained on ImageNet classification tasks. The following backbone models are available:
- R-50.pkl (torchvision): converted copy of torchvision's ResNet-50 model. More details can be found in the conversion script.
- R-103.pkl: a ResNet-101 with its first 7x7 convolution replaced by 3 3x3 convolutions. This modification has been used in most semantic segmentation papers (a.k.a. ResNet101c in our paper). We pre-train this backbone on ImageNet using the default recipe of pytorch examples.
Note: below are available pretrained models in Detectron2 that we do not use in our paper.
- R-50.pkl: converted copy of MSRA's original ResNet-50 model.
- R-101.pkl: converted copy of MSRA's original ResNet-101 model.
- X-101-32x8d.pkl: ResNeXt-101-32x8d model trained with Caffe2 at FB.
Our paper also uses ImageNet pretrained models that are not part of Detectron2, please refer to tools to get those pretrained models.