This repo contains the code for our paper SeMask: Semantically Masked Transformers for Semantic Segmentation. It is based on MaskFormer.
- † denotes the backbones were pretrained on ImageNet-22k and 384x384 resolution images.
- Pre-trained models can be downloaded following the instructions given under tools.
Method | Backbone | Crop Size | mIoU | mIoU (ms+flip) | #params | config | Checkpoint |
---|---|---|---|---|---|---|---|
SeMask-L MaskFormer | SeMask Swin-L† | 640x640 | 54.75 | 56.15 | 219M | config | checkpoint |
See installation instructions.
See Preparing Datasets for MaskFormer.
See Getting Started with MaskFormer.
@article{jain2021semask,
title={SeMask: Semantically Masking Transformer Backbones for Effective Semantic Segmentation},
author={Jitesh Jain and Anukriti Singh and Nikita Orlov and Zilong Huang and Jiachen Li and Steven Walton and Humphrey Shi},
journal={arXiv},
year={2021}
}