30 July 2024
[CVMI 2024] The article "SERNet-Former: Segmentation by Efficient-ResNet with Attention-Boosting Gates and Attention-Fusion Networks" is accepted to The 3rd IEEE International Conference on Computer Vision and Machine Intelligence (IEEE CVMI)16 May 2024
[CVPR 2024 Workshops] The article "SERNet-Former: Semantic Segmentation by Efficient Residual Network with Attention-Boosting Gates and Attention-Fusion Networks" is now accepted to CVPR 2024 Workshops. Equivariant Vision: From Theory to PracticeJanuary 2024
SERNet-Former set state-of-the-art result on Cityscapes validation dataset for pixel-level segmentation: 87.35 % mIoUJanuary 2024
SERNet-Former set state-of-the-art result on CamVid dataset: 84.62 % mIoUJanuary 2024
SERNet-Former ranked as the seventh on Cityscapes test dataset for pixel-level segmentation according to PapersWithCode.com: 84.83 % mIoU
(a) Attention-boosting Gate (AbG) and Attention-boosting Module (AbM) are fused into the encoder part.
(b) Attention-fusion Network (AfN), introduced into the decoder
The breakdown of class accuracies on CamVid dataset
Model | Baseline Architecture | Building | Tree | Sky | Car | Sign | Road | Pedestrian | Fence | Pole | Sidewalk | Bicycle | mIoU |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
SERNet-Former | Efficient-ResNet | 93.0 | 88.8 | 95.1 | 91.9 | 73.9 | 97.7 | 76.4 | 83.4 | 57.3 | 90.3 | 83.1 | 84.62 |
The experiment outcomes on CamVid dataset
Model | Baseline Architecture | road | sidewalk | building | wall | fence | pole | traffic light | traffic sign | vegetation | terrain | sky | person | rider | car | truck | bus | train | motorcycle | bicycle | mIoU |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
SERNet-Former | Efficient-ResNet | 98.2 | 90.2 | 94.0 | 67.6 | 68.2 | 73.6 | 78.2 | 82.1 | 94.6 | 75.9 | 96.9 | 90.0 | 77.7 | 96.9 | 86.1 | 93.9 | 91.7 | 70.0 | 82.9 | 84.83 |
The experiment outcomes on Cityscapes dataset