This repository contains the official PyTorch implementation for MoSA.
See env_setup.sh
-
Fine-Grained Visual Classification (FGVC): The datasets can be downloaded following the official links. We split the training data if the public validation set is not available. The splitted dataset can be found here: Dropbox, Google Drive.
-
Visual Task Adaptation Benchmark (VTAB): See
VTAB_SETUP.md
for detailed instructions and tips. -
General Image Classification Datasets (GICD): The datasets will be automatically downloaded when you run an experiment using them via
MoSA
.
Download and place the pre-trained Transformer-based backbones to MODEL.MODEL_ROOT
.
Pre-trained Backbone | Link | md5sum |
---|---|---|
ViT-B/16 | link | d9715d |
ViT-L/16 | link | 8f39ce |
Swin-B | link | bf9cc1 |
To fine-tune a pre-trained ViT model via MoSA on FGVC-cub, you can run:
bash scripts/mosa/vit/FGVC/cub.sh
The majority of MoSA is licensed under the CC-BY-NC 4.0 license (see LICENSE for details).