This repository contains code and models for:
Vision Transformers for Dense Prediction
- Download the model weights and place them in the
weights
folder:
Monodepth:
Segmentation:
-
Set up dependencies:
conda install pytorch torchvision opencv pip install timm
The code was tested with Python 3.7, PyTorch 1.8.0, OpenCV 4.5.1, and timm 0.4.5
-
Place one or more input images in the folder
input
. -
Run a monocular depth estimation model:
python run_monodepth.py
Or run a semantic segmentation model:
python run_segmentation.py
-
The results are written to the folder
output_monodepth
andoutput_segmentation
, respectively.
Use the flag -t
to switch between different models. Possible options are dpt_hybrid
(default) and dpt_large
.