Where-s-Whale-do_Competition (First Place Solution)

For this competition, your goal is to identify which images in a database contain the same individual beluga whale seen in a query image.

You will be provided with a set of queries, each one specifying a single query image of a beluga whale and a corresponding database to search for matches to that same individual. The database will include images of both matching and non-matching belugas. This is a learning-to-rank information retrieval task.

PACKAGES

To make sure that the results are completely reproducable it is better to use the same versions and (Python 3.6.9):

numpy == 1.19.5
opencv == 4.5.4
timm == 0.5.4
albumentations == 1.1.0
sklearn == 0.24.2
torch == 1.7.1+cu110
torchvision == 0.8.2+cu110
pandas == 1.1.5

Download weights

You can download the trained models from here

Training

Using Training Scripts

Clone the repo
Add the data to the same directory inside the repo in a folder named data, which includes a sub folder called images and the metadata.csv

├── train_literal (source code)
├── train_top (source code)
└── data (dataset)
    └── images
        ├── img1.jpg
        .
        .
    ├── metadata.csv

Train top models

from inside the train_top folder run:

from train import run
run(0, "effb5")

This will train fold 0 to train other folds for effb5 just change the number to 0,1,2,3,4. To train top model with effv2m run:

from train import run
run(0, "effv2")

to train other folds just change the number to 0, 2, 4 (only three folds where trained (0, 2, 4))

train literal models

from inside the train_literal folder run:

from train import run
run(0, "effb5")

This will train fold 0 to train other folds for effb5 just change the number to 0,1,2,4. (only four folds are trained (0,1,2,4))

PS: when after the training finishes change the name of weights files to avoid over writing it when starting a training for new fold.

OR Using Notbooks:

Training effb5 models for top images
Training effv2 models for top images
Training effb5 models for literal

There are some other notebooks which were used during experiments but none of them is used on the best or last submissions

Inference

The final (eligable submission) could be found here and another version of the same submission with a little refactoring is here but not tested.

Other submissions could be found on the submissions folder.

Specification

CPU intel corei9 10th generation

GPU RTX 3090

OS Linux

Memory 128 GB

Training Duration:

Effb5 top about 180 minutes for a single fold (5 folds were trained) in total about 15 hours
Effv2 top about 220 minutes for a single folds (3 folds were trained) in total about 11 hours
Effb5 literal about 180 minutes for a single fold (4 folds were trained) in total about 12 hours Inference Duration: It took almost 2 hours and 45 minutes on the driven data competition env

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Where-s-Whale-do_Competition (First Place Solution)

PACKAGES

Download weights

Training

Using Training Scripts

Train top models

train literal models

PS: when after the training finishes change the name of weights files to avoid over writing it when starting a training for new fold.

OR Using Notbooks:

Inference

Specification

Solution Architecture

Files

README.md

Latest commit

History

README.md

File metadata and controls

Where-s-Whale-do_Competition (First Place Solution)

PACKAGES

Download weights

Training

Using Training Scripts

Train top models

train literal models

PS: when after the training finishes change the name of weights files to avoid over writing it when starting a training for new fold.

OR Using Notbooks:

Inference

Specification

Solution Architecture