Skip to content

aassxun/DSH_Analysis

Repository files navigation

An Empirical Study on Training Paradigms for Deep Supervised Hashing

This project focuses on systematically evaluating the two main training paradigms in deep supervised hashing: pairwise hashing and pointwise hashing. Deep supervised hashing has become essential for large-scale image retrieval tasks, offering efficient storage and retrieval capabilities by transforming high-dimensional image data into compact binary hash codes. The study provides an extensive quantitative exploration, comparing the performance of these paradigms across multiple datasets.

Dataset and Experimental Setup

The experiments are conducted on both single-label and multi-label datasets, utilizing a variety of hash code dimensions (e.g., 16-bit, 32-bit, and 64-bit). The evaluation protocol covers 1,833 experiments, involving 17 different methods across 9 single-label datasets (3 generic and 6 fine-grained) and 3 multi-label datasets, ensuring a comprehensive assessment of retrieval performance under various conditions, including seen and unseen class scenarios.

Datasets:

  • Generic single-label datasets: ImageNet-1K, CIFAR-10, and CIFAR-100.
  • Fine-grained single-label datasets: CUB200-2011, Food101, Aircraft, NABirds, Stanford Dogs, and VegFru.
  • Multi-label datasets: COCO, NUS-WIDE, and Flickr25K.

Note: Following previous DSH settings, models are pretrained on the ImageNet-1K dataset, which may lead to data leakage on ImageNet-1K and Stanford Dogs.

Evaluation Protocols:

  1. seen@seen: Both query images and database images are from seen classes.
  2. seen@all: Based on the "seen@seen" protocol, unseen classes' images are added to the database.
  3. unseen@unseen: Both query images and database images are from unseen classes.
  4. unseen@all: Expanding on the "unseen@unseen" basis, the database is extended to include both seen and unseen images.

The evaluation metrics for these four tasks are mAP@k, calculated as:

$${\rm mAP}@k = \frac{1}{Q} \sum_{q=1}^{Q} \frac{1}{\min (m_q, k)} \sum_{t=1}^{\min (n_q,k)} P_q(t)rel_q(t)$$

where $Q$ is the number of query images; $m_q$ is the number of index images containing a landmark in common with the query image $q$ ($m_q > 0$); $n_q$ represents the number of predictions made by different methods for query $q$; $P_q(t)$ is the precision at rank $t$ for the $q$-th query. While $rel_q(t)$ denotes the relevance of prediction $t$ for the $q$-th query: it equals $1$ if the $t$-th prediction is correct, and $0$ otherwise.

Experimental Settings:

We standardized the experimental configurations across different datasets for equitable comparison by amalgamating existing pairwise and pointwise hashing methods. The training phase consists of iterations, each with a specific number of epochs. For instance, 50 iterations with 3 epochs per iteration results in a total of 150 epochs.

Settings for Generic Single-label Datasets:

  • CIFAR-10 and CIFAR-100: Randomly selected 2000 samples from the training set, with 50 iterations and 3 epochs. The evaluation metric is mAP@1000.
  • ImageNet-1K: 130,000 images in 100 classes for training, and 5,000 for testing. Iteration: 50, Epoch: 1, Evaluation metric: mAP@1000.

We established 3 partition configurations for CIFAR-100 and ImageNet-1K:

  • The former 95%/85%/75% categories as seen classes.
  • The latter 5%/15%/25% categories as unseen classes. For CIFAR-10, we only set 1 configuration: the former 80% categories as seen classes, and the latter 20% as unseen classes.

Settings for Multi-label Datasets:

  • Flickr-25K: Sampled 1000 images as queries and 20,000 as database points. Iteration: 50, Epoch: 3, Evaluation metric: mAP@5000.
  • NUS-WIDE: 21 most frequent categories used for evaluation. Iteration: 50, Epoch: 3, Evaluation metric: mAP@5000.
  • MS COCO: Pruned images with no category information. 82,081 images from the training set as database points, and 5,000 images from the validation set as queries. Iteration: 50, Epoch: 3, Evaluation metric: mAP@5000.

For multi-label datasets, two configurations were implemented:

  1. Seen classes (~95% of dataset): 32 categories (Flickr-25K), 10 categories (NUS-WIDE), 45 categories (MS COCO).
  2. Seen classes (~85% of dataset): 23 categories (Flickr-25K), 5 categories (NUS-WIDE), 27 categories (MS COCO).

Settings for Fine-grained Single-label Datasets:

  • Datasets: CUB200-2011, Food101, VegFru, Stanford Dogs, Aircraft, NABirds.
  • Iteration: 40, Epoch: 30.
  • Randomly selected 2000 samples from the training set, but 4000 for VegFru and NABirds.

We established 3 partition configurations for these datasets:

  • The former 95%/85%/75% categories as seen classes.
  • The latter 5%/15%/25% categories as unseen classes.

For multi-label datasets, two images will be defined as a ground-truth neighbor (similar pair) if they share at least one common label.

Sample Results

Seen @ applications

Quantitative results with "seen@seen/seen@all" formats when query images are from seen classes on the CIFAR-100 and ImageNet-1K datasets (5% categories as unseen classes).

Method CIFAR-100 (mAP@1000) 16 bits CIFAR-100 (mAP@1000) 32 bits CIFAR-100 (mAP@1000) 64 bits ImageNet-1K (mAP@1000) 16 bits ImageNet-1K (mAP@1000) 32 bits ImageNet-1K (mAP@1000) 64 bits
CSQ 62.9/62.6 66.8/62.9 62.9/62.1 84.3/84.2 87.4/87.3 88.2/88.1
PSLDH 59.3/58.9 67.7/67.0 70.5/69.5 79.4/79.3 81.6/81.5 81.0/80.8
OrthoHash 61.7/61.2 68.8/68.0 72.2/71.2 85.9/85.7 88.5/88.4 89.9/89.8
HyP$^2$ Loss —— —— —— —— —— ——
FISH 50.5/49.4 50.3/48.0 58.3/54.2 73.7/73.6 77.3/77.1 78.4/78.2
MDSH 57.3/56.9 67.5/66.7 69.1/68.1 73.2/73.1 76.8/76.7 79.7/79.6
DPAH 50.7/50.4 57.0/56.5 60.5/59.7 78.5/78.4 81.8/81.6 82.4/82.2
DCDH 53.4/52.9 61.9/61.1 60.5/59.6 78.1/77.7 82.3/82.2 84.6/84.4
CHN 70.1/69.8 75.2/74.6 77.4/76.4 83.7/83.6 86.5/86.4 87.8/87.7
DSH 13.7/13.6 24.8/24.4 30.9/30.4 65.7/65.4 79.0/78.8 82.9/82.7
HashNet 16.1/16.0 26.9/26.6 38.4/37.7 41.9/41.8 73.8/73.7 83.5/83.5
ADSH 32.8/31.8 58.0/56.7 72.2/70.4 81.7/81.5 87.3/87.1 88.4/88.2
ExchNet 34.7/33.7 58.4/56.8 70.0/67.5 79.5/79.2 85.9/85.7 88.4/88.2
A$^2$-Net 34.1/33.0 59.4/57.5 70.5/69.2 81.5/81.3 86.9/86.8 88.5/88.3
SEMICON 27.0/25.0 53.9/51.7 70.8/68.9 75.9/75.6 83.2/83.0 84.9/84.7
AGMH 50.2/48.6 70.1/67.9 76.7/74.6 85.3/85.1 85.6/85.4 83.3/83.0
DAHNet* 50.7/49.4 71.8/69.8 78.9/77.1 83.3/83.1 86.4/86.2 86.8/86.6
DAHNet$^{--}$ 38.3/37.3 65.5/63.8 77.2/75.0 81.8/81.6 87.2/87.0 87.3/87.2

Quantitative results with "seen@seen/seen@all" formats when query images are from seen classes on the Flickr-25K, NUS-WIDE and MS COCO datasets (around 5% images as unseen classes).

Method Flickr-25K (mAP@5000) 16 bits Flickr-25K (mAP@5000) 32 bits Flickr-25K (mAP@5000) 64 bits NUS-WIDE (mAP@5000) 16 bits NUS-WIDE (mAP@5000) 32 bits NUS-WIDE (mAP@5000) 64 bits MS COCO (mAP@5000) 16 bits MS COCO (mAP@5000) 32 bits MS COCO (mAP@5000) 64 bits
CSQ 88.1/88.9 90.3/90.4 91.2/90.8 89.3/89.8 90.2/90.5 90.5/90.9 63.7/78.4 70.6/82.7 71.7/83.5
PSLDH 86.4/87.6 91.3/91.1 90.9/90.7 89.5/90.0 90.6/90.8 89.7/90.0 65.1/78.3 69.0/81.2 70.1/82.5
OrthoHash 86.9/87.9 89.3/89.8 91.0/91.3 88.7/89.3 90.4/90.9 91.1/91.5 61.1/75.3 69.3/82.0 72.5/84.4
HyP$^2$ Loss 87.2/88.7 89.8/90.8 90.5/90.9 88.4/89.0 89.4/89.6 89.9/90.0 61.6/77.5 68.3/81.3 71.5/83.4
FISH —— —— —— —— —— —— —— —— ——
MDSH —— —— —— —— —— —— —— —— ——
DPAH 49.6/57.2 49.6/57.2 49.6/57.2 34.0/34.5 34.0/34.5 38.2/38.8 15.9/40.6 15.9/40.6 16.2/41.1
DCDH 80.4/83.0 81.2/83.1 83.8/85.0 87.3/87.7 87.7/88.1 87.9/88.4 53.9/71.3 60.7/77.7 60.3/78.9
CHN —— —— —— —— —— —— —— —— ——
DSH 76.6/80.4 81.3/84.3 77.8/81.3 82.1/82.9 79.3/80.2 80.5/81.2 48.2/69.3 55.1/73.8 52.3/71.2
HashNet 87.9/89.3 91.0/91.6 92.1/92.9 83.3/83.9 88.6/89.1 90.9/91.4 51.9/72.2 61.7/78.6 67.2/82.5
ADSH 91.7/85.6 92.3/85.7 91.1/80.5 94.9/84.1 95.5/78.3 95.8/72.2 73.6/75.9 80.8/77.2 81.8/74.8
ExchNet 93.3/84.8 92.3/80.2 90.0/73.8 93.7/78.0 94.4/78.1 94.6/71.4 85.6/77.5 89.7/74.4 90.1/73.9
A$^2$-Net 87.0/70.7 82.9/60.1 81.7/56.6 93.9/76.1 94.2/67.9 82.5/74.1 73.9/76.8 79.6/73.5 81.2/69.5
SEMICON 84.4/72.4 85.7/74.7 86.8/75.2 91.8/82.3 93.4/76.8 91.8/66.9 70.3/72.9 75.7/70.0 74.2/66.9
AGMH 81.1/56.0 82.5/60.2 84.6/65.2 82.7/40.1 86.1/48.4 85.0/81.2 69.0/67.4 61.9/36.6 64.1/47.8
DAHNet* —— —— —— —— —— —— —— —— ——
DAHNet$^{--}$ 87.4/74.1 85.3/67.5 87.0/72.2 93.1/75.1 91.5/66.3 89.9/62.1 72.4/76.5 78.4/73.7 77.6/68.8

Quantitative results with "seen@seen/seen@all" formats when queries are from seen classes on the CUB200-2011, Food101 and VegFru datasets (5% categories as unseen classes).

Method CUB200-2011 (mAP@All) 12 bits CUB200-2011 (mAP@All) 24 bits CUB200-2011 (mAP@All) 32 bits CUB200-2011 (mAP@All) 48 bits Food101 (mAP@All) 12 bits Food101 (mAP@All) 24 bits Food101 (mAP@All) 32 bits Food101 (mAP@All) 48 bits VegFru (mAP@All) 12 bits VegFru (mAP@All) 24 bits VegFru (mAP@All) 32 bits VegFru (mAP@All) 48 bits
CSQ 68.9/68.8 79.5/79.4 81.2/81.0 81.7/81.6 55.0/54.8 59.7/59.5 61.4/61.1 61.8/61.5 61.1/61.0 82.6/82.6 83.8/83.7 84.0/83.9
PSLDH 73.6/73.6 77.4/77.3 77.4/77.4 78.2/78.1 48.0/47.8 55.2/55.0 56.9/56.6 61.4/60.9 72.6/72.5 79.4/79.3 80.2/80.1 81.5/81.4
OrthoHash 73.2/73.1 81.0/80.9 81.8/81.6 82.5/82.4 55.5/55.2 65.1/64.6 66.8/66.4 69.3/68.7 78.2/78.1 84.0/83.9 84.9/84.8 85.8/85.7
HyP$^2$ Loss —— —— —— —— —— —— —— —— —— —— —— ——
FISH 75.6/75.5 76.7/76.7 77.1/77.0 77.7/77.7 79.3/79.2 80.7/80.5 80.7/80.4 81.0/80.8 82.1/81.9 84.7/84.5 84.8/84.6 84.7/84.6
MDSH 70.8/70.8 74.4/74.3 75.2/75.1 75.5/75.4 45.0/44.9 52.5/52.1 56.1/55.7 59.3/58.8 66.7/66.6 71.8/71.7 73.2/73.0 75.3/75.1
DPAH 72.8/72.7 78.8/78.7 78.9/78.7 80.2/80.0 49.9/49.7 60.9/60.4 62.9/62.3 65.8/65.1 72.9/72.8 81.0/80.7 82.5/82.3 84.2/83.8
DCDH 65.3/65.1 76.1/75.7 78.5/78.2 80.6/80.4 44.4/44.0 55.4/54.8 58.7/58.1 59.6/58.8 63.5/63.3 76.3/75.8 78.8/78.3 81.3/80.8
CHN 75.1/75.0 80.0/79.8 81.2/81.0 81.5/81.4 80.1/79.9 81.2/81.0 81.3/81.1 81.7/81.5 82.6/82.4 84.2/84.0 84.9/84.7 85.4/85.2
DSH 32.8/32.5 53.0/52.7 70.4/69.9 78.3/77.9 25.5/25.2 38.3/37.9 52.2/51.6 60.5/59.8 18.9/18.8 29.4/29.1 37.5/37.1 65.5/64.9
HashNet 13.0/12.9 33.5/33.4 38.2/38.0 45.7/45.5 17.6/17.5 30.6/30.3 36.8/36.4 43.9/43.3 13.4/13.3 33.1/32.9 39.8/39.5 45.3/44.8
ADSH 32.0/31.5 57.3/56.8 65.8/65.4 76.5/76.1 45.0/43.3 69.2/67.0 77.0/75.2 81.4/78.9 22.2/21.8 46.0/45.1 51.5/50.4 64.8/63.4
ExchNet 30.5/29.7 58.8/57.6 68.4/67.5 76.8/76.1 45.9/43.5 70.4/67.9 76.6/74.1 81.6/78.7 25.4/24.5 48.0/46.4 55.7/54.1 66.6/65.0
A$^2$-Net 31.9/31.5 62.5/61.9 69.5/68.9 80.6/80.2 44.8/43.0 71.2/69.2 78.0/76.0 81.8/78.9 24.5/24.1 46.6/45.6 57.1/55.8 67.6/66.5
SEMICON 37.7/37.2 66.8/66.3 74.4/74.2 80.8/80.5 46.5/45.1 73.5/72.1 79.3/77.5 80.9/78.2 25.4/24.8 52.8/50.9 60.1/57.3 79.4/78.0
AGMH 59.9/59.6 80.6/80.2 82.4/82.0 82.7/81.8 71.3/70.5 78.6/76.7 78.5/75.0 78.6/74.6 49.1/48.4 77.9/76.6 81.2/80.0 84.1/82.3
DAHNet* 60.6/59.9 80.7/80.3 82.3/81.9 84.1/83.6 70.4/69.6 81.2/80.1 82.5/80.5 83.2/80.1 63.3/62.8 82.5/81.7 85.5/85.0 88.0/87.4
DAHNet$^{--}$ 41.2/40.7 69.7/69.4 76.2/75.9 82.3/81.9 51.1/49.8 77.7/76.2 81.0/79.0 82.3/79.3 34.1/33.6 63.3/62.2 73.0/72.0 81.6/80.3

Quantitative results with "seen@seen/seen@all" formats when queries are from seen classes on the Stanford Dogs, NABirds, and Aircraft datasets (5% categories as unseen classes).

Method Stanford Dogs (mAP@All) 12 bits Stanford Dogs (mAP@All) 24 bits Stanford Dogs (mAP@All) 32 bits Stanford Dogs (mAP@All) 48 bits NABirds (mAP@All) 12 bits NABirds (mAP@All) 24 bits NABirds (mAP@All) 32 bits NABirds (mAP@All) 48 bits Aircraft (mAP@All) 12 bits Aircraft (mAP@All) 24 bits Aircraft (mAP@All) 32 bits Aircraft (mAP@All) 48 bits
CSQ 73.3/73.2 80.2/80.1 81.5/81.4 81.8/81.7 52.2/52.1 74.2/74.1 76.1/76.1 77.4/77.3 73.8/73.7 79.3/79.2 81.1/81.0 80.7/80.7
PSLDH 74.0/74.0 75.7/75.6 75.8/75.7 76.6/76.4 63.3/63.3 72.2/72.2 73.6/73.6 74.5/74.4 75.6/75.6 80.8/80.7 80.5/80.4 81.9/81.8
OrthoHash 80.2/80.1 83.4/83.2 83.8/83.7 84.9/84.7 62.2/62.2 75.1/75.0 77.3/77.1 78.7/78.6 76.2/76.1 79.3/79.1 81.0/80.8 81.8/81.6
HyP$^2$ Loss —— —— —— —— —— —— —— —— —— —— —— ——
FISH 76.0/75.9 76.4/76.4 76.9/76.8 77.1/77.0 61.7/61.5 74.7/74.0 75.2/74.5 75.9/75.0 84.4/84.4 84.5/84.5 83.8/83.8 84.1/84.0
MDSH 68.0/67.9 71.4/71.3 72.2/72.0 73.0/72.8 46.8/46.7 64.3/64.1 66.9/66.7 68.2/67.9 77.1/77.0 80.6/80.5 79.6/79.5 81.2/81.1
DPAH 72.8/72.7 78.2/78.0 79.2/78.9 80.0/79.5 41.1/41.0 74.0/73.8 74.2/74.1 77.0/76.7 79.5/79.4 83.5/83.4 86.0/85.9 86.9/86.7
DCDH 68.4/68.0 76.2/75.8 79.1/78.7 80.4/80.0 51.6/51.5 67.5/67.2 69.8/69.5 72.3/72.0 70.7/70.3 79.8/79.6 82.8/82.5 84.9/84.6
CHN 80.3/80.1 80.6/80.4 80.7/80.6 81.1/80.9 62.6/62.4 76.5/76.1 77.8/77.4 78.3/78.0 80.7/80.6 81.4/81.3 81.1/81.0 81.6/81.5
DSH 49.8/49.5 71.4/70.8 80.9/80.3 82.7/82.1 8.6/8.5 12.3/12.2 15.0/14.8 20.9/20.6 57.6/57.3 81.1/80.5 84.9/84.6 84.6/84.2
HashNet 38.9/38.8 59.9/59.7 65.5/65.3 68.8/68.5 3.0/3.0 10.9/10.9 16.9/16.7 20.9/20.6 12.1/12.1 25.0/24.9 31.1/31.0 36.3/36.0
ADSH 73.0/72.6 84.1/83.5 85.7/84.8 87.4/86.1 8.3/8.2 20.2/19.8 24.2/23.7 30.3/29.7 42.6/42.1 65.3/64.2 74.5/73.6 79.1/77.5
ExchNet 69.7/68.9 84.2/83.4 85.3/83.9 86.9/84.7 9.3/9.1 20.8/20.1 24.4/23.7 32.3/31.4 39.8/38.2 72.5/71.1 80.5/79.2 83.7/81.3
A$^2$-Net 70.8/69.9 84.0/83.0 86.2/85.4 86.7/85.5 8.3/8.2 21.6/21.2 26.5/26.0 37.6/37.0 46.3/45.3 69.3/68.6 77.0/75.6 83.3/81.7
SEMICON 64.7/63.8 81.2/80.4 84.6/84.0 84.5/83.2 9.3/9.1 23.0/22.4 29.4/28.9 40.0/39.2 52.7/52.2 77.9/77.0 78.1/76.7 84.4/82.5
AGMH 80.3/80.1 82.5/81.1 82.0/80.0 80.4/77.6 12.2/12.0 25.9/25.0 36.5/35.2 49.2/47.9 75.6/75.3 82.9/81.8 82.4/80.2 84.5/81.7
DAHNet* 76.8/76.4 84.5/83.8 85.1/84.1 85.1/83.3 20.1/19.9 38.6/37.8 48.0/46.8 58.0/56.9 80.8/80.6 84.1/83.4 85.5/84.6 87.2/85.3
DAHNet$^{--}$ 74.9/74.4 86.7/86.2 86.9/86.4 85.7/84.1 10.0/9.8 24.0/23.5 30.8/30.2 41.9/41.0 56.8/56.4 77.4/76.6 83.8/83.0 86.6/85.0

Notes:

  • Results marked in bold represent the optimal values under the same experimental conditions.
  • DAHNet* denotes that the pairwise method also uses class labels for supervision.
  • DAHNet$^{--}$ denotes a purely pairwise version for DAHNet.

Uneen @ applications

Quantitative results with "unseen@unseen/unseen@all" formats when query images are from unseen classes on the CIFAR-100 and ImageNet-1K datasets (5% categories as unseen classes).

Method CIFAR-100 (mAP@1000) 16 bits CIFAR-100 (mAP@1000) 32 bits CIFAR-100 (mAP@1000) 64 bits ImageNet-1K (mAP@1000) 16 bits ImageNet-1K (mAP@1000) 32 bits ImageNet-1K (mAP@1000) 64 bits
CSQ 65.8/12.6 77.8/17.4 87.3/23.3 52.0/32.5 61.1/48.1 66.6/58.0
PSLDH 61.6/9.7 71.7/13.5 82.7/17.9 43.5/21.3 57.8/36.1 75.7/50.4
OrthoHash 61.7/10.5 76.3/16.7 84.1/21.7 47.5/28.7 62.3/43.5 73.1/59.2
HyP$^2$ Loss —— —— —— —— —— ——
FISH 60.8/43.2 73.6/52.9 81.1/58.4 41.8/16.2 43.7/19.5 50.9/23.1
MDSH 63.4/10.4 71.7/14.3 82.2/17.8 41.0/15.6 46.4/28.6 54.3/33.4
DPAH 78.2/10.5 85.2/12.2 84.0/15.3 41.7/20.0 46.5/24.1 50.7/29.1
DCDH 59.3/10.1 77.9/16.0 85.5/20.8 45.5/12.9 59.3/31.6 77.7/54.4
CHN 63.6/27.3 72.9/26.4 81.5/29.9 45.2/24.9 63.5/43.4 64.0/49.4
DSH 67.2/7.5 71.5/13.1 78.8/13.8 55.5/26.0 61.5/37.8 66.6/38.9
HashNet 87.8/4.5 89.7/10.1 92.9/18.6 57.4/34.7 68.3/45.1 79.3/53.5
ADSH 79.7/39.4 84.0/58.4 87.5/45.5 52.8/18.8 66.4/43.6 62.4/35.1
ExchNet 75.8/54.0 83.8/62.4 87.3/48.3 65.5/33.9 67.1/38.6 70.8/42.4
A$^2$-Net 83.7/46.4 84.9/60.4 87.5/49.5 64.3/35.4 66.5/38.7 62.7/34.8
SEMICON 79.2/47.3 84.8/57.0 84.6/43.3 54.6/23.1 57.0/25.7 65.5/36.1
AGMH 85.5/56.1 84.5/59.1 81.4/42.2 47.0/27.5 58.8/31.6 62.5/36.2
DAHNet* 85.1/59.3 84.3/58.8 85.7/44.0 47.6/25.8 54.8/30.0 66.0/40.3
DAHNet$^{--}$ 82.9/51.8 87.8/63.9 84.9/45.7 47.0/21.3 52.4/26.9 67.4/43.9

Quantitative results with "unseen@unseen/unseen@all" formats when query images are from unseen classes on the Flickr-25K, NUS-WIDE and MS COCO datasets (around 5% images as unseen classes).

Method Flickr-25K (mAP@5000) 16 bits Flickr-25K (mAP@5000) 32 bits Flickr-25K (mAP@5000) 64 bits NUS-WIDE (mAP@5000) 16 bits NUS-WIDE (mAP@5000) 32 bits NUS-WIDE (mAP@5000) 64 bits MS COCO (mAP@5000) 16 bits MS COCO (mAP@5000) 32 bits MS COCO (mAP@5000) 64 bits
CSQ 41.9/27.3 42.8/30.1 42.4/28.8 44.4/23.2 46.4/24.6 47.5/25.3 58.1/59.0 60.7/61.8 60.4/62.7
PSLDH 41.0/27.0 42.8/31.3 43.6/31.2 46.9/26.6 47.2/29.0 49.9/31.7 58.4/57.1 59.8/61.7 63.2/65.3
OrthoHash 40.4/28.1 42.8/30.5 42.4/29.2 45.4/25.6 47.2/29.2 49.3/32.4 52.8/53.2 60.5/63.4 62.7/64.3
HyP$^2$ Loss 41.4/28.5 43.8/31.6 42.8/30.0 45.6/25.2 46.1/27.6 47.2/30.0 52.3/52.0 56.6/57.8 62.1/63.5
FISH —— —— —— —— —— —— —— —— ——
MDSH —— —— —— —— —— —— —— —— ——
DPAH 39.7/26.2 39.7/26.2 39.7/26.2 22.7/2.5 22.7/2.5 25.8/3.7 27.1/23.8 27.1/23.8 28.1/24.2
DCDH 43.9/32.4 42.7/28.6 42.5/28.5 43.3/23.5 47.8/29.9 48.6/29.1 52.8/50.0 60.8/61.2 63.4/66.9
CHN —— —— —— —— —— —— —— —— ——
DSH 39.6/25.8 40.9/25.9 40.7/25.1 40.0/22.3 39.9/20.6 43.0/23.3 45.1/43.6 56.9/55.0 45.8/45.5
HashNet 39.8/22.8 39.6/22.3 44.1/30.2 44.9/24.5 46.7/29.2 49.0/30.9 46.5/42.1 57.1/58.6 64.3/64.5
ADSH 40.4/27.1 41.7/27.4 41.0/28.4 42.9/39.3 43.4/38.4 42.7/38.9 56.0/54.5 58.6/59.3 59.0/58.4
ExchNet 40.1/26.4 39.9/26.8 40.7/25.9 40.3/34.5 43.1/38.9 40.5/36.0 52.4/53.2 59.3/58.8 59.3/58.7
A$^2$-Net 40.0/27.7 39.6/33.9 39.4/35.3 43.5/37.4 44.0/40.9 35.6/25.9 54.4/54.9 59.2/59.1 57.5/57.8
SEMICON 39.8/29.3 41.1/29.1 41.2/28.8 43.4/38.8 40.1/35.2 35.8/32.0 49.3/48.2 54.6/55.1 54.5/54.7
AGMH 39.7/35.5 39.7/34.5 39.8/31.1 25.5/24.4 28.3/25.9 34.4/26.3 45.0/44.3 41.1/40.9 42.2/41.8
DAHNet* —— —— —— —— —— —— —— —— ——
DAHNet$^{--}$ 42.2/29.8 40.1/31.2 41.3/30.2 41.5/36.5 39.4/35.7 30.8/28.6 51.8/52.0 56.5/56.3 56.8/56.7

Quantitative results with "unseen@unseen/unseen@all" formats when queries are from unseen classes on the CUB200-2011, Food101, and VegFru datasets (5% categories as unseen classes).

Method CUB200-2011 (mAP@All) 12 bits CUB200-2011 (mAP@All) 24 bits CUB200-2011 (mAP@All) 32 bits CUB200-2011 (mAP@All) 48 bits Food101 (mAP@All) 12 bits Food101 (mAP@All) 24 bits Food101 (mAP@All) 32 bits Food101 (mAP@All) 48 bits VegFru (mAP@All) 12 bits VegFru (mAP@All) 24 bits VegFru (mAP@All) 32 bits VegFru (mAP@All) 48 bits
CSQ 23.0/2.0 24.9/2.8 31.8/5.0 35.5/6.3 32.6/1.8 33.7/2.2 35.0/2.5 36.3/3.0 19.5/2.0 26.7/3.0 27.4/3.5 28.1/3.6
PSLDH 25.8/2.2 20.0/1.9 28.0/3.3 24.9/3.0 29.8/1.5 33.5/2.2 34.2/2.6 36.8/2.9 21.9/2.0 23.5/2.4 21.4/2.2 23.2/2.5
OrthoHash 25.6/3.3 31.0/4.9 31.4/5.7 37.1/6.8 32.7/1.8 35.3/2.6 37.1/2.8 38.8/3.2 20.6/2.0 24.3/2.8 25.7/3.2 25.1/3.2
HyP$^2$ Loss —— —— —— —— —— —— —— —— —— —— —— ——
FISH 21.8/2.0 20.2/1.8 22.0/2.1 25.1/3.7 29.7/1.6 31.6/2.3 29.9/2.0 32.9/2.7 18.8/1.9 20.3/2.5 20.9/2.5 19.7/2.1
MDSH 23.3/2.2 21.0/1.8 21.2/2.0 24.3/2.7 29.4/1.5 33.4/2.4 34.0/2.5 37.9/3.1 18.3/1.5 19.5/2.0 23.9/2.7 23.5/2.6
DPAH 23.4/1.9 27.2/3.0 29.2/4.3 35.1/7.4 34.4/1.9 41.5/3.2 45.1/3.5 46.9/4.2 24.1/2.2 29.8/3.0 33.7/4.0 34.2/4.9
DCDH 23.1/2.3 32.8/7.2 34.1/7.5 37.4/7.5 32.1/1.8 32.6/2.4 35.3/2.5 44.7/4.0 19.9/1.8 24.4/3.5 26.5/4.1 31.7/5.4
CHN 21.4/1.8 29.8/3.6 31.2/4.1 28.7/4.3 31.1/1.8 33.8/2.4 34.2/2.4 37.6/3.5 23.3/2.0 25.8/2.8 26.2/2.9 27.1/3.1
DSH 40.6/5.0 41.5/7.9 49.4/12.9 49.1/14.3 39.3/2.0 38.8/2.4 44.4/3.6 43.7/3.7 35.6/2.8 43.8/5.5 42.2/5.9 47.5/8.9
HashNet 29.5/2.0 45.0/7.2 43.3/6.4 51.6/10.5 46.0/2.2 53.9/3.5 54.7/4.1 55.8/4.9 33.5/1.2 46.0/3.2 50.7/5.3 55.8/7.9
ADSH 36.7/8.8 45.0/12.7 46.9/13.3 47.1/14.9 39.0/5.1 40.7/6.3 41.5/6.0 40.4/7.1 36.2/4.0 42.6/8.0 51.7/13.0 50.2/16.7
ExchNet 39.4/9.5 41.7/12.9 48.7/16.7 47.1/20.1 37.5/6.4 39.6/5.4 41.1/7.1 40.7/7.9 32.6/5.1 40.6/8.8 46.9/13.4 47.8/16.0
A$^2$-Net 41.2/8.3 46.9/16.4 46.4/15.5 49.6/17.2 38.8/5.4 42.2/6.8 41.5/6.5 42.0/6.9 34.7/3.5 46.1/10.8 48.9/13.5 50.2/15.0
SEMICON 40.7/8.4 42.8/11.0 48.8/15.9 51.7/18.9 35.9/3.5 40.3/5.2 41.8/5.9 37.1/5.2 35.6/4.7 41.9/10.6 43.8/14.2 48.8/14.2
AGMH 30.7/5.0 42.1/11.1 43.0/10.5 38.8/9.3 31.6/2.6 32.9/4.1 34.4/7.3 37.0/7.1 36.8/3.9 42.0/10.0 39.9/11.8 38.9/10.8
DAHNet* 33.1/6.5 39.9/9.0 43.3/12.2 47.5/16.6 29.6/2.1 34.4/4.1 42.5/7.6 39.4/7.1 19.2/1.7 29.7/6.1 31.3/6.0 35.4/7.8
DAHNet$^{--}$ 34.3/7.0 42.6/9.1 48.2/12.5 49.5/16.8 34.4/3.2 41.2/5.9 37.8/6.0 40.7/7.4 39.8/5.1 41.5/8.4 49.0/11.7 45.8/11.4
Method \emph{Stanford Dogs} (12 bits) \emph{Stanford Dogs} (24 bits) \emph{Stanford Dogs} (32 bits) \emph{Stanford Dogs} (48 bits) \emph{NABirds} (12 bits) \emph{NABirds} (24 bits) \emph{NABirds} (32 bits) \emph{NABirds} (48 bits) \emph{Aircraft} (12 bits) \emph{Aircraft} (24 bits) \emph{Aircraft} (32 bits) \emph{Aircraft} (48 bits)
CSQ 34.9/3.4 35.1/4.4 38.6/5.5 37.9/6.2 14.1/1.4 19.1/2.4 18.7/2.7 21.0/3.4 38.3/3.8 47.3/8.3 48.9/10.5 55.4/9.0
PSLDH 33.0/3.4 33.2/3.6 31.5/3.7 34.7/5.3 14.0/1.5 15.1/1.7 17.1/2.1 19.8/2.9 39.8/3.8 47.7/7.6 50.7/8.0 54.4/11.1
OrthoHash 33.7/4.2 41.5/6.7 36.9/5.6 39.9/7.1 12.7/1.2 19.1/2.6 20.4/2.9 20.4/3.5 49.3/7.1 46.0/8.7 56.3/11.5 54.9/10.4
FISH 29.0/3.2 33.0/3.5 33.0/4.0 32.0/4.5 14.4/1.7 16.2/3.0 15.1/2.7 16.0/3.4 44.3/6.8 45.7/7.4 46.1/8.2 48.5/8.8
MDSH 28.6/2.4 30.5/3.2 33.4/4.2 32.3/3.8 10.3/0.8 14.6/2.0 14.8/2.1 14.3/2.6 42.8/4.7 49.7/6.2 50.0/7.4 55.0/9.7
DPAH 38.3/3.9 40.2/4.9 40.2/5.8 44.2/8.3 14.0/1.0 23.4/3.3 25.3/3.6 28.6/5.4 55.8/7.4 59.9/10.0 61.0/11.0 62.3/14.7
DCDH 29.4/3.0 35.3/4.8 37.3/6.7 44.4/8.6 10.9/0.9 16.4/2.4 19.1/3.5 24.9/5.5 47.8/6.4 59.5/14.6 62.6/19.1 55.9/16.1
CHN 33.6/3.3 37.3/5.2 37.2/5.4 38.4/6.4 17.2/2.1 21.8/3.1 22.4/3.3 23.2/4.1 49.4/7.3 50.2/8.7 51.6/9.4 53.0/10.3
DSH 53.6/9.3 50.2/11.7 50.8/13.4 51.8/13.8 20.8/1.3 25.6/3.3 25.6/3.4 30.9/6.2 61.5/10.5 63.2/12.7 63.3/16.1 61.4/15.4
HashNet 53.9/5.1 55.9/10.6 57.7/16.8 58.3/18.0 14.4/0.8 24.4/2.1 29.1/3.4 33.4/4.5 51.0/2.4 55.6/7.7 61.9/10.5 63.8/11.5
ADSH 46.7/8.2 51.7/17.9 56.0/20.9 49.8/17.0 17.2/2.0 25.2/5.0 27.8/5.9 30.6/8.5 52.9/11.6 62.8/16.6 61.5/14.3 61.4/17.4
ExchNet 44.7/8.8 51.1/20.2 52.6/19.9 50.2/17.6 15.3/2.4 22.7/5.4 26.5/7.9 29.5/9.1 51.7/16.0 64.9/18.6 61.1/18.2 62.1/22.7
A$^2$-Net 54.2/15.6 51.8/16.8 51.1/13.9 49.5/14.3 14.5/1.8 24.6/5.0 26.1/5.6 32.6/10.2 53.3/16.8 60.7/17.5 62.2/17.1 61.6/18.3
SEMICON 42.1/10.0 50.2/14.4 50.1/13.5 47.7/15.2 19.0/3.0 25.2/6.5 25.9/6.0 32.4/11.1 58.1/12.2 65.1/13.4 58.0/14.7 60.0/12.0
AGMH 44.0/6.6 39.2/10.1 40.7/10.9 34.9/11.1 20.0/2.8 23.4/8.9 23.8/7.1 29.7/10.4 58.7/12.0 57.3/16.8 57.0/14.8 57.9/17.6
DAHNet* 35.0/5.3 46.6/10.8 42.6/9.4 40.9/12.0 14.8/1.6 19.0/5.4 24.4/9.0 28.3/11.0 57.6/8.3 62.5/16.5 59.3/15.1 64.6/18.8
DAHNet$^{--}$ 44.8/11.3 45.6/11.6 49.4/12.0 44.1/13.4 18.2/3.0 26.3/5.8 28.3/7.9 30.8/11.0 57.2/10.9 60.6/16.9 61.9/18.8 60.3/19.7

Notes:

  • DAHNet* denotes that the pairwise method also uses class labels for supervision.
  • DAHNet$^{--}$ denotes a purely pairwise version for DAHNet.

More results along with conclusions please refer to our paper.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages