This project focuses on systematically evaluating the two main training paradigms in deep supervised hashing: pairwise hashing and pointwise hashing. Deep supervised hashing has become essential for large-scale image retrieval tasks, offering efficient storage and retrieval capabilities by transforming high-dimensional image data into compact binary hash codes. The study provides an extensive quantitative exploration, comparing the performance of these paradigms across multiple datasets.
The experiments are conducted on both single-label and multi-label datasets, utilizing a variety of hash code dimensions (e.g., 16-bit, 32-bit, and 64-bit). The evaluation protocol covers 1,833 experiments, involving 17 different methods across 9 single-label datasets (3 generic and 6 fine-grained) and 3 multi-label datasets, ensuring a comprehensive assessment of retrieval performance under various conditions, including seen and unseen class scenarios.
- Generic single-label datasets: ImageNet-1K, CIFAR-10, and CIFAR-100.
- Fine-grained single-label datasets: CUB200-2011, Food101, Aircraft, NABirds, Stanford Dogs, and VegFru.
- Multi-label datasets: COCO, NUS-WIDE, and Flickr25K.
Note: Following previous DSH settings, models are pretrained on the ImageNet-1K dataset, which may lead to data leakage on ImageNet-1K and Stanford Dogs.
- seen@seen: Both query images and database images are from seen classes.
- seen@all: Based on the "seen@seen" protocol, unseen classes' images are added to the database.
- unseen@unseen: Both query images and database images are from unseen classes.
- unseen@all: Expanding on the "unseen@unseen" basis, the database is extended to include both seen and unseen images.
The evaluation metrics for these four tasks are mAP@k, calculated as:
where
We standardized the experimental configurations across different datasets for equitable comparison by amalgamating existing pairwise and pointwise hashing methods. The training phase consists of iterations, each with a specific number of epochs. For instance, 50 iterations with 3 epochs per iteration results in a total of 150 epochs.
- CIFAR-10 and CIFAR-100: Randomly selected 2000 samples from the training set, with 50 iterations and 3 epochs. The evaluation metric is mAP@1000.
- ImageNet-1K: 130,000 images in 100 classes for training, and 5,000 for testing. Iteration: 50, Epoch: 1, Evaluation metric: mAP@1000.
We established 3 partition configurations for CIFAR-100 and ImageNet-1K:
- The former 95%/85%/75% categories as seen classes.
- The latter 5%/15%/25% categories as unseen classes. For CIFAR-10, we only set 1 configuration: the former 80% categories as seen classes, and the latter 20% as unseen classes.
- Flickr-25K: Sampled 1000 images as queries and 20,000 as database points. Iteration: 50, Epoch: 3, Evaluation metric: mAP@5000.
- NUS-WIDE: 21 most frequent categories used for evaluation. Iteration: 50, Epoch: 3, Evaluation metric: mAP@5000.
- MS COCO: Pruned images with no category information. 82,081 images from the training set as database points, and 5,000 images from the validation set as queries. Iteration: 50, Epoch: 3, Evaluation metric: mAP@5000.
For multi-label datasets, two configurations were implemented:
- Seen classes (~95% of dataset): 32 categories (Flickr-25K), 10 categories (NUS-WIDE), 45 categories (MS COCO).
- Seen classes (~85% of dataset): 23 categories (Flickr-25K), 5 categories (NUS-WIDE), 27 categories (MS COCO).
- Datasets: CUB200-2011, Food101, VegFru, Stanford Dogs, Aircraft, NABirds.
- Iteration: 40, Epoch: 30.
- Randomly selected 2000 samples from the training set, but 4000 for VegFru and NABirds.
We established 3 partition configurations for these datasets:
- The former 95%/85%/75% categories as seen classes.
- The latter 5%/15%/25% categories as unseen classes.
For multi-label datasets, two images will be defined as a ground-truth neighbor (similar pair) if they share at least one common label.
Quantitative results with "seen@seen/seen@all" formats when query images are from seen classes on the CIFAR-100 and ImageNet-1K datasets (5% categories as unseen classes).
Method | CIFAR-100 (mAP@1000) 16 bits | CIFAR-100 (mAP@1000) 32 bits | CIFAR-100 (mAP@1000) 64 bits | ImageNet-1K (mAP@1000) 16 bits | ImageNet-1K (mAP@1000) 32 bits | ImageNet-1K (mAP@1000) 64 bits |
---|---|---|---|---|---|---|
CSQ | 62.9/62.6 | 66.8/62.9 | 62.9/62.1 | 84.3/84.2 | 87.4/87.3 | 88.2/88.1 |
PSLDH | 59.3/58.9 | 67.7/67.0 | 70.5/69.5 | 79.4/79.3 | 81.6/81.5 | 81.0/80.8 |
OrthoHash | 61.7/61.2 | 68.8/68.0 | 72.2/71.2 | 85.9/85.7 | 88.5/88.4 | 89.9/89.8 |
HyP$^2$ Loss | —— | —— | —— | —— | —— | —— |
FISH | 50.5/49.4 | 50.3/48.0 | 58.3/54.2 | 73.7/73.6 | 77.3/77.1 | 78.4/78.2 |
MDSH | 57.3/56.9 | 67.5/66.7 | 69.1/68.1 | 73.2/73.1 | 76.8/76.7 | 79.7/79.6 |
DPAH | 50.7/50.4 | 57.0/56.5 | 60.5/59.7 | 78.5/78.4 | 81.8/81.6 | 82.4/82.2 |
DCDH | 53.4/52.9 | 61.9/61.1 | 60.5/59.6 | 78.1/77.7 | 82.3/82.2 | 84.6/84.4 |
CHN | 70.1/69.8 | 75.2/74.6 | 77.4/76.4 | 83.7/83.6 | 86.5/86.4 | 87.8/87.7 |
DSH | 13.7/13.6 | 24.8/24.4 | 30.9/30.4 | 65.7/65.4 | 79.0/78.8 | 82.9/82.7 |
HashNet | 16.1/16.0 | 26.9/26.6 | 38.4/37.7 | 41.9/41.8 | 73.8/73.7 | 83.5/83.5 |
ADSH | 32.8/31.8 | 58.0/56.7 | 72.2/70.4 | 81.7/81.5 | 87.3/87.1 | 88.4/88.2 |
ExchNet | 34.7/33.7 | 58.4/56.8 | 70.0/67.5 | 79.5/79.2 | 85.9/85.7 | 88.4/88.2 |
A$^2$-Net | 34.1/33.0 | 59.4/57.5 | 70.5/69.2 | 81.5/81.3 | 86.9/86.8 | 88.5/88.3 |
SEMICON | 27.0/25.0 | 53.9/51.7 | 70.8/68.9 | 75.9/75.6 | 83.2/83.0 | 84.9/84.7 |
AGMH | 50.2/48.6 | 70.1/67.9 | 76.7/74.6 | 85.3/85.1 | 85.6/85.4 | 83.3/83.0 |
DAHNet* | 50.7/49.4 | 71.8/69.8 | 78.9/77.1 | 83.3/83.1 | 86.4/86.2 | 86.8/86.6 |
DAHNet$^{--}$ | 38.3/37.3 | 65.5/63.8 | 77.2/75.0 | 81.8/81.6 | 87.2/87.0 | 87.3/87.2 |
Quantitative results with "seen@seen/seen@all" formats when query images are from seen classes on the Flickr-25K, NUS-WIDE and MS COCO datasets (around 5% images as unseen classes).
Method | Flickr-25K (mAP@5000) 16 bits | Flickr-25K (mAP@5000) 32 bits | Flickr-25K (mAP@5000) 64 bits | NUS-WIDE (mAP@5000) 16 bits | NUS-WIDE (mAP@5000) 32 bits | NUS-WIDE (mAP@5000) 64 bits | MS COCO (mAP@5000) 16 bits | MS COCO (mAP@5000) 32 bits | MS COCO (mAP@5000) 64 bits |
---|---|---|---|---|---|---|---|---|---|
CSQ | 88.1/88.9 | 90.3/90.4 | 91.2/90.8 | 89.3/89.8 | 90.2/90.5 | 90.5/90.9 | 63.7/78.4 | 70.6/82.7 | 71.7/83.5 |
PSLDH | 86.4/87.6 | 91.3/91.1 | 90.9/90.7 | 89.5/90.0 | 90.6/90.8 | 89.7/90.0 | 65.1/78.3 | 69.0/81.2 | 70.1/82.5 |
OrthoHash | 86.9/87.9 | 89.3/89.8 | 91.0/91.3 | 88.7/89.3 | 90.4/90.9 | 91.1/91.5 | 61.1/75.3 | 69.3/82.0 | 72.5/84.4 |
HyP$^2$ Loss | 87.2/88.7 | 89.8/90.8 | 90.5/90.9 | 88.4/89.0 | 89.4/89.6 | 89.9/90.0 | 61.6/77.5 | 68.3/81.3 | 71.5/83.4 |
FISH | —— | —— | —— | —— | —— | —— | —— | —— | —— |
MDSH | —— | —— | —— | —— | —— | —— | —— | —— | —— |
DPAH | 49.6/57.2 | 49.6/57.2 | 49.6/57.2 | 34.0/34.5 | 34.0/34.5 | 38.2/38.8 | 15.9/40.6 | 15.9/40.6 | 16.2/41.1 |
DCDH | 80.4/83.0 | 81.2/83.1 | 83.8/85.0 | 87.3/87.7 | 87.7/88.1 | 87.9/88.4 | 53.9/71.3 | 60.7/77.7 | 60.3/78.9 |
CHN | —— | —— | —— | —— | —— | —— | —— | —— | —— |
DSH | 76.6/80.4 | 81.3/84.3 | 77.8/81.3 | 82.1/82.9 | 79.3/80.2 | 80.5/81.2 | 48.2/69.3 | 55.1/73.8 | 52.3/71.2 |
HashNet | 87.9/89.3 | 91.0/91.6 | 92.1/92.9 | 83.3/83.9 | 88.6/89.1 | 90.9/91.4 | 51.9/72.2 | 61.7/78.6 | 67.2/82.5 |
ADSH | 91.7/85.6 | 92.3/85.7 | 91.1/80.5 | 94.9/84.1 | 95.5/78.3 | 95.8/72.2 | 73.6/75.9 | 80.8/77.2 | 81.8/74.8 |
ExchNet | 93.3/84.8 | 92.3/80.2 | 90.0/73.8 | 93.7/78.0 | 94.4/78.1 | 94.6/71.4 | 85.6/77.5 | 89.7/74.4 | 90.1/73.9 |
A$^2$-Net | 87.0/70.7 | 82.9/60.1 | 81.7/56.6 | 93.9/76.1 | 94.2/67.9 | 82.5/74.1 | 73.9/76.8 | 79.6/73.5 | 81.2/69.5 |
SEMICON | 84.4/72.4 | 85.7/74.7 | 86.8/75.2 | 91.8/82.3 | 93.4/76.8 | 91.8/66.9 | 70.3/72.9 | 75.7/70.0 | 74.2/66.9 |
AGMH | 81.1/56.0 | 82.5/60.2 | 84.6/65.2 | 82.7/40.1 | 86.1/48.4 | 85.0/81.2 | 69.0/67.4 | 61.9/36.6 | 64.1/47.8 |
DAHNet* | —— | —— | —— | —— | —— | —— | —— | —— | —— |
DAHNet$^{--}$ | 87.4/74.1 | 85.3/67.5 | 87.0/72.2 | 93.1/75.1 | 91.5/66.3 | 89.9/62.1 | 72.4/76.5 | 78.4/73.7 | 77.6/68.8 |
Quantitative results with "seen@seen/seen@all" formats when queries are from seen classes on the CUB200-2011, Food101 and VegFru datasets (5% categories as unseen classes).
Method | CUB200-2011 (mAP@All) 12 bits | CUB200-2011 (mAP@All) 24 bits | CUB200-2011 (mAP@All) 32 bits | CUB200-2011 (mAP@All) 48 bits | Food101 (mAP@All) 12 bits | Food101 (mAP@All) 24 bits | Food101 (mAP@All) 32 bits | Food101 (mAP@All) 48 bits | VegFru (mAP@All) 12 bits | VegFru (mAP@All) 24 bits | VegFru (mAP@All) 32 bits | VegFru (mAP@All) 48 bits |
---|---|---|---|---|---|---|---|---|---|---|---|---|
CSQ | 68.9/68.8 | 79.5/79.4 | 81.2/81.0 | 81.7/81.6 | 55.0/54.8 | 59.7/59.5 | 61.4/61.1 | 61.8/61.5 | 61.1/61.0 | 82.6/82.6 | 83.8/83.7 | 84.0/83.9 |
PSLDH | 73.6/73.6 | 77.4/77.3 | 77.4/77.4 | 78.2/78.1 | 48.0/47.8 | 55.2/55.0 | 56.9/56.6 | 61.4/60.9 | 72.6/72.5 | 79.4/79.3 | 80.2/80.1 | 81.5/81.4 |
OrthoHash | 73.2/73.1 | 81.0/80.9 | 81.8/81.6 | 82.5/82.4 | 55.5/55.2 | 65.1/64.6 | 66.8/66.4 | 69.3/68.7 | 78.2/78.1 | 84.0/83.9 | 84.9/84.8 | 85.8/85.7 |
HyP$^2$ Loss | —— | —— | —— | —— | —— | —— | —— | —— | —— | —— | —— | —— |
FISH | 75.6/75.5 | 76.7/76.7 | 77.1/77.0 | 77.7/77.7 | 79.3/79.2 | 80.7/80.5 | 80.7/80.4 | 81.0/80.8 | 82.1/81.9 | 84.7/84.5 | 84.8/84.6 | 84.7/84.6 |
MDSH | 70.8/70.8 | 74.4/74.3 | 75.2/75.1 | 75.5/75.4 | 45.0/44.9 | 52.5/52.1 | 56.1/55.7 | 59.3/58.8 | 66.7/66.6 | 71.8/71.7 | 73.2/73.0 | 75.3/75.1 |
DPAH | 72.8/72.7 | 78.8/78.7 | 78.9/78.7 | 80.2/80.0 | 49.9/49.7 | 60.9/60.4 | 62.9/62.3 | 65.8/65.1 | 72.9/72.8 | 81.0/80.7 | 82.5/82.3 | 84.2/83.8 |
DCDH | 65.3/65.1 | 76.1/75.7 | 78.5/78.2 | 80.6/80.4 | 44.4/44.0 | 55.4/54.8 | 58.7/58.1 | 59.6/58.8 | 63.5/63.3 | 76.3/75.8 | 78.8/78.3 | 81.3/80.8 |
CHN | 75.1/75.0 | 80.0/79.8 | 81.2/81.0 | 81.5/81.4 | 80.1/79.9 | 81.2/81.0 | 81.3/81.1 | 81.7/81.5 | 82.6/82.4 | 84.2/84.0 | 84.9/84.7 | 85.4/85.2 |
DSH | 32.8/32.5 | 53.0/52.7 | 70.4/69.9 | 78.3/77.9 | 25.5/25.2 | 38.3/37.9 | 52.2/51.6 | 60.5/59.8 | 18.9/18.8 | 29.4/29.1 | 37.5/37.1 | 65.5/64.9 |
HashNet | 13.0/12.9 | 33.5/33.4 | 38.2/38.0 | 45.7/45.5 | 17.6/17.5 | 30.6/30.3 | 36.8/36.4 | 43.9/43.3 | 13.4/13.3 | 33.1/32.9 | 39.8/39.5 | 45.3/44.8 |
ADSH | 32.0/31.5 | 57.3/56.8 | 65.8/65.4 | 76.5/76.1 | 45.0/43.3 | 69.2/67.0 | 77.0/75.2 | 81.4/78.9 | 22.2/21.8 | 46.0/45.1 | 51.5/50.4 | 64.8/63.4 |
ExchNet | 30.5/29.7 | 58.8/57.6 | 68.4/67.5 | 76.8/76.1 | 45.9/43.5 | 70.4/67.9 | 76.6/74.1 | 81.6/78.7 | 25.4/24.5 | 48.0/46.4 | 55.7/54.1 | 66.6/65.0 |
A$^2$-Net | 31.9/31.5 | 62.5/61.9 | 69.5/68.9 | 80.6/80.2 | 44.8/43.0 | 71.2/69.2 | 78.0/76.0 | 81.8/78.9 | 24.5/24.1 | 46.6/45.6 | 57.1/55.8 | 67.6/66.5 |
SEMICON | 37.7/37.2 | 66.8/66.3 | 74.4/74.2 | 80.8/80.5 | 46.5/45.1 | 73.5/72.1 | 79.3/77.5 | 80.9/78.2 | 25.4/24.8 | 52.8/50.9 | 60.1/57.3 | 79.4/78.0 |
AGMH | 59.9/59.6 | 80.6/80.2 | 82.4/82.0 | 82.7/81.8 | 71.3/70.5 | 78.6/76.7 | 78.5/75.0 | 78.6/74.6 | 49.1/48.4 | 77.9/76.6 | 81.2/80.0 | 84.1/82.3 |
DAHNet* | 60.6/59.9 | 80.7/80.3 | 82.3/81.9 | 84.1/83.6 | 70.4/69.6 | 81.2/80.1 | 82.5/80.5 | 83.2/80.1 | 63.3/62.8 | 82.5/81.7 | 85.5/85.0 | 88.0/87.4 |
DAHNet$^{--}$ | 41.2/40.7 | 69.7/69.4 | 76.2/75.9 | 82.3/81.9 | 51.1/49.8 | 77.7/76.2 | 81.0/79.0 | 82.3/79.3 | 34.1/33.6 | 63.3/62.2 | 73.0/72.0 | 81.6/80.3 |
Quantitative results with "seen@seen/seen@all" formats when queries are from seen classes on the Stanford Dogs, NABirds, and Aircraft datasets (5% categories as unseen classes).
Method | Stanford Dogs (mAP@All) 12 bits | Stanford Dogs (mAP@All) 24 bits | Stanford Dogs (mAP@All) 32 bits | Stanford Dogs (mAP@All) 48 bits | NABirds (mAP@All) 12 bits | NABirds (mAP@All) 24 bits | NABirds (mAP@All) 32 bits | NABirds (mAP@All) 48 bits | Aircraft (mAP@All) 12 bits | Aircraft (mAP@All) 24 bits | Aircraft (mAP@All) 32 bits | Aircraft (mAP@All) 48 bits |
---|---|---|---|---|---|---|---|---|---|---|---|---|
CSQ | 73.3/73.2 | 80.2/80.1 | 81.5/81.4 | 81.8/81.7 | 52.2/52.1 | 74.2/74.1 | 76.1/76.1 | 77.4/77.3 | 73.8/73.7 | 79.3/79.2 | 81.1/81.0 | 80.7/80.7 |
PSLDH | 74.0/74.0 | 75.7/75.6 | 75.8/75.7 | 76.6/76.4 | 63.3/63.3 | 72.2/72.2 | 73.6/73.6 | 74.5/74.4 | 75.6/75.6 | 80.8/80.7 | 80.5/80.4 | 81.9/81.8 |
OrthoHash | 80.2/80.1 | 83.4/83.2 | 83.8/83.7 | 84.9/84.7 | 62.2/62.2 | 75.1/75.0 | 77.3/77.1 | 78.7/78.6 | 76.2/76.1 | 79.3/79.1 | 81.0/80.8 | 81.8/81.6 |
HyP$^2$ Loss | —— | —— | —— | —— | —— | —— | —— | —— | —— | —— | —— | —— |
FISH | 76.0/75.9 | 76.4/76.4 | 76.9/76.8 | 77.1/77.0 | 61.7/61.5 | 74.7/74.0 | 75.2/74.5 | 75.9/75.0 | 84.4/84.4 | 84.5/84.5 | 83.8/83.8 | 84.1/84.0 |
MDSH | 68.0/67.9 | 71.4/71.3 | 72.2/72.0 | 73.0/72.8 | 46.8/46.7 | 64.3/64.1 | 66.9/66.7 | 68.2/67.9 | 77.1/77.0 | 80.6/80.5 | 79.6/79.5 | 81.2/81.1 |
DPAH | 72.8/72.7 | 78.2/78.0 | 79.2/78.9 | 80.0/79.5 | 41.1/41.0 | 74.0/73.8 | 74.2/74.1 | 77.0/76.7 | 79.5/79.4 | 83.5/83.4 | 86.0/85.9 | 86.9/86.7 |
DCDH | 68.4/68.0 | 76.2/75.8 | 79.1/78.7 | 80.4/80.0 | 51.6/51.5 | 67.5/67.2 | 69.8/69.5 | 72.3/72.0 | 70.7/70.3 | 79.8/79.6 | 82.8/82.5 | 84.9/84.6 |
CHN | 80.3/80.1 | 80.6/80.4 | 80.7/80.6 | 81.1/80.9 | 62.6/62.4 | 76.5/76.1 | 77.8/77.4 | 78.3/78.0 | 80.7/80.6 | 81.4/81.3 | 81.1/81.0 | 81.6/81.5 |
DSH | 49.8/49.5 | 71.4/70.8 | 80.9/80.3 | 82.7/82.1 | 8.6/8.5 | 12.3/12.2 | 15.0/14.8 | 20.9/20.6 | 57.6/57.3 | 81.1/80.5 | 84.9/84.6 | 84.6/84.2 |
HashNet | 38.9/38.8 | 59.9/59.7 | 65.5/65.3 | 68.8/68.5 | 3.0/3.0 | 10.9/10.9 | 16.9/16.7 | 20.9/20.6 | 12.1/12.1 | 25.0/24.9 | 31.1/31.0 | 36.3/36.0 |
ADSH | 73.0/72.6 | 84.1/83.5 | 85.7/84.8 | 87.4/86.1 | 8.3/8.2 | 20.2/19.8 | 24.2/23.7 | 30.3/29.7 | 42.6/42.1 | 65.3/64.2 | 74.5/73.6 | 79.1/77.5 |
ExchNet | 69.7/68.9 | 84.2/83.4 | 85.3/83.9 | 86.9/84.7 | 9.3/9.1 | 20.8/20.1 | 24.4/23.7 | 32.3/31.4 | 39.8/38.2 | 72.5/71.1 | 80.5/79.2 | 83.7/81.3 |
A$^2$-Net | 70.8/69.9 | 84.0/83.0 | 86.2/85.4 | 86.7/85.5 | 8.3/8.2 | 21.6/21.2 | 26.5/26.0 | 37.6/37.0 | 46.3/45.3 | 69.3/68.6 | 77.0/75.6 | 83.3/81.7 |
SEMICON | 64.7/63.8 | 81.2/80.4 | 84.6/84.0 | 84.5/83.2 | 9.3/9.1 | 23.0/22.4 | 29.4/28.9 | 40.0/39.2 | 52.7/52.2 | 77.9/77.0 | 78.1/76.7 | 84.4/82.5 |
AGMH | 80.3/80.1 | 82.5/81.1 | 82.0/80.0 | 80.4/77.6 | 12.2/12.0 | 25.9/25.0 | 36.5/35.2 | 49.2/47.9 | 75.6/75.3 | 82.9/81.8 | 82.4/80.2 | 84.5/81.7 |
DAHNet* | 76.8/76.4 | 84.5/83.8 | 85.1/84.1 | 85.1/83.3 | 20.1/19.9 | 38.6/37.8 | 48.0/46.8 | 58.0/56.9 | 80.8/80.6 | 84.1/83.4 | 85.5/84.6 | 87.2/85.3 |
DAHNet$^{--}$ | 74.9/74.4 | 86.7/86.2 | 86.9/86.4 | 85.7/84.1 | 10.0/9.8 | 24.0/23.5 | 30.8/30.2 | 41.9/41.0 | 56.8/56.4 | 77.4/76.6 | 83.8/83.0 | 86.6/85.0 |
Notes:
- Results marked in bold represent the optimal values under the same experimental conditions.
- DAHNet* denotes that the pairwise method also uses class labels for supervision.
- DAHNet$^{--}$ denotes a purely pairwise version for DAHNet.
Quantitative results with "unseen@unseen/unseen@all" formats when query images are from unseen classes on the CIFAR-100 and ImageNet-1K datasets (5% categories as unseen classes).
Method | CIFAR-100 (mAP@1000) 16 bits | CIFAR-100 (mAP@1000) 32 bits | CIFAR-100 (mAP@1000) 64 bits | ImageNet-1K (mAP@1000) 16 bits | ImageNet-1K (mAP@1000) 32 bits | ImageNet-1K (mAP@1000) 64 bits |
---|---|---|---|---|---|---|
CSQ | 65.8/12.6 | 77.8/17.4 | 87.3/23.3 | 52.0/32.5 | 61.1/48.1 | 66.6/58.0 |
PSLDH | 61.6/9.7 | 71.7/13.5 | 82.7/17.9 | 43.5/21.3 | 57.8/36.1 | 75.7/50.4 |
OrthoHash | 61.7/10.5 | 76.3/16.7 | 84.1/21.7 | 47.5/28.7 | 62.3/43.5 | 73.1/59.2 |
HyP$^2$ Loss | —— | —— | —— | —— | —— | —— |
FISH | 60.8/43.2 | 73.6/52.9 | 81.1/58.4 | 41.8/16.2 | 43.7/19.5 | 50.9/23.1 |
MDSH | 63.4/10.4 | 71.7/14.3 | 82.2/17.8 | 41.0/15.6 | 46.4/28.6 | 54.3/33.4 |
DPAH | 78.2/10.5 | 85.2/12.2 | 84.0/15.3 | 41.7/20.0 | 46.5/24.1 | 50.7/29.1 |
DCDH | 59.3/10.1 | 77.9/16.0 | 85.5/20.8 | 45.5/12.9 | 59.3/31.6 | 77.7/54.4 |
CHN | 63.6/27.3 | 72.9/26.4 | 81.5/29.9 | 45.2/24.9 | 63.5/43.4 | 64.0/49.4 |
DSH | 67.2/7.5 | 71.5/13.1 | 78.8/13.8 | 55.5/26.0 | 61.5/37.8 | 66.6/38.9 |
HashNet | 87.8/4.5 | 89.7/10.1 | 92.9/18.6 | 57.4/34.7 | 68.3/45.1 | 79.3/53.5 |
ADSH | 79.7/39.4 | 84.0/58.4 | 87.5/45.5 | 52.8/18.8 | 66.4/43.6 | 62.4/35.1 |
ExchNet | 75.8/54.0 | 83.8/62.4 | 87.3/48.3 | 65.5/33.9 | 67.1/38.6 | 70.8/42.4 |
A$^2$-Net | 83.7/46.4 | 84.9/60.4 | 87.5/49.5 | 64.3/35.4 | 66.5/38.7 | 62.7/34.8 |
SEMICON | 79.2/47.3 | 84.8/57.0 | 84.6/43.3 | 54.6/23.1 | 57.0/25.7 | 65.5/36.1 |
AGMH | 85.5/56.1 | 84.5/59.1 | 81.4/42.2 | 47.0/27.5 | 58.8/31.6 | 62.5/36.2 |
DAHNet* | 85.1/59.3 | 84.3/58.8 | 85.7/44.0 | 47.6/25.8 | 54.8/30.0 | 66.0/40.3 |
DAHNet$^{--}$ | 82.9/51.8 | 87.8/63.9 | 84.9/45.7 | 47.0/21.3 | 52.4/26.9 | 67.4/43.9 |
Quantitative results with "unseen@unseen/unseen@all" formats when query images are from unseen classes on the Flickr-25K, NUS-WIDE and MS COCO datasets (around 5% images as unseen classes).
Method | Flickr-25K (mAP@5000) 16 bits | Flickr-25K (mAP@5000) 32 bits | Flickr-25K (mAP@5000) 64 bits | NUS-WIDE (mAP@5000) 16 bits | NUS-WIDE (mAP@5000) 32 bits | NUS-WIDE (mAP@5000) 64 bits | MS COCO (mAP@5000) 16 bits | MS COCO (mAP@5000) 32 bits | MS COCO (mAP@5000) 64 bits |
---|---|---|---|---|---|---|---|---|---|
CSQ | 41.9/27.3 | 42.8/30.1 | 42.4/28.8 | 44.4/23.2 | 46.4/24.6 | 47.5/25.3 | 58.1/59.0 | 60.7/61.8 | 60.4/62.7 |
PSLDH | 41.0/27.0 | 42.8/31.3 | 43.6/31.2 | 46.9/26.6 | 47.2/29.0 | 49.9/31.7 | 58.4/57.1 | 59.8/61.7 | 63.2/65.3 |
OrthoHash | 40.4/28.1 | 42.8/30.5 | 42.4/29.2 | 45.4/25.6 | 47.2/29.2 | 49.3/32.4 | 52.8/53.2 | 60.5/63.4 | 62.7/64.3 |
HyP$^2$ Loss | 41.4/28.5 | 43.8/31.6 | 42.8/30.0 | 45.6/25.2 | 46.1/27.6 | 47.2/30.0 | 52.3/52.0 | 56.6/57.8 | 62.1/63.5 |
FISH | —— | —— | —— | —— | —— | —— | —— | —— | —— |
MDSH | —— | —— | —— | —— | —— | —— | —— | —— | —— |
DPAH | 39.7/26.2 | 39.7/26.2 | 39.7/26.2 | 22.7/2.5 | 22.7/2.5 | 25.8/3.7 | 27.1/23.8 | 27.1/23.8 | 28.1/24.2 |
DCDH | 43.9/32.4 | 42.7/28.6 | 42.5/28.5 | 43.3/23.5 | 47.8/29.9 | 48.6/29.1 | 52.8/50.0 | 60.8/61.2 | 63.4/66.9 |
CHN | —— | —— | —— | —— | —— | —— | —— | —— | —— |
DSH | 39.6/25.8 | 40.9/25.9 | 40.7/25.1 | 40.0/22.3 | 39.9/20.6 | 43.0/23.3 | 45.1/43.6 | 56.9/55.0 | 45.8/45.5 |
HashNet | 39.8/22.8 | 39.6/22.3 | 44.1/30.2 | 44.9/24.5 | 46.7/29.2 | 49.0/30.9 | 46.5/42.1 | 57.1/58.6 | 64.3/64.5 |
ADSH | 40.4/27.1 | 41.7/27.4 | 41.0/28.4 | 42.9/39.3 | 43.4/38.4 | 42.7/38.9 | 56.0/54.5 | 58.6/59.3 | 59.0/58.4 |
ExchNet | 40.1/26.4 | 39.9/26.8 | 40.7/25.9 | 40.3/34.5 | 43.1/38.9 | 40.5/36.0 | 52.4/53.2 | 59.3/58.8 | 59.3/58.7 |
A$^2$-Net | 40.0/27.7 | 39.6/33.9 | 39.4/35.3 | 43.5/37.4 | 44.0/40.9 | 35.6/25.9 | 54.4/54.9 | 59.2/59.1 | 57.5/57.8 |
SEMICON | 39.8/29.3 | 41.1/29.1 | 41.2/28.8 | 43.4/38.8 | 40.1/35.2 | 35.8/32.0 | 49.3/48.2 | 54.6/55.1 | 54.5/54.7 |
AGMH | 39.7/35.5 | 39.7/34.5 | 39.8/31.1 | 25.5/24.4 | 28.3/25.9 | 34.4/26.3 | 45.0/44.3 | 41.1/40.9 | 42.2/41.8 |
DAHNet* | —— | —— | —— | —— | —— | —— | —— | —— | —— |
DAHNet$^{--}$ | 42.2/29.8 | 40.1/31.2 | 41.3/30.2 | 41.5/36.5 | 39.4/35.7 | 30.8/28.6 | 51.8/52.0 | 56.5/56.3 | 56.8/56.7 |
Quantitative results with "unseen@unseen/unseen@all" formats when queries are from unseen classes on the CUB200-2011, Food101, and VegFru datasets (5% categories as unseen classes).
Method | CUB200-2011 (mAP@All) 12 bits | CUB200-2011 (mAP@All) 24 bits | CUB200-2011 (mAP@All) 32 bits | CUB200-2011 (mAP@All) 48 bits | Food101 (mAP@All) 12 bits | Food101 (mAP@All) 24 bits | Food101 (mAP@All) 32 bits | Food101 (mAP@All) 48 bits | VegFru (mAP@All) 12 bits | VegFru (mAP@All) 24 bits | VegFru (mAP@All) 32 bits | VegFru (mAP@All) 48 bits |
---|---|---|---|---|---|---|---|---|---|---|---|---|
CSQ | 23.0/2.0 | 24.9/2.8 | 31.8/5.0 | 35.5/6.3 | 32.6/1.8 | 33.7/2.2 | 35.0/2.5 | 36.3/3.0 | 19.5/2.0 | 26.7/3.0 | 27.4/3.5 | 28.1/3.6 |
PSLDH | 25.8/2.2 | 20.0/1.9 | 28.0/3.3 | 24.9/3.0 | 29.8/1.5 | 33.5/2.2 | 34.2/2.6 | 36.8/2.9 | 21.9/2.0 | 23.5/2.4 | 21.4/2.2 | 23.2/2.5 |
OrthoHash | 25.6/3.3 | 31.0/4.9 | 31.4/5.7 | 37.1/6.8 | 32.7/1.8 | 35.3/2.6 | 37.1/2.8 | 38.8/3.2 | 20.6/2.0 | 24.3/2.8 | 25.7/3.2 | 25.1/3.2 |
HyP$^2$ Loss | —— | —— | —— | —— | —— | —— | —— | —— | —— | —— | —— | —— |
FISH | 21.8/2.0 | 20.2/1.8 | 22.0/2.1 | 25.1/3.7 | 29.7/1.6 | 31.6/2.3 | 29.9/2.0 | 32.9/2.7 | 18.8/1.9 | 20.3/2.5 | 20.9/2.5 | 19.7/2.1 |
MDSH | 23.3/2.2 | 21.0/1.8 | 21.2/2.0 | 24.3/2.7 | 29.4/1.5 | 33.4/2.4 | 34.0/2.5 | 37.9/3.1 | 18.3/1.5 | 19.5/2.0 | 23.9/2.7 | 23.5/2.6 |
DPAH | 23.4/1.9 | 27.2/3.0 | 29.2/4.3 | 35.1/7.4 | 34.4/1.9 | 41.5/3.2 | 45.1/3.5 | 46.9/4.2 | 24.1/2.2 | 29.8/3.0 | 33.7/4.0 | 34.2/4.9 |
DCDH | 23.1/2.3 | 32.8/7.2 | 34.1/7.5 | 37.4/7.5 | 32.1/1.8 | 32.6/2.4 | 35.3/2.5 | 44.7/4.0 | 19.9/1.8 | 24.4/3.5 | 26.5/4.1 | 31.7/5.4 |
CHN | 21.4/1.8 | 29.8/3.6 | 31.2/4.1 | 28.7/4.3 | 31.1/1.8 | 33.8/2.4 | 34.2/2.4 | 37.6/3.5 | 23.3/2.0 | 25.8/2.8 | 26.2/2.9 | 27.1/3.1 |
DSH | 40.6/5.0 | 41.5/7.9 | 49.4/12.9 | 49.1/14.3 | 39.3/2.0 | 38.8/2.4 | 44.4/3.6 | 43.7/3.7 | 35.6/2.8 | 43.8/5.5 | 42.2/5.9 | 47.5/8.9 |
HashNet | 29.5/2.0 | 45.0/7.2 | 43.3/6.4 | 51.6/10.5 | 46.0/2.2 | 53.9/3.5 | 54.7/4.1 | 55.8/4.9 | 33.5/1.2 | 46.0/3.2 | 50.7/5.3 | 55.8/7.9 |
ADSH | 36.7/8.8 | 45.0/12.7 | 46.9/13.3 | 47.1/14.9 | 39.0/5.1 | 40.7/6.3 | 41.5/6.0 | 40.4/7.1 | 36.2/4.0 | 42.6/8.0 | 51.7/13.0 | 50.2/16.7 |
ExchNet | 39.4/9.5 | 41.7/12.9 | 48.7/16.7 | 47.1/20.1 | 37.5/6.4 | 39.6/5.4 | 41.1/7.1 | 40.7/7.9 | 32.6/5.1 | 40.6/8.8 | 46.9/13.4 | 47.8/16.0 |
A$^2$-Net | 41.2/8.3 | 46.9/16.4 | 46.4/15.5 | 49.6/17.2 | 38.8/5.4 | 42.2/6.8 | 41.5/6.5 | 42.0/6.9 | 34.7/3.5 | 46.1/10.8 | 48.9/13.5 | 50.2/15.0 |
SEMICON | 40.7/8.4 | 42.8/11.0 | 48.8/15.9 | 51.7/18.9 | 35.9/3.5 | 40.3/5.2 | 41.8/5.9 | 37.1/5.2 | 35.6/4.7 | 41.9/10.6 | 43.8/14.2 | 48.8/14.2 |
AGMH | 30.7/5.0 | 42.1/11.1 | 43.0/10.5 | 38.8/9.3 | 31.6/2.6 | 32.9/4.1 | 34.4/7.3 | 37.0/7.1 | 36.8/3.9 | 42.0/10.0 | 39.9/11.8 | 38.9/10.8 |
DAHNet* | 33.1/6.5 | 39.9/9.0 | 43.3/12.2 | 47.5/16.6 | 29.6/2.1 | 34.4/4.1 | 42.5/7.6 | 39.4/7.1 | 19.2/1.7 | 29.7/6.1 | 31.3/6.0 | 35.4/7.8 |
DAHNet$^{--}$ | 34.3/7.0 | 42.6/9.1 | 48.2/12.5 | 49.5/16.8 | 34.4/3.2 | 41.2/5.9 | 37.8/6.0 | 40.7/7.4 | 39.8/5.1 | 41.5/8.4 | 49.0/11.7 | 45.8/11.4 |
Method | \emph{Stanford Dogs} (12 bits) | \emph{Stanford Dogs} (24 bits) | \emph{Stanford Dogs} (32 bits) | \emph{Stanford Dogs} (48 bits) | \emph{NABirds} (12 bits) | \emph{NABirds} (24 bits) | \emph{NABirds} (32 bits) | \emph{NABirds} (48 bits) | \emph{Aircraft} (12 bits) | \emph{Aircraft} (24 bits) | \emph{Aircraft} (32 bits) | \emph{Aircraft} (48 bits) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
CSQ | 34.9/3.4 | 35.1/4.4 | 38.6/5.5 | 37.9/6.2 | 14.1/1.4 | 19.1/2.4 | 18.7/2.7 | 21.0/3.4 | 38.3/3.8 | 47.3/8.3 | 48.9/10.5 | 55.4/9.0 |
PSLDH | 33.0/3.4 | 33.2/3.6 | 31.5/3.7 | 34.7/5.3 | 14.0/1.5 | 15.1/1.7 | 17.1/2.1 | 19.8/2.9 | 39.8/3.8 | 47.7/7.6 | 50.7/8.0 | 54.4/11.1 |
OrthoHash | 33.7/4.2 | 41.5/6.7 | 36.9/5.6 | 39.9/7.1 | 12.7/1.2 | 19.1/2.6 | 20.4/2.9 | 20.4/3.5 | 49.3/7.1 | 46.0/8.7 | 56.3/11.5 | 54.9/10.4 |
FISH | 29.0/3.2 | 33.0/3.5 | 33.0/4.0 | 32.0/4.5 | 14.4/1.7 | 16.2/3.0 | 15.1/2.7 | 16.0/3.4 | 44.3/6.8 | 45.7/7.4 | 46.1/8.2 | 48.5/8.8 |
MDSH | 28.6/2.4 | 30.5/3.2 | 33.4/4.2 | 32.3/3.8 | 10.3/0.8 | 14.6/2.0 | 14.8/2.1 | 14.3/2.6 | 42.8/4.7 | 49.7/6.2 | 50.0/7.4 | 55.0/9.7 |
DPAH | 38.3/3.9 | 40.2/4.9 | 40.2/5.8 | 44.2/8.3 | 14.0/1.0 | 23.4/3.3 | 25.3/3.6 | 28.6/5.4 | 55.8/7.4 | 59.9/10.0 | 61.0/11.0 | 62.3/14.7 |
DCDH | 29.4/3.0 | 35.3/4.8 | 37.3/6.7 | 44.4/8.6 | 10.9/0.9 | 16.4/2.4 | 19.1/3.5 | 24.9/5.5 | 47.8/6.4 | 59.5/14.6 | 62.6/19.1 | 55.9/16.1 |
CHN | 33.6/3.3 | 37.3/5.2 | 37.2/5.4 | 38.4/6.4 | 17.2/2.1 | 21.8/3.1 | 22.4/3.3 | 23.2/4.1 | 49.4/7.3 | 50.2/8.7 | 51.6/9.4 | 53.0/10.3 |
DSH | 53.6/9.3 | 50.2/11.7 | 50.8/13.4 | 51.8/13.8 | 20.8/1.3 | 25.6/3.3 | 25.6/3.4 | 30.9/6.2 | 61.5/10.5 | 63.2/12.7 | 63.3/16.1 | 61.4/15.4 |
HashNet | 53.9/5.1 | 55.9/10.6 | 57.7/16.8 | 58.3/18.0 | 14.4/0.8 | 24.4/2.1 | 29.1/3.4 | 33.4/4.5 | 51.0/2.4 | 55.6/7.7 | 61.9/10.5 | 63.8/11.5 |
ADSH | 46.7/8.2 | 51.7/17.9 | 56.0/20.9 | 49.8/17.0 | 17.2/2.0 | 25.2/5.0 | 27.8/5.9 | 30.6/8.5 | 52.9/11.6 | 62.8/16.6 | 61.5/14.3 | 61.4/17.4 |
ExchNet | 44.7/8.8 | 51.1/20.2 | 52.6/19.9 | 50.2/17.6 | 15.3/2.4 | 22.7/5.4 | 26.5/7.9 | 29.5/9.1 | 51.7/16.0 | 64.9/18.6 | 61.1/18.2 | 62.1/22.7 |
A$^2$-Net | 54.2/15.6 | 51.8/16.8 | 51.1/13.9 | 49.5/14.3 | 14.5/1.8 | 24.6/5.0 | 26.1/5.6 | 32.6/10.2 | 53.3/16.8 | 60.7/17.5 | 62.2/17.1 | 61.6/18.3 |
SEMICON | 42.1/10.0 | 50.2/14.4 | 50.1/13.5 | 47.7/15.2 | 19.0/3.0 | 25.2/6.5 | 25.9/6.0 | 32.4/11.1 | 58.1/12.2 | 65.1/13.4 | 58.0/14.7 | 60.0/12.0 |
AGMH | 44.0/6.6 | 39.2/10.1 | 40.7/10.9 | 34.9/11.1 | 20.0/2.8 | 23.4/8.9 | 23.8/7.1 | 29.7/10.4 | 58.7/12.0 | 57.3/16.8 | 57.0/14.8 | 57.9/17.6 |
DAHNet* | 35.0/5.3 | 46.6/10.8 | 42.6/9.4 | 40.9/12.0 | 14.8/1.6 | 19.0/5.4 | 24.4/9.0 | 28.3/11.0 | 57.6/8.3 | 62.5/16.5 | 59.3/15.1 | 64.6/18.8 |
DAHNet$^{--}$ | 44.8/11.3 | 45.6/11.6 | 49.4/12.0 | 44.1/13.4 | 18.2/3.0 | 26.3/5.8 | 28.3/7.9 | 30.8/11.0 | 57.2/10.9 | 60.6/16.9 | 61.9/18.8 | 60.3/19.7 |
Notes:
- DAHNet* denotes that the pairwise method also uses class labels for supervision.
- DAHNet$^{--}$ denotes a purely pairwise version for DAHNet.
More results along with conclusions please refer to our paper.