Omni-sourced Webly-supervised Learning for Video Recognition

Haodong Duan, Yue Zhao, Yuanjun Xiong, Wentao Liu, Dahua Lin

In ECCV, 2020. Paper

Model Zoo

Kinetics-400 Model Release

We currently released 4 models trained with OmniSource framework, including both 2D and 3D architectures. We compare the performance of models trained with or without OmniSource in the following table.

Model	Modality	Pretrained	Backbone	Input	Resolution	Top-1 (Baseline / OmniSource (Delta))	Top-5 (Baseline / OmniSource (Delta)))	Download
TSN	RGB	ImageNet	ResNet50	3seg	340x256	70.6 / 73.6 (+ 3.0)	89.4 / 91.0 (+ 1.6)	Baseline / OmniSource
TSN	RGB	IG-1B	ResNet50	3seg	short-side 320	73.1 / 75.7 (+ 2.6)	90.4 / 91.9 (+ 1.5)	Baseline / OmniSource
SlowOnly	RGB	Scratch	ResNet50	4x16	short-side 320	72.9 / 76.8 (+ 3.9)	90.9 / 92.5 (+ 1.6)	Baseline / OmniSource
SlowOnly	RGB	Scratch	ResNet101	8x8	short-side 320	76.5 / 80.4 (+ 3.9)	92.7 / 94.4 (+ 1.7)	Baseline / OmniSource

Benchmark on Mini-Kinetics

We release a subset of web dataset used in the OmniSource paper. Specifically, we release the web data in the 200 classes of Mini-Kinetics. The statistics of those datasets is detailed in preparing_omnisource. To obtain those data, you need to fill in a data request form. After we received your request, the download link of these data will be send to you. For more details on the released OmniSource web dataset, please refer to preparing_omnisource.

We benchmark the OmniSource framework on the released subset, results are listed in the following table (we report the Top-1 and Top-5 accuracy on Mini-Kinetics validation). The cbenchmark can be used as a baseline for video recognition with web data.

TSN-8seg-ResNet50

Setting	Top-1	Top-5	ckpt	json	log
Baseline	77.4	93.6	ckpt	json	log
+GG-img	78.0	93.6	ckpt	json	log
+[GG-IG]-img	78.6	93.6	ckpt	json	log
+IG-vid	80.6	95.0	ckpt	json	log
+KRaw	78.6	93.2	ckpt	json	log
OmniSource	81.3	94.8	ckpt	json	log

SlowOnly-8x8-ResNet50

Setting	Top-1	Top-5	ckpt	json	log
Baseline	78.6	93.9	ckpt	json	log
+GG-img	80.8	95.0	ckpt	json	log
+[GG-IG]-img	81.3	95.2	ckpt	json	log
+IG-vid	82.4	95.6	ckpt	json	log
+KRaw	80.3	94.5	ckpt	json	log
OmniSource	82.9	95.8	ckpt	json	log

We also list the benchmark in the original paper which run on Kinetics-400 for comparison:

Model	Baseline	+GG-img	+[GG-IG]-img	+IG-vid	+KRaw	OmniSource
TSN-3seg-ResNet50	70.6 / 89.4	71.5 / 89.5	72.0 / 90.0	72.0 / 90.3	71.7 / 89.6	73.6 / 91.0
SlowOnly-4x16-ResNet50	73.8 / 90.9	74.5 / 91.4	75.2 / 91.6	75.2 / 91.7	74.5 / 91.1	76.6 / 92.5

Citing OmniSource

If you find OmniSource useful for your research, please consider citing the paper using the following BibTeX entry.

[ALGORITHM]

@article{duan2020omni,
  title={Omni-sourced Webly-supervised Learning for Video Recognition},
  author={Duan, Haodong and Zhao, Yue and Xiong, Yuanjun and Liu, Wentao and Lin, Dahua},
  journal={arXiv preprint arXiv:2003.13042},
  year={2020}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Omni-sourced Webly-supervised Learning for Video Recognition

Model Zoo

Kinetics-400 Model Release

Benchmark on Mini-Kinetics

TSN-8seg-ResNet50

SlowOnly-8x8-ResNet50

Citing OmniSource

Files

README.md

Latest commit

History

README.md

File metadata and controls

Omni-sourced Webly-supervised Learning for Video Recognition

Model Zoo

Kinetics-400 Model Release

Benchmark on Mini-Kinetics

TSN-8seg-ResNet50

SlowOnly-8x8-ResNet50

Citing OmniSource