«ZTransforms» is an image data enhancement code base
based on pytorch/vision architecture,add albumentations as the backend
- input image format:
numpy ndarray
- data type:
uint8
- channel arrangement order:
rgb
critical dependencies's version:
pytorch/vision: c1f85d34761d86db21b6b9323102390834267c9b
albumentations-team/albumentations: v0.5.2
PyTorch provides an official data enhancement implementation:transforms。The module performs data enhancement operation based on PIL, and its advantages and disadvantages are as follows:
- Advantages:
- Simple and clear data architecture
- Simple and understandable data processing flow
- Perfect documentation introduction
- Disadvantages:
- Based on the PIL backend, the provided image enhancement function is limited
- Compared with other implementations, the execution speed is not fast
torchvision
is also aware of this and has made improvements since "0.8.0"
Prior to v0.8.0, transforms in torchvision have traditionally been PIL-centric and presented multiple limitations due to that. Now, since v0.8.0, transforms implementations are Tensor and PIL compatible and we can achieve the following new features:
transform multi-band torch tensor images (with more than 3-4 channels)
torchscript transforms together with your model for deployment
support for GPU acceleration
batched transformation such as for videos
read and decode data directly as torch tensor with torchscript support (for PNG and JPEG image formats)
- On the one hand, the new backend Pill-SIMD is used to improve the execution speed of PIL;
- On the other hand, PyTorch backend is added to realize GPU acceleration
Two data enhancement libraries are found on the Internet, which provide detection/segmentation data enhancement in addition to classification data enhancement:
- imgaug:Which realizes more data enhancement operations;
- albumentations:It finds out the fastest enhancement function in different backend (
pytorch/imgaug/opencv
) (refer to benchmarking results)
The above two data enhancement libraries have realized the data flow operation mode similar to transforms
。However, relatively speaking, I still like the official implementation and usage. Therefore, this code base is newly built, based on transforms, the albumentation
backend implementation is added to the original functions, and new data enhancement operations are also added (if albumentation
is not implemented, use imgaug/opencv/...
to implement it).
$ pip install ztransforms
# import torchvision.transforms as transforms
import ztransforms.cls as transforms
...
...
- zhujian - Initial work - zjykzj
@Article{info11020125,
AUTHOR = {Buslaev, Alexander and Iglovikov, Vladimir I. and Khvedchenya, Eugene and Parinov, Alex and Druzhinin, Mikhail and Kalinin, Alexandr A.},
TITLE = {Albumentations: Fast and Flexible Image Augmentations},
JOURNAL = {Information},
VOLUME = {11},
YEAR = {2020},
NUMBER = {2},
ARTICLE-NUMBER = {125},
URL = {https://www.mdpi.com/2078-2489/11/2/125},
ISSN = {2078-2489},
DOI = {10.3390/info11020125}
}
@misc{imgaug,
author = {Jung, Alexander B.
and Wada, Kentaro
and Crall, Jon
and Tanaka, Satoshi
and Graving, Jake
and Reinders, Christoph
and Yadav, Sarthak
and Banerjee, Joy
and Vecsei, Gábor
and Kraft, Adam
and Rui, Zheng
and Borovec, Jirka
and Vallentin, Christian
and Zhydenko, Semen
and Pfeiffer, Kilian
and Cook, Ben
and Fernández, Ismael
and De Rainville, François-Michel
and Weng, Chi-Hung
and Ayala-Acevedo, Abner
and Meudec, Raphael
and Laporte, Matias
and others},
title = {{imgaug}},
howpublished = {\url{https://github.com/aleju/imgaug}},
year = {2020},
note = {Online; accessed 01-Feb-2020}
}
Anyone's participation is welcome! Open an issue or submit PRs.
Small note:
- Git submission specifications should be complied with Conventional Commits
- If versioned, please conform to the Semantic Versioning 2.0.0 specification
- If editing the README, please conform to thestandard-readme specification.
Apache License 2.0 © 2021 zjykzj