Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge the preprocess step into the easydeploy onnx/tensorrt model #742

Open
wants to merge 5 commits into
base: dev
Choose a base branch
from

Conversation

chenxinfeng4
Copy link

Thanks for your contribution and we appreciate it a lot. The following instructions would make your pull request more healthy and more easily get feedback. If you do not understand some items, don't worry, just make the pull request and seek help from maintainers.

Motivation

Embed the preprocess operation (image_rgb - mean / std ) into the onnx/tensorrt model. So that 1. no need to repeat this preprocess in many inference tasks. 2. The image_rgb can directly fit into the onnx/tensorrt model.

将 RGB normalization 的这一步操作也合并到onnx模型中,1. 在下游图片、视频处理任务中不用再重复preprocess 操作。导出后也不再依赖 config 文件里的mean、std参数。可以减少数据预处理的计算负载。2. 可以直接将 image rgb 的 uint8 数据 (NCHW) 扔给ONNX模型,更方便部署。不影响精度。

Modification

Export the class PreProcess(torch.nn.Module) into the deploying model.

Main change.

# projects/easydeploy/tools/export_onnx.py
def preprocess(config:Config, model:torch.nn.Module):
    data_preprocess = config.get('model', {}).get('data_preprocessor', {})
    mean = data_preprocess.get('mean', [0., 0., 0.])
    std = data_preprocess.get('std', [1., 1., 1.])
    mean_value = torch.tensor(mean, dtype=torch.float32).reshape(1, 3, 1, 1)
    std_value = torch.tensor(std, dtype=torch.float32).reshape(1, 3, 1, 1)

    class PreProcess(torch.nn.Module):
        def __init__(self):
            super().__init__()
            self.mean_value = Parameter(mean_value, requires_grad=False)
            self.std_value = Parameter(std_value, requires_grad=False)
            self.core_model = model

        def forward(self, x:torch.Tensor):
            assert x.ndim==4
            x = x.float()
            y = (x - self.mean_value) / self.std_value
            y = self.core_model(y)
            return y

    return PreProcess().eval()

...

    # embed the preprocess into the model
    cfg = Config.fromfile(args.config)
    deploy_model = preprocess(cfg, deploy_model)
    deploy_model.eval()

Fit the image_rgb data into the model

# /openmmlab/mmyolo/projects/easydeploy/tools/image-demo.py
data = data_rgb_CHW[None].type(torch.uint8).to(args.device)
result = model(data)

The test code below. It didn't change the API at all.

python projects/easydeploy/tools/export_onnx.py \
	configs/yolov5/yolov5_s-v61_syncbn_fast_8xb16-300e_coco.py \
	data/checkpoint/yolov5_s-v61_syncbn_fast_8xb16-300e_coco_20220918_084700-86e02187.pth \
	--work-dir work_dirs/yolov5_s-v61_syncbn_fast_8xb16-300e_coco \
    --img-size 640 640 \
    --batch 1 \
    --device cpu \
    --simplify \
	--opset 11 \
	--pre-topk 1000 \
	--keep-topk 100 \
	--iou-threshold 0.65 \
	--score-threshold 0.25

python projects/easydeploy/tools/image-demo.py \
    demo/dog.jpg \
    configs/yolov5/yolov5_s-v61_syncbn_fast_8xb16-300e_coco.py \
    work_dirs/yolov5_s-v61_syncbn_fast_8xb16-300e_coco/yolov5_s-v61_syncbn_fast_8xb16-300e_coco_20220918_084700-86e02187.onnx \
    --device cpu

Inspect the ONNX model. You see the model allow rgb uint8 inputs.

$ polygraphy inspect model work_dirs/yolov5_s-v61_syncbn_fast_8xb16-300e_coco/yolov5_s-v61_syncbn_fast_8xb16-300e_coco_20220918_084700-86e02187.onnx

[W] 'colored' module is not installed, will not use colors when logging. To enable colors, please install the 'colored' module: python3 -m pip install colored
[I] Loading model: /openmmlab/mmyolo/work_dirs/yolov5_s-v61_syncbn_fast_8xb16-300e_coco/yolov5_s-v61_syncbn_fast_8xb16-300e_coco_20220918_084700-86e02187.onnx
[I] ==== ONNX Model ====
    Name: torch_jit | Opset: 11
    
    ---- 1 Graph Input(s) ----
    {images [dtype=uint8, shape=(1, 3, 640, 640)]}
    
    ---- 4 Graph Output(s) ----
    {num_dets [dtype=int64, shape=('ReduceSumnum_dets_dim_0', 1)],
     boxes [dtype=float32, shape=('ReduceSumnum_dets_dim_0', 'Splitboxes_dim_1', 'Splitboxes_dim_2')],
     scores [dtype=float32, shape=('ReduceSumnum_dets_dim_0', 'Splitboxes_dim_1')],
     labels [dtype=int32, shape=('Castlabels_dim_0', 'Castlabels_dim_1')]}
    
    ---- 162 Initializer(s) ----
    
    ---- 345 Node(s) ----

Inspect the onnx model.
67576c39f5fa389f473f54f8a07069e

The result compare with original version.
dc74b7d3cb1fae8d73d52c47ea0bd3f

Checklist

  1. Pre-commit or other linting tools are used to fix potential lint issues.
  2. The modification is covered by complete unit tests. If not, please add more unit tests to ensure the correctness.
  3. If the modification has a potential influence on downstream projects, this PR should be tested with downstream projects, like MMDetection or MMClassification.
  4. The documentation has been modified accordingly, like docstring or example tutorials.

@CLAassistant
Copy link

CLAassistant commented Apr 20, 2023

CLA assistant check
All committers have signed the CLA.

@hhaAndroid hhaAndroid requested a review from triple-Mu April 27, 2023 06:48
@triple-Mu
Copy link
Collaborator

TensorRT does not support uint8 input.
Ncnn has it own preprocess function.
Most other chip platforms provide preprocessing operators.
All in all, I don't recommend adding this feature, currently it is only available in onnxruntime.

@chenxinfeng4
Copy link
Author

TensorRT does not support uint8 input.

Thanks to point out that issue. I'll fix the uint8 input to float32.

Ncnn has it own preprocess function.

The deployment should FIX not only the structure, but also all the WEIGH/PARAMETER into one file. However in previous mmyolo deployment, we have to read the preprocess WEIGHT again from the config file when doing inferece. It's not a good way, nor friendly for deployment. The deployed model should be decoupled with the config file as much as possible.

@chenxinfeng4
Copy link
Author

chenxinfeng4 commented Apr 28, 2023

I saw the previous the input normalization was embedded in the build_test_pipeline. I think that was mmdet2.0. However, in the new mmdet3.0 and mmyolo feature, the input normalization is decoupled from the build_test_pipeline, and named pre_process CLASS. I think maybe some of you would agree to go further.

@chenxinfeng4
Copy link
Author

Any more comment?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants