Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversion to TRT failure of TensorRT 8.6.1.6 when converting CO-DETR model on GPU RTX 4090 #4280

Open
edwardnguyen1705 opened this issue Dec 12, 2024 · 2 comments
Assignees
Labels
Engine Build Issues with engine build triaged Issue has been triaged by maintainers

Comments

@edwardnguyen1705
Copy link

edwardnguyen1705 commented Dec 12, 2024

Description

I tried to convert model CO-DETR to TRT, but it fails with error below

[12/12/2024-02:17:38] [E] Error[10]: Could not find any implementation for node {ForeignNode[/0/Cast_3.../0/backbone/Reshape_3 + /0/backbone/Transpose_3]}.

Environment

TensorRT Version: 8.6.1.6

NVIDIA GPU: NVIDIA GeForce RTX 4090

NVIDIA Driver Version: 555.42.06

CUDA Version: 12.0

CUDNN Version:

Operating System: Ubuntu 22.04.3 LTS

Python Version (if applicable):

Tensorflow Version (if applicable):

PyTorch Version (if applicable):

Baremetal or Container (if so, version):

Relevant Files

Model link: https://drive.google.com/file/d/1voa7liji1OJxDQ8tphnbnm6EM-MexI_v/view?usp=drive_link

Steps To Reproduce

  • Env preparation: Build docker image following TensorRT-Docker-Image
  • Docker run and go to docker container
  • PyTorch to ONNX: follow DeepStream-Yolo
  • ONNX to TRT: trtexec --onnx=co_dino_5scale_swin_large_16e_o365tococo_h1280w1280.onnx --saveEngine=co_dino_5scale_swin_large_16e_o365tococo_h1280w1280.engine --explicitBatch --minShapes=input:1x3x1280x1280 --optShapes=input:2x3x1280x1280 --maxShapes=input:4x3x1280x1280 --fp16 --memPoolSize=workspace:10000 --tacticSources=-cublasLt,+cublas --sparsity=enable --verbose

Commands or scripts:

Have you tried the latest release?: Not yet.

Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (polygraphy run <model.onnx> --onnxrt): If I follow steps described in DeepStream-Yolo, then the generated engine file works, but the speed is slow. Therefore, I would like to use trtexec.

@lix19937
Copy link

Can you do a test, remove --memPoolSize=workspace:10000 ? And then reduce the size of dynamic shape ?

@asfiyab-nvidia asfiyab-nvidia added Engine Build Issues with engine build triaged Issue has been triaged by maintainers labels Dec 18, 2024
@asfiyab-nvidia
Copy link
Collaborator

@edwardnguyen1705 can you update the bug with the log from the recommendations from @lix19937 .
Can you also share the ONNX model so we can test it ourselves? Thanks

@asfiyab-nvidia asfiyab-nvidia self-assigned this Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Engine Build Issues with engine build triaged Issue has been triaged by maintainers
Projects
None yet
Development

No branches or pull requests

3 participants