Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to fuse QuantizeLinear Node with my custom op when convert onnx to trtengine #4270

Open
AnnaTrainingG opened this issue Dec 5, 2024 · 1 comment
Assignees
Labels
ONNX Issues relating to ONNX usage and import triaged Issue has been triaged by maintainers

Comments

@AnnaTrainingG
Copy link

AnnaTrainingG commented Dec 5, 2024

how to fuse QuantizeLinear Node with my custom op when convert onnx to trtengine
I find that the QuantizeLinear will be called like:
Image
it cost long time, How to fuse it with my custom kernel before QuantizeLinear node. the compute in QuantizeLinear is what ?
like:
my_custom_op - Q -- DQ --- Conv. fuse to => my_custom_op_with_Q --- (DQ_Conv)

@lix19937
Copy link

First , you need use --best to check whether your model fusion case match your expected goal.
Then, fusion the scale in your plugin.

@asfiyab-nvidia asfiyab-nvidia added triaged Issue has been triaged by maintainers ONNX Issues relating to ONNX usage and import labels Dec 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ONNX Issues relating to ONNX usage and import triaged Issue has been triaged by maintainers
Projects
None yet
Development

No branches or pull requests

4 participants