-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Plugin inference and loading from onnx #4266
Comments
cc @venkywonka |
Thank you for taking the time to create this @idantene ,
You are right that the The OnnxParser extracts If there exists no If multiple versions of plugin (and its corresponding creator) are present, the ONNX node is expected to have the right --
It was an intended design choice to prefer the native implementation if the plugin names match. One would have to use a different name that doesn't match an ONNX Op, as the plugin registry is only checked as a fallback unfortunately. |
Thanks for clarifying! As mentioned, I'm trying to use TRT-LLM's Gemm plugin, but because of this constraint, I cannot, as Gemm matches an ONNX op. EDIT: Alternatively (and/or additionally), I think that the documentation would also benefit from this clarification, as would the TRT-LLM people, so that they would not name their plugins to conflate with this design decision... |
Thank you @idantene , yes we are working it! Unfortunately for now, there aren't any flags in the OnxxParser that could arbitrate this.
Thanks! |
Thank you for these suggestions! If I understand correctly, (2) would then require recompiling the TRTLLM plugin, so I'm a bit more naturally drawn to (1). |
Digging through, this does seem doable inside the parser's source. if (opImporters.count(nodeType))
{
// inject your override here:
// start
if (isNodeInPluginRegistry(ctx, node))
{
LOG_INFO("Found registered plugin: " << nodeType << ". Importing Native Op as a plugin.");
importFunc = &opImporters.at("FallbackPluginImporter");
}
else
// end
importFunc = &opImporters.at(nodeType);
} We're internally tracking this, and although seems simple, we shall improve based on evaluating broader use-cases/implications. Thank you for exposing this gap, and helping us improve! 😄 Do let me know if I can go ahead and close this. |
That's great, above and beyond what I expected, thanks @venkywonka! Much appreciated. I assume once these changes are in place, I could simply follow the build process as described in the README -- so feel free to close this now. Thanks again! |
Hey!
I'm trying to follow through the logic of automatic parsing of plugin usage from onnx files, as per Loading an ONNX model with the custom operator.
To experiment with this, I've played around with the plugins provided by TensorRT-LLM (v0.16) and TRT 10.6.
My understanding is that when an unknown operation type is detected, the
OnnxParser
looks it up in the plugin registry (if the matchingplugin_namespace
attribute exists? Is the version not required?). If everything works well - it will use that plugin.Now, I wanted to compare the default implementation of e.g. Gemm operation and the one provided by TensorRT-LLM. I was just curious at this point to see whether they are different implementation at all, or is the one in TRT-LLM offered as a sort of backwards compatibility.
However, since the
op_type
Gemm exists natively, it seems theOnnxParser
never even looks at the plugin registry.Are there any specific flags missing? Is this a design choice or an oversight? Can one even reuse native ONNX operation names/types as plugin names?
Thanks!
The text was updated successfully, but these errors were encountered: