Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Engine cache and model security #4290

Open
MickaMickaMicka opened this issue Dec 19, 2024 · 2 comments
Open

Engine cache and model security #4290

MickaMickaMicka opened this issue Dec 19, 2024 · 2 comments
Assignees
Labels
ONNX Issues relating to ONNX usage and import triaged Issue has been triaged by maintainers

Comments

@MickaMickaMicka
Copy link

I am using ONNX Runtime to generate a TensorRt engine from ONNX model file.
I am using trt_options.trt_engine_cache_enable = 1; and trt_options.trt_engine_cache_path = "./path"; to generate the engine file once and to load it on later runs, which works well (speeds up strongly).

However I'm not sure about what exactly that engines file is.
Does it include models weights?
Can it be used to load the full model, without having access to the .onnx model file?

My question arises because we are encrypting our model files and load them from RAM during runtime, so that other people with access to the system don't have access to our models. If the cache is basically a full access to the model, we will need a different solution.

@asfiyab-nvidia
Copy link
Collaborator

@yuanyao-nv can you help take a look at the ONNX runtime query?

@asfiyab-nvidia asfiyab-nvidia added ONNX Issues relating to ONNX usage and import triaged Issue has been triaged by maintainers labels Dec 23, 2024
@yuanyao-nv
Copy link
Collaborator

Does it include models weights?
By default the engine will include the weights.

Can it be used to load the full model, without having access to the .onnx model file?
Once the engine is built, the onnx model is no longer needed for inference. Note that the engine only works for the configs you specify during engine build (optimization profile, precisions, GPU version, etc.)

See more info in our developer guide: https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#prog-model

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ONNX Issues relating to ONNX usage and import triaged Issue has been triaged by maintainers
Projects
None yet
Development

No branches or pull requests

3 participants