Regarding the quantization matmul operator and softmax operator in PyTorch #2247
Unanswered
xiexiaozheng
asked this question in
Q&A
Replies: 1 comment 1 reply
-
Hi @xiexiaozheng,
More information about which operations support execution in INT8 precision and how a quantized model from its original precision is transformed to a low precision is here https://docs.openvino.ai/2023.1/openvino_docs_OV_UG_lpt.html |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
@alexsu52 Hi, I attempted to QAT quantize a toy model that contains matmul and softmax operators. However, after exporting the model using torch.onnx.export, I noticed that no fake quantization nodes were inserted after the matmul operator, and there were none after softmax either. Why is that?
mode code is like this:
the model with onnx format like this
Beta Was this translation helpful? Give feedback.
All reactions