Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ONNX OP] DynamicQuantizeMatMul #28158

Draft
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

virajwad
Copy link
Contributor

@virajwad virajwad commented Dec 20, 2024

Pseudo-Implementation of DQMM ONNX Contrib Op (Mimic Functionality) for OpenVINO ONNX Frontend

This PR does not implement DQMM exactly as ONNX RT - e.g. MatMulInteger is not used, instead ov::MatMul is used on dequantized matrix B

Please do not merge yet

@virajwad virajwad requested a review from a team as a code owner December 20, 2024 04:24
@virajwad virajwad marked this pull request as draft December 20, 2024 04:25
@github-actions github-actions bot added the category: ONNX FE OpenVINO ONNX FrontEnd label Dec 20, 2024
@sys-openvino-ci sys-openvino-ci added the ExternalPR External contributor label Dec 20, 2024
@virajwad
Copy link
Contributor Author

@gkrivor
Current state of this PR as of writing this comment:

image

PR is able to pass the test case on CPU / GPU both, but with slightly high tolerance value of 0.0055f. I believe this is due to slight nuance difference in ONNX RT implementation vs how I have implemented in OpenVINO. Will need to take a look later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: ONNX FE OpenVINO ONNX FrontEnd ExternalPR External contributor
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants