Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CPU] Introduce FullyConnected, FCQuantized, FCCompressed, Placeholder #26239

Conversation

EgorDuplensky
Copy link
Contributor

@EgorDuplensky EgorDuplensky commented Aug 26, 2024

Details:

  1. Introduce the following operations to the internal opset

    • FullyConnected (MatMul with transposed constant second input)
    • FullyConnectedCompressed (FullyConnected with weights compression)
    • FullyConnectedQuantizedLegacy (FullyConnected with quantized activations and weights and dequantize scale and zero point pulled through the Op by LPT)
    • FullyConnectedQuantized (FullyConnected with quantization scales and zero points on activation, weights and outputs). Planned to be used in scope of dynamic quantization. Can be used for a static quantization as well in the future.
    • Unused inputs are presented as Constant input with Shape{0}
  2. The following transformations were added / updated:

    • ConvertFullyConnectedToFullyConnectedCompressed (replaces proprietary FuseFCAndWeightsDecompression)
    • ConvertFCToFCQuantizedLegacy replaces proprietary FuseConvMatmulFCDeconvAndDQScales
    • FullyConnectedBiasFusion (added into CPU folder for now, needs to be checked and review by GPU team before adaptation to internal opset). Replaces proprietary FuseConvolutionMatMulDeconvAndBias
    • ConvertMatMulToFC updated to use ov::op::internal:FullyConnected, planned to be moved to internal opset after review from GPU team

Todo

  • Clean up debug code
  • Clean up extra cmake targets
  • Perf regression check

Tickets:

  • 149923

@EgorDuplensky EgorDuplensky requested review from a team as code owners August 26, 2024 17:06
@EgorDuplensky EgorDuplensky requested review from itikhono and removed request for a team August 26, 2024 17:06
@github-actions github-actions bot added category: CPU OpenVINO CPU plugin category: transformations OpenVINO Runtime library - Transformations labels Aug 26, 2024
@EgorDuplensky EgorDuplensky force-pushed the quantized_fullyconnected_op branch 2 times, most recently from d318f8c to d444d87 Compare August 27, 2024 09:38
@EgorDuplensky EgorDuplensky force-pushed the quantized_fullyconnected_op branch 4 times, most recently from 707a7cf to 983cfbe Compare August 28, 2024 13:07
@EgorDuplensky EgorDuplensky requested a review from a team as a code owner August 28, 2024 13:07
@github-actions github-actions bot added the category: build OpenVINO cmake script / infra label Aug 28, 2024
@EgorDuplensky EgorDuplensky force-pushed the quantized_fullyconnected_op branch 4 times, most recently from ca9e51d to 3fa55f6 Compare September 2, 2024 12:21
@EgorDuplensky EgorDuplensky force-pushed the quantized_fullyconnected_op branch from 3fa55f6 to f405ccc Compare September 5, 2024 16:31
@github-actions github-actions bot added the category: Core OpenVINO Core (aka ngraph) label Sep 5, 2024
@EgorDuplensky EgorDuplensky force-pushed the quantized_fullyconnected_op branch 2 times, most recently from 7a4891b to 24e0a1b Compare September 5, 2024 17:20
@github-actions github-actions bot removed the category: Core OpenVINO Core (aka ngraph) label Sep 5, 2024
@EgorDuplensky EgorDuplensky force-pushed the quantized_fullyconnected_op branch 2 times, most recently from cb22763 to b5a7adf Compare September 8, 2024 16:08
@EgorDuplensky EgorDuplensky requested a review from a team as a code owner September 8, 2024 16:08
@github-actions github-actions bot added the category: IR FE OpenVINO IR v10 / v11 FrontEnd label Sep 8, 2024
@EgorDuplensky EgorDuplensky force-pushed the quantized_fullyconnected_op branch from b5a7adf to 9029bc1 Compare September 8, 2024 16:37
@EgorDuplensky EgorDuplensky force-pushed the quantized_fullyconnected_op branch from eaba4ff to b1e368b Compare November 26, 2024 13:21
@EgorDuplensky
Copy link
Contributor Author

EgorDuplensky commented Nov 26, 2024

There is an option to create an empty uninitialized Constant of undefined type -- just like that: make_shared<v0::Constant>(element::undefined, Shape{0}); I think we could reconsider the introduction of a Placeholder or simply inherit it from Constant if you need a separate entity.

This was my initial idea, since I also didn't like the PlaceHolder idea. And it didn't work, don't remember why. Let me try one more time :)

Placeholder replaced with an undefined Constant with an empty shape

@EgorDuplensky
Copy link
Contributor Author

There is an option to create an empty uninitialized Constant of undefined type -- just like that: make_shared<v0::Constant>(element::undefined, Shape{0}); I think we could reconsider the introduction of a Placeholder or simply inherit it from Constant if you need a separate entity.

This was my initial idea, since I also didn't like the PlaceHolder idea. And it didn't work, don't remember why. Let me try one more time :)

Placeholder replaced with an undefined Constant with an empty shape

There is one 'issue' left
With this change we are making 'undefined' element type a valid element type.
It has an impact on serialization / deserialization. Such element type was not expected there before.
Also netron does not support it at the moment which is probably is not a big deal and can be fixed.
Also, now we basically do not have some 'invalid' element type inside enum.
Not sure if this is a real issue or not.
@jane-intel @praasz @itikhono What do you think?

@EgorDuplensky EgorDuplensky force-pushed the quantized_fullyconnected_op branch from b1e368b to e80440f Compare November 26, 2024 15:38
@praasz
Copy link
Contributor

praasz commented Nov 28, 2024

There is an option to create an empty uninitialized Constant of undefined type -- just like that: make_shared<v0::Constant>(element::undefined, Shape{0}); I think we could reconsider the introduction of a Placeholder or simply inherit it from Constant if you need a separate entity.

This was my initial idea, since I also didn't like the PlaceHolder idea. And it didn't work, don't remember why. Let me try one more time :)

Placeholder replaced with an undefined Constant with an empty shape

There is one 'issue' left With this change we are making 'undefined' element type a valid element type. It has an impact on serialization / deserialization. Such element type was not expected there before. Also netron does not support it at the moment which is probably is not a big deal and can be fixed. Also, now we basically do not have some 'invalid' element type inside enum. Not sure if this is a real issue or not. @jane-intel @praasz @itikhono What do you think?

In serialization the undefined and dynamic types are converted to same string.
Maybe for this empty constant the dynamic can be used and only for empty const can be serialized as dynamic string (maybe can be applied as generic rule later).

The dynamic type is compatible with any other. The deserialization for type should work as it get ov type by name.

@EgorDuplensky EgorDuplensky force-pushed the quantized_fullyconnected_op branch from e80440f to cef7bc6 Compare November 29, 2024 17:52
@github-actions github-actions bot removed the category: build OpenVINO cmake script / infra label Nov 29, 2024
@EgorDuplensky
Copy link
Contributor Author

There is an option to create an empty uninitialized Constant of undefined type -- just like that: make_shared<v0::Constant>(element::undefined, Shape{0}); I think we could reconsider the introduction of a Placeholder or simply inherit it from Constant if you need a separate entity.

This was my initial idea, since I also didn't like the PlaceHolder idea. And it didn't work, don't remember why. Let me try one more time :)

Placeholder replaced with an undefined Constant with an empty shape

There is one 'issue' left With this change we are making 'undefined' element type a valid element type. It has an impact on serialization / deserialization. Such element type was not expected there before. Also netron does not support it at the moment which is probably is not a big deal and can be fixed. Also, now we basically do not have some 'invalid' element type inside enum. Not sure if this is a real issue or not. @jane-intel @praasz @itikhono What do you think?

In serialization the undefined and dynamic types are converted to same string. Maybe for this empty constant the dynamic can be used and only for empty const can be serialized as dynamic string (maybe can be applied as generic rule later).

The dynamic type is compatible with any other. The deserialization for type should work as it get ov type by name.

ov::dynamic type is quite unique. It does not really play well with the backends we use.
Let's try to proceed with an undefined type for now. We can replace it with dynamic one later if necessary

@EgorDuplensky EgorDuplensky force-pushed the quantized_fullyconnected_op branch from cef7bc6 to 61ed454 Compare December 5, 2024 14:12
@EgorDuplensky
Copy link
Contributor Author

Performance regression checks passed. Ready for merge

@dmitry-gorokhov dmitry-gorokhov added this pull request to the merge queue Dec 10, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Dec 10, 2024
@dmitry-gorokhov dmitry-gorokhov added this pull request to the merge queue Dec 10, 2024
Merged via the queue into openvinotoolkit:master with commit 6a4ba46 Dec 10, 2024
173 checks passed
@dmitry-gorokhov dmitry-gorokhov deleted the quantized_fullyconnected_op branch December 10, 2024 10:04
alvoron added a commit to eshoguli/openvino that referenced this pull request Dec 12, 2024
11happy pushed a commit to 11happy/openvino that referenced this pull request Dec 23, 2024
openvinotoolkit#26239)

### Details:
1.  Introduce the following operations to the internal opset

    * `FullyConnected` (`MatMul` with transposed constant second input)
* `FullyConnectedCompressed` (`FullyConnected` with weights compression)
* `FullyConnectedQuantizedLegacy` (`FullyConnected` with quantized
activations and weights and dequantize scale and zero point pulled
through the Op by LPT)
* `FullyConnectedQuantized` (`FullyConnected` with quantization scales
and zero points on activation, weights and outputs). Planned to be used
in scope of dynamic quantization. Can be used for a static quantization
as well in the future.
    * Unused inputs are presented as `Constant` input with `Shape{0}`

2. The following transformations were added / updated:
* `ConvertFullyConnectedToFullyConnectedCompressed` (replaces
proprietary ~`FuseFCAndWeightsDecompression`~)
* `ConvertFCToFCQuantizedLegacy` replaces proprietary
~`FuseConvMatmulFCDeconvAndDQScales`~
* `FullyConnectedBiasFusion` (added into CPU folder for now, needs to be
checked and review by GPU team before adaptation to internal opset).
Replaces proprietary ~`FuseConvolutionMatMulDeconvAndBias`~
* `ConvertMatMulToFC` updated to use `ov::op::internal:FullyConnected`,
planned to be moved to internal opset after review from GPU team

### Todo
 - [x] Clean up debug code
 - [x] Clean up extra cmake targets
 - [x] Perf regression check

### Tickets:
 - 149923
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: CPU OpenVINO CPU plugin category: IE Tests OpenVINO Test: plugins and common category: IR FE OpenVINO IR v10 / v11 FrontEnd category: transformations OpenVINO Runtime library - Transformations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants