feature: model compression #2636

rkuester · 2024-07-24T18:01:59Z

An issue to track the implementation of model compression.

Merge Queue

PRs from rkuester/feat-compression are open for approval and merging. Because of our one-commit-per-PR policy, there is typically only one PR open at a time.
Commits along the branch rkuester/feat-compression-next are queued, so to speak, for submission as PRs. Looking at this branch might give context for the open PR in feat-compression, see above. Be aware—this branch is rebased often.
The branch rkuester/compress-testing typically contains the final result once all PRs for the model compression feature are merged. This is the Oort cloud from which commits along feat-compression-next, which turn into PRs from feat-compression, are formed. Explore this branch or compare it to main to see the full model compression feature implementation and understand the queued PRs in context.

Add the Python distribution package `hexdump`, to be used in tests and utilities which display raw memory. BUG=#2636

Hoist the universally-useful tflite::Span out of codegen/runtime/micro_codegen_context.h and into an independent header usable from elsewhere in the project. BUG=#2636

…tor (#2642) Add a type, tflite::StaticVector, which behaves like std::vector, but which avoids heap memory allocation. BUG=#2636

github-actions · 2024-10-03T10:06:43Z

"This issue is being marked as stale due to inactivity. Remove label or comment to prevent closure in 5 days."

rkuester · 2024-10-03T16:12:32Z

This task remains open; PRs and issues link to it for tracking purposes regularly.

github-actions · 2024-10-29T10:07:17Z

"This issue is being marked as stale due to inactivity. Remove label or comment to prevent closure in 5 days."

chore: remove obsolete ci/temp_patches Remove ci/temp_patches, which was obsoleted in 23f608f once it was no longer used by the sync script. It should have been deleted then. Remove it not only to clean up dead code, but because it contains a reference to `micro_copts`, which is about to be refactored away, and we don't want to leave stray references to it in the tree. BUG=#2636

Remove micro_copts() by replacing every cc_* target that used them with a tflm_cc_* equivalent, and setting those common copts in one place, inside the tflm_cc_* macro. This is the first of several commits introducing tflm_cc_* macros in place of cc_binary, cc_library, and cc_test. Motivated by the upcoming need to support conditional compilation, the objective is to centralize build configuration rather than requiring (and remembering that) each cc_* target in the project add the same common attributes such as compiler options and select()ed Alternatives such as setting global options on the command line or in .bazelrc, even if simplified with a --config option, fail to preserve flags and hooks for configuration in the case TFLM is used as an external repository by an application project. Nor is it easy in that case for individual targets to override an otherwise global setting. BUG=tensorflow#2636

Replace cc_* targets remaining in TFLM code with tflm_cc_* targets. These are targets which did not formerly use the common copts. Avoid changing imported TFLite code, if for no other reason than to avoid merge conflicts during the automatic sync with upstream TFLite. BUG=tensorflow#2636

#2765) Remove micro_copts() by replacing every cc_* target that used them with a tflm_cc_* equivalent, and setting those common copts in one place, inside the tflm_cc_* macro. This is the first of several commits introducing tflm_cc_* macros in place of cc_binary, cc_library, and cc_test. Motivated by the upcoming need to support conditional compilation, the objective is to centralize build configuration rather than requiring (and remembering that) each cc_* target in the project add the same common attributes such as compiler options and select()ed Alternatives such as setting global options on the command line or in .bazelrc, even if simplified with a --config option, fail to preserve flags and hooks for configuration in the case TFLM is used as an external repository by an application project. Nor is it easy in that case for individual targets to override an otherwise global setting. BUG=#2636

Replace cc_* targets remaining in TFLM code with tflm_cc_* targets. These are targets which did not formerly use the common copts. Avoid changing imported TFLite code, if for no other reason than to avoid merge conflicts during the automatic sync with upstream TFLite. BUG=tensorflow#2636

Replace cc_* targets remaining in TFLM code with tflm_cc_* targets. These are targets which did not formerly use the common copts. Avoid changing imported TFLite code, if for no other reason than to avoid merge conflicts during the automatic sync with upstream TFLite. BUG=#2636

Add tflite::hexdump() for printing raw memory to output streams. Copy the output format of Python's hexdump module. BUG=tensorflow#2636

Add a flatbuffer schema for describing compressed models. Flatbuffers with this schema are to be used as the value in a .tflite model flatbuffer metadata field, and contain the extra information necessary to describe a compressed model. Include tests to ensure basic functionality and demonstrate integration with C++, Python, and Bazel. BUG=tensorflow#2636

Add tflite::hexdump() for printing raw memory to output streams. Copy the output format of Python's hexdump module. BUG=#2636

…or (#3011) Add `AllocateCompressedTensorsList` method to manage compressed tensor allocations. Update `RecordedAllocationType` to include `kCompressionData` for tracking compression-related allocations. Add `TestCompressedModel` test case to validate compressed tensor allocation functionality. BUG=part of #2636

Allocate resource variables in a persistent buffer when the input tensor is compressed. Extend tests to validate operation. BUG=part of tensorflow#2636

Implement tensor decompression in op concatenation. Extend tests to validate operation on compressed tensors. BUG=part of tensorflow#2636

Implement tensor decompression in op conv. Extend tests to validate operation on compressed tensors. BUG=part of tensorflow#2636

…3013) Allocate resource variables in a persistent buffer when the input tensor is compressed. Extend tests to validate operation. BUG=part of #2636

…#3014) Implement tensor decompression in op concatenation. Extend tests to validate operation on compressed tensors. BUG=part of #2636

Implement tensor decompression in op conv. Extend tests to validate operation on compressed tensors. BUG=part of #2636

Implement tensor decompression in op depthwise conv. Extend tests to validate operation on compressed tensors. BUG=part of tensorflow#2636

Implement tensor decompression in op transpose conv. Extend tests to validate operation on compressed tensors. BUG=part of tensorflow#2636

#3018) Implement tensor decompression in op transpose conv. Extend tests to validate operation on compressed tensors. BUG=part of #2636

#3017) Implement tensor decompression in op depthwise conv. Extend tests to validate operation on compressed tensors. BUG=part of #2636

Clarify the usage of `MicroContext::AllocateDecompressionScratchBuffer` and `tflite::micro::GetTensorData` for handling decompressed tensor data. Add a section on alternate decompression memory regions, explaining how to specify and use specialized memory for decompression. Update instructions for compressing models using a YAML specification. Simplify the model compression and alignment command examples. Spin off new Generic Benchmark Application documentation. BUG=part of tensorflow#2636

Add instructions for using the tool with compressed models, including profiling timing for decompression and alternate memory regions. Update the tested targets list to include additional Xtensa architectures. Provide example build and run commands for compressed models with alternate decompression memory. Correct typos and improve clarity in build instructions and example outputs. Update compiler flags and example output to reflect recent changes. BUG=part of tensorflow#2636

Add a check in `micro_allocator.cc` to verify the compression metadata schema version. If the schema version in the metadata is greater than the expected version, log a schema version mismatch error and return a `nullptr`. This prevents potential issues arising from using a newer, unsupported schema version. BUG=part of tensorflow#2636

Clarify the usage of `MicroContext::AllocateDecompressionScratchBuffer` and`tflite::micro::GetTensorData` for handling decompressed tensor data. Add a section on alternate decompression memory regions, explaining how to specify and use specialized memory for decompression. Update instructions for compressing models using a YAML specification. Simplify the model compression and alignment command examples. Spin off new Generic Benchmark Application documentation. BUG=part of #2636

Add a check in `micro_allocator.cc` to verify the compression metadata schema version. If the schema version in the metadata is greater than the expected version, log a schema version mismatch error and return a `nullptr`. This prevents potential issues arising from using a newer, unsupported schema version. BUG=part of #2636

Add instructions for using the tool with compressed models, including profiling timing for decompression and alternate memory regions. Update the tested targets list to include additional Xtensa architectures. Provide example build and run commands for compressed models with alternate decompression memory. Correct typos and improve clarity in build instructions and example outputs. Update compiler flags and example output to reflect recent changes. BUG=part of #2636

tinskip · 2024-12-18T20:38:09Z

Hi. Was this change intended for whether TFLM compression is or or off, or should it have been made conditional as well?

https://github.com/tensorflow/tflite-micro/blame/main/tensorflow/lite/micro/kernels/concatenation.cc#L189

ddavis-2015 · 2024-12-18T22:53:54Z

Hi. Was this change intended for whether TFLM compression is or or off, or should it have been made conditional as well?

https://github.com/tensorflow/tflite-micro/blame/main/tensorflow/lite/micro/kernels/concatenation.cc#L189

@tinskip This change is intentional and is there regardless of LRTM conditional compression compilation. The change updates the LRTM (TFLM) code to match the LiteRT (TfLite) reference implementation.

…sors Compress using a single value table when a tensor is per-tensor quantized, as indicated by the presence of only one quantization scale and zero point. Update unit tests accordingly and augment `test_models` to accommodate additional quantization fields. Abandon the logic that a tensor should be compressed along the NHWC channel dimension if the quantization parameters do not specify an axis. Instead, fail with an error if the compression axis cannot be inferred from the quantization parameters. The interpreter already expects a single value table when a tensor is per-tensor quantized. BUG=part of tensorflow#2636

…sors (#3025) Compress using a single value table when a tensor is per-tensor quantized, as indicated by the presence of only one quantization scale and zero point. Update unit tests accordingly and augment `test_models` to accommodate additional quantization fields. Abandon the logic that a tensor should be compressed along the NHWC channel dimension if the quantization parameters do not specify an axis. Instead, fail with an error if the compression axis cannot be inferred from the quantization parameters. The interpreter already expects a single value table when a tensor is per-tensor quantized. BUG=part of #2636

rkuester mentioned this issue Jul 24, 2024

build(py): add hexdump requirement #2637

Merged

mergify bot pushed a commit that referenced this issue Jul 24, 2024

build(py): add hexdump requirement (#2637)

569ed29

Add the Python distribution package `hexdump`, to be used in tests and utilities which display raw memory. BUG=#2636

rkuester mentioned this issue Jul 24, 2024

refactor: move tflite::Span to independent header #2638

Merged

rkuester mentioned this issue Jul 25, 2024

feat: add fixed-capacity, statically-allocated type tflite::StaticVector #2642

Merged

mergify bot pushed a commit that referenced this issue Aug 7, 2024

feat: add fixed-capacity, statically-allocated type tflite::StaticVec…

d3475aa

…tor (#2642) Add a type, tflite::StaticVector, which behaves like std::vector, but which avoids heap memory allocation. BUG=#2636

rkuester mentioned this issue Aug 7, 2024

feat: add a tflite::hexdump() for printing raw memory #2659

Closed

ddavis-2015 assigned rkuester Sep 7, 2024

rkuester mentioned this issue Sep 26, 2024

doc: propose design for indicating which tensors to compress #2700

Closed

github-actions bot added the Stale label Oct 3, 2024

github-actions bot removed the Stale label Oct 4, 2024

github-actions bot added the Stale label Oct 29, 2024

rkuester removed the Stale label Oct 29, 2024

rkuester mentioned this issue Oct 30, 2024

chore: remove obsolete ci/temp_patches #2744

Merged

rkuester mentioned this issue Nov 13, 2024

build(bazel): introduce tflm_cc_* macros, refactoring away micro_copts #2765

Merged

rkuester mentioned this issue Nov 15, 2024

build(bazel): replace cc_* with tflm_cc_* in remaining TFLM code #2768

Merged

rkuester added a commit to rkuester/tflite-micro that referenced this issue Nov 15, 2024

feat: add a tflite::hexdump() for printing raw memory

8ff28fa

Add tflite::hexdump() for printing raw memory to output streams. Copy the output format of Python's hexdump module. BUG=tensorflow#2636

rkuester mentioned this issue Nov 15, 2024

feat: add a tflite::hexdump() for printing raw memory #2769

Merged

mergify bot pushed a commit that referenced this issue Nov 15, 2024

feat: add a tflite::hexdump() for printing raw memory (#2769)

b238649

Add tflite::hexdump() for printing raw memory to output streams. Copy the output format of Python's hexdump module. BUG=#2636

rkuester mentioned this issue Dec 15, 2024

feat(compression): allocate resource variables in persistent buffer #3013

Merged

rkuester mentioned this issue Dec 15, 2024

feat(compression): implement tensor decompression in op concatenation #3014

Merged

rkuester pushed a commit to rkuester/tflite-micro that referenced this issue Dec 15, 2024

feat(compression): implement tensor decompression in op conv

7a38395

Implement tensor decompression in op conv. Extend tests to validate operation on compressed tensors. BUG=part of tensorflow#2636

rkuester mentioned this issue Dec 15, 2024

feat(compression): implement tensor decompression in op conv #3015

Merged

mergify bot pushed a commit that referenced this issue Dec 16, 2024

feat(compression): implement tensor decompression in op concatenation (…

50e7e5d

…#3014) Implement tensor decompression in op concatenation. Extend tests to validate operation on compressed tensors. BUG=part of #2636

mergify bot pushed a commit that referenced this issue Dec 16, 2024

feat(compression): implement tensor decompression in op conv (#3015)

f9fecab

Implement tensor decompression in op conv. Extend tests to validate operation on compressed tensors. BUG=part of #2636

rkuester mentioned this issue Dec 16, 2024

feat(compression): implement tensor decompression in op depthwise conv #3017

Merged

rkuester mentioned this issue Dec 16, 2024

feat(compression): implement tensor decompression in op transpose conv #3018

Merged

mergify bot pushed a commit that referenced this issue Dec 16, 2024

feat(compression): implement tensor decompression in op transpose conv (

099774d

#3018) Implement tensor decompression in op transpose conv. Extend tests to validate operation on compressed tensors. BUG=part of #2636

mergify bot pushed a commit that referenced this issue Dec 16, 2024

feat(compression): implement tensor decompression in op depthwise conv (

b1d8a08

#3017) Implement tensor decompression in op depthwise conv. Extend tests to validate operation on compressed tensors. BUG=part of #2636

rkuester mentioned this issue Dec 17, 2024

docs: update compression documentation #3020

Merged

rkuester mentioned this issue Dec 17, 2024

docs: update generic benchmark tool documentation #3021

Merged

rkuester mentioned this issue Dec 17, 2024

feat(compression): check metadata schema version #3022

Merged

rkuester mentioned this issue Dec 19, 2024

fix(compress.py): use single value table for per-tensor quantized tensors #3025

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature: model compression #2636

feature: model compression #2636

rkuester commented Jul 24, 2024 •

edited

Loading

github-actions bot commented Oct 3, 2024

rkuester commented Oct 3, 2024

github-actions bot commented Oct 29, 2024

tinskip commented Dec 18, 2024

ddavis-2015 commented Dec 18, 2024 •

edited

Loading

feature: model compression #2636

feature: model compression #2636

Comments

rkuester commented Jul 24, 2024 • edited Loading

github-actions bot commented Oct 3, 2024

rkuester commented Oct 3, 2024

github-actions bot commented Oct 29, 2024

tinskip commented Dec 18, 2024

ddavis-2015 commented Dec 18, 2024 • edited Loading

rkuester commented Jul 24, 2024 •

edited

Loading

ddavis-2015 commented Dec 18, 2024 •

edited

Loading