This document, which is originally written in the introducing blog, covers the layout semantic divergence between TensorFlow Lite (TFLite) models which is NHWC and ONNX which is NCHW.
The data layout format of TFLite has not been mentioned in either the document or the model representation but in the implicit agreement of the TFLite converter (the TensorFlow model needs to be NHWC) and the kernels. On the contrary, ONNX explicitly declares that it uses NCHW in both operator representation and document (which is generated from operator representation).
tflite2onnx
introduces propagation based approach to handle the layout issue in general, together with some other mechanisms for dedicated corner cases.
The propagation based approach is introduced to resolve this by propagating the layout semantic divergence across the graph, in which way the transpose pattern is not needed.
By default (for most cases), given a graph, some of the tensors have implicit layout semantic, e.g. tensors that are connected to Conv
directly, while others are not, e.g. Abs
and Add
. The later ones are transparent to layout, where transparent means all of the tensors that connected to the operator mush have the same layout semantic or don't hold such semantic.
So when an operator that is transparent to layout is connected to an operator that has implicit layout tensors, then all tensors of the transparent operator have the same layout semantic as the tensor that connecting these two operators, named as propagation.
For example, when converting the TFLite graph (omitted kernel and bias)
During layout propagation, the layout transformation permutes the shape of tensors if they are activations, i.e. value info in ONNX, and transposes the data of weights in addition, i.e. initializer in ONNX.
In practice, operators are categorized into four (as marked in Figure 5):
- Implicit: operators have layout semantic divergence, e.g.
Conv
. They are the source of layout semantic divergence. - Transparent: operators that are insensitive to layout, e.g.
Abs
. If any tensor has layout semantic divergence, propagate it to all tensors that are connected to such operators. - Attribute: operators that can propagate layout semantic divergence just like Transparent, but have layout sensitive attributes that need special handling, e.g. attribute
axis
ofConcat
. An additional pass after propagation to adjust these attributes is required. - Terminate: operators that don't have and cannot propagate layout semantic divergence, e.g.
Reshape
. The propagation across the graph terminates at such operators.
Figure 1: Part of the ONNX model generated by propagation based approach of TFLite2ONNX
When propagating layout semantic divergence across the graph, for a particular operator: if it is Transparent or Attribute, propagate layout semantic divergence among its tensors; if it is Implicit or Terminate, terminates the propagation in this direction. Figure 1 is part of the ONNX model generated by propagation based approach from the NASNet TFLite model.
With propagation based approach, the converted ONNX model includes zero effort to handle layout semantic divergence, i.e. no additional operators or tensors are introduced.
However, sometimes there could be incompatible layouts. Consider Reshape
, which is Terminate, as below. If
Explicit layout is introduced to handle such a scenario. Users can feed a mapping of Add
operator.
Another problem is the broadcast of binary operators such as Add
(see this issue for more). Taking the example below, in which tensor
To manage broadcasting in the ONNX model, tflite2onnx
introduces the Reshape pattern. Any tensors like