1

I am kinda new to model training and machine learning in general, so sorry in advance if my question mind seem weird. Last week I managed to train a model with PyTorch and got an .pth file. To use that model on android I wanted to convert it to .tflite. I learned that this conversion has to be done in two steps:

  1. Convert the .pth model to an .onnx model
  2. Convert the resulting model to .tf/.tflite

The conversion from .pth to .onnx worked, I can check the rather huge model in netron.app and no errors occured.

My problem happens when I'm trying to convert the .onnx model to .tflite with the script:

import onnx
import onnx2tf
import tensorflow as tf
import json
# Paths
onnx_model_path = "raft_model.onnx"
tflite_model_path = "raft_model.tflite"
tf_model_path = "raft_tf_model"
param_replacement_path = "param_replacement.json"
# Step 1: JSON for onnx2tf conversion fixes
param_replacement = {
 "operations": [
 {
 "op_name": "/fnet/layer1/layer1.1/Add",
 "param_target": "inputs",
 "param_name": "x",
 "value": "tf.transpose(x, perm=[0, 2, 3, 1])"
 },
 {
 "op_name": "/fnet/layer1/layer1.1/Add",
 "param_target": "inputs",
 "param_name": "y",
 "value": "tf.transpose(y, perm=[0, 2, 3, 1])"
 }
 ]
}
# Save JSON file
with open(param_replacement_path, "w") as f:
 json.dump(param_replacement, f, indent=4)
# Step 2: Convert ONNX → TensorFlow
onnx2tf.convert(
 input_onnx_file_path=onnx_model_path,
 output_folder_path=tf_model_path,
 keep_ncw_or_nchw_or_ncdhw_input_names=[
 "/fnet/layer1/layer1.0/relu_2/Relu_output_0",
 "/fnet/layer1/layer1.1/relu_1/Relu_output_0"
 ],
 keep_nwc_or_nhwc_or_ndhwc_input_names=["/fnet/layer1/layer1.1/Add_output_0"],
 param_replacement_file=param_replacement_path
)
# Step 3: Convert TensorFlow → TensorFlow Lite
try:
 converter = tf.lite.TFLiteConverter.from_saved_model(tf_model_path)
 converter.optimizations = [tf.lite.Optimize.DEFAULT]
 tflite_model = converter.convert()
 # Save the .tflite model
 with open(tflite_model_path, "wb") as f:
 f.write(tflite_model)
 print("Successful conversion! .tflite model saved as:", tflite_model_path)
except Exception as e:
 print("Error during conversion:", str(e))

Resulting in:

INFO: 46 / 1727
INFO: onnx_op_type: Add onnx_op_name: /cnet/layer2/layer2.0/Add
INFO: input_name.1: /cnet/layer2/layer2.0/downsample/downsample.0/Conv_output_0 shape: [1, 96, 192, 384] dtype: float32
INFO: input_name.2: /cnet/layer2/layer2.0/relu_1/Relu_output_0 shape: [1, 96, 192, 384] dtype: float32
INFO: output_name.1: /cnet/layer2/layer2.0/Add_output_0 shape: [1, 96, 192, 384] dtype: float32
DEBUG: before_op_output_shape_trans = True
INFO: tf_op_type: add
INFO: input.1.x: name: tf.math.add_13/Add:0 shape: (1, 192, 384, 96) dtype: <dtype: 'float32'> 
INFO: input.2.y: name: tf.nn.relu_13/Relu:0 shape: (1, 192, 384, 96) dtype: <dtype: 'float32'> 
INFO: output.1.output: name: tf.math.add_16/Add:0 shape: (1, 192, 384, 96) dtype: <dtype: 'float32'> 
INFO: 47 / 1727
INFO: onnx_op_type: Add onnx_op_name: /fnet/layer1/layer1.1/Add
INFO: input_name.1: /fnet/layer1/layer1.0/relu_2/Relu_output_0 shape: [2, 64, 384, 768] dtype: float32
INFO: input_name.2: /fnet/layer1/layer1.1/relu_1/Relu_output_0 shape: [2, 64, 384, 768] dtype: float32
INFO: output_name.1: /fnet/layer1/layer1.1/Add_output_0 shape: [2, 64, 384, 768] dtype: float32
DEBUG: before_op_output_shape_trans = True
ERROR: The trace log is below.
Traceback (most recent call last):
 File "/home/user/Exporter/onnx2tflite/lib/python3.10/site-packages/onnx2tf/utils/common_functions.py", line 313, in print_wrapper_func
 result = func(*args, **kwargs)
 File "/home/user/Exporter/onnx2tflite/lib/python3.10/site-packages/onnx2tf/utils/common_functions.py", line 386, in inverted_operation_enable_disable_wrapper_func
 result = func(*args, **kwargs)
 File "/home/user/Exporter/onnx2tflite/lib/python3.10/site-packages/onnx2tf/utils/common_functions.py", line 56, in get_replacement_parameter_wrapper_func
 func(*args, **kwargs)
 File "/home/user/Exporter/onnx2tflite/lib/python3.10/site-packages/onnx2tf/ops/Add.py", line 283, in make_node
 merge_two_consecutive_identical_ops_into_one(
 File "/home/user/Exporter/onnx2tflite/lib/python3.10/site-packages/onnx2tf/utils/common_functions.py", line 5475, in merge_two_consecutive_identical_ops_into_one
 tf.math.add(
 File "/home/user/Exporter/onnx2tflite/lib/python3.10/site-packages/tensorflow/python/ops/weak_tensor_ops.py", line 142, in wrapper
 return op(*args, **kwargs)
 File "/home/user/Exporter/onnx2tflite/lib/python3.10/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
 raise e.with_traceback(filtered_tb) from None
 File "/home/user/Exporter/onnx2tflite/lib/python3.10/site-packages/tf_keras/src/layers/core/tf_op_layer.py", line 119, in handle
 return TFOpLambda(op)(*args, **kwargs)
 File "/home/user/Exporter/onnx2tflite/lib/python3.10/site-packages/tf_keras/src/utils/traceback_utils.py", line 70, in error_handler
 raise e.with_traceback(filtered_tb) from None
ValueError: Exception encountered when calling layer "tf.math.add_17" (type TFOpLambda).
Dimensions must be equal, but are 768 and 384 for '{{node tf.math.add_17/Add}} = AddV2[T=DT_FLOAT](Placeholder, Placeholder_1)' with input shapes: [2,768,384,64], [2,384,768,64].
Call arguments received by layer "tf.math.add_17" (type TFOpLambda):
 • x=tf.Tensor(shape=(2, 768, 384, 64), dtype=float32)
 • y=tf.Tensor(shape=(2, 384, 768, 64), dtype=float32)
 • name='/fnet/layer1/layer1.1/Add'
ERROR: input_onnx_file_path: raft_model_fixed.onnx
ERROR: onnx_op_name: /fnet/layer1/layer1.1/Add
ERROR: Read this and deal with it. https://github.com/PINTO0309/onnx2tf#parameter-replacement
ERROR: Alternatively, if the input OP has a dynamic dimension, use the -b or -ois option to rewrite it to a static shape and try again.
ERROR: If the input OP of ONNX before conversion is NHWC or an irregular channel arrangement other than NCHW, use the -kt or -kat option.
ERROR: Also, for models that include NonMaxSuppression in the post-processing, try the -onwdt option.

Some things might seems weird, but I already tried to adjust the wrong dimensions, which don't want to (can't) be added in the Add node. As I understand it the first of the two tensors is not correctly converted from NCHW to NHWC, which results in wrongly added dimensions. (That's why there is a .json file with correct permutation, to force NHWC for those specific tensors).Or i

In the error output I also added the step before the error, in which the Add node has no problem adding two tensors. Could it be because of the increasing of the nodes from 1 to 2? Is my .onnx model faulty resulting in this mess or is it something else?

Like are my installed versions of TensorFlow and onnx(-tf) not harmonizing?

  • TensorFlow: 2.19.0
  • onnx: 1.16.1
  • onnx-tf: 1.27.1
Mark Rotteveel
110k241 gold badges160 silver badges233 bronze badges
asked Apr 10, 2025 at 9:14
1
  • So I am still not sure, why the dimension inversion happened, but I managed to counter it with a working .json file with the format: param_replacement = { "format_version": 1, "operations": [ { "op_name": "/fnet/layer1/layer1.1/Add", "param_target": "inputs", "param_name": "/fnet/layer1/layer1.0/relu_2/Relu_output_0", "values": [2, 384, 768, 64] } ] } I just forced the problematic tensor to have matching dimension (Syntax from link) Commented Apr 11, 2025 at 8:52

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.