I am kinda new to model training and machine learning in general, so sorry in advance if my question mind seem weird. Last week I managed to train a model with PyTorch and got an .pth file. To use that model on android I wanted to convert it to .tflite. I learned that this conversion has to be done in two steps:
- Convert the .pth model to an .onnx model
- Convert the resulting model to .tf/.tflite
The conversion from .pth to .onnx worked, I can check the rather huge model in netron.app and no errors occured.
My problem happens when I'm trying to convert the .onnx model to .tflite with the script:
import onnx
import onnx2tf
import tensorflow as tf
import json
# Paths
onnx_model_path = "raft_model.onnx"
tflite_model_path = "raft_model.tflite"
tf_model_path = "raft_tf_model"
param_replacement_path = "param_replacement.json"
# Step 1: JSON for onnx2tf conversion fixes
param_replacement = {
"operations": [
{
"op_name": "/fnet/layer1/layer1.1/Add",
"param_target": "inputs",
"param_name": "x",
"value": "tf.transpose(x, perm=[0, 2, 3, 1])"
},
{
"op_name": "/fnet/layer1/layer1.1/Add",
"param_target": "inputs",
"param_name": "y",
"value": "tf.transpose(y, perm=[0, 2, 3, 1])"
}
]
}
# Save JSON file
with open(param_replacement_path, "w") as f:
json.dump(param_replacement, f, indent=4)
# Step 2: Convert ONNX → TensorFlow
onnx2tf.convert(
input_onnx_file_path=onnx_model_path,
output_folder_path=tf_model_path,
keep_ncw_or_nchw_or_ncdhw_input_names=[
"/fnet/layer1/layer1.0/relu_2/Relu_output_0",
"/fnet/layer1/layer1.1/relu_1/Relu_output_0"
],
keep_nwc_or_nhwc_or_ndhwc_input_names=["/fnet/layer1/layer1.1/Add_output_0"],
param_replacement_file=param_replacement_path
)
# Step 3: Convert TensorFlow → TensorFlow Lite
try:
converter = tf.lite.TFLiteConverter.from_saved_model(tf_model_path)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
tflite_model = converter.convert()
# Save the .tflite model
with open(tflite_model_path, "wb") as f:
f.write(tflite_model)
print("Successful conversion! .tflite model saved as:", tflite_model_path)
except Exception as e:
print("Error during conversion:", str(e))
Resulting in:
INFO: 46 / 1727
INFO: onnx_op_type: Add onnx_op_name: /cnet/layer2/layer2.0/Add
INFO: input_name.1: /cnet/layer2/layer2.0/downsample/downsample.0/Conv_output_0 shape: [1, 96, 192, 384] dtype: float32
INFO: input_name.2: /cnet/layer2/layer2.0/relu_1/Relu_output_0 shape: [1, 96, 192, 384] dtype: float32
INFO: output_name.1: /cnet/layer2/layer2.0/Add_output_0 shape: [1, 96, 192, 384] dtype: float32
DEBUG: before_op_output_shape_trans = True
INFO: tf_op_type: add
INFO: input.1.x: name: tf.math.add_13/Add:0 shape: (1, 192, 384, 96) dtype: <dtype: 'float32'>
INFO: input.2.y: name: tf.nn.relu_13/Relu:0 shape: (1, 192, 384, 96) dtype: <dtype: 'float32'>
INFO: output.1.output: name: tf.math.add_16/Add:0 shape: (1, 192, 384, 96) dtype: <dtype: 'float32'>
INFO: 47 / 1727
INFO: onnx_op_type: Add onnx_op_name: /fnet/layer1/layer1.1/Add
INFO: input_name.1: /fnet/layer1/layer1.0/relu_2/Relu_output_0 shape: [2, 64, 384, 768] dtype: float32
INFO: input_name.2: /fnet/layer1/layer1.1/relu_1/Relu_output_0 shape: [2, 64, 384, 768] dtype: float32
INFO: output_name.1: /fnet/layer1/layer1.1/Add_output_0 shape: [2, 64, 384, 768] dtype: float32
DEBUG: before_op_output_shape_trans = True
ERROR: The trace log is below.
Traceback (most recent call last):
File "/home/user/Exporter/onnx2tflite/lib/python3.10/site-packages/onnx2tf/utils/common_functions.py", line 313, in print_wrapper_func
result = func(*args, **kwargs)
File "/home/user/Exporter/onnx2tflite/lib/python3.10/site-packages/onnx2tf/utils/common_functions.py", line 386, in inverted_operation_enable_disable_wrapper_func
result = func(*args, **kwargs)
File "/home/user/Exporter/onnx2tflite/lib/python3.10/site-packages/onnx2tf/utils/common_functions.py", line 56, in get_replacement_parameter_wrapper_func
func(*args, **kwargs)
File "/home/user/Exporter/onnx2tflite/lib/python3.10/site-packages/onnx2tf/ops/Add.py", line 283, in make_node
merge_two_consecutive_identical_ops_into_one(
File "/home/user/Exporter/onnx2tflite/lib/python3.10/site-packages/onnx2tf/utils/common_functions.py", line 5475, in merge_two_consecutive_identical_ops_into_one
tf.math.add(
File "/home/user/Exporter/onnx2tflite/lib/python3.10/site-packages/tensorflow/python/ops/weak_tensor_ops.py", line 142, in wrapper
return op(*args, **kwargs)
File "/home/user/Exporter/onnx2tflite/lib/python3.10/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/home/user/Exporter/onnx2tflite/lib/python3.10/site-packages/tf_keras/src/layers/core/tf_op_layer.py", line 119, in handle
return TFOpLambda(op)(*args, **kwargs)
File "/home/user/Exporter/onnx2tflite/lib/python3.10/site-packages/tf_keras/src/utils/traceback_utils.py", line 70, in error_handler
raise e.with_traceback(filtered_tb) from None
ValueError: Exception encountered when calling layer "tf.math.add_17" (type TFOpLambda).
Dimensions must be equal, but are 768 and 384 for '{{node tf.math.add_17/Add}} = AddV2[T=DT_FLOAT](Placeholder, Placeholder_1)' with input shapes: [2,768,384,64], [2,384,768,64].
Call arguments received by layer "tf.math.add_17" (type TFOpLambda):
• x=tf.Tensor(shape=(2, 768, 384, 64), dtype=float32)
• y=tf.Tensor(shape=(2, 384, 768, 64), dtype=float32)
• name='/fnet/layer1/layer1.1/Add'
ERROR: input_onnx_file_path: raft_model_fixed.onnx
ERROR: onnx_op_name: /fnet/layer1/layer1.1/Add
ERROR: Read this and deal with it. https://github.com/PINTO0309/onnx2tf#parameter-replacement
ERROR: Alternatively, if the input OP has a dynamic dimension, use the -b or -ois option to rewrite it to a static shape and try again.
ERROR: If the input OP of ONNX before conversion is NHWC or an irregular channel arrangement other than NCHW, use the -kt or -kat option.
ERROR: Also, for models that include NonMaxSuppression in the post-processing, try the -onwdt option.
Some things might seems weird, but I already tried to adjust the wrong dimensions, which don't want to (can't) be added in the Add node. As I understand it the first of the two tensors is not correctly converted from NCHW to NHWC, which results in wrongly added dimensions. (That's why there is a .json file with correct permutation, to force NHWC for those specific tensors).Or i
In the error output I also added the step before the error, in which the Add node has no problem adding two tensors. Could it be because of the increasing of the nodes from 1 to 2? Is my .onnx model faulty resulting in this mess or is it something else?
Like are my installed versions of TensorFlow and onnx(-tf) not harmonizing?
- TensorFlow: 2.19.0
- onnx: 1.16.1
- onnx-tf: 1.27.1
param_replacement = { "format_version": 1, "operations": [ { "op_name": "/fnet/layer1/layer1.1/Add", "param_target": "inputs", "param_name": "/fnet/layer1/layer1.0/relu_2/Relu_output_0", "values": [2, 384, 768, 64] } ] }I just forced the problematic tensor to have matching dimension (Syntax from link)