I saw a paper where someone was able to initialize the hidden state of a RNN by using a feed forward NN. I was trying to figure out how this could by done but keep getting error messages while developing the model. I have time series data that is at least 100 values for 2000 independent runs. I want the input to be used to both develop the hidden state, and also be used in the RNN.
Currently this is how I was trying to create the model:
from tensorflow.keras import layers, Model
from tensorflow.keras import GRU, Dense
units = 200
N_inputs = 4
N_outputs = 10
inputs = Input(shape = (None, N_inputs))
state_init = layers.Dense(units)(inputs)
GRU_layer = layers.GRU(units = units, input_shape = (None, N_inputs), return_sequences = True)\
(inputs, initial_state = state_init)
outputs = layers.Dense(units = N_outputs)
model = Model(inputs, outputs)
I am getting this error:
ValueError: An 'initial_state' was passed that is not compatible with 'cell.state_size'. Received 'state_spec'=ListWrapper([InputSpec(shape=(None, None, 200), ndim=3)]); however 'cell.state_size is [200]
Is this even possible or do I have to create some custom code for this? Any help would be greatly appreciated.
The paper is here: https://ieeexplore.ieee.org/document/7966138
-
The state must not have a time dimension. Applying the Dense layer to the same sequence as the GRU doesn't make much sense.xdurch0– xdurch02024年02月27日 07:46:07 +00:00Commented Feb 27, 2024 at 7:46
-
The error you're encountering arises because the initial_state argument in the GRU layer expects a tensor with shape (batch_size, units) or a nested list of tensors if the GRU layer is a bidirectional layer. However, in your code, you are passing a tensor with shape (batch_size, None, units) as the initial_statePriya T– Priya T2024年04月22日 05:53:35 +00:00Commented Apr 22, 2024 at 5:53