I need to take a derivative from a neural network implemented in Tensorflow/Keras 2.0 (super_model). This model has been composed of multiple basic models (x1 to x6) due to my previous issue explained in this post. (Thus, I will get an error if only passing angles to the model.) See the following code:
angles=[0] * 21
data = {
'x1_model_input': numpy.array([angles[0:3]]),
'x2_model_input': numpy.array([angles[3:6]]),
'x3_model_input': numpy.array([[angles[6]]]),
'x4_model_input': numpy.array([angles[7:13]]),
'x5_model_input': numpy.array([angles[13:15]]),
'x6_model_input': numpy.array([angles[15:21]])
}
# this super_model prediction is working well
pred = super_model.predict(data) # `pred` shape is `shape=(1,1)`
Now, I need to take a derivative of the network based on the input data using GradientTape. I have tried the following and aim to get the gradient value of the network for the above-specified data:
with tf.GradientTape() as tape:
pred = super_model(data)
# does not work as `data` is a dictionary
# the error is:
# ...
# return pywrap_tfe.TFE_Py_TapeGradient(
# AttributeError: 'numpy.ndarray' object has no attribute '_id'
grad = tape.gradient(pred, data)
But, data is a dictionary and I cannot call tape.watch and then gradient. I cannot also call tf.convert_to_tesnor over data as it is a dictionary.
So, my question is how I can continue the work without changing the structure of the super_model?