Using GradientTape for A tf.keras Neural Network with dictionary input (composed from multiple models)

Question

I need to take a derivative from a neural network implemented in Tensorflow/Keras 2.0 (super_model). This model has been composed of multiple basic models (x1 to x6) due to my previous issue explained in this post. (Thus, I will get an error if only passing angles to the model.) See the following code:

angles=[0] * 21

data = {
    'x1_model_input': numpy.array([angles[0:3]]),
    'x2_model_input': numpy.array([angles[3:6]]),
    'x3_model_input': numpy.array([[angles[6]]]), 
    'x4_model_input': numpy.array([angles[7:13]]), 
    'x5_model_input': numpy.array([angles[13:15]]), 
    'x6_model_input': numpy.array([angles[15:21]])
}

# this super_model prediction is working well
pred = super_model.predict(data) # `pred` shape is `shape=(1,1)`

Now, I need to take a derivative of the network based on the input data using GradientTape. I have tried the following and aim to get the gradient value of the network for the above-specified data:

with tf.GradientTape() as tape:
    pred = super_model(data)
# does not work as `data` is a dictionary
# the error is:
#         ...
#         return pywrap_tfe.TFE_Py_TapeGradient(
#     AttributeError: 'numpy.ndarray' object has no attribute '_id'
grad = tape.gradient(pred, data)

But, data is a dictionary and I cannot call tape.watch and then gradient. I cannot also call tf.convert_to_tesnor over data as it is a dictionary. So, my question is how I can continue the work without changing the structure of the super_model?

Thanks for your detailed question. What is the shape of `pred` and what is the error you are getting? — AloneTogether, Nov 13 '21 at 14:10
@AloneTogether The post has been updated. The `pred` shape is `(1,1)` and the error is `AttributeError: 'numpy.ndarray' object has no attribute '_id'`. — OmG, Nov 13 '21 at 14:19

AloneTogether · Accepted Answer · 2021-11-13T15:07:56.277

I am not sure if it is a viable option for you, but your code would work using tf.Variable instead of numpy:

import tensorflow as tf

angles=[0] * 21

test_tensor = tf.Variable([angles[0:3]], dtype=tf.float32)
data = {
    'x1_model_input': test_tensor,
    'x2_model_input': tf.Variable([angles[3:6]], dtype=tf.float32),
    'x3_model_input': tf.Variable([[angles[6]]], dtype=tf.float32), 
    'x4_model_input': tf.Variable([angles[7:13]], dtype=tf.float32), 
    'x5_model_input': tf.Variable([angles[13:15]], dtype=tf.float32), 
    'x6_model_input': tf.Variable([angles[15:21]], dtype=tf.float32)
}

with tf.GradientTape() as tape:
    pred = tf.constant([[1.0]]) * test_tensor

grad = tape.gradient(pred, data) 
tf.print(grad)

{'x1_model_input': [[1 1 1]],
 'x2_model_input': None,
 'x3_model_input': None,
 'x4_model_input': None,
 'x5_model_input': None,
 'x6_model_input': None}

Using GradientTape for A tf.keras Neural Network with dictionary input (composed from multiple models)

1 Answers1