TensorFlow pass gradient unchaned

Question

Say I have some custom operation binarizer used in a neural network. The operation takes a Tensor and constructs a new Tensor. I would like to modify that operation such that it is only used in the forward pass. In the backward pass, when gradients are calculated, it should just pass through the gradients reaching it.

More concretly, say binarizer is:

def binarizer(input):
    prob = tf.truediv(tf.add(1.0, input), 2.0)
    bernoulli = tf.contrib.distributions.Bernoulli(p=prob, dtype=tf.float32)
    return 2 * bernoulli.sample() - 1

and I setup my network:

# ...

h1_before_my_op = tf.nn.tanh(tf.matmul(x, W) + bias_h1)
h1 = binarizer(h1_before_b)

# ...

loss = tf.reduce_mean(tf.square(y - y_true))
train_step = tf.train.GradientDescentOptimizer(0.5).minimize(loss)

How do I tell TensorFlow to skip gradient calculation in the backward pass?

I tried defining a custom operation as described in this answer, however: py_func cannot return Tensors, that's not what it is made for – I get:

UnimplementedError (see above for traceback): Unsupported object type Tensor

You want your subgraph to behave like `tf.identity` for the backwards pass, so you can use trick here -- http://stackoverflow.com/questions/36456436/how-can-i-define-only-the-gradient-for-a-tensorflow-subgraph/36480182#36480182 — Yaroslav Bulatov, Oct 28 '16 at 18:15
@YaroslavBulatov Nice! I finally had time to implement this today and it seems to work! — fabian789, Oct 31 '16 at 09:01

score 1 · Answer 1 · answered Oct 28 '16 at 12:58

1

You're looking for tf.stop_gradient(input, name=None):

Stops gradient computation.

When executed in a graph, this op outputs its input tensor as-is.

h1 = binarizer(h1_before_b)
h1 = tf.stop_gradient(h1)

answered Oct 28 '16 at 12:58

nessuno

26,493
5
83
74

1

Thanks. But it seems that this stops gradients rather than passing them on? I.e. that everything to the left of `h1` does not receive gradients in my tests: All weights associated with the stages to the left of `h1` do not get updated by the training step, the ones on the right do... – fabian789 Oct 28 '16 at 15:37

TensorFlow pass gradient unchaned

1 Answers1