I found in other questions that to do L2 regularization in convolutional networks using tensorflow the standard way is as follow.
For each conv2d layer, set the parameter kernel_regularizer to be l2_regularizer like this
regularizer = tf.contrib.layers.l2_regularizer(scale=0.1)
layer2 = tf.layers.conv2d(
inputs,
filters,
kernel_size,
kernel_regularizer=regularizer)
Then in the loss function, collect the reg loss
reg_losses = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES)
reg_constant = 0.01 # Choose an appropriate one.
loss = my_normal_loss + reg_constant * sum(reg_losses)
Many people including me made the mistake skipping the 2nd step. That implies the meaning of kernel_regularizer is not well understood. I have an assumption that I can't confirm. That is
By setting
kernel_regularizerfor a single layer you are telling the network to forward the kernel weights at this layer to the loss function at the end of the network such that later you will have the option (by another piece of code you write) to include them in the final regularization term in the loss function. Nothing more.
Is it correct or is there a better explanation?