I encountered a strange thing in a testing-on-training experiment, where the val_loss is completely different than the training loss, even though they are evaluated on the exact same data (X,Y) with the same batch_size
Below is the code that I used to train one batch
X, Y = valid_datagen.next()
batch_size = len(X[0])
joint_model.fit( X, Y,
batch_size=batch_size,
epochs=1,
verbose=1,
validation_data=(X, Y))
Train on 12 samples, validate on 12 samples Epoch 1/1 12/12 [==============================] - 38s 3s/step - loss: 0.7510 - q_mask_a_loss: 0.4739 - r_mask_a_loss: 0.6610 - q_mask_b_loss: 0.4718 - r_mask_b_loss: 0.3164 - pred_a_loss: 1.8092 - pred_b_loss: 0.2238 - q_mask_a_F1: 0.8179 - r_mask_a_F1: 0.5318 - q_mask_b_F1: 0.8389 - r_mask_b_F1: 0.6134 - pred_a_acc: 0.0833 - pred_b_acc: 1.0000 - val_loss: 7.0257 - val_q_mask_a_loss: 6.9748 - val_r_mask_a_loss: 14.9849 - val_q_mask_b_loss: 6.9748 - val_r_mask_b_loss: 14.9234 - val_pred_a_loss: 0.6919 - val_pred_b_loss: 0.6944 - val_q_mask_a_F1: 0.0000e+00 - val_r_mask_a_F1: 0.0000e+00 - val_q_mask_b_F1: 0.0000e+00 - val_r_mask_b_F1: 0.0000e+00 - val_pred_a_acc: 1.0000 - val_pred_b_acc: 0.0000e+00
Note:
- the training
lossis0.7510while theval_lossis7.0257. - I've already made the
batch_sizeto be equal the number of samples, i.e. training only on one batch. - I am using
keras2.2.0 withtensorflowbackend 1.5.0. - using
joint_model.evaluate( X, Y, batch_size=batch_size)gives the same result as the validation.
With regard to the used joint_model, it is nothing but a feed-forward CNN with frozen weights in the first several layers. No Dropout layer anywhere.
I've completely no idea what's going on here. Does anyone what are potential reasons or how to debug this? Any suggestions are welcome.