I'm doing a benchmark of keras pre-trained models (vgg,resnet,inception,...) for image classification on personal data (electrical equipments), and I was wondering if there are best practices to have a relevant benchmark. I have 120 labeled images. I already tried data augmentation, checkpoints, early stopping,
Best pratices about :
- getting reproducible results : every time I train my model, I have different results. I tried the tips in this post in vain : How to get reproducible results in keras 
- fully-connected layers : do we need to have complex FC layers ? What are the best pratices in transfer learning? 
- other tips? 
Thank you in advance !
