I trained multiple models with different configuration for a custom hyperparameter search. I use pytorch_lightning and its logging (TensorboardLogger). When running my training script after Task.init() ClearML auto-creates a Task and connects the logger output to the server.
I log for each straining stage train, val and test the following scalars at each epoch: loss, acc and iou
When I have multiple configuration, e.g. networkA and networkB the first training log its values to loss, acc and iou, but the second to networkB:loss, networkB:acc and networkB:iou. This makes values umcomparable.
My training loop with Task initalization looks like this:
names = ['networkA', networkB']
for name in names:
task = Task.init(project_name="NetworkProject", task_name=name)
pl_train(name)
task.close()
method pl_train is a wrapper for whole training with Pytorch Ligtning. No ClearML code is inside this method.
Do you have any hint, how to properly use the usage of a loop in a script using completly separated tasks?
Edit: ClearML version was 0.17.4. Issue is fixed in main branch.