I am building a binary classification where the class I want to predict is present only <2% of times. I am using pytorch
The last layer could be logosftmax or softmax.
self.softmax = nn.Softmax(dim=1) or self.softmax = nn.LogSoftmax(dim=1)
my questions
I should use
softmaxas it will provide outputs that sum up to 1 and I can check performance for various prob thresholds. is that understanding correct?if I use
softmaxthen can I usecross_entropyloss? This seems to suggest that it is okay to useif i use
logsoftmaxthen can I usecross_entropyloss? This seems to suggest that I shouldnt.if I use
softmaxthen is there any better option thancross_entropyloss?` cross_entropy = nn.CrossEntropyLoss(weight=class_wts)`