After studying autograd, I tried to make loss function myself. And here are my loss
def myCEE(outputs,targets):
    exp=torch.exp(outputs)
    A=torch.log(torch.sum(exp,dim=1))
    
    hadamard=F.one_hot(targets, num_classes=10).float()*outputs
    B=torch.sum(hadamard, dim=1)
    return torch.sum(A-B)
and I compared with torch.nn.CrossEntropyLoss
here are results
for i,j in train_dl:
    inputs=i
    targets=j
    break
outputs=model(inputs)
myCEE(outputs,targets) : tensor(147.5397, grad_fn=<SumBackward0>)
loss_func = nn.CrossEntropyLoss(reduction='sum')  : tensor(147.5397, grad_fn=<NllLossBackward>)
values were same.
I thought, because those are different functions so grad_fn are different and it won't cause any problems.
But something happened!
After 4 epochs, loss values are turned to nan.
Contrary to myCEE, with nn.CrossEntropyLoss learning went well.
So, I wonder if there is a problem with my function.
After read some posts about nan problems, I stacked more convolutions to the model.
As a result 39-epoch training did not make an error.
Nevertheless, I'd like to know difference between myCEE and nn.CrossEntropyLoss
 
    