I recently came across tf.nn.sparse_softmax_cross_entropy_with_logits and I can not figure out what the difference is compared to tf.nn.softmax_cross_entropy_with_logits.
Is the only difference that training vectors y have to be one-hot encoded when using sparse_softmax_cross_entropy_with_logits?
Reading the API, I was unable to find any other difference compared to softmax_cross_entropy_with_logits. But why do we need the extra function then?
Shouldn't softmax_cross_entropy_with_logits produce the same results as sparse_softmax_cross_entropy_with_logits, if it is supplied with one-hot encoded training data/vectors?  
 
     
     
     
     
    