I am trying to figure out how to match activation=sigmoid and activation=softmax with the correct model.compile() loss parameters.  Specifically those associated with binary_crossentropy.
I have researched related topics and read the docs.  Also I have built a model and got it working with sigmoid but not softmax.  And I cannot get it working properly with the "from_logits" parameters.
Specifically, here it says:
Args:
from_logits: Whetheroutputis expected to be a logits tensor. By default, we consider thatoutputencodes a probability distribution.
This says to me that if you use a sigmoid activation you want "from_logits=True".  And for softmax activation you want "from_logits=False" by default.  Here I am assuming that sigmoid provides logits and softmax provides a probability distribution.
Next is some code:
model = Sequential()
model.add(LSTM(units=128,
               input_shape=(n_timesteps, n_features), 
               return_sequences=True))
model.add(Dropout(0.3))
model.add(LSTM(units=64, return_sequences=True))
model.add(Dropout(0.3))
model.add(LSTM(units=32))
model.add(Dropout(0.3))
model.add(Dense(16, activation='relu'))
model.add(Dropout(0.3))
model.add(Dense(1, activation='sigmoid'))
Notice the last line is using the sigmoid activation.  Then:
model.compile(optimizer=optimizer,
              loss='binary_crossentropy',  
              metrics=['accuracy'])
This works fine but it is working with the default "from_logits=False" which is expecting a probability distribution.
If I do the following, it fails:
model.compile(optimizer=optimizer,
              loss='binary_crossentropy',  
              metrics=['accuracy'],
              from_logits=True) # For 'sigmoid' in above Dense
with this error message:
ValueError: Invalid argument "from_logits" passed to K.function with TensorFlow backend
If I try using the softmax activation as:
model.add(Dense(1, activation='softmax'))
It runs but I get 50% accuracy results.  With sigmoid I am getting +99% accuracy.  (I am using a very contrived data set to debug my models and would expect very high accuracy.  Plus it is a very small data set and will over fit but that is OK for now.)
So I expect that I should be able to use the "from_logits" parameter in the compile function.  But it does not recognize that parameter.  
Also I would like to know why it works with the sigmoid activation and not the softmax activation and how do I get it working with the softmax activation.
Thank you,
Jon.
 
     
    