...
print('Build model...')
model = Sequential()
model.add(Embedding(max_features, 128))
model.add(LSTM(size, return_sequences=True, dropout_W=0.2 dropout_U=0.2)) 
model.add(GlobalAveragePooling1D())
model.add(Dense(1))
model.add(Activation('sigmoid'))
....
I need to be able to take the mean or max of the vectors for all time steps in a sample after LSTM layer before giving this mean or max vector to the dense layer in Keras.
I think timedistributedmerge was able to do this but it was deprecated. Using return_sequences=True I can obtain the vectors for all time steps in a sample after the LSTM layer. However, GlobalAveragePooling1D() is not compatible with masking and it considers all time steps whereas I need only the non-masked time steps. 
I saw posts recommending the Lambda layer but these also do not take masking into account. Any help would be appreciated.
 
     
     
     
    