If you take a look at the model.summary() output you will see that what the issue is:
Layer (type) Output Shape Param #
=================================================================
dense_13 (Dense) (None, 128, 50) 150
_________________________________________________________________
dense_14 (Dense) (None, 128, 20) 1020
_________________________________________________________________
dense_15 (Dense) (None, 128, 5) 105
_________________________________________________________________
dense_16 (Dense) (None, 128, 2) 12
=================================================================
Total params: 1,287
Trainable params: 1,287
Non-trainable params: 0
_________________________________________________________________
As you can see, the output of the model is (None, 128,2) and not (None, 1, 2) (or (None, 2)) as you expected. So, you may or may not know that Dense layer is applied on the last axis of its input array and as a result, as you see above, the time axis and dimension is preserved until the end.
How to resolve this? You mentioned you don't want to use a RNN layer, therefore you have two options: you need to either use Flatten layer somewhere in the model or you can also use some Conv1D + Pooling1D layers or even a GlobalPooling layer. For example (these are just for demonstration, you may do it differently):
using Flatten layer
model = models.Sequential()
model.add(Dense(50, batch_input_shape=(None, 128, 2), kernel_initializer="he_normal" ,activation="relu"))
model.add(Dense(20, kernel_initializer="he_normal", activation="relu"))
model.add(Dense(5, kernel_initializer="he_normal", activation="relu"))
model.add(Flatten())
model.add(Dense(2))
model.summary()
Model summary:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_17 (Dense) (None, 128, 50) 150
_________________________________________________________________
dense_18 (Dense) (None, 128, 20) 1020
_________________________________________________________________
dense_19 (Dense) (None, 128, 5) 105
_________________________________________________________________
flatten_1 (Flatten) (None, 640) 0
_________________________________________________________________
dense_20 (Dense) (None, 2) 1282
=================================================================
Total params: 2,557
Trainable params: 2,557
Non-trainable params: 0
_________________________________________________________________
using GlobalAveragePooling1D layer
model = models.Sequential()
model.add(Dense(50, batch_input_shape=(None, 128, 2), kernel_initializer="he_normal" ,activation="relu"))
model.add(Dense(20, kernel_initializer="he_normal", activation="relu"))
model.add(GlobalAveragePooling1D())
model.add(Dense(5, kernel_initializer="he_normal", activation="relu"))
model.add(Dense(2))
model.summary()
Model summary:
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_21 (Dense) (None, 128, 50) 150
_________________________________________________________________
dense_22 (Dense) (None, 128, 20) 1020
_________________________________________________________________
global_average_pooling1d_2 ( (None, 20) 0
_________________________________________________________________
dense_23 (Dense) (None, 5) 105
_________________________________________________________________
dense_24 (Dense) (None, 2) 12
=================================================================
Total params: 1,287
Trainable params: 1,287
Non-trainable params: 0
_________________________________________________________________
Note that in both cases above you need to reshape the labels (i.e. targets) array to (n_samples, 2) (or you may want to use a Reshape layer at the end).