I have sequence data that tells me what color was observed for multiple subjects at different points in time. For example:
| ID | Time | Color | 
|---|---|---|
| A | 1 | Blue | 
| A | 2 | Red | 
| A | 5 | Red | 
| B | 3 | Blue | 
| B | 6 | Green | 
| C | 1 | Red | 
| C | 3 | Orange | 
I want to obtain predictions for the most likely color for the next 3 time steps, as well as the probability of that color appearing. For example, for ID A, I'd like to know the next 3 items (time, color) in the sequence as well as its probability of the color appearing.
I understand that LSTMs are often used to predict this type of sequential data, and that I would feed in a 3d array like
input =[ 
[[1,1], [2,2], [5,2]], #blue at t=1, red at t=2, red at t=5 for ID A
[[0,0], [3,1], [6,3]], #nothing for first entry, blue at t=3, green at t=6 for ID B
[[0,0], [1,2], [3,4]]
]
  
after mapping the colors to numbers (Blue-> 1, Red->2, Green-> 3, Orange -> 4etc.). My understanding is that, by default, the LSTM just predicts the next item in each sequence, so for example
output = [
[[7, 2]], #next item is most likely red at t=7
[[9, 3]], # next item is most likely red at t=9
[[6, 2]] 
]
Is it possible to modify the output of my LSTM so that instead of just predicting the next occurence time and color, I can get the next 3 times, colors AND probabilities of the color appearing? For example, an output like
output = [
[[7, 2, 0.93], [8,2, 0.79], [10,4, 0.67]], 
[[9, 2, 0.88], [11,3, 0.70], [14,3, 0.43]], 
...
]
I've tried looking in the Sequential documentation for Keras, but I'm not sure if I've found anything.
Furthermore, I see that there's a TrainX and TrainY typically used for model.fit() but I'm also not sure what my TrainY would be here?
