Convert the list keep into a set, since that will be checked frequently. Iterate over train, since you want to keep order and repeats. That makes set not an option. Even if it was, it wouldn't help, since the iteration would have to happen anyway:
keeps = set(keep)
train_keep = [k for k in train if k in keeps]
A lazier, and probably slower version would be something like
train_keep = filter(lambda x: x in keeps, train)
Neither of these options will give you a large speedup you'd probably be better off using numpy or pandas or some other library that implements the loops in C and stores numbers as something simpler than full-blown python objects. Here is a sample numpy solution:
train = np.array([...])
keep = np.array([...])
train_keep = train[np.isin(train, keep)]
This is likely an O(M * N) algorithm rather than O(M) set lookup, but if checking N elements in keep is faster than a nominally O(1) lookup, you win.
You can get something closer to O(M log(N)) using sorted lookup:
train = np.array([...])
keep = np.array([...])
keep.sort()
ind = np.searchsorted(keep, train, side='left')
ind[ind == keep.size] -= 1
train_keep = train[keep[ind] == train]
A better alternative might be to append np.inf or a maximum out-of-bounds integer to the sorted keep array, so you don't have to distinguish missing from edge elements with extra at all. Something like np.max(train.max() + 1, keep.max()) would do:
train = np.array([...])
keep = np.array([... 99999])
keep.sort()
ind = np.searchsorted(keep, train, side='left')
train_keep = train[keep[ind] == train]
For random inputs with train.size = 10000 and keep.size = 10, the numpy method is ~10x faster on my laptop.