So first I'll just describe the task:
I need to:
- Compare two
__m128i. - Somehow do the bitwise and of the result with a certain
uint16_tvalue (probably using_mm_movemask_epi8first and then just&). - Do the
blendof the initial values based on the result of that.
So the problem is as you might've guessed that blend accepts __m128i as a mask and I will be having uint16_t. So either I need some sort of inverse instruction for _mm_movemask_epi8 or do something else entirely.
Some points -- I probably cannot change that uint16_t value to some other type, it's complicated; I doing that on SSE4.2, so no AVX; there's a similar question here How to perform the inverse of _mm256_movemask_epi8 (VPMOVMSKB)? but it's about avx and I'm very inexperienced with this so I cannot adopt the solution.
PS: I might need to do that for arm as well, would appreciate any suggestions.