I have a vector int16_t beta = {1,1,0,0,0,0,0,0}.
I want to implement this equation with AVX2
c[i] = a[i] + (-1)^beta[i] * b[i]
where a, b, c, and beta are all AVX2 vectors of int16_t.
I have figured out that, if I can map 1 to -32768 multiplication operation can be avoided. I mean, flipping the sign of vector b can be done using OR and NEGATE functions of simd intrinsics.
I do know that 1 can be mapped to -32768 using left shift operation, however avx2 doesn't have any bit shift operations1. Is there any way to efficiently map 1 to -32768 with simd?
Editor's footnote 1: _mm256_slli_epi16(x, 15) does in fact exist. But there are other ways to implement the whole formula so the question is interesting after all.
