Do you know any way to add with saturation 32-bit signed words using MMX/SSE assembler instructions? I can find 8/16 bits versions but no 32-bit ones.
            Asked
            
        
        
            Active
            
        
            Viewed 2,804 times
        
    3
            
            
        - 
                    See [Agner Fog's vectorclass library](http://www.agner.org/optimize/#vectorclass) for an implementation of add and subtract with C++ intrinsics. A copy of the GPLed source [is here](https://github.com/pcordes/vectorclass/blob/77522287e64da5e887d69659e144d2caa5d3a4f1/vectori128.h#L2189), using XOR to check for same / different signs, and shifts / PANDN / PADDD to fix up the result. – Peter Cordes Nov 24 '16 at 04:22
2 Answers
2
            
            
        You can emulate saturated signed adds by performing the following steps:
int saturated_add(int a, int b)
{
    int sum = a + (unsigned)b;                // avoid signed-overflow UB
    if (a >= 0 && b >= 0)
        return sum > 0 ? sum : INT32_MAX;     // catch positive wraparound
    else if (a < 0 && b < 0)
        return sum > 0 ? INT32_MIN : sum;     // catch negative wraparound
    else
        return sum;                           // sum of pos + neg always fits
}
Unsigned, it's even simpler, see this stackoverflow posting
In SSE2, the above maps to a sequence of parallel compares and AND/ANDN operations. No single operation is available in hardware, unfortunately.
 
    
    
        Peter Cordes
        
- 328,167
- 45
- 605
- 847
 
    
    
        FrankH.
        
- 17,675
- 3
- 44
- 63
- 
                    2[Bitwise saturated addition in C (HW)](https://stackoverflow.com/q/5277623) could probably vectorize better, with a couple `pxor` for `sum^a` and `sum^b`, and `pcmpgt(0, v)` or `psrad` – Peter Cordes Oct 08 '22 at 22:03
1
            
            
        Saturated unsigned subtraction is easy, because for `a -= b', we can do
    asm (
        "pmaxud %1, %0\n\t" // a = max (a,b)
        "psubd %1, %0" // a -= b
        : "+x" (a)
        : "xm" (b)
    );
with SSE.
I was looking for unsigned addition, but possibly, the only way is to transform to a saturated unsigned subtraction, perform it, and transform back. Same for signed variants.
EDIT: with unsigned addition, you get min (a, ~b) + b this way, which of course works. With signed addition and subtraction, you have two saturation boundaries, which makes things complicated. 
 
    
    
        Michiel
        
- 21
- 2
 
    