I have created a double-double data type in C. I tried -Ofast with GCC and discovered that it's dramatically faster (e.g. 1.5 s with -O3 and 0.3s with -Ofast) but the results are bogus. I chased this down to -fassociative-math. I'm surprised this does not work because I explicitly define the associativity of my operations when it matters.  For example in the following code I but parentheses where it matters.  
static inline doublefloat two_sum(const float a, const float b) {
        float s = a + b;
        float v = s - a;
        float e = (a - (s - v)) + (b - v);
        return (doublefloat){s, e};
}
So I don't expect GCC to change e.g. (a - (s - v)) to ((a + v) - s) even with -fassociative-math.  So why are the results so wrong using -fassociative-math (and so much faster)?  
I tried /fp:fast with MSVC (after converting my code to C++) and the results are correct but it's no faster than /fp:precise.
From the GCC manual in regards to -fassociative-math it states
Allow re-association of operands in series of floating-point operations. This violates the ISO C and C++ language standard by possibly changing computation result. NOTE: re-ordering may change the sign of zero as well as ignore NaNs and inhibit or create underflow or overflow (and thus cannot be used on code that relies on rounding behavior like "(x + 2^52) - 2^52". May also reorder floating-point comparisons and thus may not be used when ordered comparisons are required. This option requires that both -fno-signed-zeros and -fno-trapping-math be in effect. Moreover, it doesn't make much sense with -frounding-math.
Edit:
I did some tests with integers (signed and unsigned) and float to check to see if GCC simplifies associative operations. Here is the code I tested
//test1.c
unsigned foosu(unsigned a, unsigned b, unsigned c) { return (a + c) - b; }
signed   fooss(signed   a, signed   b, signed   c) { return (a + c) - b; }
float    foosf(float    a, float    b, float    c) { return (a + c) - b; }
unsigned foomu(unsigned a, unsigned b, unsigned c) { return a*a*a*a*a*a; }
signed   fooms(signed   a, signed   b, signed   c) { return a*a*a*a*a*a; }
float    foomf(float    a, float    b, float    c) { return a*a*a*a*a*a; }
and
//test2.c
unsigned foosu(unsigned a, unsigned b, unsigned c) { return a - (b - c);     }
signed   fooss(signed   a, signed   b, signed   c) { return a - (b - c);     }
float    foosf(float    a, float    b, float    c) { return a - (b - c);     }
unsigned foomu(unsigned a, unsigned b, unsigned c) { return (a*a*a)*(a*a*a); }
signed   fooms(signed   a, signed   b, signed   c) { return (a*a*a)*(a*a*a); }
float    foomf(float    a, float    b, float    c) { return (a*a*a)*(a*a*a); }
I complied with -O3 and -Ofast and I looked at the generated assembly and this is what I observed
- unsigned: the code was identical both for addition and multiplication (reduced to three multiplications)
- signed: the code was not identical for addition but was for multiplication (reduced to three multiplications)
- float: the code was not identical for addition or multiplication with -O3however with-Ofastthe addition was identical and the multiplication was almost the same using only three multiplications.
From this I conclude that
- if an operation is associative then GCC will simplify it however it chooses so that a - (b - c)can become(a + c) - b.
- unsigned addition and multiplication is associative
- signed addition is not associative
- signed multiplication is associative
- a*a*a*a*a*agets simplified to only three multiplications for integers and for floating point when using- -fassociative-math.
- -fassociative-mathcauses floating point addition and multiplication to be associative.
In other words GCC did exactly what I did not expect it to do with -fassociative-math. It converted (a - (s - v)) to ((a + v) - s).
One may think this is obvious with -fassociative-math but there are cases where a programmer may want to have the floating point be associative in once case and non-associative in another case. For example auto-vectorization and reducing a floating point array requires -fassociative-math but if this is done the double-float can't be used in the same module. So the only option is to put associative floating point functions in one module and non-associative floating point functions in another module and compile them into seperate object files.
 
     
    