char pointer to an integer can test endianness, right?

Question

I'm going through the C test at 2braces.com, and got confused by the first question.

#include<stdio.h>
int main(){
    int a = 130;
    char *ptr;
    ptr = (char *)&a;
    printf("%d ",*ptr);
    return 0;
}

The answer said its output would be -126. but IMO, it depends: the output should be -126 in little-endian env, and 0 in big endian env. Am I right?

If the code want to test the char-overflow issue, it could be more rigorous this way:

#include<stdio.h>
int main(){
    int a = 130;
    char c = a;
    printf("%d ",c);
    return 0;
}

BTW, I saw a lot of sample code which uses union of char and int to test the endianness, isn't it much simpler to use char * to an integer?

I'd like to verify my guess but i'don't have a big-endian env around.

An extensive test you may want to look at is [Endianess Test](http://sourceforge.net/p/predef/wiki/Endianness/) — David C. Rankin, Oct 12 '17 at 03:23
kumo: congratulations, you know more about C than the author. Now, forget about this site, obviously there is nothing to be learned from there. — Antti Haapala -- Слава Україні, Oct 12 '17 at 03:33
This is UB per [strict aliasing rule](https://stackoverflow.com/questions/98650/what-is-the-strict-aliasing-rule) Use a `union`. — chux - Reinstate Monica, Oct 12 '17 at 03:50
@chux Could you elaborate? char is supposed to be an exception... and before someone edited it out, there was C++ too, and afaik the union is UB there. — deviantfan, Oct 12 '17 at 06:10

deviantfan · Answer 1 · 2017-10-12T03:39:11.697

2

but IMO, it depends: the output should be -126 in little-endian env, and 0 in big endian env. Am I right?

Sort of.

If you check for both -126 and 130, vs 0, (char can either be signed or unsigned), in about 99%-something of devices nowadays, this will work. But strictly speaking, there's no guarantee for either value. Reasons include:

variable sizes. eg. char might have the same size (and endianess and ...) as int, then its value will be 130 here.
A system that doesn't use 2-complement. (then 130 is not -126)
Strange things like variables with trap values, unused bits, etc.etc.
...

BTW, I saw a lot of sample code which uses union of char and int to test the endianness

While I need to look this up, afaik there is no guarantee that the char will be at the first byte of the int. If so, this is another problem to consider.

edited Oct 12 '17 at 03:39

answered Oct 12 '17 at 03:04

deviantfan

11,268
3
32
49

You don't need wierd sizes or representations . Sufficient for char to be unsigned. – rici Oct 12 '17 at 03:19
@rici For not being -126, yes. But for ability to detect endianess, just being unsigned shouldn't change anything? – deviantfan Oct 12 '17 at 03:23
1

[Is char signed or unsigned by default?](https://stackoverflow.com/q/2054939/995714) – phuclv Oct 12 '17 at 03:33
`afaik there is no guarantee that the char will be at the first byte of the int.` this is an interesting statement. I thought pointers alwasy point to the first byte of objects. do you have any reference that can explain it a bit more? – kumo Oct 12 '17 at 05:05
@kumo Pointers yes, but in this part I wasn't talking about pointers but unions. – deviantfan Oct 12 '17 at 06:11

score 1 · Answer 2 · answered Oct 12 '17 at 03:56

This fragment of code:

#include <stdio.h>
int main() {
    int a = 130;
    char *ptr;
    ptr = (char *)&a;
    printf("%d ",*ptr);
    return 0;
}

doesn't have undefined or unspecified behaviour, besides the quite obscure possibility that 130 would somehow happen to be trap value when accessed as (implicitly signed) char, but there are many possible outcomes:

output is -126 on a little-endian platform where char is signed and the representation is 2's complement
output is something else on an architecture that uses sign-and-magnitude or one's complement for the negative numbers.
output is 130 on a little-endian platform where char is unsigned.
output is 0 on a big-endian platform where char is signed or unsigned
output could be 0 on a middle-endian platform. Or it could be -126 or 130 or something else if int is represented by one, two bytes
there are platforms where sizeof(int) is 1. It might be that all addressable primitive values have sizeof 1 - these wouldn't then have any endianness at all.

Some of these are obscure, but both little- and big-endian computers do exist widely, and on ARM processors the chars are unsigned by default.

To not complicate things too much, the endianess detection ought to be done using unsigned types.

The union test works if you use fixed-size types

union {
    uint32_t the_int;
    uint8_t fragments[4];
}

You wouldn't test the endianness of your computer at runtime! The code has to be compiled for one and one endianness only so you can do it at compile-time anyway. The actual endianness checks that need to be done at runtime concern only data structures read from files or interprocess communication.

so, that is to say: one won't be able to test the hardware's endianness at runtime, all one can do is to test the **compilers's endianness**. — kumo, Oct 12 '17 at 05:11
@kumo no, the point is that it is meaningless to test something at runtime that you would have tested at compile-time anyway. Not the compiler's endianness, but the endianness of the environment that your program runs in. Using `#ifdef`s for example - notice that there is no standard way, but for example [GCC has ways to detect it](https://gcc.gnu.org/onlinedocs/cpp/Common-Predefined-Macros.html) (look for `__BYTE_ORDER__`) — Antti Haapala -- Слава Україні, Oct 12 '17 at 05:13

char pointer to an integer can test endianness, right?

2 Answers2