I would like to find a maximally efficient way to compute a char that contains the least significant bits of an int in C++11. The solution must work with any possible standards-compliant compiler. (I'm using the N3290 C++ draft spec, which is essentially C++11.)
The reason for this is that I'm writing something like a fuzz tester, and want to check libraries that require a std::string as input. So I need to generate random characters for the strings. The pseudo-random generator I'm using provides ints whose low bits are pretty uniformly random, but I'm not sure of the exact range. (Basically the exact range depends on a "size of test case" runtime parameter.)
If I didn't care about working on any compiler, this would be as simple as:
inline char int2char(int i) { return i; }
Before you dismiss this as a trivial question, consider that:
You don't know whether
charis a signed or unsigned type.If
charis signed, then a conversion from an unrepresentableintto acharis "implementation-defined" (§4.7/3). This is far better than undefined, but for this solution I'd need to see some evidence that the standard prohibits things like converting all ints not betweenCHAR_MINandCHAR_MAXto'\0'.reinterpret_castis not permitted between a signed and unsigned char (§5.2.10).static_castperforms the same conversion as in the previous point.char c = i & 0xff;--though it silences some compiler warnings--is almost certainly not correct for all implementation-defined conversions. In particular,i & 0xffis always a positive number, so in the case thatcis signed could quite plausibly not convert negative values ofito negative values ofc.
Here are some solutions that do work, but in most of these cases I'm worried they won't be as efficient as a simple conversion. These also seem too complicated for something so simple:
Using
reinterpret_caston a pointer or reference, since you can convert fromunsigned char *orunsigned char &tochar *orchar &(but at the possible cost of runtime overhead).Using a union of
charandunsigned char, where you first assign theintto theunsigned char, then extract thechar(which again could be slower).Shifting left and right to sign-extend the int. E.g., if
iis the int, runningc = ((i << 8 * (sizeof(i) - sizeof(c)) >> 8 * (sizeof(i) - sizeof(c))(but that's inelegant, and if the compiler doesn't optimize away the shifts, quite slow).
Here's a minimal working example. The goal is to argue that the assertions can never fail on any compiler, or to define an alternate int2char in which the assertions can never fail.
#include <algorithm>
#include <cassert>
#include <cstdio>
#include <cstdlib>
using namespace std;
constexpr char int2char(int i) { return i; }
int
main(int argc, char **argv)
{
for (int n = 1; n < min(argc, 127); n++) {
char c = -n;
int i = (atoi(argv[n]) << 8) ^ -n;
assert(c == int2char(i));
}
return 0;
}
I've phrased this question in terms of C++ because the standards are easier to find on the web, but I am equally interested in a solution in C. Here's the MWE in C:
#include <assert.h>
#include <stdlib.h>
static char int2char(int i) { return i; }
int
main(int argc, char **argv)
{
for (int n = 1; n < argc && n < 127; n++) {
char c = -n;
int i = (atoi(argv[n]) << 8) ^ -n;
assert(c == int2char(i));
}
return 0;
}