Fastest way to find binary base

Question

I am trying to find binary base of a number, like the floor function which rounds of number to greatest integer below it, i want to round off the number to the 1st binary base below it.

For example:

for 1000 it should be 512
for 10 it should be 8
for 208 it should be 128

This is what I've tried. I feel log functions will consume more resources so is there any faster approach for this?

#include<stdio.h>
int main()  {
    unsigned long long int num;
    unsigned long long int mask;
    scanf("%llu", &num);
    mask = 0x80000000;
    while(mask >>= 1)   {
        if (mask & num)
            break;
    }
    printf("%llu\n", mask);
    return 0;
}

Thanks:)

Close but not exactly a dup: http://stackoverflow.com/questions/671815/what-is-the-fastest-most-efficient-way-to-find-the-highest-set-bit-msb-in-an-i — Shafik Yaghmour, May 30 '13 at 02:48
You might want to look at Andrei Alexandrescu's [Three Optimization Tips for C++](http://isocpp.org/blog/2012/12/three-optimization-tips-alexandrescu), where he uses essentially this problem as an example. Slide: 24, video: ~30:00. — Jerry Coffin, May 30 '13 at 03:08
Have a look at this branch-free code for finding `ceil(log2(x))`: http://stackoverflow.com/a/15327567/1553090 -- perhaps you can adapt it. — paddy, May 30 '13 at 03:22
not exactly a dup, although that solution works great. @bugsbunny: If you want an straightforward, easy to understand way you can try "round up to the next highest power of 2" algorithm [here](http://graphics.stanford.edu/~seander/bithacks.html#RoundUpPowerOf2Float) and shift right 1 bit — phuclv, Aug 03 '13 at 09:59

score 5 · Answer 1 · answered May 30 '13 at 03:03

5

int topBit(int n) {
    while (true){
        m = n & (n-1);
        if (m == 0) return n;
        n = m;
    } 
}

n & (n-1) clears the bottom most set bit. Just do this until you hit zero, and then you know the previous value had only one bit set (the highest that was set in the input).

answered May 30 '13 at 03:03

Laurence Gonsalves

137,896
35
246
299

This nicely complements @markransom's answer. Bit-and vs bit-or! Either works. Cool. Probably faster on small integers; I wonder for what size integer binary search wins... – Floris May 30 '13 at 03:09
I like it too, it's shorter and clearer than mine. – Mark Ransom May 30 '13 at 03:36

Floris · Accepted Answer · 2013-05-30T19:44:48.093

2

Represent the number in binary, then look for the most significant bit (the highest nonzero bit). Naively you can do this by right shifting one bit at a time until it is zero - that was "one too many". That is basically the approach you tried. A bit faster would be a binary search. For 32 bit integer, shift right by 16; if still > 0, right shift by 8, etc. I'm sure you can figure it out from here.

Code example:

typedef unsigned long long int ulli;
ulli floor2(ulli num){
  int msb = 8*sizeof(num)/2;
  ulli temp = ((ulli)1)<<msb;
  while(msb>1){
    msb/=2; // using divide for clarity
    if(temp>num) temp>>=msb; else temp<<=msb;
  }
  if (temp>num) temp/=2;
  return temp;
}

I ran some benchmarks of this algorithm against the topBit as well as the builtIn method. A loop with 10M iterations, generating a "long" random number, takes 362 ms on my system (with no compiler optimization). If the loop includes one of the methods of calculation, times increase as follows:

=============  total    net
builtin:         407     45
binary search:   579    215
topBit:         2295   1933

The built in method is definitely the fastest by a significant margin - not really surprising! With 64 bit numbers, topBit will on average need 32 loops (half the bits are set, so get skipped) and binary only 5 loops, so you would expect about 6x speed difference; that is roughly what you see. When I define ulli as unsigned short (16 bit), the time difference is about 2x.

edited May 30 '13 at 19:44

answered May 30 '13 at 02:39

Floris

45,857
6
70
122

how is msb = sizeof(num) >> 1; ?? msb(3524) != sizeof(3524) >> 1 ; sizeof(3524) == sizeof(2) (== 4 say on a 32-bit platform) – Ahmed Masud May 30 '13 at 02:52
1

@ahmedmasud I chose variable names poorly. `temp` will end up being the number you are looking for. `msb` starts out as 16 (if int is 32 bits), so `temp=1<<16` as a first guess, etc. I edited answer to clarify; thanks for pointing it out! – Floris May 30 '13 at 03:01
This solution has trouble with `num` having any of its upper bits set. `int topBit(int n)`, Paulpro and other solutions are faster and complete. BTW it was a fun exercise. – chux - Reinstate Monica May 30 '13 at 05:01
1

Any decent compiler changes a division by 2 to a shift. It's better to code for clarity, especially when the obscure coding gets you nothing. – Gene May 30 '13 at 13:25
@Gene - I didn't think `>>=1` was "obscure" , but I take your point about "clarity trumps speed" and modified my example. Used the occasion to fix glaring error (`temp>>msb` should be `temp>>=msb`) at the same time. – Floris May 30 '13 at 14:12
Recommend changing `= 1< – chux - Reinstate Monica May 30 '13 at 18:54

Paul · Answer 3 · 2013-05-30T05:02:01.790

2

You can do this using GCC builtins, if you're compiling with GCC. The builtin __builtin_clzll counts the number of leading zeros in an unsigned long long. You can use it to calculate the position of the most significant bit, and then left shift 1 that many times to get your answer:

#include <limits.h>

Then use:

unsigned long long result = 
  num ? 1LLU << (sizeof(unsigned long long)*CHAR_BIT - __builtin_clzll(num) - 1) : 0;

printf("%llu\n", result);

edited May 30 '13 at 05:02

answered May 30 '13 at 04:49

Paul

139,544
27
275
264

I think you want `1LLU << ...` instead of `1`. – chux - Reinstate Monica May 30 '13 at 04:56
@chux Thanks! I updated it. – Paul May 30 '13 at 05:01

score 2 · Answer 4 · answered May 30 '13 at 13:42

This classic document has many ways to find the floor(log base 2) of an integer. After you've found the log, the number you want is of course 1 << log.

The most fascinating suggestion is this

// Find the integer log base 2 of an integer with an 64-bit IEEE float 
int v; // 32-bit integer to find the log base 2 of
int r; // result of log_2(v) goes here
union { unsigned int u[2]; double d; } t; // temp

t.u[__FLOAT_WORD_ORDER==LITTLE_ENDIAN] = 0x43300000;
t.u[__FLOAT_WORD_ORDER!=LITTLE_ENDIAN] = v;
t.d -= 4503599627370496.0;
r = (t.u[__FLOAT_WORD_ORDER==LITTLE_ENDIAN] >> 20) - 0x3FF;

The code above loads a 64-bit (IEEE-754 floating-point) double with a 32-bit integer (with no paddding bits) by storing the integer in the mantissa while the exponent is set to 252. From this newly minted double, 252 (expressed as a double) is subtracted, which sets the resulting exponent to the log base 2 of the input value, v. All that is left is shifting the exponent bits into position (20 bits right) and subtracting the bias, 0x3FF (which is 1023 decimal). This technique only takes 5 operations, but many CPUs are slow at manipulating doubles, and the endianess of the architecture must be accommodated.

So the final result you want will be 1 << r. Note that manipulation of doubles is much faster now than when this article was written. The best thing about this code is that it contains no branches, so will pipeline nicely. You should definitely give it a try. I don't have time to try a benchmark just now, but it would be interesting.

I can't vouch that this code meets the C Standard.

May not be standard C, but well cool! – Floris May 30 '13 at 17:52 — Floris, May 30 '13 at 17:52
For standard C, simply use `int r; frexp(v, &r); r -= 1;`. – Eric Postpischil May 30 '13 at 19:53 — Eric Postpischil, May 30 '13 at 19:53

score 1 · Answer 5 · answered May 30 '13 at 02:42

1

I assume that if a number is already a power of 2 or zero, it should be returned without change. Positive numbers only.

int floor2(int n)
{
    if ((n & (n-1)) == 0)
        return n;
    while (((n+1) & n) != 0)
    {
        n = n | (n+1);
    }
    return (n + 1) >> 1;
}

The fancy bit twiddling here takes advantage of the fact that subtracting 1 from a number with a single bit (i.e. a power of 2) sets all the bits below it, while adding 1 to a number will set the bottom-most zero bit.

answered May 30 '13 at 02:42

Mark Ransom

299,747
42
398
622

I had to think about this for a bit, but now I like it. Is it faster than binary search though? I think it only saves iterations for non zero bits; am i right? – Floris May 30 '13 at 02:53
@Floris, right - it only skips non-zero bits. Best case is when there are no zero bits, worst case is when 2 bits are set and the rest are zero. Binary search might be better. – Mark Ransom May 30 '13 at 02:57
@bugsbunny sample code was with unsigned integers. I asked this about another solution too, does this method, converted to unsigned, work with MAX unsigned? (It is the n+1 overflow that concerns me - looks like it would exit the while loop immedaitely) Maybe a simple test for MAX UINT would solve things. – chux - Reinstate Monica May 30 '13 at 04:39

score 1 · Answer 6 · edited May 23 '17 at 12:11

This is problem is very closely related to finding the most significant bit problem; because after that it's just a bit of bit shifting:

Finding MSB is well described here:

Find most significant bit (left-most) that is set in a bit array

and then you do something like this:

int foo = 1000;
int bar = ((msb(foo) << 1) - 1) >> 1; 
if ( bar > foo ) bar = bar >> 1;

and you have it.

If you're on intel architecture you can use __builtin_clz (calculate leading zeros) in gcc to get the msb;

Or

Here is a really fun way to do calculate leading zeros without CPU support

http://www.hackersdelight.org/hdcodetxt/nlz.c.txt

I have to upvote this for the awesome hackersdelight link! – Floris May 30 '13 at 17:59 — Floris, May 30 '13 at 17:59

chux - Reinstate Monica · Answer 7 · 2013-05-30T18:58:28.713

1

Short and sweet ...
(Questioner's sample code had trouble with values >= 0x80000000LLU, fixed here.)
This needs only 1 compare in the loop rather than 2.

unsigned long long int MSMask(unsigned long long int) {
  if (num == 0) {
    return 0;
    }
  else {
    unsigned long long int mask = 0x8000000000000000LLU;
    while (!(mask & num)) mask >>= 1;
    return mask;
  }
}

edited May 30 '13 at 18:58

answered May 30 '13 at 03:47

chux - Reinstate Monica

143,097
13
135
256

This looks a lot like the "here is what I tried" example in the question?... – Floris May 30 '13 at 17:49
@Floris. Agreed, it _is_ a lot like the example. It needed 2 corrections: do the `>>=` _after_ the `while()` and begin with a correct `unsigned long long`. So algorithmically this did not add to the discussion, but many of the answers posted are algorithmically interesting but have troubles with some values. Many questions in SO are closed because they only generally address algorithms. I addressed specific program weaknesses and offered suggestions. BTW, you timing analyses is great! Should not testing also include correctness like against `ULLONG_MAX`? – chux - Reinstate Monica May 30 '13 at 18:45

perreal · Answer 8 · 2013-05-30T04:07:34.837

0

This is about 2 times faster than your version:

uint64_t g(uint64_t num) {
    uint64_t r = 1;
    num = num >> 1;
    while(num>r)   {
        r <<= 1;
    }
    return r;
}

edited May 30 '13 at 04:07

answered May 30 '13 at 02:50

perreal

94,503
21
155
181

If num > 0x8000000000000000LLU, the while loop does not terminate. `r` reaching the value 0x8000000000000000LLU would shift left to a 0 and then the loop keeps going. – chux - Reinstate Monica May 30 '13 at 03:53

vvy · Answer 9 · 2013-05-30T04:09:48.077

0

int test(int num) {
    int res = (num==0?0:1);
    while(num>1) {
        num=num>>1;
        res=res<<1;
    }
    return res;
}

faster than below when num is not large.

edited May 30 '13 at 04:09

answered May 30 '13 at 03:05

vvy

1,963
13
17

@bugsbunny did not specify what to do with f(0). Just noting that `test(0)` and `test(1)` both return 1. – chux - Reinstate Monica May 30 '13 at 04:00
@chux repaird,but I'll not consider whether num < 0 – vvy May 30 '13 at 04:11

score 0 · Answer 10 · answered May 30 '13 at 04:19

0

The worst case performance of the suggested bit twiddling tricks could / should be improved by gradually oring several bits to be ones:

 int s=1;
 while ((n+1) & n) {
    n|=n >> s;
    s<<=1;
 }
 return (n+1) >> 1;

This fragment exits when all the least significant bits are ones, requiring only some log2(log2(n)) iterations.

answered May 30 '13 at 04:19

Aki Suihkonen

19,144
1
36
57

@bugsbunny sample code was with unsigned integers, does this method, converted to unsigned ints, work with MAX unsigned? (It is the n+1 overflow that concerns me.) – chux - Reinstate Monica May 30 '13 at 04:28
No it doesn't. At least the while loop works for the MAX int, so one could probably find a better formula for the return value. (n>>1)+1 would work for that, but can't recall the specification for func(0) – Aki Suihkonen May 30 '13 at 04:34

pjs · Answer 11 · 2013-05-31T17:08:46.320

0

I know recursion is generally not as efficient as iteration, but I can't help myself -- I love a good recursion:

unsigned topBit(unsigned n) {
   unsigned m = n & (n-1);
   return m ? topBit(m) : n;
}

ADDENDUM: Actual timings showed this as slightly faster than @LaurenceGonsalves's iterative version when compiled with optimization.

edited May 31 '13 at 17:08

answered May 30 '13 at 19:01

pjs

18,696
4
27
56

Fastest way to find binary base

11 Answers11