So I find myself taking advantage of heapq for some calculations. However, for the problem I am working on, it runs slow because the heap gets pretty big.
I thought I had an option to speed it up. Rather than creating a giant heap, make it a heap of heaps. But, surprisingly to me, the "more efficient" code is significantly slower. There's a bit more overhead in that more efficient code, but I really thought it would win by a lot. Having stripped down the problem, I've got two functions that do the same net calculation. f1 is the "naive" (and faster) version. f2 is the "improved" (but slower) version. I do some random number generation in both, but I use the same seed, so it really is the same thing.
import random
import heapq
def f1():
random.seed(1)
Q=[0]
while Q:
value = heapq.heappop(Q)
#print value
if value<0.5:
for counter in range(16):
heapq.heappush(Q,value + 0.1 + (random.random()/1000))
print value
def f2():
random.seed(1)
Q=[[0]]
while Q:
subQ = heapq.heappop(Q)
value = heapq.heappop(subQ)
#print value
if subQ:
heapq.heappush(Q,subQ)
newQ = []
if value<0.5:
for counter in range(16):
newQ.append(value + 0.1 + (random.random()/1000))
heapq.heapify(newQ)
heapq.heappush(Q,newQ)
print value
Why does the heap of heaps (f2) run significantly slower? It should call heappush the same number of times, and heappop twice as many times. But the size of the heaps should be significantly smaller, so I expected it to run faster.