Problem description:
I'm optimizing quite complex algorithm, which unfortunately relies heavily on using set and frozenset datatypes (because of faster in operator). This means I get different time of execution every time I run the test, even for exactly the same input data. And since I need (badly) to optimize the algorithm, I'd like to have a constant (as far as it's possible) time of execution every time.
Demonstration
I made a simplified example, which hopefully demonstrates the problem:
import timeit
class A(object):
def __init__(self, val):
self.val = val
def run_test():
result = []
for i in xrange(100):
a = {A(j) for j in xrange(100)}
result.append(sorted(v.val for v in a))
return result
N = 10
times = timeit.Timer(run_test).repeat(repeat=3, number=N)
print '%.6f s' % (min(times) / N,)
The core of the problem is the ordering of objects in sets - it depends ( I think) on their position in memory, which of course is different each time. Then, when sorting the values, sorted execution speed will be different each time. On my machine, it gives execution times with tolerance of about 10%.
It's not the best demonstration, because my real code depends much more on set ordering and time differences are much higher, but I hope you get the picture.
What I tried:
- sorting the sets in the algoritm - it gives constant execution time, but also makes the whole algorithm ten times slower
- using very large
numberandrepeatparameters - unless I want to wait an hour after each change, this won't help me
What I'd like to try:
I think that if I could somehow "reset" python interpreter to have a "clean memory", it would lead to predictable memory position for objects and the time measurements would be constant. But I have no idea how to do something like this, except for creating a VM and restarting it every time I want to make a test.. I think
Not an issue:
- I profile a lot, and I know which functions are slowest by now - I just need to make them faster - those are the functions which speeds I'm trying to measure.
- I can't use anything other than
setandfrozensetfor testing (like some ordered set), because it would be much slower and the measured speed wouldn't be in any relation to production code setandfrozensetperformance is not important here
Summary / Question:
- my algorithm uses
setsinternally - I want to measure execution speed
- the execution speed depends on the order in which the elements contained in the internal
setare retrieved - the test I'm using has fixed input values
- based on
timeitmeasurement, I'm unable to measure impact of any change I make - in the test above, the
run_testfunction is a good example of my real problem
So I need some way to temporarily make sure all set elements will be created in the same memory positions, which will make the test execution speed and number of function calls (profiling) deterministic.
Additional example
This example perhaps demonstrates my problem better:
import timeit
import time
class A(object):
def __init__(self, val):
self.val = val
def get_value_from_very_long_computation(self):
time.sleep(0.1)
return self.val
def run_test():
input_data = {A(j) for j in xrange(20)}
for i, x in enumerate(input_data):
value = x.get_value_from_very_long_computation()
if value > 16:
break
print '%d iterations' % (i + 1,)
times = timeit.Timer(run_test).repeat(repeat=1, number=1)
print '%.9f s' % (min(times) / N,)
Which returns, for example:
$ ./timing-example2.py
4 iterations
0.400907993 s
$ ./timing-example2.py
3 iterations
0.300778866 s
$ ./timing-example2.py
8 iterations
0.801693201 s
(this will be, of course, different every time you run it and may or may not be completely different on another machine)
You can see that the execution speed is VERY different each time while the input data remains exactly the same. This is exactly the behaviour I see when I measure my algorithm speed.