1

I have been doing a lot of application testing, trying different libraries, threading ratios, operation distributions, etc... I want to be able to quickly generate graphs from my run data.

The data is outputted to a tab separated file that is conceptually divided into a left half and a right half. Each row is a separate execution configuration.

The left half is various configuration parameters and the right half is a series of execution time. (For example we may run a configuration 20 times to insure an even average)

I want to quickly be able to plot graphs of this data and/or calculate values such as standard deviation.

For example I would specify a column in the right half to be used as the X-axis, and the rest of the right half columns would be used to identify the execution config used to generate the line. The Y-axis will always be execution time (or normalized performance)

For example the left hand side could be

| Threads | Elements | AlgorithmA | AlgorithmB |
|---------+----------+------------+------------|
|       1 |      100 | t1a        | t1b        |
|       2 |      100 | t1a        | t1b        |
|       3 |      100 | t1a        | t1b        |
|       1 |   100000 | t1a        | t1b        |
|       2 |   100000 | t1a        | t1b        |
|       3 |   100000 | t1a        | t1b        |
|---------+----------+------------+------------|
|       1 |      100 | t2a        | t1b        |
|       2 |      100 | t2a        | t1b        |
|       3 |      100 | t2a        | t1b        |
|       1 |   100000 | t2a        | t1b        |
|       2 |   100000 | t2a        | t1b        |
|       3 |   100000 | t2a        | t1b        |
|---------+----------+------------+------------|
|       1 |      100 | t2a        | t2b        |
|       2 |      100 | t2a        | t2b        |
|       3 |      100 | t2a        | t2b        |
|       1 |   100000 | t2a        | t2b        |
|       2 |   100000 | t2a        | t2b        |
|       3 |   100000 | t2a        | t2b        |
|---------+----------+------------+------------|
|       1 |      100 | t1a        | t2b        |
|       2 |      100 | t1a        | t2b        |
|       3 |      100 | t1a        | t2b        |
|       1 |   100000 | t1a        | t2b        |
|       2 |   100000 | t1a        | t2b        |
|       3 |   100000 | t1a        | t2b        |
|---------+----------+------------+------------|

If the X-axis would be Threads then there would be 8 lines each line with 3 data points, then to filter to only one element size. If the X-axis was elements then there would be 16 lines, each line with 2 data points, then I would filter it to only one thread level.

My google fu has failed me at finding one I like, maybe I am searching the wrong keywords...so my questions is what application can you recommend to solve my problem and make my days less repetitive.

Prasanna
  • 4,174

1 Answers1

2

You can do pretty much all of this in Python with IPython, matplotlib and pandas, with some statsmodels thrown in for good measure. There are some other dependencies, like NumPy, so read the docs.

I highly recommend reading Python For Data Analysis as it walks you through all the packages I mentioned and more, although basic knowledge of Python is assumed.

MattDMo
  • 5,409