Converting all variables into gpuArrays doesn't speed up computation

Question

I'm writing simulation with MATLAB where I used CUDA acceleration.

Suppose we have vector x and y, matrix A and scalar variables dt,dx,a,b,c.

What I found out was that by putting x,y,A into gpuArray() before running the iteration and built-in functions, the iteration could be accelerated significantly.

However, when I tried to put variables like dt,dx,a,b,c into the gpuArray(), the program would be significantly slowed down, by a factor of over 30%. (Time increased from 7s to 11s).

Why it was not a good idea to put all the variables into the gpuArray()?

(Short comment, those scalars were multiplied together with x,y,A, and was never used during the iteration alone.)

It's impossible to answer without knowing more about your variables, what you do with them, etc. You should `profile` your code and look for bottlenecks. — Dev-iL, Oct 08 '18 at 07:50
my grain of salt: you are doing lightweight operations and this means kernel launch overhead is not negligible. This means increasing parameters of kernel(dt,dx,etc) also adds even more kernel launch overhead. Try to pack them all into a single array as a parameter. I assume you are giving them their own gpuArray() but I mean you put all into same gpuArray() instance to be accessed by an index. — huseyin tugrul buyukisik, Oct 08 '18 at 16:42

score 4 · Accepted Answer · answered Oct 08 '18 at 07:53

GPU hardware is optimised for working on relatively large amounts of data. You only really see the benefit of GPU computing when you can feed the many processing cores lots of data to keep them busy. Typically this means you need operations working on thousands or millions of elements.

The overheads of launching operations on the GPU dwarf the computation time when you're dealing with scalar quantities, so it is no surprise that they are slower than on the CPU. (This is not peculiar to MATLAB & gpuArray).

Converting all variables into gpuArrays doesn't speed up computation

1 Answers1