Does CUDA's unified memory work only for variables in the heap?

Question

I was reading this article that explains unified memory and also went through the code provided but the only thing that I cannot find is if in order to be able to use unified memory I should always allocate objects on the heap through new otherwise I'm getting runtime errors.

Am I missing some configuration for CUDA? Im working on a GTX 760 with CUDA 8.0 installed.

class Object : Managed
{
  Object(){//do something}
  void foo() {//do something else}
};

__global__ void aKernel(Object& obj)
{
  //do something in parallel with the object
}

int main()
{
  Object o;
  aKernel<<<b,t>>>(o);
  cudaDeviceSynchronize(); 
  o.foo();                   // ERROR

  Object* p = new Object;
  aKernel<<<b,t>>>(*p);
  cudaDeviceSynchronize();
  p.foo();                   // GOOD
}

Answering would require seeing some real code. Is your `Object`inheriting from the same `Managed` class as in the article? — talonmies, May 03 '17 at 14:12

score 2 · Accepted Answer · edited May 23 '17 at 11:47

The stack allocation:

Object o;

does not invoke new. Therefore, for CUDA, it is an unmanaged object/allocation (since your overridden new operator would have to be called for the managed memory subsystem to enter the picture). For unmanaged data, pass-by-reference as a kernel parameter:

__global__ void aKernel(Object& obj)
                              ^

is illegal.

And your code will not run correctly, if you were to run it with cuda-memcheck. You can also validate these assertions by putting a cout statement in your Managed new override, and study where and when it actually prints something.

In general, AFAIK, managed stack allocations will require the so-called linux HMM patch, which is not available yet.

Also note that there are a few syntactical errors in the code you have shown, for example I believe:

p.foo();

should be:

p->foo();

and I believe:

class Object : Managed

should probably be:

class Object : public Managed

but this didn't seem to be the point of your question (how to get this code to work). I've made the assumption that the inheritance from Managed that you show in your question is indeed inheriting from the Managed class defined here

Good info about the HMM didn't know about it. It actually makes a good point that the only operator I overloaded was **new** hence it will only trigger cuda memory management for heap allocated objects. Thanks for pointing out the errors. — BRabbit27, May 03 '17 at 14:18
We could certainly imagine a stack-allocated object, that had some user-invoked managed character to it (i.e. after instantiation, or perhaps even during the constructor, which *does* get called of course), but this seems somewhat contorted. However the object *itself* wouldn't be managed, so still would not work in your case. Also, your stack case could probably be made to work if we simply didn't pass the object by reference, but by value instead. But again, these nuances did not seem to be the focus of your question. — Robert Crovella, May 03 '17 at 14:21

Does CUDA's unified memory work only for variables in the heap?

1 Answers1