I have a HostMatrix which was declared as:
float **HostMatrix
I have to copy the content of device matrix , pointed to by devicePointer to the 2 dimensional host matrix HostMatrix
I tried this
for (int i=0; i<numberOfRows; i++){
    cudaMemcpy(HostMatrix[i], devicePointer, numberOfColumns *sizeof(float),
                 cudaMemcpyDeviceToHost);
    devicePointer += numberOfColumns;// so as to reach next row
}
But this will be wrong since I am doing this inside a host function, and devicePointer can not be manipulated directly in host function as I am doing in last line.
So what will be the correct way to achieve this ?
Edit
Oh actually this will work correctly!. But the problem would come while de-allocating the memory as discussed in my earlier question: CUDA: Invalid Device Pointer error when reallocating memory . So basically the following will be incorrect
 for (int i=0; i<numberOfRows; i++){
        cudaMemcpy(HostMatrix[i], devicePointer, numberOfColumns *sizeof(float),
                     cudaMemcpyDeviceToHost);
        devicePointer += numberOfColumns;// so as to reach next row
    }
   cudaFree(devicePointer); //invalid device pointer 
 
     
    