In the piece of code here I came across an struct for the shared memory definition and usages. I modified the allocation to be static and used it in a test program like below:
#include <stdio.h>
template<class T, uint bDim>
struct SharedMemory
{
     __device__ inline operator T *() {
        __shared__ T __smem[ bDim ];
        return (T*) (void *) __smem;
    }
     __device__ inline operator const T *() const {
        __shared__ T __smem[ bDim ];
        return (T*) (void *) __smem;
    }
};
template <uint bDim>
__global__ void myKernel() {
    SharedMemory<uint, bDim> myShared;
    myShared[ threadIdx.x ] = threadIdx.x;
    __syncthreads();
    printf("%d\tsees\t%d\tat two on the circular right.\n", threadIdx.x,     myShared[ ( threadIdx.x + 2 ) & 31 ]);
}
int main() {
    myKernel<32><<<1, 32>>>();
    cudaDeviceSynchronize();
    return 0;
}
It works fine as predicted. However, I have a few questions about this usage:
- I don't understand the syntax used in the operator overloading section in the - sharedMemorystruct. Is it overloading the dereference operator- *? If yes, how accesses via square bracket translate into dereference pointer? Also, why does changing- __device__ inline operator T *() {line into- __device__ inline T operator *() {produce compiler errors?
- I wanted to ease the use of the wrapper by overloading the assignment operator or defining a member function, so that each thread updates the shared memory location corresponding to its thread index. So that, for example, writing down - myShared = 47;or- myShared.set( 47 );translates into- myShared[threadIdx.x] = 47;behind the curtain. But I have been unsuccessful doing this. It compiles fine but the shared memory buffer is read all- 0(which I think is the default shared memory initialization in the Debug mode). Can you please let me know where I'm doing things wrong? Here's my try:- template<class T, uint bDim> struct SharedMemory { __device__ inline operator T*() { __shared__ T __smem[ bDim ]; return (T*) (void *) __smem; } __device__ inline operator const T *() const { __shared__ T __smem[ bDim ]; return (T*) (void *) __smem; } __device__ inline T& operator=( const T& __in ) { __shared__ T __smem[ bDim ]; __smem[ threadIdx.x ] = __in; return (T&) __smem[ threadIdx.x ]; } __device__ inline void set( const T __in ) { __shared__ T __smem[ bDim ]; __smem[ threadIdx.x ] = __in; } };- For the member function, the compiler gives out a warning: - variable "__smem" was set but never used
Although I am aware member variables cannot be __shared__, I'm thinking I have a wrong assumption about or what I want to do is not matched with the __shared__ qualifier characteristics. I appreciate the help.
 
     
    