CUDA Launch: Out of Resources Error and Strongly Typed Methods

You may see this cryptic error now and then when developing with CUDAfy.NET.  Typically the reason is down to the passing of the wrong parameters to the device function. This example is going to end in tears:

        [Cudafy]
        public static void Scale(GThread thread, ComplexFloat[] c, float scale)
        {
            int id = thread.get_global_id(0);
            c[id].R = c[id].R * scale;
            c[id].I = c[id].I * scale;
        }
        ...
        int N = 1024;
        gpu.Launch(gridSize, BLOCK_LEN, "Scale", devBuffer, 1.0/N);
        ...

It can be corrected by changing 1.0 to 1.0F to ensure we stay with single floating point. As it stands above the result of the division is double floating point which is 8 bytes. The device function expects single which is 4 bytes. Hence the out of resources message. Kind of makes sense.
A better policy can be to use strong typing in the Launch.

        [Cudafy]
        public static void Scale(GThread thread, ComplexFloat[] c, float scale)
        {
            int id = thread.get_global_id(0);
            c[id].R = c[id].R * scale;
            c[id].I = c[id].I * scale;
        }
        ...
        int N = 1024;
        gpu.Launch(gridSize, BLOCK_LEN, Scale, devBuffer, 1.0F/N);
        ...

The only downside is a slight performance hit due to reflection. The degree of this is nowhere near as high as the more elegantly looking dynamic launching that CUDAfy supports.

        gpu.Launch(gridSize, BLOCK_LEN).Scale(devBuffer, 1.0F/N);

First time this is called you can get hit with about 20ms of dynamic runtime goodness. Subsequent calls appear efficient. You have however also lost your strongly typed safety, which does not appear to worry millions of programmers around the world.

Advertisements