Chapter 17: improving performance of Flocking

mjeragh · February 6, 2019, 6:18pm

To improve the performance I wanted to combine the three loops cohesion_separation_alignment into a single loop, I have written this function:

    device float2 * cohesion_separation_alignment(uint index, device Boid* boids, uint particleCount){
        Boid thisBoid = boids[index];
        float2 position[3]= {float2(0),float2(0),float2(0)};
        
        for (uint i =0; i < particleCount; i++){
            if (i != index) {
                position[0] += boids[i].position[0];
                if (abs(distance(boids[i].position[1], thisBoid.position)) < limit){
                    position[1] -= (boids[i].position[1] - thisBoid.position);
                }
                position[2] += boids[i].position[2];
            }
            
        }
        return position;
    }

but the shader compiler is complaining about the return statement
/starter/Flocking/Flocking/Shaders.metal:128: cannot initialize return object of type ‘device float2 *’ with an lvalue of type ‘float2 [3]’

I also tried
return &position[0];

/starter/Flocking/Flocking/Shaders.metal:128: cannot initialize return object of type ‘device float2 *’ with an rvalue of type ‘float2 *’

Is there away to return the array. I want to see if I can get a performance boost from this experiment.

Thank you for all your effort

mhorga · February 6, 2019, 6:34pm

@mjeragh unfortunately, MSL does not like dynamically allocated memory (pointers) on the GPU so you might need to think of another way to do it.