Chapter 16 Dead/Alive Particle Buffer

Hello,

I’ve implemented particles already in my game but now I’m at a point where I’m using it very inefficient. My particles do not reemit themselves as the book covers but are triggered based on time and/or movement. Currently, I’m doing this by reading out the particle buffer, looping over each particle, separate them based on alive/dead, then render the alive particles. All on the CPU. I’m only repositioning each alive particle on the GPU. With my game progressing this is the #1 bottleneck.

To improve performance I want to move this functionality to the GPU. The book states the use of two buffers on page 489. But unfortunately, does not continue with that concept.

More complex particle systems would maintain a live buffer and a dead buffer. As particles die, they move from live to dead, and as the system requires new particles, it recovers them from dead

Is there an example available that goes deeper into this? I’m at the point of having a live buffer, dead buffer, and a buffer that stores the live/dead count. I already know how to read and write to a buffer from the compute shader. But I’m not sure how to efficiently remove this from the CPU. I would still need to read out the life/dead count to use in the following code:

encoder.drawPrimitives(type: .point, vertexStart: 0, vertexCount: 1, instanceCount: amountOfParticlesAlive)

Am I missing something or overthinking this? Any help or additional documentation to this topic would be greatly appreciated. Thank you.

For the moment I can only direct you to flexmonkey’s excellent blog.

The code is a bit old, but I’ve updated his ParticleLab to Xcode 12.3 here: GitHub - carolight/ParticleLab: Particle system that's both calculated and rendered on the GPU using the Metal framework

3 Likes

Cool resource!
Thks to share and update :wink:

1 Like

Thank you for the resource. It has given me an idea of how to effectively move the particles on different buffers. In my example, I’m not using a target texture but another data buffer, but the same idea applies.

For others finding this thread. In my compute shader I’m checking the life of the current (thread_position_in_grid) particle, if the particle is dead I’m checking the “particlesToAdd” buffer (one that stores new particles on the CPU) using the “particlesToAdd” counter (so I know how many particles were added on the CPU). I then replace the new particle with the dead particle in the original particle buffer, and continue to process that particle’s movement. This was a bit tricky because I had to use atomic counters (and the use of atomic_fetch_add_explicit) to avoid race conditions, but it seems to work now. Finally, the original particle buffer is used in my vertex/fragment shader.

I’ve now successfully removed most from the CPU (except triggering new particles) and it already gave me a noticeable increase in FPS! There is still room for improvement, but for now, I already reached my goal.

1 Like

As someone fairly unfamiliar with shaders, in particular GPU buffer manipulation, I would love some additional details about how you’ve accomplished this task. I’ve looked through the FlexMonkey example and couldn’t figure out how it applied. Do you have some code samples I could look at, to see how you’ve handled it?