Photo Stacking in iOS with Vision and Metal |

In this tutorial, you’ll use Metal and the Vision framework to remove moving objects from pictures in iOS. You’ll learn how to stack, align and process multiple images so that any moving object disappears.

This is a companion discussion topic for the original entry at

Did anyone else seem to lose color in their final images? I’m only getting a red and green and it’s washed out. I’m pretty sure I followed everything correctly.

I figured it out, my metal file had the average as avg.rbg not avg.rgb. :slight_smile:

1 Like

I feel that the avgStacking function will let the first few pictures get more power in combine. Is there a way to balance? :smiley_cat:

Hi @duckmole,

With this line in the kernel:

float4 avg = ((currentStack * stackCount) + newImage) / (stackCount + 1.0);

You multiply the stack count by the current average to get the sum of the pixels already seen. You then add the pixel for the new image to it and divide everything by the stack count + 1 (because the new image increases the stack count by one).

This should ensure that the average you see is a balanced average of all images, provided the correct stack count is passed in with each filter operation.

Back in Swift land, you ensure that the stack count is correct by setting it in this for loop:

for (i, image) in alignedFrameBuffer.enumerated() {
    filter.inputCurrentStack = finalImage
    filter.inputNewImage = image
    filter.inputStackCount = Double(i + 1)
    finalImage = filter.outputImage()!

Does that make sense?

Yes, you are right. My intuition mislead me. Really appreciate your reply.✧ʕ̢̣̣̣̣̩̩̩̩·͡˔·ོɁ̡̣̣̣̣̩̩̩̩✧

1 Like

When the photos are aligned, the areas that are transparent become dark when the average calculation completes. How can you create a calculation so there aren’t dark lines when a transparent color and non-transparent color are calculated? Thanks for the tutorial.

Hi @hellojosh

I guess it depends on the color of the transparent pixels. Since I did this tutorial using output from the camera, I didn’t take transparent pixels into account. Just thinking off the top of my head you could add the 4th alpha channel to the average calculations? Or maybe weight the value of the color by the alpha percentage?

This tutorial is more than six months old so questions are no longer supported at the moment for it. Thank you!