New Meta AI can cut unwanted objects from your photos & videos
The next generation of AI is slipping into all sorts of uses and it’s no surprise that one of the world’s biggest tech companies would add its own version of this technology to its platforms.
With its new “Segment Anything Model” (SAM) AI, the company that owns Facebook and Instagram has demonstrated how users could easily cut any object they like out of an image or video and make it vanish seamlessly.
Curiously for Meta, the new technology is allowed to be open source despite being quite impressive even by the standards of competing systems.
According to a recent Gizmodo report, the SAM AI uses a range of input prompts by users to understand what they want to be removed and then removes them selectively and smoothly.
AI clipping and cutting tools are of course nothing new and lately, they’ve proliferated even more.
For example, Adobe Photoshop features a content-aware fill tool of its own and Apple has also developed a “lift and drop” AI tool for cutting subjects from photos and placing them elsewhere, or simply erasing them.
Meta’s system is however somewhat different, and even more advanced it seems. The reason why is that it does a remarkably intelligent job of computing an image and precisely selecting specific objects for removal.
In a recent live demo of SAM at work, users were able to ask the AI to show them all of the objects it recognized as individual, discreet things.
The technology showed itself remarkably capable of picking out numerous specific things, even in very complex photos involving things like cityscapes and street scenes.
SAM could even identify parts of an object that are out of focus and recognize them as part of something that should be individually selected.
Having selected multiple objects in just seconds, the AI cutting tool can then remove them as desired.
In the demo, users could even upload their own sample images for on-the-fly testing. Despite being made to work with these visuals that it had never seen before, the SAM AI still showed off its chops.
Some photos, such as one of the Tarantula Nebula, taken by the James Webb telescope, gave it trouble due to its diffuse complexity. Others, such as photos of people in the middle of sports events, caused SAM little trouble.
As Meta Explains:
“SAM’s advanced capabilities are the result of its training on millions of images and masks collected through the use of a model-in-the-loop ‘data engine.’ Researchers used SAM and its data to interactively annotate images and update the model. This cycle was repeated many times over to improve both the model and the dataset,”
The company further elaborated on the impressively large datasets it has been training the AI with:
“After annotating enough masks with SAM’s help, we were able to leverage SAM’s sophisticated ambiguity-aware design to annotate new images fully automatically. To do this, we present SAM with a grid of points on an image and ask SAM to segment everything at each point. Our final dataset includes more than 1.1 billion segmentation masks collected on ~11 million licensed and privacy-preserving images.”
Meta Further claims that SAM can handle multiple masks even with ambiguous subjects and will be developed further for superior prompt recognition.
This ability to recognize prompts should include inputs such as gestures by users wearing VR headsets and text instructions. It could eventually even recognize verbal prompts,
It’s worth noting here that all of the abilities of SAM demonstrated so far are just those being shown off by Meta’s early demonstration model of the AI.
SAM should advance even further with more development work and particularly if tested and further trained live through the company’s consumer services and products.
As we mentioned above, SAM is, surprisingly for Meta, open source, at least so far.