Netflix Launches VOID AI for Advanced Video Object Removal

Netflix has introduced its first public AI model, a video object removal framework called VOID. Not only does the tool erase objects, but it also predicts how the remaining scene should behave physically. As a result, it delivers more realistic edits compared to traditional methods.

Moreover, the system, known as Video Object and Interaction Deletion, is available under the Apache 2.0 license for commercial use. Therefore, developers and creators can freely access and build upon the technology.

How VOID Transforms Scenes

Unlike standard video inpainting tools, VOID goes beyond simple object removal by addressing physical interactions. For instance, if one vehicle disappears from a collision scene, the remaining car continues naturally without impact effects. Similarly, removing a person from a pool scene eliminates any splash, leaving the water undisturbed.

To achieve this, VOID relies on a multi-model pipeline. It uses CogVideoX as its base and enhances it with synthetic datasets such as Kubric and HUMOTO. In addition, Gemini 3 Pro analyzes scenes, while SAM2 handles object segmentation.

Furthermore, a vision-language reasoning system generates a “quadmask” to guide scene reconstruction. An optional optical flow step then refines shapes, ensuring visual consistency.

Development and Accessibility

The project was developed by a team of researchers, including Saman Motamed, William Harvey, Benjamin Klein, Zhuoning Yuan, Ta-Ying Cheng, and Luc Van Gool in collaboration with INSAIT at Sofia University.

Khalifa University Launches RF-GPT for 6G AI Telecom

According to their research, VOID was preferred 64.8 percent of the time over competing solutions, while Runway achieved 18.4 percent. Consequently, the results highlight its strong performance in realistic video editing tasks.

Additionally, the code, research paper, and demo are publicly available on platforms like GitHub, arXiv, and Hugging Face. However, running the model requires a high-performance GPU with at least 40GB of VRAM.

Overall, VOID represents a significant step forward in AI-driven video editing, as it combines object removal with realistic physical reconstruction.