Special Feature

Believe it or not

SWETA AKUNDI

Feb 2026
from Shaastra :: vol 05 issue 02 :: Feb 2026

Detectors are upping the ante to spot realistic-looking deepfake videos.

It was a Sunday morning like any other, and Rohit Kundu, an artificial intelligence (AI) and computer vision researcher, was catching up on the lives of others — those once close, now usernames on his social media. That's when he stumbled upon a picture of his childhood friend with football legend Lionel Messi, who was visiting Kundu's hometown of Kolkata. For one whole day, the researcher wondered how his friend had come to meet the Argentinian player — until he realised he was looking at an AI-generated photo. The irony was not lost on him: modern generative AI (Gen AI) models were so sophisticated that they could even fool people like him, those studying AI for a living.

Just last year, Kundu, a PhD candidate at the University of California, Riverside (UCR), developed a model to improve deepfake detection. His team at UCR had started working on deepfakes during the 2024 U.S. elections, when AI was used widely to spread political misinformation. While deepfake detection platforms have been around for almost half a decade, they have mainly dealt with images and videos that morph authentic media with false identities — using techniques such as face-swapping and voice cloning. However, in the past two years, the exponential improvement in generative AI has led to synthetic content that no longer needs an authentic source. In 2025, Google's Nano Banana Pro and OpenAI's Sora 2 took text-to-video models to new levels of realism. And deepfake detectors had to up their game.

Earlier, deepfake videos typically had front-facing people, possibly speaking to the camera. Detectors would thus focus on faces to assess whether they were generated by a machine. Kundu tested this theory out on state-of-the-art detectors. "When we had the models do a full-frame analysis of AI-generated videos, so that it [a model] could take hints from background elements, it would still look only at the foreground, at the faces in the video," says Kundu. To counter that, his team developed an 'attention-diversity loss' technique. "Basically, we force the attention of the detecting models to be spatially diverse and look at different regions within the video frame," he explains.

The model would simultaneously look at the face and hands, as well as the background. Called the Universal Network for Identifying Tampered and synthEtic videos (UNITE) model, it can, unlike traditional detectors, capture full-frame manipulations in scenarios without faces and with non-human subjects.

Kundu has now moved on to developing techniques beyond detecting AI videos to identifying the generative model used, the type of prompt and the developers. His framework combines computer vision models with context provided by large language models. "For actual forensic use, we want to move towards more interpretable solutions. If a video has been flagged as fake, then we would like to know the reason why," he says.

WATERMARKING WAYS

Fakes span a spectrum. AI-made doorcam footage of animals is especially popular on social media. And while these AI-generated animal videos — of rabbits bouncing on a trampoline, for instance — are mostly harmless, deepfakes that lead to financial scams or defame people are malicious or even criminal. Social media platform X's chatbot Grok recently made sexual deepfakes out of publicly available pictures of women, triggering a global outcry. In Pune, a deepfake detection platform called pi-labs worked with the Maharashtra police to detect videos implanted as false evidence and alibis to mislead investigations.

This is the dichotomy of AI development: open sourcing helps democratise models, but it also enables bypassing safeguards.

Gen AI companies recognise their responsibility to flag their content as machine-generated. However, even if information about its creation is available in the media's metadata, this can be stripped away as it is edited and re-shared. This is why videos made with closed-source models include invisible digital watermarks. Google's AI content has SynthID — a watermark embedded into the pixels of photos and videos. It slightly alters the pixel values in patterns imperceptible to the human eye but identifiable to detectors. Meta, on the other hand, has made its digital watermark open source; any generative AI model that wishes to label its content can use the Meta Seal.

These digital watermarks are still vulnerable to attacks as they are detectable. It is a problem that researchers are trying to solve. Sam Gunn at UC Berkeley and Kareem Shehata at the National University of Singapore are separately working on undetectable watermarks. Diffusion models generate images and videos from randomness, and such undetectable watermarks are baked into the latent noise in the models, even before the pixels are generated.

Still, as of now, users can remove most watermarks by simply recreating the video using an open-source model that does not add them. This is the dichotomy of AI development: open sourcing helps democratise models, but it also enables bypassing safeguards.

START-UPS CATCH UP

In November 2025, Abhijeet Zilpelwar, Chief Technology Officer of pi-labs and creator of pi-authentify, the start-up's detection platform, received a video of a tiger attack from his family living near the Brahmapuri Forest Range in Chandrapur, Maharashtra. The district has witnessed multiple human-tiger conflicts, and so the video, showing a tiger dragging away a man from a forest guesthouse, raised alarm bells. This particular video, however, was fake. Zilpelwar used pi-authentify to confirm that it was AI-generated, and spokespersons from the forest department, too, explained why the tiger's movement looked unnatural.

"Modern Gen AI models are embedding Indian context so accurately, and that is what makes them scary," says Zilpelwar. Gen AI models are trained on public data, and a lot of Indian data is public. "There's no 'great wall' like in China. Everything we share on social media or the internet is available for training. So there's nothing like a lack of Indian context for these models," he says, explaining why Gen AI videos are doing well in the country.

To keep up, Indian deepfake detectors have to constantly research modern generation techniques. The checks involve continuous red teaming, in which the detector's vulnerabilities are tested through internal attacks. At pi-labs, a data team is responsible for training the detector to distinguish between AI-manipulated and AI-generated videos. The team scrapes AI content from the web, generates content in-house, and uses open-source deepfake generators to create training data.

TELLTALE SIGNS

It's all about visual realism.

True-to-life deepfake posts have prompted researchers to question the way artificial intelligence (AI) works. Does AI learn world models that discover laws of physics, or is it just a sophisticated pixel predictor that achieves visual realism without understanding the physical principles of reality? Google DeepMind's Physics-IQ benchmark dataset found that current AI models' understanding of physical principles — across fluid dynamics, optics, solid mechanics, and thermodynamics — is poor, despite their visual realism.

This lack of grounding physical laws in AI videos can be spotted through different tells: sudden disappearing shadows, ripples of water flowing in the wrong direction, or just a blatant disregard for gravitational and frictional forces. Content creators such as Jeremy Carrasco, a former technical producer, have garnered a huge following on social media by pointing out such inconsistencies in AI videos.

Piyush Verma, Chief Executive Officer of deepfake detection company Neural Defend, says that with foundational models improving at an unprecedented pace, the company has had to invest heavily in research and update their models cyclically. "When a new image-generation method comes out, like Google's Nano Banana, we analyse it, check whether our system already detects it, and if not, we add an additional detection layer," he says. While banking and digital forensics are the most common cases for Indian deepfake detectors, the eerie accuracy of modern Gen AI videos has Neural Defend working with media houses as well. "News channels need to verify content before broadcasting it," he says. The start-up is currently working with a major media group, which has made its product live in 190 countries, he says.

But for how long will flagging AI videos be a reactive measure? This is the question on Kundu's mind. "I would like to build a system where harmful video creation is interrupted during its generation," he says. "Obviously, harmful prompts are already blocked, but cleverly written prompts can still find their way around it [detection]. That is where the research challenge lies in AI safety."

Name

Your Comments

Your Name

Your Email

Are you an alumnus of IIT Madras?

Yes

Please let us know your

Year of Graduation

Department

Send me updates on new articles on Shaastra

Name

Are you an alumnus of IIT Madras?

Yes

Please let us know your

Year of Graduation

Department

Country of Residence

Educational Profile

Work Profile

Send me updated on new articles on Shaastra

Believe it or not

WATERMARKING WAYS

START-UPS CATCH UP

TELLTALE SIGNS

LEAVE A COMMENT

Other Articles

A space start-up takes off into history

Other Articles

1992: the world's first SMS triggered a communication revolution

Other Articles

Support systems for start-ups

Have a story idea? Tell us.

Could you tell us a little more about yourself?

Already given us your details?

Could you tell us a little more about yourself?

Have a
story idea?
Tell us.