Skip to main content
Special Feature

Marking benchmarks

  • from Shaastra :: vol 05 issue 06 :: Jun 2026
Enterprises may want to identify benchmarks most relevant to them.

Researchers rethink AI evaluation to test for real-world deployment.

Postdoctoral researcher Mohammed Safi Ur Rahman Khan is contemplating ways to evaluate artificial intelligence (AI) models within India's cultural context. India, for instance, has seen a surge in UPI and OTP-based scams. Would popular AI models, built in Western societies, still be able to recognise fraud patterns common in India and flag suspicious requests involving Aadhaar details?

At AI4Bharat, a research lab at the Indian Institute of Technology Madras, Safi is part of a team developing IndicLLMSuite, an evaluation suite of benchmarks that tests for safety and capabilities within the Indian context. "We test for the models' capabilities on intents that Indian users care about, from information on applying for licences to government schemes and so on," he says. The suite of benchmarks will be out soon, he adds.

CONTINUE READING

Get unlimited digital access on any device.

Get the print magazine delivered at home.

Subscribe

PAST ISSUES - Free to Read

share-alt-square
Volume 01 Issue 04 Jul-Aug 2022
Read This Issue
share-alt-square
Volume 01 Edition 03 Sep-Oct 2021
Read This Issue
Search by Keywords, Topic or Author

© 2026 IIT MADRAS - All rights reserved

Powered by ADK RAGE