Skip to main content
Special Feature

When small is big

  • from Shaastra :: vol 04 issue 03 :: Apr 2025
Research into foundational large models is necessary, but at a more immediate level, smaller models are important, reckon researchers.

Alongside building large language models, researchers are using smaller models to address India's linguistic and cultural diversity.

The internet was designed for English speakers. It was a belief in the need to change this status quo that drove Akshat Prakash to set up CAMB.AI in 2022. The start-up, founded in Dubai, offers voice translations driven by artificial intelligence (AI) – from live-streaming sports tournaments in multiple languages to dubbing movies. At a time when the world is focused on large language models (LLMs) to move towards general machine intelligence, CAMB.AI decided to think smaller: it created small language models (SLMs), designated to focus on the particular tasks of voice-to-text and text-to-voice translations.

SLMs are miniature versions of LLMs; to function, they need far fewer parameters – the weights and biases in a neural network that determine how the model reaches its output. Models like Gemini have up to 27 billion parameters, and GPT-4 is estimated to have over a trillion. For SLMs, the order is often limited to millions of parameters. A model with a high number of parameters could potentially solve more complex problems, and for long, scaling has been the way to go in AI development. However, this often comes with a high cost of resources like graphics processing units (GPUs) to train the model; it also needs a lot of memory power.

Even newer models like DeepSeek's V3 and R1, which claim to work with a tenth of the parameters that the rest do, need billions of dollars to be trained. Prakash believes this to be a key hurdle in resource-strapped countries. "If I'm building a large language model and I potentially have to serve a billion people, I'm going to go bankrupt in training it and doing the inference cost for it. But if I can build a small model that I can just ship to your phone, I can scale it like any other app," he says.

While horizontal research into foundational large models is necessary, at a more immediate level, smaller models are important, says Prakash. "SLMs are economical, faster to train, and you can get a specialised team to create it with low resources. So, their deployment also becomes easier," he adds. CAMB.AI's  models have 80 million parameters and can fit in as a standalone app on a mobile phone, taking up about 300 MB of space.

For many start-ups building their own SLMs, the idea is to focus on just one task and do it well. Pushpak Bhattacharyya, Professor of Computer Science and Engineering at the Indian Institute of Technology (IIT) Bombay, has observed the mushrooming of research in SLMs in both academia and industry. "I am trying to popularise the term 'trinity models' to address them," he says. 'Trinity models', as he puts it, have three main aspects to them: domain, task and language-specificity.

"If I build a small model that I can just ship to your phone, I can scale it like any other app," says Akshat Prakash, Founder and CTO of CAMB.AI.

Take, for example, a question-answering system for farmers in Karnataka: the domain would be agriculture; the task would be answering questions with sentiment analysis (to understand the tone and context of the queries); and the language would be Kannada. Similarly, there could be a financial adviser model for banks giving out small-scale loans to rural entrepreneurs in Manipur. "In a country like India, if there were just 20 labs, each working on specific small language models, that could actually cover most of the use cases that we as a society care about," believes Prakash.

Domain-specific models are effective at handling specific terminology and language typical to those domains. To Bhattacharyya, the most attractive aspect of such varied domain-specific models is their possible ability to talk to each other. "If they can interact with each other, they can give an impression of a large foundation model," he says. It could be like a team of experts exchanging information and routing the question to the appropriate model. The answer could be refined with this sort of communication between specialised models.

Instead of a medical LLM, for example, multiple SLMs can work together, each focusing on a different requirement, from symptom detection to billing. While the Indian government is pushing researchers to build an indigenous foundational model, Bhattacharyya believes there will be two parallel paths towards that goal. "One will be a more slow-paced building of foundation models. The second will be these trinity models mushrooming and interacting with each other," he says.

AI GOES RURAL

A big reason for SLMs' popularity in India is that they take into account the country's inter-State diversity of culture, language and challenges. On the one hand are projects driven by academia – like VISWAM.ai from the International Institute of Information Technology Hyderabad, which is building for the Telugu language; and the Ganga-1B model from IIT Gandhinagar, which is doing the same for Hindi. Such models crowdsource data, through voice recordings of people and by trawling local news articles and books, to harness the diversity of dialects within each language. In November 2024, when the IIT Madras Pravartak Technologies Foundation launched the Centre for Human-Centric Artificial Intelligence, one of its key research focuses was on developing small, domain-specific models in Indian languages.

The interest in SLMs for the Indian context extends beyond the country's academia to start-ups and multinational companies investing here. "India has substantial cultural diversity. So, the entire AI development pipeline — from data collection to modelling and evaluation — must be adapted to address this diversity," says Sunayana Sitaram, Principal Researcher at Microsoft Research India, whose work focuses on democratising multilingual AI. "It may be complex for a single large model to manage all these scenarios without customisation," she adds.

Smartphone penetration in rural areas opens doors for mobile-based AI applications, and companies have been quick to leverage this.

Over the past year, Microsoft has been building its family of small models, called Phi, and open-sourcing them. "Our work in small models began with the exploration of the smallest model needed to be able to understand language," says Sitaram. For Microsoft, it was about pushing away from the scaling trends to asking what was the best performance one could get with the smallest capacity of GPUs. "We do this by thinking about the quality of data and investing in synthetic data generation."

Companies like Microsoft use large models to synthetically generate data, which can then be used to train smaller models. This kind of teacher-to-student training of SLMs by LLMs is common across big tech companies today. Google has Gemma 2 with 2 billion parameters; Mistral has a version with 7 billion parameters; and Meta's Llama is also downscaling with versions under 3 billion parameters.

Beyond the task of language processing, small models are now looking to be better equipped at reasoning, which has so far been best done by large models with higher processing power. In March 2025, Fractal Analytics, which was founded in Mumbai and is now based in New York, proposed building India's first series of reasoning models, including a small model of 2-7 billion parameters. Microsoft, too, sees potential in this: in a paper published in January 2025 (bit.ly/SLM-math), Microsoft researchers outlined a technique to improve math solving and reasoning capabilities in small models.

With small models choosing depth over width and consequently needing much fewer resources to do the required task, AI can make headway in rural areas with poor internet infrastructure. In June 2024, Meta released a paper (bit.ly/meta-sub-billion) addressing the need to develop language models in under a billion parameters, driven by concerns over increasing cloud costs and latency. Instead of relying on the data width and the parameter quantity to determine the model's quality, the Meta team emphasised the significance of deep and thin architectures. The resultant series of models, called MobileLLMs, used 350 million parameters and still performed almost as well as Llama 7B in certain tasks. For Meta, this meant being able to deploy its models on mobile devices.

"The increasing penetration of smartphones in rural areas opens up opportunities for mobile-based AI applications," says Sitaram. Her team is interested in applications that can deliver skill development content in local languages to revolutionise agricultural practices, and democratise access to financial services. Microsoft is collaborating with agri-tech companies like ITC to leverage the Phi small models in apps that provide agricultural services and answer farmers' queries in voice and text, even in remote areas with limited internet connectivity. "The road to mobile-first AI and Edge AI in rural India is promising," she says.

LEAVE A COMMENT

Search by Keywords, Topic or Author

© 2025 IIT MADRAS - All rights reserved

Powered by ADK RAGE