The open-source edge in AI
-
- from Shaastra :: vol 04 issue 02 :: Mar 2025

The open-source movement — with its promise that one need not make big investments to get big results — is making inroads in AI, too.
howindialives.com
howindialives.com is a database and search engine for public data
When DeepSeek launched its artificial intelligence (AI) model in January 2025, many felt it had the potential to disrupt the AI landscape. The suddenness of the revelation from a little-known Chinese start-up was branded a 'Sputnik moment', similar to how the Soviet Union's Sputnik satellite launch in 1957 startled the United States and initiated the space race.
However, Yann LeCun, Chief AI Scientist at Meta and a 2018 winner of the ACM A.M. Turing Award, had a different view. He posted on LinkedIn: "To people who see the performance of DeepSeek and think, 'China is surpassing the US in AI', you are reading this wrong. The correct reading is: 'Open-source models are surpassing proprietary ones'."
Open-source AI typically refers to AI models that give others the freedom to use, study, modify and share. There are debates around whether releasing the weights is enough, or if access to the training data and source code are also needed to make it truly open.
THE OPEN-SOURCE PROMISE
The open-source movement has been around since the 1990s, with its roots going back to the 1950s. Open-source products have seen significant adoption among both organisations and individuals. The primary reason is cost savings, but also the degree of freedom it offers to quickly build on it. According to the 2024 State of Open Source Report, more than a third of organisations have invested in open-source databases and data technologies in terms of projects, budgets or resources. For AI, the number is 11.1%.


The number of open-source foundation models is increasing. According to the Center for Research on Foundation Models, 73% of all foundation models — an AI model that is trained on vast datasets and that has a range of use cases — were open source in 2024, compared to 61% in 2023.


RESEARCH AND REVENUES
While performance varies significantly, the top open-source models are closing the gap. They are matching and, in some cases, outperforming their proprietary rivals — often at a lower cost. It's one reason why DeepSeek spooked the U.S. tech ecosystem, raised questions about the demand for graphics processing units (GPUs), and pushed down the share prices of NVIDIA, the top AI GPU maker. Open-source models also tend to score better in transparency. Since they allow for greater scrutiny, they are seen as more reliable, secure, and more capable of identifying and mitigating bias.



While big money is going to businesses with proprietary models — because they have a clearer path to revenues, and eventually profits — AI research has been gaining momentum. Computational power has improved, growing exponentially over the last decade. There is also a growing focus on practical applications, as AI gets integrated into different fields such as finance, healthcare and medicine.


ADVANTAGE ENGLISH
While DeepSeek was developed in China, one reason it had a global impact was the model was trained in both English and Chinese. English dominates training datasets, even in open source. Some 57% of open-source AI training datasets are in English. As a result, AI models in general tend to perform better in English. This has historical reasons, too. A considerable amount of AI research, development and datasets originate from Western nations, where English is often the primary language.


Open source holds promise. It allows for modification of models to better suit local languages. For instance, Indian start-ups have modified open-source models to train them in Hindi. Since it lowers the cost of entry, it also allows local developers to innovate and create local solutions. The promise of open source is ultimately that one need not make billion-dollar investments to get billion-dollar results.

Have a
story idea?
Tell us.
Do you have a recent research paper or an idea for a science/technology-themed article that you'd like to tell us about?
GET IN TOUCH