Wading into the gene pool
-
- from Shaastra :: vol 04 issue 01 :: Feb 2025

An ongoing study of India's genetic landscape will help researchers understand disease better.
What makes Indians uniquely Indian? A first-phase analysis of more than half of the 10,000 genomes of Indians, sequenced under the ambitious GenomeIndia project representing 69 population groups, has thrown up 7 million genetic variants not found in any global dataset.
The GenomeIndia project is being undertaken by a consortium of 20 scientific institutions led by the Centre's Department of Biotechnology (DBT). In January 2025, it made the sequenced data available to sections of the public. It invited proposals for translational research to address critical personalised and community-level health issues through developing cost-effective screening and diagnostic tools and drug development, and for identifying population-specific genetic risk factors.
GenomeIndia is creating a 'reference Indian genome' that can capture the novelty of the Indian population.
UNIQUELY COMMON
The initial analysis of around 5,750 samples highlights some of the uniqueness of India's genetic landscape. It showed over 135 million genetic variations, most of which are "rare or ultra-rare", yet at least 11% of these variations are common across different Indian populations.
Given that there are around 4,600 population groups in the country, many of them endogamous, more novel variants are waiting to be discovered. For the project, a group of experts whittled the list to 99 groups, representing four major linguistic groups: Indo-European, Dravidian, Austro-Asiatic and Tibeto-Burman. "The groups would maximally represent the Indian population for genetic studies," says DBT Adviser Suchita Ninawe, who is a member of GenomeIndia's Technical Monitoring and Assessment Committee. The first-phase analysis looks at 69 population groups.
BIG DATA
The sequenced data — in several terabytes — is kept in a repository at the Indian Biological Data Centre, Faridabad. Nearly 20,000 blood samples are stored at the Centre for Brain Research, Bengaluru. "Storing and uploading such big data itself are major accomplishments," says Ninawe. GenomeIndia developed a "double-blind" coding system to protect the privacy of the individuals, with first-level coding at the time of collecting the sample and another at the sequencing stage. Ninawe elaborates, "We were clear that we are creating a national resource, and while it will be in a national repository and managed by the government, it will be available to the public." Privacy and consent regulations were therefore prepared accordingly.
The data can currently be accessed only by researchers in Indian institutes. Protocols for sharing the data with international researchers and private companies will be developed presently.
GENETIC LANDSCAPE
The need for Indian genetic datasets has been felt for long. Researchers note that Indian populations exhibit several different disease development patterns, such as in the prevalence of type 2 diabetes or the onset of cancers. Prevailing genome-based diagnostic tools and treatments use Western datasets (To catch a killer).
In 2019, the Institute of Genomics and Integrative Biology (IGIB), New Delhi, sequenced 1,000 human genomes over six months. "The exercise was to demonstrate the technical skills and ability to sequence big data. Analysis of those genomes showed that 30% of the variants were unique to the subcontinent," recalls Sridhar Sivasubbu, a former IGIB scientist involved in sequencing genomes for both projects. Larger datasets will give a better resolution of the genetic landscape, he explains.
Such a dataset will help scientists better understand disease in India. "The carrier frequency of many diseases is related to gene variants. As we move into the era of precision medicine, we will need to understand the genetic landscape of the population," says Rakesh Mishra, Director, Tata Institute for Genetics and Society, Bengaluru. "The basis for starting any nature of genomic treatment is to first have a database."
The 10,000-genome study focused on healthy representatives. In the next phase, GenomeIndia aims to study Indian populations for diseases. These include rare disorders, cancers, neurological problems and lifestyle-associated diseases such as obesity and diabetes, all of which have genetic links. However, the protocols for Phase II are yet to be developed.
The data is the baseline for developing biomedical applications. However, bio-manufacturing may have to wait a bit. First, the data would need to be studied.
See also:
Genomics, and the promise of precision medicine
Designer Genomes
Have a
story idea?
Tell us.
Do you have a recent research paper or an idea for a science/technology-themed article that you'd like to tell us about?
GET IN TOUCH