What is meant by viral genome and mutations?
The genome of the SARS-CoV-2 virus is composed of Ribonucleic Acid or RNA, a single strand of 30,000 units (called bases) that make up all its genes. As the virus multiplies inside human cells, it needs to copy its genome to make more viral particles, a replication process. Copying errors are sometimes introduced in this process because the enzyme that copies the RNA is error-prone. These copying errors introduce mutations or changes to the genomic sequence. The changes alter the viral genes, leading to a change in the viral proteins encoded by them.
While most of these changes are either of no consequence or harmful to the virus itself and, therefore, don’t get noticed, some of these changes can have biological effects in terms of affecting the infectivity, transmission and/or other features of the virus that are governed by the viral proteins. Thus, though mutations arise in the normal course of viral replication, they can then get selected for their beneficial properties to the virus. The only means of replication available to a virus is inside its human host; therefore, higher disease spread and increasing infections provide greater avenues for the viral genome to develop multiple mutations and their combinations known as ‘variants’. Over the past year and a half of the pandemic, SARS-CoV-2 has been accumulating mutations at the rate of ~2 changes per month.
What is genomic surveillance, and why is it important?
Continuous monitoring of viral genome variations is handy in tracing the spread of the disease. Such information can be beneficial in containment measures and strategies. Mutations can also influence the course of the pandemic by altering the properties of the virus and generating new variants. A ‘Variant of Interest (VOI) has one or more mutations that are believed to have biological consequences, such as increased transmission or observed in other lineages with detrimental effects. If further assessment predicts or documents the role of a particular VOI in decreasing effectiveness of current measures, for example, evidence for increased transmission, virulence or immune escape, it becomes a ‘Variant of Concern’ (VOC) and is closely monitored. With vaccinations now being undertaken in many countries, mutations conferring immune escape are of particular concern. This is the phenomenon by which the virus acquires the ability to bypass host immunity that has been either acquired by previous SARS-CoV-2 infection or conferred by vaccination.
Genomic surveillance allows researchers to examine the genome sequence of the viral strains infecting the population. Thus, we can track the effects of emerging mutations by sequencing the viral genomes and associating new changes with significant differences in viral circulation trends. Carrying out genomic surveillance in a populous country like India requires rapid and temporal sequencing of viral genomes across multiple states to identify potential concerns and mitigate the spread of the virus. For this purpose, the Indian SARS-CoV-2 Genome Sequencing Consortia (INSACOG) was formed towards the end of 2020 to sequence a fraction of positive cases to follow the emergence of variants, etc. The viral genome sequences are deposited in a global public database called Global Initiative on Sharing All Influenza Data (GISAID) to enable data-sharing among researchers for analysis.
India’s genome sequencing capabilities
India has constituted a lab network of academic institutions under the INSACOG to expand genomic surveillance. CSIR-CCMB (Centre for Cellular and Molecular Biology) is a partner institute in this initiative and has been spearheading the COVID-19 testing and sequencing of viral samples from infected cases, especially across Telangana and Andhra Pradesh. We have sequenced ~4000 viral genomes and analyzed viral variants circulating across the country to follow emergent mutation trends. The sequences and mutations analysis is hosted on an interactive dashboard, GEAR-19 (Genome Evolution Analysis Resource for COVID-19 ), with an easy graphical user interface. The data is publicly available to researchers across the country and the globe.
GISAID currently hosts 18,73,981 SARS-CoV-2 genome sequences globally. Approximately 20,000 sequences of viral genomes isolated from Indian patients are currently available on GISAID, with INSACOG having submitted ~13,000 sequences in the last five months since its inception. This has been possible due to the large network developed with major hospitals and COVID testing centres, and state governments over the last year of the pandemic for access to patient samples and epidemiological data. However, given the country’s population and COVID-19 case burden, the number of sequences available for the country remains disproportionately low, and insights from existing efforts need to be sharpened to keep pace with the rapidly evolving situation. We are now incorporating further measures to upscale genomic surveillance programmes across the country by ramping up sequencing efforts, bioinformatics and data sharing, especially from rural setting and areas affected more in the second wave of COVID-19.
One thing that has emerged during this pandemic is the use of sophisticated techniques to address crucial questions related to the epidemiology and dynamics of this disease. Diagnostics that was costing about Rs 4,000 per test is now available at under Rs 200 per test. We can track the level of infection in the population from the sewage sample. These are so cheap and effective that future surveillance is going to be of much greater effectiveness. Genomics is going to be the fundamental tool for handling not only infectious diseases but also for personalized and precision medicine. India should develop and indigenize its genome sequencing capabilities over the next 2-3 years, given that we may have to live with epidemics and pandemics on a more regular basis. Such capabilities and surveillance will be crucial in pre-empting the catastrophes and reducing the damage.