Genetic Maps Reveal Biological Blind Spots Due to European Ancestry Overrepresentation

Edited by: Katia Cherviakova

A significant scientific finding, detailed in a study published in Nature Communications in 2025, has identified inherent biological blind spots within the foundational human genetic maps utilized across modern biomedicine. This limitation stems directly from the historical and ongoing overrepresentation of individuals of European descent in established genetic catalogs, resulting in a biased representation of the human genome for substantial global populations.

Researchers from the Barcelona Supercomputing Center (BSC) and the Centre for Genomic Regulation (CRG) provided empirical evidence showing that established catalogs systematically omit thousands of RNA transcripts uniquely specific to populations originating from Africa, Asia, and the Americas. Pau Clavell-Revelles, the study's first author, likened the current gene map interpretation tool to corrective lenses that selectively filter out crucial visual information, thereby skewing the perception of genetic variation. Roderic Guigó, a principal co-author at CRG, reinforced this by noting that the overwhelming majority of genetic sequencing data has historically originated from European cohorts, causing reference catalogs to frequently omit genes or transcripts exclusive to non-European ancestries.

This representational gap has tangible clinical consequences, illuminating known discrepancies such as the documented variation in how African children respond to specific asthma bronchodilators or differing adverse reactions in Asian patients to common anticoagulant medications. To uncover this previously hidden biological landscape, the team analyzed blood cells sourced from 43 individuals across eight distinct global populations using advanced long-read RNA sequencing technology. The investigation identified approximately 41,000 potential RNA transcripts entirely absent from official, established catalogs, with a substantial 2,267 transcripts found to be exclusive to a single population group of African, Asian, or American heritage.

Furthermore, 773 of these newly detected transcripts originated from regions previously categorized as non-coding, suggesting the potential existence of hundreds of uncharacterized genes, with the study specifically pinpointing up to 476 novel genes. Dr. Marta Melé, a principal co-author affiliated with the BSC, highlighted the clinical relevance, observing that a significant proportion of these ancestry-biased transcripts are situated within genes already implicated in complex conditions like autoimmune disorders, asthma susceptibility, and metabolic traits. The team confirmed the presence of a specific variant of the SUB1 gene, critical for DNA repair, exclusively in individuals of Peruvian descent—a variant ignored by existing reference maps.

The authors issued a formal recommendation advocating for a concerted, global mobilization effort to construct a comprehensive human pantranscriptome, defined as a complete catalog detailing every RNA molecule across all human tissues, developmental stages, and population groups. The computational analysis, involving over ten terabytes of sequencing information, was managed by the BSC's MareNostrum 5 supercomputer. Dr. Melé clarified that while the pangenome offers insight into static DNA diversity, the pantranscriptome reveals which gene expressions are important in every cell, offering a dynamic view.
Dr. Guigó concluded that this breakthrough likely represents only the tip of the iceberg, as the current study was constrained to blood cells from mature adults, suggesting the deficit in complete data is significantly impeding progress in personalized medicine.

21 Views

Sources

  • Barcelona Institute of Science and Technology

  • PubMed

Did you find an error or inaccuracy?

We will consider your comments as soon as possible.