Sequence of 64 complete human genomes to better capture genetic diversity

DNA sequence

Structure of genome. Credit: NIH

64 consecutive human genomes will serve as new reference for genetic variation and propensity for human diseases

Researchers at the University of Maryland School of Medicine (UMSOM) co-authored a study published in the journal today Science, which describes the sequence of 64 complete human genomes. This reference data contains individuals from around the world and better captures the genetic diversity of the human species. The work will enable, among other things, population-specific studies on genetic predispositions to human diseases, as well as the discovery of more complex forms of genetic variation.

Twenty years ago this month, the International Human Genome Sequencing Consortium announced the first draft of the reference sequence of the human genome. The human genome project, as it was called, required 11 years of work and involved more than 1,000 scientists from 40 countries. However, this reference did not represent a single individual, but was a compilation of people who could not accurately capture the complexity of human genetic variation.

Based on this, scientists have conducted several sequential projects over the past twenty years to identify and catalog genetic differences between an individual and the reference genome. These differences usually focused on small single base changes and missed larger genetic changes. Current technologies are now beginning to detect and differentiate larger differences – structural variants – such as the insertion of new genetic material. Structural variants are more likely than smaller genetic differences to interfere with gene function.

The new finding in Science has announced a new and significantly more comprehensive reference dataset obtained using a combination of advanced sequencing and mapping technologies. The new reference dataset reflects 64 composite human genomes, representing 25 different human populations from around the world. It is important that each of the genomes be compiled without guidance from the first human genome. As a result, the new dataset can better determine genetic differences between different human populations.

‘We have entered a new era in genomics where whole human genomes can be followed up with exciting new technologies that make more comprehensive and accurate reading of the DNA bases, ”says co-author of the study, Scott Devine, PhD, associate professor of Medicine at UMSOM and faculty member of IGS. “It allows researchers to study areas of the genome that were previously inaccessible but relevant to human traits and diseases.”

The Institute of Genome Science (IGS)’s Genome Resource Center (GRC) was one of three follow-up centers, along with Jackson Labs and the University of Washington, which generated the data using a new sequencing technology recently developed by Pacific Biosciences. The GRC was one of only five early access centers asked to test the new platform.

Dr. Devine led the sequence of this study and also led the subgroup of authors who discovered the presence of ‘mobile elements’ (i.e. pieces of DNA that can move and place in other parts of the genome). Other members of the Institute for Genome Sciences (IGS) at the University of Maryland School of Medicine are among the 65 co-authors. Luke Tallon, PhD, scientific director of the Genomic Resource Center, spoke with Dr. Devine worked to generate one of the first human genome series on the Pacific Bioscences platform that contributed to this study. Nelson Chuang, a graduate student in Dr. Devine’s laboratory also contributed to the project.

“The important new research shows a giant step forward in our understanding of the basis of genetically-driven health conditions,” said E. Albert Reece, MD, PhD, MBA, Executive Vice President of Medical Affairs, UM Baltimore, and John Z. and Akiko K. Bowers a prominent professor and dean, University of Maryland School of Medicine. “These advances will hopefully fuel future studies to understand the impact of human genome variation on human diseases.”

Reference: “Haplotype-resolved diverse human genomes and integrated analysis of structural variation” by Peter Ebert, Peter A. Audano, Qihui Zhu, Bernardo Rodriguez-Martin, David Porubsky, Marc Jan Bonder, Arvis Sulovari, Jana Ebler, Weichen Zhou, Rebecca Serra Mari, Feyza Yilmaz, Xuefang Zhao, PingHsun Hsieh, Joyce Lee, Sushant Kumar, Jiadong Lin, Tobias Rausch, Yu Chen, Jingwen Ren, Martin Santamarina, Wolfram Höps, Hufsah Ashraf, Nelson T. Chuang, Xiaofei Yang, Katherine M. Munson, Alexandra P. Lewis, Susan Fairley, Luke J. Tallon, Wayne E. Clarke, Anna O. Basile, Marta Byrska-Bishop, André Corvelo, Uday S. Evani, Tsung-Yu Lu, Mark JP Chaisson, Junjie Chen, Chong Li, Harrison Brand, Aaron M. Wenger, Maryam Ghareghani, William T. Harvey, Benjamin Raeder, Patrick Hasenfeld, Allison A. Regier, Haley J. Abel, Ira M. Hall, Paul Flicek, Oliver Stegle, Mark B. Gerstein , Jose MC Tubio, Zepeng Mu, Yang I. Li, Xinghua Shi, Alex R. Hastie, Kai Ye, Zechen Chong, Ashley D. Sanders, Michael C. Zo dy, Michael E. Talkowski, Ryan E. Mills, Scott E. Devine, Charles Lee, Jan O. Korbel, Tobias Marschall en Evan E. Eichler, 25 February 2021, Science.
DOI: 10.1126 / science.abf7117

Source