Pangenome Project: complete and improve human genetic mapping

He Human Genome Project, this double helix letter sequence was a revolution in its time and for many years. But the advancement of science in this area has brought advances that have made it obsolete, as it is incomplete and unrepresentative of human diversity. Now another project pangenome, promises to fill the gaps that existed, as well as show a broader picture of the differences between people.

It all started in 2001, when the Human Genome Project, an ambitious research effort aimed at deciphering the chemical makeup of the entire genetic code, released a preview of the fundamental sequence. In 2003, the international consortium completed the work, with a coverage of 92 percent of the total target. The technologies to decipher the missing 8 percent, which concealed information about relevant biological processes, did not yet exist at the time. Furthermore, this map was based on samples of only twenty volunteers.

Even so, the result became a reference (known in the scientific community by the code GRCh38) and has been, ever since, the backbone of human genomics. For a long time, it was a powerful tool. Over the years, however, it proved to be limited by not being sufficiently broad and representative of the range of human beings, thus not covering complicated technical naming elements such as “structural variants” and “alternative alleles”.

More than two decades later, the Human Pangenome Reference Consortiumanother international initiative that brings together scientists from ten countriescame to correct these problems. The group recently published a series of articles in the scientific journal Nature describing a new model for sequencing the genetic code. More complex, true, but much more complete.

The investigation combines material collected from 47 genetically diverse people to create a fuller picture of the variety that characterizes human beings. About half of those people have african descent. Most of the rest are from Latin America and South and East Asia. Only one had European ancestry. Using advanced computer algorithms, the researchers aligned the corresponding sequences within the various genomes.

He The objective is to increase to 350 the number of individuals analyzed by the middle of the year 2024. Since each one carries a pair of chromosomes, the current reference includes 94 different genomic sequences, with the goal of reaching 700 by the end of the project.

“A genome does not represent all the rich variation that we know can be observed and studied throughout the world,” explains the American geneticist. Karen Crumbof the University of California at Santa Cruz, one of the researchers involved in the project. “The number one goal of the human pangenome is to try to broaden the representation so that it is more inclusive and more equitable in the study of the human species, with a collection of references and not just one,” she points out.

The researchers found more than 1,100 instances of duplication of genes in the pangenome that were missing from GRCh38, the reference after the publication of the Human Genome Project. The pangenome also contains more than 100 million base pairs (the “letters” of DNA) than GRCh38. Using the pangenome to identify small variants in sequencing data reduced errors by 34 percent compared to using GRCh38.

The empty that were left in the above reference now they are completedwith almost 120 million letters of DNA that were previously missing and have now been added to the three billion letter code.

What is it for

In addition to better represent human diversitythe new pangenome will play a vital role in medicine. A genome is the set of instructions for DNA (whose molecular structure was described by the Americans James Watson and Francis Crick, in 1953) that helps all living things to develop and function. The code sequences differ slightly between individuals. In the case of humans, the genetic similarities between two random people are as high as 99 percent. What makes them unique, and here is the wonder of their existence, is the remaining one percent, which can provide information about a person’s health, helping to diagnose disease, predict outcomes, and guide medical treatment.

To understand these differences, scientists create a multiple pattern. “Everyone has a unique genome, so using a single reference can lead to inequities in the analysis,” says Adam Phillippy, a researcher at the National Human Genome Research Institute and co-author of the lead study. And he gives an example: “Predicting a genetic disease might not work as well for someone whose genome is very different from the reference.”

Another application of the new approach will be pharmacogenomics, a branch of pharmacology that studies the response of patients to drugs and disease treatments, taking into account the genetic variation between different individuals. The same medication that may be effective for one person may be harmless or toxic for another. This depends both on the individual’s genome and on the speed his body needs to metabolize the drug. If a fast metabolizer requires a higher dose, it is possible that a slow metabolizer requires an opposite route, taking into account that the active ingredient can accumulate in the body and become toxic.

“All of this depends on our genome,” summarizes Mayana Zatz, a geneticist at the Institute of Biosciences of the University of São Paulo, in Brazil. It is one of the reasons why it is important to know the ethnic composition of each population and to find out which drugs are suitable. The current extraordinary results are only an intermediate stage of the project.”

“It will be the beginning of a new area to more meaningfully incorporate human diversity into the biological sciences,” says geneticist Ting Wang of the University of Washington. The pangenome also marks the beginning of a period in which people’s genomic sequence and variation will be clearly visible. “The new benchmark shows that each of us carries pieces of DNA that are unusual or unique,” says the geneticist.

The American Erik Garrison, from the University of Tennessee, involved in the international consortium. “These are parts associated with immune function or environmental interactions, critical for health.” A scientific benchmark that incorporates citizens of all genetic origins, as is now the case with the pangenome, without any bias, represents an extraordinary leap for civilization. More than that: it reflects the beauty of human diversity.

By Alessandro Giannini (Veja Magazine) and Andrea Gentil

Image gallery

ttn-25