A groundbreaking library has been created that hopes to represent all of humanity and ensure that no genetic variations are left out.
The entire human genome was first sequenced in April 2003 as part of the landmark Human Genome Project, and the news was shared with well-deserved fanfare. It provided fundamental information about the human blueprint (DNA) and showed that we all share 99.6 percent of genetic information.
Now twenty years later, scientists are still working to decipher the remaining 0.4 percent, which represents the variety and diversity of the human species. There is good news however, as researcher from many countries including Canada, Denmark, Germany, Italy, Japan, Spain, the UAE, the UK, and the US, have now created what they have dubbed the human pangenome reference consortium. This is a collection of sequenced human genomes with the aim to represent a wide variety of DNA sequences found across Homo sapiens. It is the first step in comparing genetic variation that will enable us to understand how our genes vary and mutate.
What is a genome?
A genome is all the genetic information of an organism. It consists of nucleotide sequences of DNA (or RNA in RNA viruses).
What is the pangenome?
The pangenome is a composite (or the beginnings of a library) of genome sequences from 47 people compiled into one data structure. It is essentially a map of genetic code and represents human beings of diverse ethnic backgrounds. These 47 genomes were compared to assess what is identical and what is different among people. The idea is for it to serve as a reference for researchers trying to predict, diagnose and treat diseases for a variety of people. It will also help to better understand genetic variation across our species.
According to Nature News: “Many of the samples being analysed are from people who took part in the 1000 Genomes Project, a sequencing effort initiated in 2008 to map genetic variation across 26 diverse populations. The participants’ frozen DNA samples are being defrosted and reanalysed by the pangenome consortium using a more detailed technique called ‘long-read sequencing’. This analyses longer sections of DNA at a time compared with older sequencing methods, and can distinguish between chromosome pairs from the same person. “It’s a much higher-resolution approach,” says Keolu Fox, a genome scientist at the University of California, San Diego..
While the Human Genome Project was itself groundbreaking, it sourced material from 20 people (mostly European) and the majority of its information was obtained from one mixed-race person. In the pangenome a composite of different genomes is compared to highlight variations.
“It’s been long needed—and they’ve done a very good job,” Ewan Birney, a geneticist at the European Molecular Biology Laboratory who did not contribute to the work, tells the New York Times’ Elie Dolgin.
“It’s an exceptional advance,” Mashaal Sohail, an evolutionary geneticist at the National Autonomous University of Mexico who was not involved in the research, tells Science’s Rodrigo Pérez Ortega. “It’s making the picture of human genetic variation more accurate and more complete.”
“It’s something that we have all been waiting for,” says Aimé Lumaka, a geneticist who holds a joint position at the University of Liège in Belgium and the University of Kinshasa in the Democratic Republic of the Congo. “The current reference genome is missing not only part of the genomic information but, most importantly, it’s missing diversity,” he says.
What can the pangenome be used for?
Such a reference genome can prove to be crucial for doctors and medicine. It can help to identify mutations in patients and to diagnose genetic conditions.
“We’re missing quite a bit of information that can contribute to our knowledge of health disparities and health inequities,” Krystal Tsosie, a genetic epidemiologist at Arizona State University, tells Science.
What is next for the pangenome?
While this is a great step forward, it needs a lot more representation of global human diversity. “It’s still underrepresenting Latin Americans and Indigenous Americans, and … there’s nobody included from Oceania,” O’Connor tells Science News. “There’s still a lot more variation that needs to be added to the pangenome to really, truly be representative of everyone.”
Because of this, the scientists aim to expand the reference genome to include sequences from 350 different people by 2024.
And they are ensuring that ethical concerns are also being addressed, especially since the Human Genome Project was criticized for not engaging properly with the marginalized communities that they were sampling.
“I am concerned that many of the participating pangenome locations have samples that were collected in the 1980s under very different political and social structures,” Latifa Jackson, a geneticist at Howard University, tells Nature News. “We need to revisit ideas of consent, especially for samples collected 30–40 years ago under very different power structures.”
The draft genome, was produced by the Human Pangenome Reference Consortium, which was aunched in 2019. It is an international project that aims to “map the entirety of human genetic variation, to create a comprehensive reference against which geneticists will be able to compare other sequences. Such a reference would aid studies investigating potential links between genes and disease.”
The research was published in Nature.