A collaboration between Phase Genomics and Pacific Biosciences of California is bringing about the next generation of genome assembly technology. A newly published software tool, FALCON-Phase, combines genomic proximity ligation methods developed by Phase Genomics™, with the high accuracy, long-read sequencing data from PacBio®, enabling researchers to create haplotype-resolved genome sequences on a chromosomal scale, without having parental genome data. This method and its application to several animal genomes was published today in Nature Communications.
Humans, as well as other animals, carry DNA sequence copies from both parents. These parental sequence “haplotypes” can carry millions of mutations unique to one of the parents and are often very relevant to diseases and other genetic traits. Until recently, accurately separating paternal and maternal mutations on the whole-genome scale required sequence information from the individual parents or extensive efforts that relied heavily on imputation from population studies. The new method employs the physical proximity information captured by proximity ligation (a technology also known as “Hi-C”) to separate maternal and paternal haplotype information from long-read genome assemblies. This development significantly increases the actionable information content coming out of genome sequencing studies.
“It’s an exciting time for genome assembly and PacBio HiFi sequencing continues to lead the way in this area with its powerful combination of read length and accuracy,” wrote Jonas Korlach, Chief Scientific Officer at Pacific Biosciences. “Phase Genomics Hi-C complements PacBio technology by extending our data into the ultra-long-range domain, enabling us to connect phase blocks and deliver chromosome-scale diploid assemblies without parental data. We are fortunate to have this excellent partnership with Phase Genomics, and we look forward to continuing to work together to create the highest quality reference genomes available.”
Assembling two fully-phased genomes in a single, streamlined process not only saves on the costs of research, but it also enables scientists to upgrade their genome assembly pipelines and obtain previously unobtainable information.
Dr. Erich Jarvis, professor at Rockefeller University and chair of the international Vertebrate Genomes Project, wrote, “Chromosome-scale haplotype phasing is critical for generating accurate genome assemblies and for understanding genomic variation within a species.” Furthermore, FALCON-Phase produces maternal and paternal haplotypes without family-trio data, so it can be applied to wild-caught samples or organisms lacking pedigree information. Jarvis notes, “In wild populations that many work with, parental samples are usually unavailable and therefore we need a method that can phase paternal and maternal sequences in the offspring individuals. With FALCON-Phase, we are able to use the Hi-C data that we have already generated for genome scaffolding and add a new dimension to every genome assembly, even retrospectively for previous projects. Our collaboration with Phase Genomics and PacBio has been extremely fruitful and the combination of the two technologies through FALCON-Phase will be highly beneficial to genomic sequencing efforts focused on conservation.”
FALCON-Phase is applicable to any diploid genome, including plants, animals, and fungi. It is available as free of charge open-source software (https://github.com/phasegenomics/FALCON-Phase) and Phase Genomics offers services that include the application of this method to varying genome projects. See the latest news and publications on this and other genome assembly methods at https://phasegenomics.com/resources-and-support/publications/.
For more information, email us at info@phasegenomics.com.