Tag: proximo

Catching Evolution in the Act

Scientist studying chromosomes

 

Genome sequencing has confirmed some long-held theories about the blueprints of life. But it has also unearthed quite a few surprises. Scientists once hypothesized that the human genome consisted of upward of 100,000 genes. The decades-long Human Genome Project — as well as many next-generation sequencing studies — have prompted the downward revision of that figure to a relatively spartan 20,000 genes, more or less.

 

Evolution in action

 

If there is a lesson in this vast overestimation to our gene load, it is perhaps that evolution shapes genomes in unexpected ways.

 

The advent of more nimble and lithe methods for genome assembly and analysis holds the promise to unearth the surprises that evolution has wrought. These relatively new advancements include tools like Phase Genomics’ ultra-long-range sequencing, which reconstructs the sequence of chromosomes by using positional relationships between DNA sequences in the genome. These methods have grown sufficiently sophisticated to catch the quick transitions that transform populations and species.

 

Recently a team led by Dr. Leonid Kruglyak at UCLA employed these tools to catch evolution at work. Their discovery relates to sex determination, a complex developmental process that, in animals, generally kicks off when an immature gonad develops into either testes or ovaries. In humans and many animals, sex determination is governed largely by genes, and in turn shapes their genomes and evolutionary trajectories like few other biological processes can.

 

That special pair

 

For species with full genetic control over sex determination, the process often leaves its imprint on the genome in the form of sex chromosomes. In most animals, genomes consist of pairs of chromosomes called autosomes. But in addition to those autosomes, many animals — including us — harbor another set of chromosomes called the sex chromosomes. Sex chromosomes govern — or at least try to govern — whether the gonads develop into ovaries or testes, which  in turn influences the development of genitals and secondary sex characteristics.

 

Scientists have long theorized that sex chromosomes evolve from autosomes. Studies of young, relatively new sex chromosome systems, like those in the medaka, indicate that the transition happens fast. Yet the steps that transform a pair of autosomes into sex chromosomes are at best murky, with many questions unresolved. Much could be answered by catching this transition from autosome to sex chromosome in the act.

 

Behind the curtain

In a paper published June 1 in Nature, Dr. Kruglyak and his colleagues announced that they have found just such a transition: an animal with a pair of autosomes that is beginning to act like sex chromosomes. The researchers utilized Phase Genomics’ Proximo™ genome scaffolding platform and PacBio long reads to sequence and assemble a highly complete genome for a microscopic, freshwater flatworm, Schmidtea mediterranea. In many parts of its natural habitat across the Mediterranean basin, S. mediterranea reproduces by budding, without the need for sex. But some populations in Corsica and Sardinia produce the next generation through sexual reproduction.

 

The team, including lead and co-corresponding author Dr. Longhua Guo at UCLA, discovered that in these sexual strains of S. mediterranea, one pair of autosomes shows evidence of almost no genetic exchange, also known as recombination, during reproduction. This is a telltale signature of sex chromosomes. In addition, they saw that the unusual pair of autosomes harbors a large contingent of genes that play a role in developing sex-specific characteristics. Taken together, these genomic data finger these autosomes as a “sex-primed” pair that are in the process of evolving into fully fledged sex chromosomes.

 

Photo finishes

 

Future studies of S. mediterranea’s nascent sex chromosomes will likely fuel fresh inquiry and debate about this rarely-seen evolutionary transition. The answers will stretch far beyond flatworms. Studies of other recently evolved systems, such as in stickleback fish, show that sex chromosomes can play a decisive role in other poorly understood evolutionary transitions, such as the rise of a new species.

 

Beyond sex chromosomes, this study demonstrates the raw interrogative power of modern genome assembly and analysis methods. They can capture transitions — even the most brief and ephemeral. Applied appropriately, methods like these can help scientists make sense of a myriad of messy, complex processes that evolution shapes. These include some issues that hit as close to home as gonads, from curbing the spread of antibiotic resistance to protecting pollinators from annihilation. Evolution moves quickly. Now, so can we.

 

100 Publications

 

 

Over 100 scientific papers have been published using Phase Genomics technology!

 

Since our founding in 2015, we have sought to bring transformative change to research, industry, and the clinic by building and providing cutting-edge genomic solutions to scientists all over the globe. Now, in 2021, we are happy to look back at the accomplishments made by those using our kits, services, and software.

 

Our team of researchers, computational scientists, and bioinformaticians have refined our ProxiMeta and Proximo Platforms (as well as many other products) to construct platinum genomes, master the microbiome, and expand our knowledge of the human genome and epigenomics. From potatoes to people, cassava to cannabis, bison to basenjis, our molecular tools and software have been used to drive genomic discoveries across many scientific fields. We encourage you to take a look at the fascinating collection of articles we have compiled here that explore more research using our technology.

 

Over the years, we have also helped break records and make headlines as researchers use our platforms to make breakthroughs in science.

 

A Question Hidden in the Platypus Genome: Are We the Weird Ones?

-The New York Times

Phase Genomics Releases Platform for Discovering New Viruses in Microbiome Samples

-BusinessWire

Precision Medicine Looks beyond DNA Sequences

-Genetic Engineering and Biotechnology News

 

We are grateful to all the researchers who have been working with us to accomplish these feats. Together, we can drive innovation and continue to make advances in genomic science. We will continue to work on ways to add applications and support current research, making it easier to get high-quality data and comprehensive reports.

 

Follow us on social media (Twitter, LinkedIn, YouTube) or subscribe to our quarterly newsletter (Phasebook) to receive updates on our technology and highlights from the latest in genomics.

New genome assembly method makes fruitful advances in genomic technology

 

A collaboration between Phase Genomics and Pacific Biosciences of California is bringing about the next generation of genome assembly technology. A newly published software tool, FALCON-Phase, combines genomic proximity ligation methods developed by Phase Genomics™, with the high accuracy, long-read sequencing data from PacBio®, enabling researchers to create haplotype-resolved genome sequences on a chromosomal scale, without having parental genome data. This method and its application to several animal genomes was published today in Nature Communications.

cow, zebra finch, and human hand arranged in a collage

Humans, as well as other animals, carry DNA sequence copies from both parents. These parental sequence “haplotypes” can carry millions of mutations unique to one of the parents and are often very relevant to diseases and other genetic traits. Until recently, accurately separating paternal and maternal mutations on the whole-genome scale required sequence information from the individual parents or extensive efforts that relied heavily on imputation from population studies. The new method employs the physical proximity information captured by proximity ligation (a technology also known as “Hi-C”) to separate maternal and paternal haplotype information from long-read genome assemblies. This development significantly increases the actionable information content coming out of genome sequencing studies.

 

 

“It’s an exciting time for genome assembly and PacBio HiFi sequencing continues to lead the way in this area with its powerful combination of read length and accuracy,” wrote Jonas Korlach, Chief Scientific Officer at Pacific Biosciences. “Phase Genomics Hi-C complements PacBio technology by extending our data into the ultra-long-range domain, enabling us to connect phase blocks and deliver chromosome-scale diploid assemblies without parental data. We are fortunate to have this excellent partnership with Phase Genomics, and we look forward to continuing to work together to create the highest quality reference genomes available.”

 

Assembling two fully-phased genomes in a single, streamlined process not only saves on the costs of research, but it also enables scientists to upgrade their genome assembly pipelines and obtain previously unobtainable information.

 

Dr. Erich Jarvis, professor at Rockefeller University and chair of the international Vertebrate Genomes Project, wrote, “Chromosome-scale haplotype phasing is critical for generating accurate genome assemblies and for understanding genomic variation within a species.” Furthermore, FALCON-Phase produces maternal and paternal haplotypes without family-trio data, so it can be applied to wild-caught samples or organisms lacking pedigree information. Jarvis notes, “In wild populations that many work with, parental samples are usually unavailable and therefore we need a method that can phase paternal and maternal sequences in the offspring individuals. With FALCON-Phase, we are able to use the Hi-C data that we have already generated for genome scaffolding and add a new dimension to every genome assembly, even retrospectively for previous projects. Our collaboration with Phase Genomics and PacBio has been extremely fruitful and the combination of the two technologies through FALCON-Phase will be highly beneficial to genomic sequencing efforts focused on conservation.”

 

FALCON-Phase is applicable to any diploid genome, including plants, animals, and fungi. It is available as free of charge open-source software (https://github.com/phasegenomics/FALCON-Phase) and Phase Genomics offers services that include the application of this method to varying genome projects. See the latest news and publications on this and other genome assembly methods at https://phasegenomics.com/resources-and-support/publications/.

 

For more information, email us at info@phasegenomics.com.

Breaking the Mold: New Tech Sheds Light on 5 Mysteries of the Fungal World

 

This month Phase Genomics is celebrating #FungusFebruary by highlighting some of the unique capabilities of our Hi-C technology to solve age-old mysteries in the world of fungal genetics and deliver new potential for researchers to understand fungi, all while helping solve global crop crises and develop new groundbreaking pharmaceuticals.

While we wield the power of genomics to explore the wonders of fungi today, a few centuries ago people dismissed them as just weird plants. Eventually microscopes and anatomical studies revealed fungi as a distinct flavor of life — some varieties quite tasty — but educational experts today continue to bemoan the lack of lessons on fungi in biology curricula, and research on fungi — even those that cause disease — lags.

As a result, scientists lack much basic information on the genetics, life cycles, and reproductive habits of many fungi — even though members of this kingdom could help address a bevy of challenges in food and energy production, illuminate the evolution of complex life and even shelter us on Mars.

Genome studies on fungi of all stripes can resolve evolutionary relationships and ecosystem dynamics, identify metabolites of commercial and medical interest and — for fungi that cause disease — reveal biochemical and genetic targets to help us fight pathogenicity.

Like their animal and plant cousins, fungal genomes also have their challenging parts, including repeats, duplications and structural elements that complicate both sequencing and assembly. Recently, the chromosome conformation method “Hi-C” and advances in next-generation sequencing have helped untangle some of these sticky genomic knots, and show promise in taming genomes across this diverse and neglected kingdom of life.

 

        1. High-resolution mapping of centromeres

 Hi-C’s power lies in its ability to identify regions of the genome that reside in close proximity to one another in the nucleus — information that essentially captures the 3D organization of the genome. But Hi-C doesn’t just identify where particular chromosomes reside within the nucleus. It can also help identify functional elements in genomes that are difficult to identify in other ways.

That is what two groups of researchers (from the Pasteur Institute and the University of Washington) did when they used Hi-C to track down functional elements in yeast genomes — centromeres and rDNA clusters — both of which are typically repeat-rich and difficult to identify without laborious experiments involving functional assays or mapping the binding sites of rare centromere proteins. In fungal species, centromeres are held tightly together at the spindle pole body, and the team used this shared proximity to identify centromere locations in the genomes of numerous yeasts (and subsequently other fungi), despite not knowing their centromeric DNA sequence. Ribosomal DNA clusters similarly congregate in yeast nuclei, which one team exploited to identify their positions in Debaryomyces hansenii.

 

        2. High-quality genomes illuminate biochemical pathways

Fungi harbor a wide array of genes for synthesizing secondary metabolites, which range from harmful toxins to helpful pharmaceuticals. In fungi, genes for synthesizing secondary metabolites tend to occur in clusters, which are also thought to be sites of rapid evolution.

Phase Genomics worked with a University of Minnesota-led team and used Hi-C to generate high-quality genomes of six strains of Tolypocladium inflatum, an insect pathogen that has already given us the immunosuppressant drug cyclosporin. The new assemblies revealed major differences in secondary metabolite production between T. inflatum strains, including novel clusters, transpositions and clusters that may be involved in toxin synthesis. The bevy of discoveries from these assemblies showed how recombination can drive significant divergence even within a single species — and how important it is to build multiple high-quality genome assemblies that can capture that diversity.

 

        3. Fungal dikaryons and the hidden nuclear dance

The genetic differences between strains also apply to pathogenic fungi, like the stem rust, which parasitizes wheat. Phase Genomics partnered with a team led by scientists at CSIRO in Australia to apply Hi-C to stem rust – the particularly deadly scourge Ug99. Like many fungi, stem rust genomes are divided between two haploid nuclei. The team used Hi-C data to assemble complete haplotypes for both haploid genomes of both strains, and discovered that Ug99, a recent arrival that is decimating whole fields of wheat in Africa, has an unexpected origin: The strain arose through “somatic hybridization,” when hyphae from two strains exchange haploid nuclei. This may explain the strain’s sudden rise and deadly wake, and gives scientists new genomic information to understand Ug99’s virulence and identify weaknesses that could give wheat a leg up.

 

        4. Hybrids, beer, and fungal metagenomics

The ability to separate two nuclei from within the same cell can be extended to more complex samples.  Yeasts, which are integral players in brewing, will often hybridize to form new species containing genomes from two organisms at once (the famous lager-producing yeast Sacharomyces carlsbergensis is one example of such a hybrid).  But in a mixed microbial community, such as beer, wine, or a microbiome sample, how can DNA sequencing detect which genomes co-exist within the same cell?  One special power of Hi-C is that it traps sequences that are within touching distance of each other, and therefore must come from inside the same cell.  The Dunham lab at the University of Washington used this property to analyze an open-fermentation beer from a local brewery.  The exciting result was that they were able to discover a new hybrid yeast, later named Pichia apotheca, using Hi-C data to identify it as a hybrid bearing two genomes from related organisms.  This new hybrid species has since been used by home-brewers to ply their craft and gives beer a very unique flavor.

 

        5. The Epigenetics of Symbiosis

Nature has plenty of examples of plants and fungi getting along. One of them is Epichloë festucae, a filamentous fungus that has evolved a symbiotic relationship with certain grass species. When Phase Genomics worked with a Massey University-led team, they discovered that E. festucae’s genome carries hallmarks of this symbiosis. The analysis of Hi-C data revealed that important genes are clustered into blocks separated by repeat-rich regions. Hi-C and RNA-seq data together showed that genes within the blocks have similar expression patterns — indicating that genes needed for symbiosis with their grass hosts tend to cluster together in the same blocks.

 

Looking Forward

Cutting edge genomic technologies like Hi-C have the potential to keep making up for lost time and reveal even more intimate details of the hidden lives of fungi. This #FungusFebruary, it’s worth asking: What other mysteries about this long-overlooked kingdom are worth solving?

Q&A with Co-Authors About Bees, Mites, and Their Genomes

Co-authors Dr. Alexander Mikheyev of the Okinawa Institute of Science and Technology and Dr. Jay Evans from the U.S. Department of Agriculture’s Bee Research Laboratory had such great answers that we wanted to share some of them. This research was also featured in ZME Science.

Why is it important and useful to have a high-quality genome for Varroa species? Is there any combined value with the recently published bee genome?

Dr. Mikheyev: Understanding the mechanisms of parasitism requires detailed information about the organization of the genome. Many recent ideas for fighting Varroa rely on molecular tools, which in turn rely on genomic data. Furthermore, a good genome enables us to understand the coevolutionary interactions between mites and the bees. For now, our studies are focused on understanding how the mite has evolved to become a better parasite. However, my lab is also looking at the bee side of the coevolutionary interaction. Having high-quality genomes for both will allow us to identify genomic regions and genes involved in coevolution.

Why did you choose to use Hi-C? Why did you need chromosomes for your genome assembly?

Dr. Evans: From prior genome efforts, we had no information on the physical positions of mite gene features. Now with these in place, we can leverage synteny information from other arthropod genomes and narrow searches for some hard-to-find proteins like olfactory receptors, which often occur in clusters. Generally, the improved genome helps us know what might be unique to Varroa — and therefore a novel clue into their biology and control.

Dr. Mikheyev: One element of this study was to look at patterns of gene duplication, which could indicate diversification of particular gene families. Having a contiguous genome allows us to better localize these duplications and confirm that the different copies are homologous. In the future, when we’ll be looking at signatures of selection, a really powerful approach is to identify genomic regions with reduced genetic diversity. Having adequate chromosomal scaffolding will be essential there.

What genomic clues were found in the two Varroa species that may contribute to parasitism?

Dr. Evans: We found a clear set of genes for the proteins — olfactory receptors and others — that these mites must be using to react to their bee hosts. Hopefully, knowing these proteins will lead to smarter controls and insights into why each species maintains a specific host preference.

Dr. Mikheyev: For us, the most striking finding is this: The evolutionary trajectories of both mites, despite their similarities and close relatedness, were completely dissimilar. At this stage, it is still a bit hard to tell specifically what the selective pressures were and what the mites are adapting to. Curiously, in both species, genes involved in stress tolerance and detoxification were already under selection. This most likely happened before they ever faced miticides and suggests that they may have pre-adapted strategies for dealing with our chemical warfare strategies against them. We hope to tackle this in an upcoming study looking at population-level differences between mites adapted to original and novel hosts.

How do you hope these genomes will be used to help save honey bees?

Dr. Evans: Prior genome drafts had enough gaps that we missed candidate proteins for mite control. These mite genomes will lead to focused efforts to target pathways or traits not found in bees by techniques like small molecules, biological controls, and RNA interference.

Dr. Mikheyev: They can be used to develop new strategies for Varroa control. Also, in upcoming studies looking at how mite populations are adapted to original vs. switched hosts, we hope to identify genes and genomic regions that are specifically important in host switches.

Is there any genomic evidence that the western honeybee could be developing resistance to these pests?

Dr. Evans: Yes. Some bee breeders are targeting these traits, from behaviors to virus resistance. A recent, improved assembly of the honey bee genome — aided in part by Hi-C sequencing — is being used for trait identification and marker-assisted breeding right now.

Dr. Mikheyev: They most definitely are. Intriguingly, wild populations of honey bees seem to evolve tolerance to the mites relatively quickly. In one of my favorite studies, a USDA-monitored population in Louisiana first saw high mortality upon the arrival of Varroa, but a few years later colonies lived even longer than before. There are resistant populations known in the U.S. and in Europe, and resistance is a trait that can be selected. How this adaptation takes place in the bee is really interesting, and something we’ll continue to look into.

Isolating Varroa mites from bees involves a creative use of powdered sugar. How do you think this technique came about?

Dr. Mikheyev: We don’t know. The papers describing this method are pretty prosaic. It seems that in the late 1980s, wheat flour was used to control Varroa by knocking them off the bees — and eventually, someone tried sugar.

Dr. Evans: Since they’re attached to their bee hosts, researchers have used a variety of ‘irritants’ to get mites to fall off. Powdered sugar is safe for the bees and might even be an extra calorie boost. The bees pull sugar from each other and the mites fall off — mostly because of the sugar itself, but also because the grooming bees find them.

What is your favorite weird food that involves honey?

Dr. Mikheyev: It’s not really a food since it is honey, but I love the fact that the giant honey bees of Nepal make psychedelic honey from Rhododendron flowers. The story is worth tracking down for no other reason than the dramatic photos of the men that harvest this honey from sheer cliffs.

Dr. Evans: Honey lemonade. Sorry, I am required by my kids to not say weird things.

The Era of Platinum Genomes Has Arrived

Platinum Genome

 

Phase Genomics is dedicating the rest of this month (January, 2019) to the beginning of “The Era of Platinum Genomes” to celebrate recent advancements in genome assembly; researchers now have the ability to generate chromosome-scale, fully-phased diploid genome assemblies for any species by combining two technologies: long-read sequencing data from PacBio and Phase Genomics’ Hi-C.

 

At the end of this month, we will be giving away a “Platinum Genome Project” which includes a full Hi-C service or kit project to an attendee at the International Plant and Animal Genome Conference 2019 (PAGXXVII). This project includes using Proximo Genome Scaffolding to generate chromosome-scale scaffolds and FALCON-Phase to phase haplotypes across the entire genome. Attendees can enter the raffle by stopping by our booth (#208) throughout the conference, or enter using the form at the bottom of this page. Stay tuned for the winner announcement on January 31st, 2019 by following our twitter account @PhaseGenomics. Offer subject to sweepstakes terms. No purchase necessary.

 

WHAT ARE PLATINUM GENOMES?

 

Much like the music industry ranks albums as gold or platinum, genomes can also be classified using the same terminology based on the completeness of the assembly and quality of phasing (i.e. haplotype resolution). High-quality genomes have complete chromosomes and haplotype resolution in critical sections of the genome qualify as a “gold genome,” whereas “platinum genomes” are assemblies with full chromosome scaffold and haplotypes resolved across the entire genome.

 

Since publishing the first human genome assembly, research from the 1000 genomes project and other groups have created several platinum human genomes to represent different human populations. In fact, one of our latest projects in collaboration with PacBio, generated the most contiguous, haplotype resolved, human genome to-date. However, there are only a few platinum genomes for non-human organisms, as scaffolding and haplotyping entire genomes is very labor-intensive using standard tools.  We are excited to offer tools such as Proximo and FALCON-Phase to help usher in the era of straightforward platinum genome assemblies to researchers studying plants and animals.

RESOURCES

Phase Genomics Workshop at PAGXXVII: Add it to your schedule.

Standard Projects Outline 

Phase Genomics Platinum Genome Sweepstakes guidelines

 

Phase Genomics and Pacific Biosciences Co-Developing new Genome Assembly Phasing Software

Phase Genomics and Pacific Biosciences logos

“FALCON-Phase” – an algorithm for producing diploid genomes.

 

Phase Genomics has entered into a co-development agreement with Pacific Biosciences to develop FALCON-Phase, a software module that combines Hi-C and PacBio® highly-accurate, long read sequencing data to produce fully-phased diploid genome assemblies. The software will be released later this spring.

FALCON-Phase augments PacBio Single Molecule, Real-Time (SMRT®) assemblies with Hi-C proximity-ligation data, generating accurate, fully-phased diploid assemblies. Specifically, it uses Hi-C’s chromatin proximity information to identify sequences belonging to the same parental chromosome in genome assemblies produced by PacBio’s FALCON-Unzip software, greatly reducing haplotype switching along the primary assembly.

Furthermore, by combining Phase Genomics’ Proximo Hi-C genome scaffolding technology with FALCON-Phase, users can fully reconstruct maternal and paternal haplotypes on a chromosomal scale. The end result is a diploid set of chromosome-scale scaffolds, or two fully-phased genomes for the same data and labor cost typical for a single genome project.

FALCON-Phase genome Phasing Graph

FALCON-Phase groups long-read contigs into two separate haplotypes based on Hi-C data. Red and blue edges show contigs connected to the same haplotype, while black edges show homologous contigs connected to both haplotypes. Colors were assigned based on known phasing of assembly, which was not otherwise used to inform FALCON-Phase analysis.

These high-quality phased haplotypes can be leveraged to improve the efficiency of agricultural breeding programs, and could help identify disease-causing genomic variations in humans.

Prof. John Williams, Director of the Davies Research Centre at the University of Adelaide, Australia, wrote, “We are interested in expression of imprinted genes and for this work the availability of haplotype-resolved genome assemblies is an important advance. The release of software that enables the creation of haplotyped genome sequence assembly will revolutionize exploration of genome function. The FALCON-Phase software has this ability and can be applied retroactively to SMRT assemblies, as long as Hi-C data are available. Therefore, even pre-existing genomes can potentially be upgraded to haplotyped assemblies for little or no cost.”

Haplotype-resolved genome assembly is an exciting emerging field. Currently, there is only one other method, Trio Canu, which, unlike FALCON-Phase, requires the parents and offspring to be sequenced, adding an additional cost. For many species, it is not possible to collect a trio in the wild and breeding is often not an option. Other Hi-C phasing techniques exist, but they phase genetic variants, not genome assemblies.

The addition of ultra-long genomic interactions captured by Hi-C to PacBio assemblies is very powerful and presents a straightforward solution to a problem experienced by almost all genomic researchers working with diploid organisms.

A formal announcement with more information is coming in the next month. For more information, email us at info@phasegenomics.com.

 

Pacific Biosciences, the Pacific Biosciences logo, PacBio and SMRT are trademarks of Pacific Biosciences of California, Inc.

A sweet new genome for the black raspberry using Proximo™ Hi-C

Black raspberries

The Black Raspberry, known for its sweetness and health benefits studied further to reveal its chromosome-scale genome.

What is a black raspberry you may ask? Jams, preserves, pies, and liqueur are just a few of the delicious products made with black raspberry. The black raspberry offers much more beyond its exquisite flavors. For instance, did you know it contains a compound called anthocyanins that is used as a dye? It is also used in anti-aging beauty products and contains compounds that may help fight cancer. The useful properties of black raspberry are encoded within the genome.

A multi-national team of scientists have built a full map of the Black Raspberry genome. Teams from New Zealand, Canada, and the U.S.A. contributed to the project led by Drs. Rubina Jibran and David Chagné. The work was published in Nature, Horticulture Research. In the project they leverage Proximo™ Hi-C to order and orient short-read contigs into chromosome-scale scaffolds.

A chromosome-scale reference genome is an important step for basic biology and for breeding programs. Breeders can use this genome while crossing plants to select for traits like color or taste.  To learn more about how Hi-C technology was used to improve the black raspberry genome we contacted Dr. Chagné and Dr. Jibran for a Q&A session. We also wanted their take on the scientific value of Proximo Hi-C and to share their experiences working with us.

 

What is a black raspberry? How is it different from the blackberries we have in Seattle?

The black raspberry we used is no different from the ones found in Seattle. Actually, I remember seeing some black raspberries (also called black-caps) at Pike market few years ago! Washington and Oregon are the largest producers of this delicious crop. Raspberries belong to the genus Rubus, which includes red (Rubus idaeus) and black (R. occidentalis) raspberries, blackberries, loganberries and boysenberries.

 

There are many curious uses of black raspberries, what’s yours?

Black and red raspberries are great on top of Pavlova, alongside slices of kiwifruit. Pavlova is New Zealand’s iconic dessert served around Christmas time, which is the berry fruit season down under here.

 

What are molecular breeding technologies? What are some of the traits in black raspberry you’d like to breed for?

Molecular Breeding techniques use DNA to inform selection decisions. My colleague Cameron Peace from Washington State University did a very good review about the use of DNA-informed breeding in fruit tree.  Plant & Food Research is leading in the use of molecular tools for breeding fruit species, for example we are using genetic markers to predict if apple seedlings carry certain loci for black spot resistance or if they are likely to be red fruited. The breeding goals for Plant & Food Research’s raspberry breeding programme are high fruit flavour, berry anti-oxidant content, pest and disease resistance and higher productivity.

 

The initial black raspberry genome assembly was built from short-read data. Why did you choose to scaffold the short-read contigs rather than create a new long-read assembly? Would you get chromosome scale contigs from a long-read assembly? 

Actually we took both approaches and we decided we would like to see how much of the short-read assembly we would be putting together using Proximo Hi-C. A long-read based assembly will be released soon and the comparison of both assemblies will be extremely informative on what strategy to use for future assemblies of other crop species.

 

How did you validate the Proximity Guided Assembly (PGA) scaffolds? How did you correct errors in the scaffolds?

The PGA for black raspberry was first validated by aligning it to a linkage map and then by aligning it to the genome of strawberry (Fragaria vesca) as they have syntenic genomes.

 

What was the process like in working with Phase Genomics? Would you recommend them to your colleagues?

I enjoy a lot working with Phase Genomics. Black raspberry is not the first genome that we collaborated with Phase Genomics, as we had assembled genomes for kiwifruit and New Zealand manuka previously. The way we work with Phase Genomics is very iterative and they are excellent at trying new methods and assembly parameters until we are satisfied with our assemblies. Every organism has its own challenges when it comes to genome assembly and working with Phase Genomics in a very collaborative way is extremely useful. I have recommended Phase Genomics to colleagues.

New Video: From Contigs to Chromosomes

Phase Genomics CEO and Founder Ivan Liachko, Ph.D. offers an inside look at our ProxiMeta™ Hi-C and Proximo™ Hi-C technology. He explains in this 40 minute presentation how Hi-C is revolutionizing genome and metagenome assembly. Watch “From Contigs to Chromosomes” now and reach out to http://phasegenomics.com/contact-us/ with any questions.

Thanks to IMMSA for hosting this webinar.

Orphan Crop Gains Reference Genome with Proximo Hi-C

Amaranth genome assembly brought to the chromosome-scale using Phase Genomics’ Proximo Hi-C technology. 

 

“Orphan crops” are growing in popularity because they have the potential to feed the world’s expanding population.  You may have heard of orphan crops like quinoa or spelt, but have you heard of amaranth?  The amaranth genus (Amaranthus) is a hearty group of plants that produce nutritious (high in protein and vitamin content) leaves and seeds.  Amaranth species grow strongly across a wide geographic range, including South America, Mesoamerica, and Asia.  Amaranth was likely domesticated by the Aztec civilization and has been a staple food of Mesoamericans for thousands of years. Breeders wish to enhance amaranth’s beneficial properties like drought resistance, nutrition, and seed production to improve the usefulness of amaranth as a food source.  However, effective plant husbandry requires genetic and genomic resources, and building these resources has been inhibited by the high cost of genome sequencing and assembly.

 

Genome assembly Hi-C Orphan Crop

Dr. Jeff Maughan (left) and Dr. Damien Lightfoot (right), are the primary authors of the amaranth genome paper.

Dr. Jeff Maughan, professor at Brigham Young University, is a champion of orphan crop genomics.  Over the past year, Dr. Maughan and his team built a reference-quality amaranth genome on a tight budget.  They built upon an earlier,  short-read assembly by adding Hi-C data, which measures the conformation of chromatin in vivo, as well as low coverage long reads and optical mapping data.  After using optical mapping to correct assembly errors in the short read assembly, the Hi-C data was used to cluster the short genome fragments into nearly complete chromosomes using Phase Genomics’ Proximity-Guided Assembly platform, Proximo™ Hi-C, Then, the long reads were used to close remaining gaps on the chromosomes.  This cost-effective strategy recovered over 98% of the 16 amaranth chromosomes.

 

The completed reference genome provides an important resource for the community and will boost the efforts of plant breeders to unlock more agricultural benefits for amaranth.  In their paper, Dr. Maughan’s team demonstrated the utility of the reference quality genome in at least two ways.  First, they looked at chromosomal evolution by comparing the amaranth genome to the beet genome, which enables researchers to better understand amaranth in the context of how plants evolved, and second, they mapped the genetic locus responsible for stem color, which clarifies the scientific understanding of a useful agricultural trait.  Dr. Maughan points out that both of these experiments would have been impossible without the chromosome-scale genome assembly afforded by Proximo Hi-C.

 

A high-quality reference genome is the first of many important steps towards creating a modern breeding program for amaranth. We contacted Dr. Maughan to learn more about how he is improving amaranth genomics and the importance of orphan crops.

 

What is an orphan crop? 

According to the FAO (Food and Agriculture Organization of the United Nations) the world has approximately 7,000 cultivated edible plant species, but just five of them (rice, wheat, corn, millet, and sorghum) are estimated to provide 60% of the world’s energy intake and just 30 species account for nearly all (95%) of all human food energy needs.  The remaining species are underutilized and often referred to as “orphan crops”.

 

How is genomics relevant to orphan crops?

Would you invest your entire 401K savings in just three stocks?  In essence, that is what we are doing with world food security.  This comes with tremendous risk.  If we are going to diversify our food crops, it will be with these orphan crops.  Modern plant breeding programs leverage genomics to significantly enhance genetic gain (yield), such methods will undoubtedly expedite the development of advanced varieties in orphan crop species.

 

What are the challenges facing researchers interested in orphan crop genomics?  How have you overcome them?

Funding has long been the main obstacle to developing genomic resources for orphaned crops.  The development of cheap, high-quality next-generation sequencing technology has dramatically ameliorated this problem – making genomics accessible for most plant species.

 

You used two scaffolding technologies for your assembly, Hi-C, and BioNano. How did they compare?

Both technologies are extremely useful and complementary but address different genome assembly challenges.  The Hi-C data allows for the production of chromosome length scaffolds, while the BioNano data allows for fine-tuning and verification of the assembly.

 

Beyond building a high-quality genome assembly, what other genomic resources are required to encourage the adoption of orphan crops?

While genomic resources (such as genome assemblies and genetic markers) are fundamental for developing a modern plant breeding program, often what is missing with orphan crops is the collection of diverse germplasm (or gene bank) that is the foundation of a hybrid breeding program.  The U.S. and other nations have extensive collections (tens of thousands of accessions) that serve as the genetic foundation for staple crop breeding programs – unfortunately, such collections are minimal or non-existent for orphan crops.

 

Who stands to benefit the most from a complete amaranth genome?  How do you disseminate your work to them?

We collaborate extensively with researchers throughout South and Central America, where amaranth is already valued as a regionally important crop.  Dissemination of our research occurs though traditional methods (e.g., peer reviewed publications) as well as through sponsored scientist and student exchanges.

 

Amaranth is used in a variety of interesting foods, what’s your favorite dish?

Alegría, which is made with popped amaranth and honey, and is common throughout Mexico.

 

Spotlight on Hi-C in Science: New Technologies Boost Genome Quality

Science writer, Elizabeth Pennisi, outlines available genomics technologies that are helping researchers improve genome assemblies with a focus on Hi-C’s ability to bring genome assembly to the chromosome-scale.

This article, by Elizabeth Pennisi, focuses on how new technologies are making genome quality much better.  Long-reads, optical maps, and Hi-C data are being synergistically applied to improve modern genome assemblies including goat (Dr. Tim Smith), humming bird (Dr. Eric Jarvis), maize, and more.  Importantly, Hi-C provides the finishing touch to these genomes, by providing ultra-long contiguity information that can scaffold entire chromosomes. We, at Phase Genomics, are glad researchers have chosen Proximo Hi-C to scaffold the goat, hummingbird, and hundreds of other assemblies into contiguous chromosome-scale reference genomes.

 

Read the article here

Spotlight on Hi-C in The Atlantic: The Game-Changing Technique That Cracked the Zika-Mosquito Genome

One of the most prolific science writers, Ed Yong, profiles how Hi-C sequencing technologies can make genome assembly easier and more cost-effective than ever before. 

Science writer Ed Yong covers the narrative on the researchers’ tackling the disease carrying Aedes aegypti genome, and how Hi-C “knitted” the genome from 36,000 pieces into complete and contiguous chromosomes. Yong points out that the completed genome will not only help scientists better understand the biology of the mosquito at a much deeper level, but it also marks a technological pivot in genomics: Hi-C makes genome assembly cheaper, more accurate and faster than ever before. Also, mentioned in the article: our collaborator, Dr. Catherine Piechel’s newly published three-spine stickleback genome, and Dr. Erich Jarvis’s hummingbird were also cited as examples of the power of Proximo Hi-C scaffolding.

 

Read the article here

Papadum’s Recipe for an Outstanding, Chromosome-Scale Genome with Hi-C

Meet Papadum the Goat! Papadum is a descendent from a rare population of goats that used to inhabit the San Clemente Island, and notably, Papadum also now holds the world record for the most contiguous non-model mammalian genome.  The recipe for a his amazing de novo genome assembly? Long reads, optical mapping, and Proximo Hi-C genome scaffolding. Read NIH’s article about Papadum’s genome here.

 

The goat genome has been of scientific interest for several reasons: goats are important suppliers of milk, cloth, meat, and more. But prior to the Papadum genome, scientists’ ability to fully understand how the goat genome controls its biology was limited. As a part of the “Feed the Future” initiative, in 2014 the U.S. Agency for International Development awarded innovative scientists Dr. Tim Smith, Dr. Derek Bickhart and Dr. Adam Phillippy a grant to attempt to eliminate these limitations by assembling Papadum’s genome. As pioneers in the genomics field, the scientists teamed up to leverage two rather young technologies, long reads and Hi-C, to create an ultra-high-quality new assembly of the goat genome.

 

Their efforts ultimately led to the creation of the highest quality de novo genome assembly of a mammal to date and are published in Nature Genetics.  With this new reference-quality goat genome, scientists will have a better understanding of goat biology and health to guide better breeding decisions, improving traits like milk production, meat quality, and resilience from disease.

 

The Papadum genome assembly includes large DNA sequences called “chromosome-scale scaffolds” which are nearly complete representations of entire chromosomes from Papadum. These chromosome-scale scaffolds are critical achievement that allows far better understanding of the mechanics of the goat genome than earlier, less advanced results, which included thousands of tiny fragments of chromosomes and lacked the overall structure of the goat genome. The difference is not unlike having an entire intact book, versus a jumble of all the individual words from the book.

 

The ability to reconstruct nearly complete chromosomes was made possible largely by a new technique called Proximity-Guided Assembly, performed with Phase Genomics’ ProximoTM Hi-C scaffolding technology. This process was followed by a tool called PBJelly, which identifies and closes gaps (regions of uncertainty) in the chromosome-scale scaffolds. After Proximo and PBJelly, the resulting assembly included 31 chromosome-scale scaffolds containing only 663 gaps total across the 3 billion base pair diploid genome. Descended from research first published in 2013, Phase Genomics has since successfully demonstrated the success of the Proximo Hi-C scaffolding method in the genomes of plants, animals, fungi and more.

 

Papadum’s genome marks the beginning of an era where reference-quality genomes are achievable and affordable for any organism, not just extensively studied model organisms like mice, fruit flies, and humans. The availability of these extraordinarily complete genomes enables scientists to answer many new biological questions that have the potential to help farmers, government agencies, agricultural companies, and developing countries solve a significant part of the food security problem.

 

Read more about the grant, the scientists, and Papadum’s genome on the NIH’s National Human Genome Research Institute website.