Author: admin

Breaking the Mold: New Tech Sheds Light on 5 Mysteries of the Fungal World

 

This month Phase Genomics is celebrating #FungusFebruary by highlighting some of the unique capabilities of our Hi-C technology to solve age-old mysteries in the world of fungal genetics and deliver new potential for researchers to understand fungi, all while helping solve global crop crises and develop new groundbreaking pharmaceuticals.

While we wield the power of genomics to explore the wonders of fungi today, a few centuries ago people dismissed them as just weird plants. Eventually microscopes and anatomical studies revealed fungi as a distinct flavor of life — some varieties quite tasty — but educational experts today continue to bemoan the lack of lessons on fungi in biology curricula, and research on fungi — even those that cause disease — lags.

As a result, scientists lack much basic information on the genetics, life cycles, and reproductive habits of many fungi — even though members of this kingdom could help address a bevy of challenges in food and energy production, illuminate the evolution of complex life and even shelter us on Mars.

Genome studies on fungi of all stripes can resolve evolutionary relationships and ecosystem dynamics, identify metabolites of commercial and medical interest and — for fungi that cause disease — reveal biochemical and genetic targets to help us fight pathogenicity.

Like their animal and plant cousins, fungal genomes also have their challenging parts, including repeats, duplications and structural elements that complicate both sequencing and assembly. Recently, the chromosome conformation method “Hi-C” and advances in next-generation sequencing have helped untangle some of these sticky genomic knots, and show promise in taming genomes across this diverse and neglected kingdom of life.

 

        1. High-resolution mapping of centromeres

 Hi-C’s power lies in its ability to identify regions of the genome that reside in close proximity to one another in the nucleus — information that essentially captures the 3D organization of the genome. But Hi-C doesn’t just identify where particular chromosomes reside within the nucleus. It can also help identify functional elements in genomes that are difficult to identify in other ways.

That is what two groups of researchers (from the Pasteur Institute and the University of Washington) did when they used Hi-C to track down functional elements in yeast genomes — centromeres and rDNA clusters — both of which are typically repeat-rich and difficult to identify without laborious experiments involving functional assays or mapping the binding sites of rare centromere proteins. In fungal species, centromeres are held tightly together at the spindle pole body, and the team used this shared proximity to identify centromere locations in the genomes of numerous yeasts (and subsequently other fungi), despite not knowing their centromeric DNA sequence. Ribosomal DNA clusters similarly congregate in yeast nuclei, which one team exploited to identify their positions in Debaryomyces hansenii.

 

        2. High-quality genomes illuminate biochemical pathways

Fungi harbor a wide array of genes for synthesizing secondary metabolites, which range from harmful toxins to helpful pharmaceuticals. In fungi, genes for synthesizing secondary metabolites tend to occur in clusters, which are also thought to be sites of rapid evolution.

Phase Genomics worked with a University of Minnesota-led team and used Hi-C to generate high-quality genomes of six strains of Tolypocladium inflatum, an insect pathogen that has already given us the immunosuppressant drug cyclosporin. The new assemblies revealed major differences in secondary metabolite production between T. inflatum strains, including novel clusters, transpositions and clusters that may be involved in toxin synthesis. The bevy of discoveries from these assemblies showed how recombination can drive significant divergence even within a single species — and how important it is to build multiple high-quality genome assemblies that can capture that diversity.

 

        3. Fungal dikaryons and the hidden nuclear dance

The genetic differences between strains also apply to pathogenic fungi, like the stem rust, which parasitizes wheat. Phase Genomics partnered with a team led by scientists at CSIRO in Australia to apply Hi-C to stem rust – the particularly deadly scourge Ug99. Like many fungi, stem rust genomes are divided between two haploid nuclei. The team used Hi-C data to assemble complete haplotypes for both haploid genomes of both strains, and discovered that Ug99, a recent arrival that is decimating whole fields of wheat in Africa, has an unexpected origin: The strain arose through “somatic hybridization,” when hyphae from two strains exchange haploid nuclei. This may explain the strain’s sudden rise and deadly wake, and gives scientists new genomic information to understand Ug99’s virulence and identify weaknesses that could give wheat a leg up.

 

        4. Hybrids, beer, and fungal metagenomics

The ability to separate two nuclei from within the same cell can be extended to more complex samples.  Yeasts, which are integral players in brewing, will often hybridize to form new species containing genomes from two organisms at once (the famous lager-producing yeast Sacharomyces carlsbergensis is one example of such a hybrid).  But in a mixed microbial community, such as beer, wine, or a microbiome sample, how can DNA sequencing detect which genomes co-exist within the same cell?  One special power of Hi-C is that it traps sequences that are within touching distance of each other, and therefore must come from inside the same cell.  The Dunham lab at the University of Washington used this property to analyze an open-fermentation beer from a local brewery.  The exciting result was that they were able to discover a new hybrid yeast, later named Pichia apotheca, using Hi-C data to identify it as a hybrid bearing two genomes from related organisms.  This new hybrid species has since been used by home-brewers to ply their craft and gives beer a very unique flavor.

 

        5. The Epigenetics of Symbiosis

Nature has plenty of examples of plants and fungi getting along. One of them is Epichloë festucae, a filamentous fungus that has evolved a symbiotic relationship with certain grass species. When Phase Genomics worked with a Massey University-led team, they discovered that E. festucae’s genome carries hallmarks of this symbiosis. The analysis of Hi-C data revealed that important genes are clustered into blocks separated by repeat-rich regions. Hi-C and RNA-seq data together showed that genes within the blocks have similar expression patterns — indicating that genes needed for symbiosis with their grass hosts tend to cluster together in the same blocks.

 

Looking Forward

Cutting edge genomic technologies like Hi-C have the potential to keep making up for lost time and reveal even more intimate details of the hidden lives of fungi. This #FungusFebruary, it’s worth asking: What other mysteries about this long-overlooked kingdom are worth solving?

2019: A Year in Review with Phase Genomics

 

From scaling up the ProxiMeta Platform to publishing numerous Hi-C papers, Phase Genomics has had a year filled with new products, new discoveries, and new applications. Proximity ligation technology is continuing to fuel genomic and metagenomic research. Here is a brief recap of newsworthy items from 2019.  

 

Metagenomics Publications

MAY 30, 2019

The ISME Journal

Linking the Resistome and Plasmidome to the Microbiome »  

 

AUGUST 2, 2019

Genome Biology

Assignment of virus and antimicrobial resistance genes to microbial hosts in a complex microbial community by combined long-read assembly and proximity ligation »  

 

SEPTEMBER 7, 2019

Frontiers in Microbiology

Degradation of recalcitrant polyurethane and xenobiotic additives by a selected landfill microbial community and its biodegradative potential revealed by proximity ligation-based metagenomic analysis »  

 

Genome Assembly Papers

FEBRUARY 7, 2019

BMC Genomics

Chromosome rearrangements shape the diversification of secondary metabolism in the cyclosporin producing fungus Tolypocladium inflatum »  

 

FEBRUARY 13, 2019

Journal of the American Society of Nephrology

Integrated Functional Genomic Analysis Enables Annotation of Kidney Genome-Wide Association Study Loci »  

 

MARCH 18, 2019

BioRxiv

Exceptional subgenome stability and functional divergence in allotetraploid teff, the primary cereal crop in Ethiopia »  

 

APRIL 15, 2019

BMC Genomics

A hybrid de novo genome assembly of the honeybee, Apis mellifera, with chromosome-length scaffolds »  

 

JUNE 3, 2019

GigaScience

A chromosome-scale assembly of the major African malaria vector Anopheles funestus »  

 

SEPTEMBER 16, 2019

BioRxiv

A de novo chromosome-level genome assembly of Coregonus sp. “Balchen”: one representative of the Swiss Alpine whitefish radiation »  

 

SEPTEMBER 16, 2019

BioRxiv

Chromosome-scale de novo assembly and phasing of a Chinese indigenous pig genome »  

 

OCTOBER 1, 2019

Communications Biology

Divergent evolutionary trajectories following speciation in two ectoparasitic honey bee mites »  

 

OCTOBER 1, 2019

KeyGene

KeyGene delivers first-ever fully phased red raspberry genome »  

 

NOVEMBER 8, 2019

Nature Communications

Emergence of the Ug99 lineage of the wheat stem rust pathogen through somatic hybridisation »  

 

NOVEMBER 25, 2019

New Zealand Journal of Crop and Horticultural Science

A whole genome assembly of Leptospermum scoparium (Myrtaceae) for mānuka research »  

 

DECEMBER 12, 2019

BioRxiv Preprint

Assembly of a young vertebrate Y chromosome reveals convergent signatures of sex chromosome evolution »  

 

DECEMBER 20, 2019

BioRxiv Preprint

Identifying the causes and consequences of assembly gaps using a multiplatform 2 genome assembly of a bird-of-paradise »  

 

Phase Genomics in the News

JANUARY 22, 2019

Market Insider 

Medicinal Genomics Releases Industry’s First Comprehensive Cannabis Reference Genome »  

 

JANUARY 23, 2019

Podcast: How to Live to 200

Dr. Ivan Liachko on Cat Poop and the Secrets of the Microbiome »  

 

FEBRUARY 25, 2019

Genome Web

Phase Genomics wins $200K grant to develop microbiome discovery technology »  

 

MARCH 6, 2019

Podcast: How On Earth

Dr. Ivan Liachko on Tagging the Bugs that Carry Antibiotic Resistance »  

 

APRIL 5, 2019

ZME Science

Saving the Honeybee: Can New Genomic Clues Help Solve the Colony Collapse Mystery? »  

 

APRIL 25, 2019

Genes to Genomes

Chromosome-scale genome assembly gives African mosquito and malaria vector fewer places to hide its secrets »  

 

MAY 9, 2019

The Genetic Literacy Project

Dissecting the cannabis genome in the quest for a better bud and effective medicines »  

 

JUNE 20, 2019

New Product 

Phase Genomics Accelerates Microbiome Discovery and Antibiotic Resistance tracking with New ProxiMeta8-Pack Including Bundled Analysis »  

 

AUGUST 6, 2019

New Application 

Phase Genomics’ Platform Links Viruses and Antibiotic Resistance Genes to the Microbiome in Study of a Highly Complex Microbial Community »  

 

SEPTEMBER 23, 2019

Xconomy

An Entrepreneur’s Quest to Make Seattle a Genome Sciences Hub »  

 

OCTOBER 22, 2019

Biodiesel Magazine

Phase Genomics awarded DOE grant for algae biofuel research »  

 

Blog Posts

JANUARY 13, 2019

The Era of Platinum Genomes Has Arrived »  

 

APRIL 5, 2019

Q&A with Co-Authors About Bees, Mites, and Their Genomes »  

 

APRIL 25, 2019

Q&A with Co-Author Dr. Nora Besansky about Malaria, Mosquitos, Insecticides and Adaptations! »  

 

MAY 8, 2019

The Highest-Quality Genomes: Q&A on Cannabis Genomics »  

 

JUNE 3, 2019

Project ProxiMeta: 2019 Metagenomics Award »  

 

JUNE 19, 2019

Dr. Ivan Liachko, LinkedIn Article: No longer beyond the horizon: Capturing the flow of antibiotic resistance genes through the microbiome with next-gen proximity ligation sequencing »  

 

AUGUST 7, 2019

Dr. Ivan Liachko, LinkedIn Article: A more complete microbiome: Using proximity-ligation next generation sequencing to uncover host-viral links in complex microbial ecosystems »  

 

SEPTEMBER 3, 2019

Choose This Year’s Metagenomics Award Winner »  

 

OCTOBER 28, 2019

Phase Genomics Transformative Genome Phasing Tool (FALCON-Phase) Now Compatible with Nanopore Sequencing »  

 

OCTOBER 29, 2019

Dr. Ivan Liachko, LinkedIn Article: Help Build Genome Startup Day 2020 »  

 

Phase Genomics Events and Awards

SEPTEMBER 23, 2019

Genome Startup Day

Phase Genomics hosted the inaugural Genome Startup day this year with an attendance of over two hundred scientists, students, and investors.   

 

SEPTEMBER 11, 2019

Project ProxiMeta Winner

Dr. Ben Tully: The Complete Hydrothermal Microbial Metal Metabolism »

Phase Genomics Transformative Genome Phasing Tool (FALCON-Phase) Now Compatible with Nanopore Sequencing

Nanopore and Hi-C produce a new fully-phased, chromosome-scale genome for the red raspberry.

On October 22, scientists at KeyGene revealed the first fully-phased, chromosome-scale reference genome for the red raspberry, sequenced with Oxford Nanopore long-read technology and scaffolded and phased into full chromosomes using Phase Genomics’ Proximo™ Hi-C method.  

Assembling complex plant genomes used to be considered nearly impossible as they can be extremely large, polypoid, and contain highly repetitive regions. Long-read sequencing generates genomic data spanning very long regions, but still needs to be scaffolded, or “put together” into chromosomes. Proximo Hi-C not only helps guide the assembly to produce chromosome-level scaffolds but can also tell which sequences and mutations come from the maternal and paternal chromosome copies (this is called phasing). Our phasing method, FALCON-Phase was originally released in 2018 and was used in conjunction with the Proximo pipeline to generate this “platinum level” raspberry genome.

Read more about the assembly and future directions for the project here.

Choose This Year’s Metagenomics Award Winner

Congratulations to Dr. Ben Tully on winning this year’s Project ProxiMeta: 2019 Metagenomics Award! Read more about his project, 4. The Complete Hydrothermal Microbial Metal Metabolism

This summer, researchers from across the U.S. sent in short proposals for a chance to win a full-service ProxiMeta™ microbiome workup for a sample of their choice. ProxiMeta combines shotgun metagenomics with in vivo proximity ligation (Hi-C) and necessary bioinformatic tools to help researchers assemble high-quality microbial genomes directly from complex microbiome samples.

 

 

HOW TO VOTE

Each project was assessed by a panel of scientists for scientific merit, novelty, impact, and feasibility, and four finalists were selected. Cast your vote on Twitter for your favorite project.

 


 

THE FINALISTS

1. The Gut Microbiome as a Risk Factor for Arsenic-Induced Cancer

Twitter Name: Gut & As-Induced Cancer

It is estimated that ~200 million people worldwide are exposed to arsenic concentrations exceeding current safety standards. Our collaborators have recently demonstrated that mice and human microbiomes can protect mice from arsenic toxicity. While human stool supplementation fully restores protection to arsenic in germ-free mice, researchers were only able to isolate one microbe, Faecalibacterium prausnitzii, that successfully conferred protection to both parent and infant mice. These results are huge because arsenic poses the highest lifetime risk for developing cancer in humans.We will investigate the role of arsenic-transforming bacteria within the gastrointestinal (GI) microbiome as another possible risk factor.

In nature, arsenic-reducing microorganisms are well known for their ability to generate more toxic arsenic products called arsenites, which are typically formed in anaerobic environments like the gut. Past research indicates that ingested arsenic may also be transformed into the toxic product arsenite by gut microbes thus increasing the risk for the host. On the other hand, arsenite-oxidizing microbes may also provide a benefit to the host by lowering arsenite concentrations. The ability of the microbiome to transform arsenic is determined by its genetic composition, therefore ProxiMeta sequencing technology will allow us to immediately analyze our collaborators rodent stool samples for genetic clues regarding this mysterious protection. Our project goals are to expand on this knowledge by: (1) characterizing the genetic basis for protection to arsenic provided by the microbiome (2) identifying, and then isolating, the bacteria-harboring arsenic transforming genes involved in protection.

We predict that differences in the gut metagenome composition will explain the incidences in arsenic susceptibility within a population or even at the family level. This project will provide important insight regarding how gut microbes contribute to cancer and may lead to novel therapies and probiotics that could target the microbiome of arsenic-exposed individuals.


2. Evaluating Antimicrobial Resistance in Backyard Poultry Environments

Twitter: AMR in Backyard Poultry

Approximately 13 million rural, urban, and suburban US residents reported owning backyard poultry (BYP) in 2014, and interest in BYP ownership is nearly four times that amount. BYP ownership has risen recently due to product quality, public health, ethical, and animal welfare concerns of commercial operations. However, BYP ownership and disease treatment is largely under-regulated, unlike commercial poultry production. Lack of regulation poses public health concerns of transmission of antimicrobial resistant (AMR) bacteria, such as AMR strains of Salmonella, Mycoplasma gallisepticum, and Escherichia coli commonly associated with BYP. BYP owners (2014 survey) were largely uninformed about poultry diseases and treatments but were interested in learning more on disease management.

The combination of a lack of regulation and public information warrants further research into the bacterial communities of BYP and their environments. Cloacal and environmental swabs were collected as part of a 2018 citizen science study where BYP owners reported current and historical poultry antibiotic usage. We propose to conduct shotgun metagenomic sequencing and proximity ligation using the ProxiMeta platform, allowing for increased detection of full-length AMR gene alleles compared to that revealed by short-read sequencing. The combination of PacBio reads with HiC intercontig ligation analysis allows for identification of potential gene transfer events of AMR genes within communities and potential dissemination throughout the environment.

This analysis is especially important considering the public health concerns of AMR persistence in backyard environments. Additionally, investigation of lytic and prophage presence would allow investigation of phage-mediated bacterial regulation that would not be possible with short-read sequencing alone. ProxiMeta analysis of these samples would provide the most comprehensive insight of AMR presences and persistence in BYP environments to date. These findings will be critical for new regulation and disease management for the increasing number of BYP flocks, which currently pose a potential health risk.


3. Unraveling the Metagenomics of Contamination

Twitter: Steel Site Contamination

We propose a metagenome characterization of contaminated Munger Landing sediment located in the St. Louis River, Duluth, MN USA. Seasonal samples are already collected and stored; of which one will be sequenced. Munger landing, is located downstream from the U.S. Steel Superfund site and contaminants include PAHs, dioxins, PCBs, and heavy metals.

Soil condition is integral to high productivity and ecosystem balance at all trophic levels. Human activities erode soil condition through agriculture, mining, sewage outflows and/or chemical/waste disposal into waterways. These practices alter the chemical structure of the soil and break down the microbial community processes responsible for ensuring the balance of biogeochemical cycling patterns in the soil. We hypothesize the activity of these pathways involved in cycling of nitrogen, phosphorus and carbon are altered in contaminated soil systems.

Metagenomic profiling of Munger Landing will provide data to examine microbes, metabolic pathways, and contaminant-processing genes present in the community that can be characterized further using qRTPCR. This project will be presented within a community college microbiology course module. Curriculum utilizing real-world data and the sequencing technology from Phase Genomics will teach students experimental design, troubleshooting, hypothesis testing, data analysis and how to communicate the broader impacts of a study to society, the field of environmental microbiology or conservation.

In the future, this data will assist in designing a longitudinal metagenomic and metatranscriptomic study to assess the ability of remediation to ‘recover’ bacterial community function at the Munger Landing site; slated to start in 2020-2021 as compared to two uncontaminated control sites. Ten sites, slated for remediation, have been identified as having high chemical and heavy metal contamination for the St. Louis River Estuary. The Munger Landing project will establish a workflow that can be applied to other contaminated sites.


4. The Complete Hydrothermal Microbial Metal Metabolism

Twitter: Hydrothermal Microbiome

Hydrothermal vents replenish the oceans with much-needed micronutrients, spewing iron, magnesium, nickel, and other metals from the earth’s crust. These metal micronutrients are used as biological cofactors for organisms throughout the marine food chain. Boiling, sterile hydrothermal fluids quickly cool and are colonized by highly specialized microorganisms that begin to cycle the metal species mixing with the seawater. Though regularly sampled, rarely have hydrothermal plumes been tracked through the water column to establish how microbial colonization occurs through time and space. We lack understanding regarding the replicability of colonization to what extent stochastic processes shape microbial community structure.

While on station at the East Pacific Rise hydrothermal vent field, size-fractionated samples (0.2, 3.0 and 5.0-μm) were collected in the hydrothermal plume emanating from Bio Vent. Samples fluids were collected from the source through the first 1-km of dispersal – the key distance for colonization – and this effort was repeated over the course of 10-days – to determine the replicability of natural colonization events. The application of standard metagenomics sequencing and microbial genome reconstruction through binning would provide novel insight into the cycling of metals within the plume but the use of cross-linked DNA techniques would deliver an unprecedented understanding of how strain diversity impacts colonization and how microbes interact with extrachromosomal elements in the environment.

While some microbes are poised to take advantage of reduced metal species for lithotrophic growth, microbes from the water column that become entrained in the plume will need metal-resistance adaptations to alleviate stress from the elevated metal concentrations present. Metal-resistance genes dispersed through the viral and plasmid pools are essential elements for understanding the functioning of the microbial community in this globally important source of metals to the oceans and effective interpretation of the community can only be achieved through cross-linked DNA metagenomic techniques.

*All finalists projects are owned by verified researchers at U.S. academic institutions.


 

RESOURCES

 

Project ProxiMeta: 2019 Metagenomics Award

Win a Free Proximity-Ligation Metagenomics Project

Win a chance to collaborate with Phase Genomics on a metagenomics research project. The grand prize winner will receive a full-service ProxiMeta Metagenome Deconvolution project, including proximity-ligation and shotgun library prep, sequencing, and analysis. Characterize a microbial community of your choice and assemble hundreds of bacterial and eukaryotic genomes, associate plasmids and phage with hosts, and discover novel microbial life.

Submit your proposal by August 8, 2019 The four project finalists will be announced on September 5, 2019 via Twitter based on scientific merit, novelty, and impact. After a week of public voting, the project with the most votes will be named the 2019 Metagenomics Award winner and will receive a full ProxiMeta service project.

With ProxiMeta, you can explore the microbiome with confidence. Only high-quality microbial genomes can provide true insights into the dark matter of the microbiome. Submit your proposal for the 2019 Metagenomics Award today!


KEY DATES

8 August                                      Deadline for Entries

4 September                               Finalists Announced

5-12 September                          Vote for Projects @PhaseGenomics Twitter

12 September                             2019 Metagenomics Award Announcement

 


Help Us Choose the Winner!

We need your help choosing which project to sequence! Below are our four finalists, read through the project proposals and choose your favorite; voting is open to the public and will take place on Twitter September 5, 2019 for one week.


1. The Gut Microbiome as a Risk Factor for Arsenic-Induced Cancer

It is estimated that ~200 million people worldwide are exposed to arsenic concentrations exceeding current safety standards. Our collaborators have recently demonstrated that mice and human microbiomes can protect mice from arsenic toxicity. While human stool supplementation fully restores protection to arsenic in germ-free mice, researchers were only able to isolate one microbe, Faecalibacterium prausnitzii, that successfully conferred protection to both parent and infant mice. These results are huge because arsenic poses the highest lifetime risk for developing cancer in humans.We will investigate the role of arsenic-transforming bacteria within the gastrointestinal (GI) microbiome as another possible risk factor.

In nature, arsenic-reducing microorganisms are well known for their ability to generate more toxic arsenic products called arsenites, which are typically formed in anaerobic environments like the gut. Past research indicates that ingested arsenic may also be transformed into the toxic product arsenite by gut microbes thus increasing the risk for the host. On the other hand, arsenite-oxidizing microbes may also provide a benefit to the host by lowering arsenite concentrations. The ability of the microbiome to transform arsenic is determined by its genetic composition, therefore ProxiMeta sequencing technology will allow us to immediately analyze our collaborators rodent stool samples for genetic clues regarding this mysterious protection. Our project goals are to expand on this knowledge by: (1) characterizing the genetic basis for protection to arsenic provided by the microbiome (2) identifying, and then isolating, the bacteria-harboring arsenic transforming genes involved in protection.

We predict that differences in the gut metagenome composition will explain the incidences in arsenic susceptibility within a population or even at the family level. This project will provide important insight regarding how gut microbes contribute to cancer and may lead to novel therapies and probiotics that could target the microbiome of arsenic-exposed individuals.


2. Evaluating antimicrobial resistance in backyard poultry environments

Approximately 13 million rural, urban, and suburban US residents reported owning backyard poultry (BYP) in 2014, and interest in BYP ownership is nearly four times that amount. BYP ownership has risen recently due to product quality, public health, ethical, and animal welfare concerns of commercial operations. However, BYP ownership and disease treatment is largely under-regulated, unlike commercial poultry production. Lack of regulation poses public health concerns of transmission of antimicrobial resistant (AMR) bacteria, such as AMR strains of Salmonella, Mycoplasma gallisepticum, and Escherichia coli commonly associated with BYP. BYP owners (2014 survey) were largely uninformed about poultry diseases and treatments but were interested in learning more on disease management.

The combination of a lack of regulation and public information warrants further research into the bacterial communities of BYP and their environments. Cloacal and environmental swabs were collected as part of a 2018 citizen science study where BYP owners reported current and historical poultry antibiotic usage. We propose to conduct shotgun metagenomic sequencing and proximity ligation using the ProxiMeta platform, allowing for increased detection of full-length AMR gene alleles compared to that revealed by short-read sequencing. The combination of PacBio reads with HiC intercontig ligation analysis allows for identification of potential gene transfer events of AMR genes within communities and potential dissemination throughout the environment.

This analysis is especially important considering the public health concerns of AMR persistence in backyard environments. Additionally, investigation of lytic and prophage presence would allow investigation of phage-mediated bacterial regulation that would not be possible with short-read sequencing alone. ProxiMeta analysis of these samples would provide the most comprehensive insight of AMR presences and persistence in BYP environments to date. These findings will be critical for new regulation and disease management for the increasing number of BYP flocks, which currently pose a potential health risk.


3. Unraveling the metagenomics of contamination

We propose a metagenome characterization of contaminated Munger Landing sediment located in the St. Louis River, Duluth, MN USA. Seasonal samples are already collected and stored; of which one will be sequenced. Munger landing, is located downstream from the U.S. Steel Superfund site and contaminants include PAHs, dioxins, PCBs, and heavy metals.

Soil condition is integral to high productivity and ecosystem balance at all trophic levels. Human activities erode soil condition through agriculture, mining, sewage outflows and/or chemical/waste disposal into waterways. These practices alter the chemical structure of the soil and break down the microbial community processes responsible for ensuring the balance of biogeochemical cycling patterns in the soil. We hypothesize the activity of these pathways involved in cycling of nitrogen, phosphorus and carbon are altered in contaminated soil systems.

Metagenomic profiling of Munger Landing will provide data to examine microbes, metabolic pathways, and contaminant-processing genes present in the community that can be characterized further using qRTPCR. This project will be presented within a community college microbiology course module. Curriculum utilizing real-world data and the sequencing technology from Phase Genomics will teach students experimental design, troubleshooting, hypothesis testing, data analysis and how to communicate the broader impacts of a study to society, the field of environmental microbiology or conservation.

In the future, this data will assist in designing a longitudinal metagenomic and metatranscriptomic study to assess the ability of remediation to ‘recover’ bacterial community function at the Munger Landing site; slated to start in 2020-2021 as compared to two uncontaminated control sites. Ten sites, slated for remediation, have been identified as having high chemical and heavy metal contamination for the St. Louis River Estuary. The Munger Landing project will establish a workflow that can be applied to other contaminated sites.


4. The Complete Hydrothermal Microbial Metal Metabolism

Hydrothermal vents replenish the oceans with much-needed micronutrients, spewing iron, magnesium, nickel, and other metals from the earth’s crust. These metal micronutrients are used as biological cofactors for organisms throughout the marine food chain. Boiling, sterile hydrothermal fluids quickly cool and are colonized by highly specialized microorganisms that begin to cycle the metal species mixing with the seawater. Though regularly sampled, rarely have hydrothermal plumes been tracked through the water column to establish how microbial colonization occurs through time and space. We lack understanding regarding the replicability of colonization to what extent stochastic processes shape microbial community structure.

While on station at the East Pacific Rise hydrothermal vent field, size-fractionated samples (0.2, 3.0 and 5.0-μm) were collected in the hydrothermal plume emanating from Bio Vent. Samples fluids were collected from the source through the first 1-km of dispersal – the key distance for colonization – and this effort was repeated over the course of 10-days – to determine the replicability of natural colonization events. The application of standard metagenomics sequencing and microbial genome reconstruction through binning would provide novel insight into the cycling of metals within the plume but the use of cross-linked DNA techniques would deliver an unprecedented understanding of how strain diversity impacts colonization and how microbes interact with extrachromosomal elements in the environment.

While some microbes are poised to take advantage of reduced metal species for lithotrophic growth, microbes from the water column that become entrained in the plume will need metal-resistance adaptations to alleviate stress from the elevated metal concentrations present. Metal-resistance genes dispersed through the viral and plasmid pools are essential elements for understanding the functioning of the microbial community in this globally important source of metals to the oceans and effective interpretation of the community can only be achieved through cross-linked DNA metagenomic techniques.

 

 


 

RESOURCES

The Highest-Quality Genomes: Q&A on Cannabis Genomics

 

Co-author Kevin McKernan of Medicinal Genomics talks more about the past, present, and future of cannabis genomic research. Read more about his newly published cannabis genome assembly project using Proximo Hi-C scaffolding featured in The Genetic Literacy Project.

 

What is the difference between hemp and marijuana? How can we use genomics to answer this question?

 

McKernan: The legal definition of hemp is any Cannabis sativa that has less than 0.3 percent THC acid, or THCA. Historically, hemp has been grown for fiber and the exceptional nutritional content of its seed. THCA expression is genetically controlled at what has been historically referred to the Bt:Bd allele. Next-generation sequencing technologies are giving us our first glimpse of this complicated locus.

 

Why are you interested in assembling the Cannabis genome? What are you hoping to accomplish?

 

McKernan: A refined genome assembly will enable molecular breeding programs to deploy marker-assisted selection for yield, flowering time, pest resistance and rare cannabinoid expression. It will likely shed light on the heritability of hermaphroditism and apomixis. A clearer picture of the genes involved in cannabinoid and terpenoid expression will enable more intelligent breeding and synthetic biology programs.

 

Which genes are responsible for cannabidiolic acid production and how do these genes vary between the cultivars?

 

McKernan: The Cannabis plant makes 113 different cannabinoids. There are three well-understood cannabinoid synthesis genes. These highly similar genes all compete for a common precursor molecule. Mutations in these genes affect gross cannabinoid expression. A more refined reference may enlighten us to the genetic variants that can more accurately estimate THCA levels to segregate hemp and drug-type seed stocks.

 

What other hidden gems did you find in the Cannabis genome after you finished the assembly?

 

McKernan: The most exciting picture is the 2.1Mb CBCAS (cannabichromenic acid synthase) gene cluster seen the Jamaican Lion assembly. This has 9 tandem copies of CBCAS all directionally orientated that are 99.4-99.9 percent identical and separated by 30-80kb long terminal repeats. This region has been an assembly knot for over seven years and I think the only reason it is visible to us today is due to novel sequencing tools we didn’t have in 2011.

 

Why is the Cannabis genome so difficult to assemble? Are there unique genomic features (i.e. copy number variants, special repeat classes, segmental duplications) that are especially troublesome?

McKernan: Its 1.07Gb genome consists of 10 chromosomes, with 73 percent repeat, 66 percent AT and 0.5-1 percent polymorphic. The genes that contribute to chemotype are under the most selective pressure and have hijacked long terminal repeats to enable gene expansions. We had suspicions of this back in 2011 but could never assemble the region to prove it.

 

Why was it important to obtain chromosomes for your assembly? How did Hi-C help?

 

McKernan: The Pacific Biosciences assembly delivered us an assembly that was an amazing leap forward from the Illumina assemblies, but it is not chromosomal in scale. Hi-C has helped to organize these contigs into chromosomes and it can do this without having to make linkage maps.

 

What did you find to be most useful in working with Phase Genomics?

 

McKernan: Hi-C is very complimentary to PacBio sequence data and is the only technology that delivers long range information without having to make high molecular weight DNA. This is very important in Cannabis as it is difficult to get high molecular weight DNA out of the plant.

 

What would you like other researchers, breeders or regulators to take away from your high-quality genome assembly? How do you think this genome assembly will be utilized in the future?

 

McKernan: We also need dozens of genomes sequenced to the quality level of Jamaican Lion to get a full picture of these complex cannabinoid loci. We need Hi-C libraries to better understand the microbiome of the plant, so we can more intelligently manage pathogenic threats that affect yield. Many endofungal bacteria like Ralstonia are found in metagenomic sequencing studies in Cannabis flowers and can be a risk to consumers and negatively impact plant yield. Ralstonia is also notorious for contaminating many metagenomic studies due to contamination in library construction kits. We suspect Hi-C will play important roles in segregating live versus dead DNA and resolving these contamination problems.

 

What regulatory challenges do you run into when working on Cannabis genomics?

 

McKernan: The biggest issue at the moment is that the movement of tissue, other than sterilized stalk, is currently federally prohibited in the U.S. This makes RNA studies very challenging as RNA isolation has to be performed in the field. Movement of DNA or cross-linked chromatin is legal, so this is a compelling case for the use of Hi-C in the Cannabis field (insert Hi-C pun here). Phase Genomics’ kits were critical, as shipping certain tissues is restricted.U.S. federal funding also remains restricted. We turned to the Dash Distributed Autonomous Organization for funding to rapidly sequence and publish the genome. We applied for funds in May of 2018 and had the first assembly public on August 2. This is a very generous contribution by Dash because any U.S. university that attempts to handle the plant places their federal funding at risk.

 

What genomic evidence suggests that Cannabis has been selectively bred by humans?

 

McKernan: I think the elevated THCA levels witnessed since prohibition — combined with the long terminal repeat-driven expansion of the synthase genes — is the best evidence we have.

 

What is your favorite fact and what is your least favorite misconception about Cannabis?

 

McKernan: My favorite thought experiment regarding the rapid reproduction of Cannabis is that its genome is very likely spreading through space and time more quickly than the human genome, and it evokes much of David Sinclair’s work on Xenohormesis. My least favorite misconception is the false dichotomy of medical versus recreational cannabis consumption. I think this showcases our reactionary health-care mindset as opposed to the preventative mindset we need to strive for. If you disregard recreational use, you are likely going to require more medical use. These compounds have been in our diet for thousands of years. We now know mutations in human endocannabinoid system-related genes are associated with neurological phenotypes and a large class of idiosyncratic diseases are now being recognized as clinical endocannabinoid deficiency (CED). It was incredibly naïve and destructive to remove cannabinoids from the American diet in 1937.

 

What do you think the future holds for the cannabis industry?

 

McKernan: In states that legalize cannabis, there is a 15 percent reduction in alcohol consumption, a 25 percent reduction in opiate overdoses, a 17 percent decrease in Medicare opiate usage and a 25 percent reduction in general pharmaceutical use. There is a 10 percent reduction in suicide and a 72 percent reduction in PTSD nightmares. The benefits to epilepsy have survived FDA scrutiny. This is the most disruptive market force we have seen in healthcare since the internet and next-generation sequencing. We are now just witnessing the alcohol industry take multi-billion dollar positions in the cannabis industry. It is only a matter of time before the pharmaceutical industry begins to hedge their losses as well. I am betting against the endocannabinoid mimetic known as acetaminophen and in favor of the less-toxic phytocannabinoids like cannabidiol.

 

 

About Phase Genomics

Seattle-based Phase offers research services and kits based on its Hi-C and proximity-ligation technologies, which enable chromosome-scale genome assembly, metagenomic deconvolution, and the analysis of structural genomic variation and genome architecture. Phase Genomics offers Hi-C genomics tools for genome scaffolding and phasing. Learn more about Proximo and bring the power of Hi-C into your lab today by purchasing one of our Hi-C kits.

How it Works: Proximo Hi-C Genome Scaffolding

Q&A with Co-Author Dr. Nora Besansky about Malaria, Mosquitoes, Insecticides and Adaptations!

 

New Genome Published for Malaria Vector Mosquito, An. funestus


Plasmodium 
parasites—the microbes that cause malaria—are right at home in the tropics. After all, tropical regions harbor the two animals that the malaria parasites need to complete their complex lifecycle: female Anopheles mosquitoes and human beings. And in 2017 alone, Plasmodiumracked up 219 million cases of malaria, with 435,000 deaths…

Read the full article in Genes to Genomes

By generating a high-quality genome assembly for one of these mosquitos, researchers in the future will be able to reveal genomic clues as to why An. funestus is able to be a key vector for malaria. Here is our the Q&A with Dr. Nora Besansky, a leading author behind the new genome assembly for malaria vector mosquito, An. funestus.

 

Why is it important to understand mosquito genetics in malaria research?

Dr. Besansky: Malaria parasites are not spread directly between humans, unlike cold or flu viruses. Malaria mosquitoes — the subset of all mosquitoes that spread the disease — are essential for malaria transmission. Mosquitoes must bite humans to spread the disease, but they don’t operate like dirty syringes. Malaria parasites enter the mosquito when it bites an infected human. The parasite then has to complete a long (at least 10-day) and exquisitely complex developmental process inside the mosquito before that mosquito can infect another human through its bite. The propensity of a mosquito to bite humans, and the ability of the malaria parasite to develop successfully inside of the mosquito depends on mosquito and parasite genetics. Mechanistic understanding of mosquito genetics provides novel opportunities for us to control disease transmission by mosquitoes, without harming the environment or other organisms.

 

How does access to health care affect treatment for malaria?

Dr. Besansky: Malaria is actually a curable disease — if humans are infected with malaria parasites that are susceptible to the current drugs, and if the infected humans have access to those drugs and to health care. But malaria is a disease of poverty. Health care is not often accessible or affordable, and malaria parasites are rapidly becoming resistant to anti-parasite drugs. Since these drugs have only a short-term effect in the human body, they are impractical for control of malaria in high-transmission parts of the world, even in the absence of parasite resistance, due to the economic burden and the lack of infrastructure for drug distribution.

 

How does the rise of insecticide use affect the spread of malaria?

Dr. Besansky: In the absence of an effective malaria vaccine, broad-spectrum insecticides against malaria mosquitoes are the mainstay of malaria control. But like malaria parasites, the mosquitoes are also becoming resistant to the insecticides that have been approved for use inside homes and on bed nets, particularly in the face of massively scaled-up insecticide-use campaigns to control malaria. Resistance not only means that mosquitoes can survive exposure to the insecticide; resistance can also be behavioral. For example, mosquitoes that normally enter houses to bite at night, when people are sleeping under bed nets, and rest on indoor walls may change their behavior to daytime biting and outdoor resting — making it much more difficult to specifically target those mosquitoes. Genetic research offers the opportunity to understand in detail aspects of mosquito behavior and physiology that are essential to the mosquito life cycle or to the parasite developmental process inside the mosquito, revealing new and specific ways to intervene and protect human health.

 

Why is it important to have a chromosome-scale genome assembly for Anopheles funestus?

Dr. Besansky: Human malaria is prevalent in many tropical regions across the globe. But Africa suffers disproportionately. About 90 percent of malaria cases and malaria deaths occur in tropical Africa south of the Sahara. This is mainly due to the dominance of two highly efficient mosquito vectors of human malaria that occur throughout that region: Anopheles gambiae and Anopheles funestus.  Owing to its acknowledged importance in malaria transmission, An. gambiae was the first insect, after the fruit fly Drosophila melanogaster, to have its genome fully sequenced and assembled. Sequencing and assembly of other anopheline malaria vectors has lagged, but in 2015, an additional 16 Anopheles reference genomes were made available, among these An. funestus. However, limitations of the sequencing technologies applied at that time meant that these reference genomes were not chromosome-scale assemblies. Just as driving from New York to Los Angeles is facilitated by a road map, genetic research also is more powerful, efficient and accurate if the 260 million puzzle pieces of nucleotides in the nuclear genome are properly ordered and oriented.

 

Does heterozygosity or haplotype diversity affect genome assembly methods for the Anopheles species?

Dr. Besansky: A major international consortium, modeled after the 1,000 human genomes consortium, published its findings based on 765 An. gambiae mosquitoes sampled from natural populations. A major conclusion was that, on average, there is a polymorphic site every other nucleotide in An. gambiae, emphasizing the almost unprecedented heterozygosity of this species. This same consortium has begun work on An. funestus, a species expected to be equally diverse. Such nucleotide diversity is well-known to pose difficulties for chromosome-scale assemblies based on traditional sequencing technologies.

 

What are chromosomal inversions? Do they affect the spread of malaria?

Dr. Besansky: Chromosomal inversions are reversals in gene order that occur when a linear chromosome breaks in two places, and the intervening segment rotates 180 degrees before rejoining the other two pieces. They affect the spread of malaria, indirectly if not directly, because they typically contain hundreds or thousands of genes involved in climatic or local adaptation, allowing their mosquito carriers to fully exploit heterogeneous and otherwise challenging environments. Due to their modification of recombination rates along the chromosome, undetected inversions can mislead genome-wide association studies and other genetic studies. Chromosome-scale assemblies make it possible to localize inversions in the genome.

Q&A with Co-Authors About Bees, Mites, and Their Genomes

Co-authors Dr. Alexander Mikheyev of the Okinawa Institute of Science and Technology and Dr. Jay Evans from the U.S. Department of Agriculture’s Bee Research Laboratory had such great answers that we wanted to share some of them. This research was also featured in ZME Science.

Why is it important and useful to have a high-quality genome for Varroa species? Is there any combined value with the recently published bee genome?

Dr. Mikheyev: Understanding the mechanisms of parasitism requires detailed information about the organization of the genome. Many recent ideas for fighting Varroa rely on molecular tools, which in turn rely on genomic data. Furthermore, a good genome enables us to understand the coevolutionary interactions between mites and the bees. For now, our studies are focused on understanding how the mite has evolved to become a better parasite. However, my lab is also looking at the bee side of the coevolutionary interaction. Having high-quality genomes for both will allow us to identify genomic regions and genes involved in coevolution.

Why did you choose to use Hi-C? Why did you need chromosomes for your genome assembly?

Dr. Evans: From prior genome efforts, we had no information on the physical positions of mite gene features. Now with these in place, we can leverage synteny information from other arthropod genomes and narrow searches for some hard-to-find proteins like olfactory receptors, which often occur in clusters. Generally, the improved genome helps us know what might be unique to Varroa — and therefore a novel clue into their biology and control.

Dr. Mikheyev: One element of this study was to look at patterns of gene duplication, which could indicate diversification of particular gene families. Having a contiguous genome allows us to better localize these duplications and confirm that the different copies are homologous. In the future, when we’ll be looking at signatures of selection, a really powerful approach is to identify genomic regions with reduced genetic diversity. Having adequate chromosomal scaffolding will be essential there.

What genomic clues were found in the two Varroa species that may contribute to parasitism?

Dr. Evans: We found a clear set of genes for the proteins — olfactory receptors and others — that these mites must be using to react to their bee hosts. Hopefully, knowing these proteins will lead to smarter controls and insights into why each species maintains a specific host preference.

Dr. Mikheyev: For us, the most striking finding is this: The evolutionary trajectories of both mites, despite their similarities and close relatedness, were completely dissimilar. At this stage, it is still a bit hard to tell specifically what the selective pressures were and what the mites are adapting to. Curiously, in both species, genes involved in stress tolerance and detoxification were already under selection. This most likely happened before they ever faced miticides and suggests that they may have pre-adapted strategies for dealing with our chemical warfare strategies against them. We hope to tackle this in an upcoming study looking at population-level differences between mites adapted to original and novel hosts.

How do you hope these genomes will be used to help save honey bees?

Dr. Evans: Prior genome drafts had enough gaps that we missed candidate proteins for mite control. These mite genomes will lead to focused efforts to target pathways or traits not found in bees by techniques like small molecules, biological controls, and RNA interference.

Dr. Mikheyev: They can be used to develop new strategies for Varroa control. Also, in upcoming studies looking at how mite populations are adapted to original vs. switched hosts, we hope to identify genes and genomic regions that are specifically important in host switches.

Is there any genomic evidence that the western honeybee could be developing resistance to these pests?

Dr. Evans: Yes. Some bee breeders are targeting these traits, from behaviors to virus resistance. A recent, improved assembly of the honey bee genome — aided in part by Hi-C sequencing — is being used for trait identification and marker-assisted breeding right now.

Dr. Mikheyev: They most definitely are. Intriguingly, wild populations of honey bees seem to evolve tolerance to the mites relatively quickly. In one of my favorite studies, a USDA-monitored population in Louisiana first saw high mortality upon the arrival of Varroa, but a few years later colonies lived even longer than before. There are resistant populations known in the U.S. and in Europe, and resistance is a trait that can be selected. How this adaptation takes place in the bee is really interesting, and something we’ll continue to look into.

Isolating Varroa mites from bees involves a creative use of powdered sugar. How do you think this technique came about?

Dr. Mikheyev: We don’t know. The papers describing this method are pretty prosaic. It seems that in the late 1980s, wheat flour was used to control Varroa by knocking them off the bees — and eventually, someone tried sugar.

Dr. Evans: Since they’re attached to their bee hosts, researchers have used a variety of ‘irritants’ to get mites to fall off. Powdered sugar is safe for the bees and might even be an extra calorie boost. The bees pull sugar from each other and the mites fall off — mostly because of the sugar itself, but also because the grooming bees find them.

What is your favorite weird food that involves honey?

Dr. Mikheyev: It’s not really a food since it is honey, but I love the fact that the giant honey bees of Nepal make psychedelic honey from Rhododendron flowers. The story is worth tracking down for no other reason than the dramatic photos of the men that harvest this honey from sheer cliffs.

Dr. Evans: Honey lemonade. Sorry, I am required by my kids to not say weird things.

The Era of Platinum Genomes Has Arrived

Platinum Genome

 

Phase Genomics is dedicating the rest of this month (January, 2019) to the beginning of “The Era of Platinum Genomes” to celebrate recent advancements in genome assembly; researchers now have the ability to generate chromosome-scale, fully-phased diploid genome assemblies for any species by combining two technologies: long-read sequencing data from PacBio and Phase Genomics’ Hi-C.

 

At the end of this month, we will be giving away a “Platinum Genome Project” which includes a full Hi-C service or kit project to an attendee at the International Plant and Animal Genome Conference 2019 (PAGXXVII). This project includes using Proximo Genome Scaffolding to generate chromosome-scale scaffolds and FALCON-Phase to phase haplotypes across the entire genome. Attendees can enter the raffle by stopping by our booth (#208) throughout the conference, or enter using the form at the bottom of this page. Stay tuned for the winner announcement on January 31st, 2019 by following our twitter account @PhaseGenomics. Offer subject to sweepstakes terms. No purchase necessary.

 

WHAT ARE PLATINUM GENOMES?

 

Much like the music industry ranks albums as gold or platinum, genomes can also be classified using the same terminology based on the completeness of the assembly and quality of phasing (i.e. haplotype resolution). High-quality genomes have complete chromosomes and haplotype resolution in critical sections of the genome qualify as a “gold genome,” whereas “platinum genomes” are assemblies with full chromosome scaffold and haplotypes resolved across the entire genome.

 

Since publishing the first human genome assembly, research from the 1000 genomes project and other groups have created several platinum human genomes to represent different human populations. In fact, one of our latest projects in collaboration with PacBio, generated the most contiguous, haplotype resolved, human genome to-date. However, there are only a few platinum genomes for non-human organisms, as scaffolding and haplotyping entire genomes is very labor-intensive using standard tools.  We are excited to offer tools such as Proximo and FALCON-Phase to help usher in the era of straightforward platinum genome assemblies to researchers studying plants and animals.

RESOURCES

Phase Genomics Workshop at PAGXXVII: Add it to your schedule.

Standard Projects Outline 

Phase Genomics Platinum Genome Sweepstakes guidelines

 

A Year in Review with Phase Genomics

 

From releasing the world’s first Hi-C kits for plants and animals to publishing the most contiguous human genome assembly to date, Phase Genomics has had a year filled with new papers, new discoveries, and new applications. Proximity-Guided Assembly is continuing to fuel genomic research and here is a brief recap of newsworthy items in 2018.

 

PAPERS

 

Published Hi-C Genome Assemblies:

 

 

 

 

 

 

 

 

 

 

 

Published Metagenomic Projects:

 

 

 

 

PRODUCT RELEASES

 

 

 

 

 

BLOGS AND VIDEOS

 

Uncovering the microbiome: What will you do with metagenomics? March 1st, 2018

 

New Video: From Contigs to Chromosomes March 15th, 2018

 

A sweet new genome for the black raspberry using Proximo™ Hi-C March 28th, 2018

 

Phase Genomics and Pacific Biosciences Co-Developing new Genome Assembly Phasing Software April 19th, 2018

 

Lil BUB Aids in Discovery of New Bacteria August 1st, 2018

 

Hi-C solves the problem of linking plasmids to hosts in microbiome samples August 8th, 2018

 

Earth’s Wine Cellar: Digging into the Microbiome of Vineyards September 6th, 2018

 

Hi-C Technology Links Antimicrobial Resistance Genes to the Microbiome December 4, 2018

 

 

IN THE NEWS

 

NPR, March 6th, 2018
Mysteries of the Moo-crobiome: Could Tweaking Cow Gut Bugs Improve Beef?

 

GeekWire, June 27th, 2018
Phase Genomics wins $1.5M grant to peer inside microorganisms’ DNA

 

GeekWire, August 1st, 2018
Cat celebrity Lil Bub lends poop to Seattle startup, leading to discovery of new kinds of bacteria

 

Market Watch, August 10th, 2018
$500+ Million Human Microbiome Market Scenario, 2018-2022

 

GenomeWeb, September 13th, 2018
Vertebrate Genomes Project Releases First Assemblies; Describes Challenges, Plans

 

Bio-IT World, October 9th, 2018
Pacific Biosciences Releases Highest-Quality, Most Contiguous Individual Human Genome Assembly To Date

 

Genetic Engineering and Biotechnology News (GEN), November 14th, 2018
Precision Medicine Looks beyond DNA Sequences

 

Boise State Radio, December 18th, 2018
University Of Idaho Scientists Put Crosshairs On Antibiotic-Resistant Bacteria

Hi-C Technology Links Antimicrobial Resistance Genes to the Microbiome

 

Antibiotic resistance is a rapidly growing global health threat as bacteria share and spread resistance genes via plasmids and other mobile genetic elements. Several teams of researchers applied a new method to understand which microorganisms house genes for antibiotic resistance within complex microbiome communities.
Read the paper, Linking the Resistome and Plasmidome to the Microbiome.

 

ANTIMICROBIAL RESISTANCE ON THE RISE

 

According to the World Health Organization, antimicrobial resistance (AMR) in microbial pathogens is expected to take 10 million lives by 2050 if there are no new pharmaceutical or technological advancements dedicated to combating this pressing problem. For almost a century, medicine has made remarkable impact on human life by using antibiotics to treat infections, but this has led to a very concerning overuse problem, stoking an arms race between antibiotics and the pathogens they target. The CDC points out that at least 30% of antibiotic prescriptions are unnecessary and there is a massive contribution to antibiotic overuse in the food and agriculture industry where each year 130,000 tons of antibiotics are given to food animal livestock. Both of these problems correlate with the rise of AMR.

 

Though there are naturally occurring antibiotic-resistant bacteria, there are two mechanisms by which bacteria can acquire antimicrobial resistance genes (ARGs) and become resistant: 1) through spontaneous genetic mutations and/or 2) by acquiring genetic material from other microbes via plasmids, viruses, or other means of horizontal gene transfer. Due to the evolutionary pressure exerted on microbes by antibiotic overuse, pathogens resistant to these antibiotics within our body, hospitals, and the environment become reservoirs of transmittable AMR genes that can rapidly spread and accumulate within a single microbe contributing to the emergence of multidrug-resistant microbes commonly known as superbugs.

 

PROXIMITY-LIGATION (HI-C) LINKS ARG AND PLASMIDS TO THEIR HOSTS

 

One of the biggest obstacles faced by scientists when studying AMR is the inability to determine which microbes are carrying and spreading specific ARGs. Because these genes often travel on mobile elements, they can move dynamically between different species and can therefore be found in numerous organisms without one clear parental host. When attempting to sequence the DNA of a mixed microbial sample, all the DNA is purified from all the cells at the same time and the host-plasmid connection is severed, making it nearly impossible to determine where each mobile element came from or if they were shared among several species. In this newly published paper, researchers highlight a novel method for linking ARGs and other mobile genetic elements to their hosts directly from microbiome samples using the latest version of the proximity-ligation (Hi-C) data analysis tool, ProxiMeta Hi-C.

 

Phase Genomics CEO, Dr. Ivan Liachko, describes how our Hi-C platform solves one of microbiologists’ greatest problems pertaining to the linking of plasmids with their hosts.

 

Hi-C utilizes in vivo proximity-ligation which can assemble complete genomes down to the strain-level directly from mixed-population samples as well as physically links plasmids/ARGs to their host. This method is particularly useful for researchers studying the “dark-matter” of the microbiome because the method does not require culturing nor a priori information about a sample.

 

USING HI-C TO TRACK ARGs IN THE MICROBIOME

 

Lead author Thibault Stalder from the University of Idaho used the ProxiMeta Hi-C kit on a complex microbiome wastewater community, a suspected AMR reservoir, to learn more about which bacteria carry ARGs. After the Hi-C library was sequenced, Phase Genomics used the data to inform contig clustering of hundreds of genomes, most of which are novel, with our cloud-based software – ProxiMeta. Using the genome clusters found by ProxiMeta, the Hi-C linkages of each ARG-, plasmid-, and integron-bearing contigs to each genome were measured to determine which species physically hosted the relevant mobile elements.

 

ProxiMeta was able to cluster contigs into >1000 genome clusters and search for over 30 groups of ARGs, plasmids, and integrons which speed up the adaptive process of newly integrated ARGs (Figure 1, circle plot). For each of these genes, we inferred hosts (Figure 2). Moreover, these organisms generally belonged to families known to host each known gene (marked with an “X” in Figure 2), supporting the accuracy of the analysis. In the future, this information will allow us to track the spread of AMR in complex communities consisting of many diverse organisms.

 

Microbiome Antibiotic Resistance Genes and Plasmids

Figure 1: Hi-C linkage between ARGs, plasmid markers, and integrons among clusters belonging to Alpha, Beta, Gamma and Delta Proteobacteria.

 

Over 200 genome clusters had strong Hi-C links to ARGs, of which 12 had high-quality assemblies. These resultant genomes include both gram positive and gram-negative bacteria and most belonged to species that were previously unsequenced. ARGs were mostly linked to genome clusters belonging to the Gammaproteobacteria, Betaproteobacteria and Bacteroidetes (Figure 2, below).

 

Microbiome Antibiotic Resistance Genes AMR and Plasmids

Figure 2: Normalized Hi-C links between ARGs, plasmids, and families of bacteria.

 

 

FUTURE DIRECTIONS

 

This method can be useful for researchers not only studying the microbiome, but the virome as well. Phages, or viruses, also distribute genetic information amongst bacteria to influence host biology, much like plasmids. Several previous studies showed that in vivo proximity-ligation can be used to link phages with their hosts directly from mixed complex samples, much like was done with plasmids and AMR genes in this study. This information could be crucial to labs and companies that are now engineering phages that could replace the widespread use of antibiotics and combat AMR.

 

This year, antibiotic resistant bugs have infected more than 2 million people globally; 23,000 of those individuals will die because of our inability to fight these superbugs. By using ProxiMeta Hi-C to better understand the genomics of microbial communities suspected to be AMR reservoirs, researchers can identify ARG carriers down to the strain-level and quantify how prevalent these genes are. With further exploration, this tool could one day offer a new solution to limit the spread of these genes and reverse the trend of increasing antibiotic resistance and save lives.

 

BRING A HI-C KIT INTO YOUR LAB TODAY

 

Phase Genomics offers a wide variety of proximity-ligation products and services including Hi-C preparation kits and a range of different cloud-based bioinformatic analysis platforms. Power your microbiome research with ProxiMeta Hi-C and our easy Hi-C kits; assemble hundreds of complete genomes for novel, unculturable microbes, and associate plasmids with hosts directly from raw microbiome samples using ProxiMeta Hi-C.

Earth’s Wine Cellar: Digging into the Microbiome of Vineyards

 

Phase Genomics partnered with Browne Family Vineyards to begin to understand, the microbiome makeup of soils within different vineyards across the state of Washington. The findings were unveiled at the Pacific Science Center’s “STEM: Science Uncorked” winetasting event.

There are many different factors that contribute to soil composition, such as parent material, topography, climate, geological time, and the thousands of different and undiscovered microbes living in the soil—the least understood factor. In April of 2018, Browne Family Vineyards staff visited five of their vineyards, filled a bag with soil from each site, and sent it to Phase Genomics to analyze the microbiome in each of the soil samples.

SYMBIOSIS BETWEEN PLANTS AND MICROBES

Plants rely heavily on their microbiome to live, grow, and protect themselves from pathogens. One example of this symbiotic relationship is that plants release chemicals into the soil in order to attract microbes. These microbes bring nutrients such as nitrogen, iron, potassium, and phosphorus to the plants in exchange for sugar, which the microbes require to survive. Microbes also play an important role in nitrogen fixation, organic decay, and biofilm production to protect the plant roots from drought. It is evident that this symbiotic relationship between microbes and plants is critical to the health and survival of both, but further research into this complex community is inhibited by two main problems: It is impossible to isolate microbes in such a complex mix and most of the microbes have never been discovered before.

THE DARK MATTER OF THE MICROBIOME

Microbes live in communities where they rely on each other. This makes it difficult to isolate or culture (i.e. grow) microbes without killing them or altering their genetic makeup. Moreover, there can be millions of microbes living in a single teaspoon of soil, making these samples extremely complex environments. This causes most of the microbial world to be unknown, sometimes referred to as the “Dark Matter of the Microbiome”.

The most effective way to identify the microbes in the community is to look at the genetic makeup of the microbiome to try to classify microbial genomes present. Standard practices include sequencing of 16S (a hypervariable genomic region) and shotgun sequencing.  By combining these standard practices with Hi-C, researchers are now able to fully reconstruct genomes from a mix because Hi-C captures the DNA within each microbe to exploit key genetic features unique to each individual in the community. The Phase Genomics Hi-C kit and software, ProxiMetaTM, uses this information to capture even novel genomes straight from the sample without culturing—illuminating the dark matter of the microbiome.

THE PROCEDURE

Shotgun Sequencing Procedure and Difficulties

Figure 1: Shotgun Sequencing Procedure and Difficulties

Once the soil samples were collected from the five vineyards, Phase Genomics produced shotgun libraries to obtain DNA from all of the microbes in each sample (Figure 1)—essentially taking the soil sample, breaking open all of the microbial cells then purifying the DNA (1.A). Since DNA is fragile, most of it gets broken into smaller pieces during this process, leaving a mix of many DNA fragments from all of the microbes that were present in the original soil sample. The fragmented DNA is then sequenced and the “sequence reads” are uploaded into a database of known microbial genomes (1.B). This database then searches for matches or “hits” to see if the reads are similar to anything in the database (1.C).

A problem with relying on shotgun data is that it’s unclear which DNA fragments belong to which microbe, thus relying heavily on computational techniques and the accuracy of the reference database for classification. This results in little improvement or clarity on the makeup of the sample, again, leaving the microbiome in the dark. Though shotgun sequencing only provides a glimpse into the microbial community, this data allows scientists to differentiate the taxonomy (phyla, genera, species) of the microorganisms living in the soil.

THE RESULTS

Shotgun sequencing identified over 10,000 different species from each of the vineyard soil samples; however, it is impossible to know if this is the true number of species because only ~ 20% of the reads matched the database, indicating ~80% was either incomplete or undiscovered (see table below).

Table 1: Vineyard Read Classification
Vineyard Total Reads Percent of Reads Classified Number of Organisms Found Percent of Unknown Organisms
Canyon 19,001,222 15.95% 10,726 73.32%
Canoe Ridge 21,214,190 17.66% 11,721 55.55%
Waterbrook 19,469,954 19.6% 10,782 50.58%
Skyfall 63,850,810 16.17% 15,101 80.08%
Willow Crest 43,941,026 17.13% 13,914 71.84%

 

Moreover, of assigned reads, >50% did not match to a genus or species—hinting that many of the organisms found are novel. Without digging too deep into the microbiome analysis, it is evident that the microbial makeup is different for each of the samples. Varying levels of reads from each vineyard were able to be classified (Table 1), and among the classified reads, the vineyards have 3-4 microbes that vary in abundance in common. These microbes, such as Proteobacteria, Rhizobacteria, and Actinobacteria, generally, are very common in soil.

Proteobacteria

Proteobacteria

There are obvious differences in the biodiversity of the soil samples both in number of species and relative abundance. For example, Canoe Ridge and Waterbrook samples were >20%, Delftia, while the microbes in the other vineyards were more evenly distributed, with abundance closer to 1-5%. Interestingly, Delftia, a rod-shaped bacterium, has the ability to break down toxic chemicals and to produce gold.

Actinobacteria

Actinobacteria

There are two main components that influence microbe classification in these samples: the desired taxonomy level, and the statistical threshold, or minimum number of reads, set to define it. Much like zooming in and out, the most “zoomed out” analysis is achieved by a stringent threshold and will reveal phylum, while the most “zoomed in” analysis is achieved by a more lenient threshold and will reveal genus and species

If the data is “zoomed in” further, about 37% of the microbes in each community can be identified by genus. On average, 63% of the communities do not match to a genus at all, hinting that these microbes may have never been sequenced. The most abundant microbe genera present in these samples are Bradyrhizobium, Streptomyces, and Nocardiodes.

As discussed earlier, this data highlights the issues that are present with shotgun data and the corresponding analysis: there is still far too much that is unknown. In order to better understand these samples, we also performed Hi-C on two of the samples which will be discussed in further detail in the next section.

 

HI-C AND FINDING NOVEL GENOMES

One thing all these soil samples have in common is that they are composed of numerous novel species. To obtain more information on the microbes present in these samples, and solve the issue discussed earlier surrounding shotgun data, Hi-C was performed on two of the soil samples, Skyfall and Willow Crest. Essentially, Hi-C assigns DNA fragments from shotgun sequencing to the correct species by connecting DNA while the cells are still intact.

Hi-C enables clustering of shotgun assemblies and subsequently yields complete genomes from a microbiome, even if the genome has never been sequenced before. With complete microbe genomes, it becomes easier to classify organisms down to the strain-level—a step even further than species. By having the genome, we can essentially read a microorganism’s blueprint and learn more about its genes, evolution, and even function once the genome is annotated.

For example, preliminary data from the Willow Crest soil sample yielded 400 different genome clusters. When compared to known bacterial genomes in the RefSeq database, which aggregates all published microbial genomic data, over half of the extracted genomes are unable to be identified at a genus level and thus likely represent newly discovered bacterial organisms.

SCIENCE UNCORKED

When the microbiome data from the vineyards were presented to the public at the Pacific Science Center, two questions consistently arose: How does this influence wine taste, and how can growers select for a healthy microbiome? These very forward-thinking questions unfortunately cannot be answered—yet.

Scientists do know that soil plays a big role in plant health, and this could in part be due to the plants’ symbiotic relationship with microbes, as discussed earlier. It has also been shown that biodiversity can benefit plants because of the diverse functions individual microbes have, i.e. with more microbes, there are more potential functions being served versus 1 microbe serving one function. However, nailing down answers to these questions will take a lot of research. With emerging technologies, like Hi-C, the answers have become much more obtainable.

Though the term “microbiome” may not be household vocabulary, many of the attendees were very aware about the role that microbes play in human health, and how they influence the world around us. It goes to show that the rapid developments in the microbiome field are reaching beyond just research and becoming more tangible for the general public. Relevant stories—like looking into the microbiome of vineyards— are helping them understand the intricate concept of microbial life.

Learn more about ProxiMeta Hi-C and the microbiome by visiting our website www.phasegenomics.com and connect with us on twitter by following @PhaseGenomics

Hi-C solves the problem of linking plasmids to hosts in microbiome samples

Plasmids are hard!

Plasmids are an important part of microbial biology. Plasmid-borne genes can have serious public health consequences by conferring virulence traits or resistance to antibiotic drugs, and can be readily shared among bacterial cells through cell-cell conjugation or other means. In principle, any gene that gives bacterial cells a selective advantage is likely to be shared via plasmids among related cells. For example, so-called “epidemic resistance plasmids” have been instrumental in the rise of multi-drug resistance in pathogenic E. coli and Klebsiella pneumoniae.

However, determining the bacterial hosts of any given plasmid in a sample can be difficult. The classic approach is to isolate host and plasmid together and culture them in the lab. However, in complex samples with numerous organisms, many of which cannot be cultured readily or even where culturing may alter the selection pressure on the organisms of interest, this approach is often impossible. Alternatives like statistical metagenomic approaches also have difficulty with plasmid-host association, as plasmids do not necessarily resemble their host genomes in either abundance or nucleotide composition and single-cell sequencing approaches are expensive and have a limited range of samples and species they can be used on.

Hi-C to the rescue

Fortunately, recent developments in genomic technology have yielded some novel tools that allow us to circumvent this limitation. Hi-C is a method that allows us to measure 3-dimensional distances between sequences inside intact cells and was originally developed to model 3D folding of genomes inside cells. These structural measurements include a clear signal about which sequences originated inside the same cell simply because the cell membrane generally prevents inter-cellular sequences from coming into contact. Hi-C therefore provides direct physical evidence of DNA sequences originating from the same cell.

Phase Genomics has developed the ProxiMeta™ Hi-C metagenome deconvolution method, which is specifically optimized for metagenomic applications (Figure 1). At Phase Genomics we use ProxiMeta Hi-C to reconstruct whole genomes from a variety of complex samples such as human fecal, wastewater, soil, and co-culture communities (for more information, see our paper about ProxiMeta).

 

Figure 1. Schematic of ProxiMeta Hi-C. (a) Hi-C crosslinking junctions will form only between sequences in the same cell. (b) Proximity-ligation creates chimeric Hi-C junctions between adjacent DNA molecules which can be directly observed by paired-end sequencing. (c) clustering methods can be used to infer the starting genomes based on the Hi-C junction information. Originally published here.

 

As a necessary part of their life cycle, plasmids need to pass through their bacterial host cells to replicate. Therefore, plasmids typically form Hi-C links to their host genomes simply by virtue of being inside the same cell as their host genome. So, to find the hosts of a given plasmid,  one only needs to find these plasmid-genome links. Our analysis of metagenomic Hi-C data bears this conclusion out repeatedly through multiple publications, as described below.

Hi-C links plasmids and hosts

A pair of early publications showed that using this method we could correctly associate several plasmids with their bacterial hosts in an artificial community using and early version of the Hi-C  method.

In our more recent paper, we have demonstrated that Hi-C links between plasmids and hosts in a complex human fecal sample link described plasmids to their known hosts. Excitingly, in a single experiment we can now assemble numerous novel microbial genomes, complete with plasmid content, from a complex sample with hundreds of different species present.

An exciting finding from this complex community is that we can directly visualize how plasmids are shared between bacteria in a community (Figure 2). Recall from above that the sharing and spread of plasmids is a serious problem in the epidemiology of antibiotic resistance and infectious disease. For example, the sequence marked with “*” in Figure 2 shows substantial similarity to a plasmid called pBUN24, in addition to other plasmids with unknown hosts. It is clear that this plasmid shows contacts with a variety of genome clusters corresponding to different organisms, suggesting that all of these organisms can act as hosts for this plasmid.

 

Figure 2. Heatmap representing quantitative Hi-C links between plasmids (columns) and genome clusters (rows) in a human fecal metagenome. For scale see top right key (blue=no contact). Columns where more than one cell shows signal are possible instances of plasmid sharing. All genome cluster rows are near-complete genomes, e.g. have >90% completeness and <10% redundancy according to CheckM analysis.

 

In a more recent collaboration with Mick Watson’s group at the Roslin Institute, we applied ProxiMeta Hi-C to the cow rumen microbiome, a very complex microbial community. In this peer-reviewed paper, we were able to not only discover scores of novel genomes in this community, but also to profile plasmid-genome linkages for these genomes. Thus, Hi-C linkages of plasmids to genomes are robust even to very high complexity of the community.

Looking to the future: Plasmid Biology conference and more.

We have multiple exciting ongoing collaborations using Hi-C to understand the host range and biology of plasmids and other mobile elements; the best is yet to come! To see several examples of our Hi-C technology applied to this problem, you need only read the abstracts for the 2018 Plasmid Biology conference, August 5-10 at the University of Washington in Seattle.

We will be writing more about the uses of ProxiMeta and metagenomic Hi-C on this blog in the future, so stay tuned.

Lil BUB Aids in Discovery of New Bacteria

Published author, talk show host, movie star, musician, and philanthropist—Lil BUB has now also helped to discover novel microbial life living in her gut in collaboration with AnimalBiome, KittyBiome, and Phase Genomics. Enter to sequence your cat’s microbiome in our #Meowcrobiome twitter raffle!

 

We live in an era of discovery, especially as it relates to the microbiome and how microbial diversity influences our world, our health—and even our pet’s health. To better understand the microbial life of our feline friends, Lil BUB volunteered to sequence her gut microbiome. Thanks to a recent collaboration with AnimalBiome, KittyBiome, and Phase Genomics, Lil BUB helped discover 22 new microbes living in cats which, in time, could reveal new insights into cat health and happiness.

When KittyBiome started back in 2015 with an intent to understand the cat microbiome,  Lil BUB’s owner Mike “Dude” Bridavsky provided a sample of her poop to be analyzed. Because of Lil BUB and over 1,000 other cats, KittyBiome’s microbial census will help us identify what microbes are associated with healthy cats and work towards helping cats with Inflammatory Bowel Disease (IBD), diabetes and other ailments likely to be associated with the microbiome.

 

USING GENOMICS TO FIND MICROBES

Late last year, Phase Genomics offered to analyze samples from Lil BUB and another cat, Danny (belonging to Jennifer Gardy—a microbiologist at the University of British Columbia and science TV host), using our ProxiMeta™ Hi-C Metagenomic Deconvolution platform to obtain complete microbial genomes from their samples.  This method solves a huge problem in microbiome research—how to tell apart different species when their DNA is all mixed up in one sample (imagine a thousand jigsaw puzzles mixed together).

ProxiMeta Hi-C revealed about two hundred different species of microorganisms living in Lil BUB and Danny’s poop, many of which have never been seen before. The genome sequences of the microorganisms found in these samples were analyzed using our software and other microbiome analysis tools to measure the quality of the different assembled genomes and to see if those genomes matched any known microbes (Lil BUB’s and Danny’s data are available for free on our website). Without using our ProxiMeta Hi-C platform to extract these genomes, many of them would have been undetectable and gone unseen.

Lil BUB and Danny the Cat

Phase Genomics sequenced both Lil BUB (left) and Danny’s (right) poop samples.

 

OVER 20 NEW BACTERIAL GENOMES DISCOVERED

Lil BUB being heldTogether, Lil BUB and her buddy Danny carry 22 previously undescribed bacterial species in their guts.  Lil BUB’s poop sample had 13 species and Danny’s sample had 9 species that have never before been fully sequenced or characterized.

These new bacterial species mostly belong to the order Clostridiales, and the team is currently analyzing the genomes to better characterize them. This discovery will help continue to build a database that contains cat bacteria that are new to science, so we can better identify the contributions of the microbiome to various health conditions.

This cool discovery, made with the help of Lil BUB and Danny, highlights that there’s a  universe of undiscovered microbial life out there. If we found 22 potentially novel species in only two cats, just imagine what else is out there, and what the implications might be for new ways to support and improve the health of our pets.

 

WHO ARE OUR HERO CATS?

Lil BUB is a one of a kind critter, made famous on the Internet due to her adorable genetic anomalies. She is a “perma-kitten”, which means she will stay kitten-sized and maintain kitten-like features her entire life. She has an extreme case of dwarfism, which means her limbs are disproportionately small relative to the rest of her body. Her lower jaw is significantly shorter than her upper jaw, and her teeth never grew in so her tongue is always hanging out. Lil BUB is also a polydactyl cat, meaning she has extra toes – 22 toes total!  Lil BUB and Her Dude travel all over the country raising hundreds of thousands of dollars for animals in need.

Danny, an exotic shorthair with a face much like Grumpy Cat, is equally adorable.  He is the companion cat of one of KittyBiome’s original researchers, Jennifer Gardy, and was one of the very first cats to lend his poop profile to the KittyBiome initiative.  He is a very healthy cat and his microbial profile has helped us learn what a balanced gut in cats looks like.

WHAT’S NEXT?

Phase Genomics and AnimalBiome are eager to learn more about these newly-discovered bacterial species. They hope to work with the scientific community to analyze, identify, characterize and publish these genomes, starting with exploring their identities based on 16S rRNA and other marker genes.

HOW TO GET INVOLVED

  • Help characterize the new bacteria: If you know of a researcher, scientist or cat-lover who would like to help us, we are soliciting input on the analysis that needs to be done to properly characterize and publish these genomes. Participants who contribute in a substantive manner to the project will be co-authors on the publication. All data associated with the project will be deposited into publicly available databases and we will publish the findings in open access journals, so all pet lovers can read them. We will hold a raffle to award one lucky contributor a free Hi-C sample kit from Phase Genomics. If interested, contact us at team@animalbiome.com to learn more.
  • Name the new bacteria: We’re looking for input from the community on what we should name these 22 new bacteria, so if you have any fun ideas, please drop us an email at team@animalbiome.com. The format should follow standard practices of scientific nomenclature, so it should be constructed like this: “Clostridium _________.”
  • Submit your pet’s sample for genomic research: If you don’t win the raffle and still want your pet to contribute to scientific knowledge through the identification of new bacterial species, please contact us at team@animalbiome.com. We can provide you with the details and pricing involved for us to identify new species in your cat or dog through in depth analyses like we did for Lil BUB and Danny using the Hi-C approach pioneered by Phase Genomics, which would also result in a publication.

Improving databases of the microbiome of cats (and dogs) with new bacteria like this could help us learn more about how the gut microbiome helps support the digestive health of all pets.

ENTER YOUR CAT IN OUR TWITTER RAFFLE

Phase Genomics, AnimalBiome and KittyBiome are hosting a twitter raffle where you can enter to sequence your cat’s microbiome! All you have to do is go to either the Phase Genomics’ or AnimalBiome’s original tweet of this blog, retweet it with a picture and introduction of your cat with the hashtag #Meowcrobiome. On August 8th 2018, we will randomly draw one (1) winner whose cat poop will be scientifically analyzed by Phase Genomics with ProxiMeta Hi-C to search for novel microbes, and three (3) additional winners whose cat poop will receive a Kitty Kit to have their cat’s poop analyzed by Animal Biome to compare their cat’s gut to healthy cat guts.  Send in your cat’s poop, and you too can help discover new microbial life!

LIL BUB AND DANNY’S STORY FEATURED ON GEEKWIRE PODCAST

GeekWire discussed Lil BUB, Danny, and the new bacteria found in their poop in their weekly Week in Geek podcast. Check out the full podcast on their website (the segment begins around 22:58), or play just the segment about Lil BUB and Danny below.

 

 

Phase Genomics and Pacific Biosciences Co-Developing new Genome Assembly Phasing Software

Phase Genomics and Pacific Biosciences logos

“FALCON-Phase” – an algorithm for producing diploid genomes.

 

Phase Genomics has entered into a co-development agreement with Pacific Biosciences to develop FALCON-Phase, a software module that combines Hi-C and PacBio® highly-accurate, long read sequencing data to produce fully-phased diploid genome assemblies. The software will be released later this spring.

FALCON-Phase augments PacBio Single Molecule, Real-Time (SMRT®) assemblies with Hi-C proximity-ligation data, generating accurate, fully-phased diploid assemblies. Specifically, it uses Hi-C’s chromatin proximity information to identify sequences belonging to the same parental chromosome in genome assemblies produced by PacBio’s FALCON-Unzip software, greatly reducing haplotype switching along the primary assembly.

Furthermore, by combining Phase Genomics’ Proximo Hi-C genome scaffolding technology with FALCON-Phase, users can fully reconstruct maternal and paternal haplotypes on a chromosomal scale. The end result is a diploid set of chromosome-scale scaffolds, or two fully-phased genomes for the same data and labor cost typical for a single genome project.

FALCON-Phase genome Phasing Graph

FALCON-Phase groups long-read contigs into two separate haplotypes based on Hi-C data. Red and blue edges show contigs connected to the same haplotype, while black edges show homologous contigs connected to both haplotypes. Colors were assigned based on known phasing of assembly, which was not otherwise used to inform FALCON-Phase analysis.

These high-quality phased haplotypes can be leveraged to improve the efficiency of agricultural breeding programs, and could help identify disease-causing genomic variations in humans.

Prof. John Williams, Director of the Davies Research Centre at the University of Adelaide, Australia, wrote, “We are interested in expression of imprinted genes and for this work the availability of haplotype-resolved genome assemblies is an important advance. The release of software that enables the creation of haplotyped genome sequence assembly will revolutionize exploration of genome function. The FALCON-Phase software has this ability and can be applied retroactively to SMRT assemblies, as long as Hi-C data are available. Therefore, even pre-existing genomes can potentially be upgraded to haplotyped assemblies for little or no cost.”

Haplotype-resolved genome assembly is an exciting emerging field. Currently, there is only one other method, Trio Canu, which, unlike FALCON-Phase, requires the parents and offspring to be sequenced, adding an additional cost. For many species, it is not possible to collect a trio in the wild and breeding is often not an option. Other Hi-C phasing techniques exist, but they phase genetic variants, not genome assemblies.

The addition of ultra-long genomic interactions captured by Hi-C to PacBio assemblies is very powerful and presents a straightforward solution to a problem experienced by almost all genomic researchers working with diploid organisms.

A formal announcement with more information is coming in the next month. For more information, email us at info@phasegenomics.com.

 

Pacific Biosciences, the Pacific Biosciences logo, PacBio and SMRT are trademarks of Pacific Biosciences of California, Inc.

A sweet new genome for the black raspberry using Proximo™ Hi-C

Black raspberries

The Black Raspberry, known for its sweetness and health benefits studied further to reveal its chromosome-scale genome.

What is a black raspberry you may ask? Jams, preserves, pies, and liqueur are just a few of the delicious products made with black raspberry. The black raspberry offers much more beyond its exquisite flavors. For instance, did you know it contains a compound called anthocyanins that is used as a dye? It is also used in anti-aging beauty products and contains compounds that may help fight cancer. The useful properties of black raspberry are encoded within the genome.

A multi-national team of scientists have built a full map of the Black Raspberry genome. Teams from New Zealand, Canada, and the U.S.A. contributed to the project led by Drs. Rubina Jibran and David Chagné. The work was published in Nature, Horticulture Research. In the project they leverage Proximo™ Hi-C to order and orient short-read contigs into chromosome-scale scaffolds.

A chromosome-scale reference genome is an important step for basic biology and for breeding programs. Breeders can use this genome while crossing plants to select for traits like color or taste.  To learn more about how Hi-C technology was used to improve the black raspberry genome we contacted Dr. Chagné and Dr. Jibran for a Q&A session. We also wanted their take on the scientific value of Proximo Hi-C and to share their experiences working with us.

 

What is a black raspberry? How is it different from the blackberries we have in Seattle?

The black raspberry we used is no different from the ones found in Seattle. Actually, I remember seeing some black raspberries (also called black-caps) at Pike market few years ago! Washington and Oregon are the largest producers of this delicious crop. Raspberries belong to the genus Rubus, which includes red (Rubus idaeus) and black (R. occidentalis) raspberries, blackberries, loganberries and boysenberries.

 

There are many curious uses of black raspberries, what’s yours?

Black and red raspberries are great on top of Pavlova, alongside slices of kiwifruit. Pavlova is New Zealand’s iconic dessert served around Christmas time, which is the berry fruit season down under here.

 

What are molecular breeding technologies? What are some of the traits in black raspberry you’d like to breed for?

Molecular Breeding techniques use DNA to inform selection decisions. My colleague Cameron Peace from Washington State University did a very good review about the use of DNA-informed breeding in fruit tree.  Plant & Food Research is leading in the use of molecular tools for breeding fruit species, for example we are using genetic markers to predict if apple seedlings carry certain loci for black spot resistance or if they are likely to be red fruited. The breeding goals for Plant & Food Research’s raspberry breeding programme are high fruit flavour, berry anti-oxidant content, pest and disease resistance and higher productivity.

 

The initial black raspberry genome assembly was built from short-read data. Why did you choose to scaffold the short-read contigs rather than create a new long-read assembly? Would you get chromosome scale contigs from a long-read assembly? 

Actually we took both approaches and we decided we would like to see how much of the short-read assembly we would be putting together using Proximo Hi-C. A long-read based assembly will be released soon and the comparison of both assemblies will be extremely informative on what strategy to use for future assemblies of other crop species.

 

How did you validate the Proximity Guided Assembly (PGA) scaffolds? How did you correct errors in the scaffolds?

The PGA for black raspberry was first validated by aligning it to a linkage map and then by aligning it to the genome of strawberry (Fragaria vesca) as they have syntenic genomes.

 

What was the process like in working with Phase Genomics? Would you recommend them to your colleagues?

I enjoy a lot working with Phase Genomics. Black raspberry is not the first genome that we collaborated with Phase Genomics, as we had assembled genomes for kiwifruit and New Zealand manuka previously. The way we work with Phase Genomics is very iterative and they are excellent at trying new methods and assembly parameters until we are satisfied with our assemblies. Every organism has its own challenges when it comes to genome assembly and working with Phase Genomics in a very collaborative way is extremely useful. I have recommended Phase Genomics to colleagues.

New Video: From Contigs to Chromosomes

Phase Genomics CEO and Founder Ivan Liachko, Ph.D. offers an inside look at our ProxiMeta™ Hi-C and Proximo™ Hi-C technology. He explains in this 40 minute presentation how Hi-C is revolutionizing genome and metagenome assembly. Watch “From Contigs to Chromosomes” now and reach out to http://phasegenomics.com/contact-us/ with any questions.

Thanks to IMMSA for hosting this webinar.

Uncovering the microbiome: What will you do with metagenomics?

In this Nature Microbiology blog post, Mick Watson shares his journey into the depths of the rumen microbiome. Read more here to learn how Phase Genomics ProxiMeta Hi-C Metagenomic Deconvolution techniques are helping investigators advance their metagenomic research in complex samples. This study successfully assembled 913 genomes and will help to improve our understanding of the microbial population in cow rumen in an unprecedented way using these new metagenomics techniques. We look forward to seeing what else comes from Microbiome 2.0. and are proud to be a part of this impressive piece of work.

Hundreds of Genomes Isolated from Single Fecal Sample with Hi-C Kit

 

Hi-C Kit Microbiome

A Phase Genomics Hi-C kit for any sample type are now available!

Phase Genomics recently launched its ProxiMeta™ Hi-C metagenome deconvolution kit + software
product, enabling researchers to bring this powerful technology (previously only available through the ProxiMeta service) into their own labs. A new paper posted to biorxiv describes the results of employing ProxiMeta technology to deconvolute a human gut microbiome sample.

 

In the paper, ProxiMeta was used on a single human gut microbiome sample and isolated 252 individual microbial genomes or genome fragments, with 50 of these genomes meeting the “near-complete” threshold typically used as the standard according to the CheckM tool (>90% complete, <10% contaminated). Examining the tRNA and rRNA content of the genomes found 10 to meet “high-quality” and 75 to meet “medium-quality” thresholds. Additionally, 14 of the genomes represent near-complete assemblies of novel species or strains not found in RefSeq, showing that even after many years of research, there remain numerous unknown microbes in the human gut that are discoverable with new approaches.

 

ProxiMeta’s results were compared to those achieved with MaxBin, a common tool used to perform metagenomic binning based on heuristics such as shotgun read depth and tetranucleotide profiles. MaxBin was able to create 29 near-complete genomes (cf. 50 for ProxiMeta), with only 5 meeting high-quality (cf. 10) and 44 meeting medium-quality (cf. 75) thresholds based on tRNA and rRNA content. In terms of ability to construct similar sets of near-complete genomes, ProxiMeta and MaxBin constructed 27 of approximately the same genomes, with ProxiMeta constructing an additional 32 genomes that MaxBin did not, and MaxBin constructing 9 genomes that ProxiMeta did not. ProxiMeta’s assembled genomes also exhibited a much lower amount of contamination than MaxBin’s assembled genomes, with 43% of MaxBin’s assemblies exceeding the 10% contamination limit that is the typical standard for genome quality, compared to only 2% of ProxiMeta’s assemblies.

 

Other results unique to ProxiMeta include the discovery of near-complete genomes for 14 novel species or strains and various associations of plasmids with their hosts. Of the 14 novel genomes, 10 appear to be of the class Clostridia, a common group of gut microbes that are poorly characterized due to their difficulty to culture.  ProxiMeta also assigned 137 contigs containing plasmid content to a cluster and identified candidate plasmid sequences as being present across multiple, distantly related bacteria. For example, ProxiMeta placed a known megaplasmid into an assembly for Eubacterium eligens that included homologous plasmid sequences placed into several other genomes, suggesting either the presence of the megaplasmid into other species, or variants of the megaplasmid being found on other mobile elements spread through the metagenome.

 

The depth of the resulting data and results offers the opportunity to learn much more about this microbial niche and research continues to unlock new discoveries about this community. Phase Genomics is thrilled to be able to offer all researchers the same new power to dig deeper into their mixed samples than ever before, especially now with a product that puts the power of discovery in their hands.

 

To learn more about ordering our kits or services, just send us an email at info@phasegenomics.com

Orphan Crop Gains Reference Genome with Proximo Hi-C

Amaranth genome assembly brought to the chromosome-scale using Phase Genomics’ Proximo Hi-C technology. 

 

“Orphan crops” are growing in popularity because they have the potential to feed the world’s expanding population.  You may have heard of orphan crops like quinoa or spelt, but have you heard of amaranth?  The amaranth genus (Amaranthus) is a hearty group of plants that produce nutritious (high in protein and vitamin content) leaves and seeds.  Amaranth species grow strongly across a wide geographic range, including South America, Mesoamerica, and Asia.  Amaranth was likely domesticated by the Aztec civilization and has been a staple food of Mesoamericans for thousands of years. Breeders wish to enhance amaranth’s beneficial properties like drought resistance, nutrition, and seed production to improve the usefulness of amaranth as a food source.  However, effective plant husbandry requires genetic and genomic resources, and building these resources has been inhibited by the high cost of genome sequencing and assembly.

 

Genome assembly Hi-C Orphan Crop

Dr. Jeff Maughan (left) and Dr. Damien Lightfoot (right), are the primary authors of the amaranth genome paper.

Dr. Jeff Maughan, professor at Brigham Young University, is a champion of orphan crop genomics.  Over the past year, Dr. Maughan and his team built a reference-quality amaranth genome on a tight budget.  They built upon an earlier,  short-read assembly by adding Hi-C data, which measures the conformation of chromatin in vivo, as well as low coverage long reads and optical mapping data.  After using optical mapping to correct assembly errors in the short read assembly, the Hi-C data was used to cluster the short genome fragments into nearly complete chromosomes using Phase Genomics’ Proximity-Guided Assembly platform, Proximo™ Hi-C, Then, the long reads were used to close remaining gaps on the chromosomes.  This cost-effective strategy recovered over 98% of the 16 amaranth chromosomes.

 

The completed reference genome provides an important resource for the community and will boost the efforts of plant breeders to unlock more agricultural benefits for amaranth.  In their paper, Dr. Maughan’s team demonstrated the utility of the reference quality genome in at least two ways.  First, they looked at chromosomal evolution by comparing the amaranth genome to the beet genome, which enables researchers to better understand amaranth in the context of how plants evolved, and second, they mapped the genetic locus responsible for stem color, which clarifies the scientific understanding of a useful agricultural trait.  Dr. Maughan points out that both of these experiments would have been impossible without the chromosome-scale genome assembly afforded by Proximo Hi-C.

 

A high-quality reference genome is the first of many important steps towards creating a modern breeding program for amaranth. We contacted Dr. Maughan to learn more about how he is improving amaranth genomics and the importance of orphan crops.

 

What is an orphan crop? 

According to the FAO (Food and Agriculture Organization of the United Nations) the world has approximately 7,000 cultivated edible plant species, but just five of them (rice, wheat, corn, millet, and sorghum) are estimated to provide 60% of the world’s energy intake and just 30 species account for nearly all (95%) of all human food energy needs.  The remaining species are underutilized and often referred to as “orphan crops”.

 

How is genomics relevant to orphan crops?

Would you invest your entire 401K savings in just three stocks?  In essence, that is what we are doing with world food security.  This comes with tremendous risk.  If we are going to diversify our food crops, it will be with these orphan crops.  Modern plant breeding programs leverage genomics to significantly enhance genetic gain (yield), such methods will undoubtedly expedite the development of advanced varieties in orphan crop species.

 

What are the challenges facing researchers interested in orphan crop genomics?  How have you overcome them?

Funding has long been the main obstacle to developing genomic resources for orphaned crops.  The development of cheap, high-quality next-generation sequencing technology has dramatically ameliorated this problem – making genomics accessible for most plant species.

 

You used two scaffolding technologies for your assembly, Hi-C, and BioNano. How did they compare?

Both technologies are extremely useful and complementary but address different genome assembly challenges.  The Hi-C data allows for the production of chromosome length scaffolds, while the BioNano data allows for fine-tuning and verification of the assembly.

 

Beyond building a high-quality genome assembly, what other genomic resources are required to encourage the adoption of orphan crops?

While genomic resources (such as genome assemblies and genetic markers) are fundamental for developing a modern plant breeding program, often what is missing with orphan crops is the collection of diverse germplasm (or gene bank) that is the foundation of a hybrid breeding program.  The U.S. and other nations have extensive collections (tens of thousands of accessions) that serve as the genetic foundation for staple crop breeding programs – unfortunately, such collections are minimal or non-existent for orphan crops.

 

Who stands to benefit the most from a complete amaranth genome?  How do you disseminate your work to them?

We collaborate extensively with researchers throughout South and Central America, where amaranth is already valued as a regionally important crop.  Dissemination of our research occurs though traditional methods (e.g., peer reviewed publications) as well as through sponsored scientist and student exchanges.

 

Amaranth is used in a variety of interesting foods, what’s your favorite dish?

Alegría, which is made with popped amaranth and honey, and is common throughout Mexico.

 

Threespine Stickleback Genome Upgraded Using Proximo™ Hi-C

Threespine stickleback

Proximo Hi-C genome scaffolding not only improved the well-studied threespine stickleback assembly, but also found structural differences that would have otherwise been missed. 

 

This week researchers from the University of Bern and the University of Georgia released a new high-quality reference threespine stickleback genome. The results of this project, a joint collaboration between Dr. Catherine Peichel, Dr. Michael White, and Phase Genomics, were publiaried in the Journal of Heredity. By applying a relatively new scaffolding technology, Proximo Hi-C, the team was able place 60% of previously unassigned sequence to chromosomes. These previously unplaced sequences make up ~5% (13 Megabases) of the stickleback genome and contain multiple genes and other functional DNA. The assembly was generated from an individual from a different lake than the previous stickleback reference genome, and the structural information generated by Proximo Hi-C allowed the team to identify novel structural variants between the two populations. These improvements and new structural information will benefit many research groups that use this model organism to study genetics and evolution.

 

The first efforts to sequence and assemble the threespine stickleback genome from 2012 used a costly sequencing method called Sanger sequencing. This assembly was followed by two revisions in 2013 and 2015 that used standard short-read sequencing technologies. Short reads can be assembled together into larger fragments of the genome called contigs, but some regions of the genome are difficult to assemble because they are long, highly repetitive, or otherwise ambiguous. In the end, these efforts left researchers with a decent yet highly fragmented picture of the stickleback’s chromosomes, with other large portions of its genetic sequence left in individual contigs unassociated to any chromosome.

 

Dr. Catherine (Katie) Peichel and Dr. Michael White

Dr. Catherine (Katie) Peichel (left), Head of the Division Evolutionary Ecology, University of Bern, and Dr. Michael White, Assistant Professor, Department of Genetics, University of Georgia, used Proximo Hi-C genome scaffolding to make many improvements to the Threespine stickleback genome and detect structural variation.

Dr. Peichel and Dr. White used Phase Genomics’ Proximo Hi-C genome scaffolding technology to resolve many of these issues and create the new reference genome. Proximo Hi-C genome scaffolding uses a protocol called Hi-C to measure the physical structure of an organism’s genome and then uses that information to place contigs into chromosome-scale de novo assemblies. Phase Genomics was founded by the inventors of this genome assembly approach and has been making its Proximo Hi-C genome scaffolding technology available to researchers since 2015. The company specializes in generating and analyzing Hi-C data for the scaffolding of genomes such as the Threespine stickleback, as well as for analyzing microbial communities and other metagenomic samples through its ProxiMeta™ Hi-C metagenomic deconvolution technology.

 

We know that scientific tools are only as good as the resulting scientific findings. We sent a brief Q&A to both Dr. Peichel and Dr. White to get their take on the scientific value of Proximo Hi-C and share their experiences in working with us.

 

Why is the stickleback genome important?

Sticklebacks are a “supermodel” for evolutionary genetics, in that they have been one of the leading model systems for identifying the genetic and molecular basis of phenotypic changes in natural populations. Thus, it is important to have a complete genome sequence so that one can correctly identify all the genes that are present in a genomic region that is associated with a phenotype of interest. -CP

Why did the original genome need improvement?

A high-quality Sanger-sequenced genome was published in 2012 and has undergone two revisions since this time. Despite incorporating dense linkage maps to help assign many of the unanchored scaffolds to linkage groups, over 26.7 Mb of the 460 Mb genome still remained unassigned to linkage groups. We needed to apply other approaches to try and assign these remaining scaffolds. -MW

How did Proximo Hi-C scaffolding improve the contiguity of the genome?

We were able to assign over 60% of the unassigned contigs to chromosomes. -CP

What other applications of the Hi-C data are useful to your biological questions?

Hi-C is a useful way to identify structural variation (like inversions) among stickleback populations. We are also excited about the possibility of using Hi-C for assembly of the hard-to-assemble regions of the genome like Y chromosomes. -CP

Why did you choose to work with Phase Genomics?

I was impressed by their interest in our biological questions and dedication to working with us until we were satisfied with the assembly. -CP

We chose to work with Phase Genomics because of the ease of the entire pipeline. Phase Genomics was fast and kept us updated at every step along the way. It was great to work with a group that was so communicative and open to trying different approaches to get the best assembly. -MW

Spotlight on Hi-C in Science: New Technologies Boost Genome Quality

Science writer, Elizabeth Pennisi, outlines available genomics technologies that are helping researchers improve genome assemblies with a focus on Hi-C’s ability to bring genome assembly to the chromosome-scale.

This article, by Elizabeth Pennisi, focuses on how new technologies are making genome quality much better.  Long-reads, optical maps, and Hi-C data are being synergistically applied to improve modern genome assemblies including goat (Dr. Tim Smith), humming bird (Dr. Eric Jarvis), maize, and more.  Importantly, Hi-C provides the finishing touch to these genomes, by providing ultra-long contiguity information that can scaffold entire chromosomes. We, at Phase Genomics, are glad researchers have chosen Proximo Hi-C to scaffold the goat, hummingbird, and hundreds of other assemblies into contiguous chromosome-scale reference genomes.

 

Read the article here

Hi-C Used to Assemble Extremely Large, Difficult Barley Genome

Barley is the 4th most cultivated plant in the world and has been a reliable food source for over 10,000 years. Genome Web reports on the exceptional state of the genome assembly and how researchers used Hi-C technology to tackle this extremely complex genome.

 

The barley genome, like many other grains, is notorious for being extremely difficult to assemble due to extensive polyploidy, long repeat regions, and its large genome size (5.3 Gb). However, the Barley Genome Sequencing Consortium (IBSC) used Hi-C to tackle this genome assembly, producing chromosome-level scaffolds representing over 95% of the genome in an attempt to understand the biology of this widely cultivated plant. After completing the assembly, the researchers began annotating the genome and identified over 87,000 different genes, publishing their findings in Nature.

 

Obtaining reference-quality assemblies for complex genomes, such as barley, used to be an extremely challenging endeavor. With Hi-C, obstacles like polyploidy and multi-Gb genomes are manageable due to its ability capture ultra-long-range genomic contiguity information from unbroken chromosomes, replacing the need for genetic maps. This ability enables researchers to answer questions otherwise difficult or impossible, including structural variation, complex gene structure, gene linkage, gene regulation, and more. While the researchers performed the barley assembly themselves, Phase Genomics’ Proximo Hi-C service makes it easy for any researcher to obtain similar results and has been used to assemble hundreds of genomes to chromosome-scale over the past two years, including complex genomes like barley.

 

Read more about the barley genome on Genome Web.

Spotlight on Hi-C in The Atlantic: The Game-Changing Technique That Cracked the Zika-Mosquito Genome

One of the most prolific science writers, Ed Yong, profiles how Hi-C sequencing technologies can make genome assembly easier and more cost-effective than ever before. 

Science writer Ed Yong covers the narrative on the researchers’ tackling the disease carrying Aedes aegypti genome, and how Hi-C “knitted” the genome from 36,000 pieces into complete and contiguous chromosomes. Yong points out that the completed genome will not only help scientists better understand the biology of the mosquito at a much deeper level, but it also marks a technological pivot in genomics: Hi-C makes genome assembly cheaper, more accurate and faster than ever before. Also, mentioned in the article: our collaborator, Dr. Catherine Piechel’s newly published three-spine stickleback genome, and Dr. Erich Jarvis’s hummingbird were also cited as examples of the power of Proximo Hi-C scaffolding.

 

Read the article here

Papadum’s Recipe for an Outstanding, Chromosome-Scale Genome with Hi-C

Meet Papadum the Goat! Papadum is a descendent from a rare population of goats that used to inhabit the San Clemente Island, and notably, Papadum also now holds the world record for the most contiguous non-model mammalian genome.  The recipe for a his amazing de novo genome assembly? Long reads, optical mapping, and Proximo Hi-C genome scaffolding. Read NIH’s article about Papadum’s genome here.

 

The goat genome has been of scientific interest for several reasons: goats are important suppliers of milk, cloth, meat, and more. But prior to the Papadum genome, scientists’ ability to fully understand how the goat genome controls its biology was limited. As a part of the “Feed the Future” initiative, in 2014 the U.S. Agency for International Development awarded innovative scientists Dr. Tim Smith, Dr. Derek Bickhart and Dr. Adam Phillippy a grant to attempt to eliminate these limitations by assembling Papadum’s genome. As pioneers in the genomics field, the scientists teamed up to leverage two rather young technologies, long reads and Hi-C, to create an ultra-high-quality new assembly of the goat genome.

 

Their efforts ultimately led to the creation of the highest quality de novo genome assembly of a mammal to date and are published in Nature Genetics.  With this new reference-quality goat genome, scientists will have a better understanding of goat biology and health to guide better breeding decisions, improving traits like milk production, meat quality, and resilience from disease.

 

The Papadum genome assembly includes large DNA sequences called “chromosome-scale scaffolds” which are nearly complete representations of entire chromosomes from Papadum. These chromosome-scale scaffolds are critical achievement that allows far better understanding of the mechanics of the goat genome than earlier, less advanced results, which included thousands of tiny fragments of chromosomes and lacked the overall structure of the goat genome. The difference is not unlike having an entire intact book, versus a jumble of all the individual words from the book.

 

The ability to reconstruct nearly complete chromosomes was made possible largely by a new technique called Proximity-Guided Assembly, performed with Phase Genomics’ ProximoTM Hi-C scaffolding technology. This process was followed by a tool called PBJelly, which identifies and closes gaps (regions of uncertainty) in the chromosome-scale scaffolds. After Proximo and PBJelly, the resulting assembly included 31 chromosome-scale scaffolds containing only 663 gaps total across the 3 billion base pair diploid genome. Descended from research first published in 2013, Phase Genomics has since successfully demonstrated the success of the Proximo Hi-C scaffolding method in the genomes of plants, animals, fungi and more.

 

Papadum’s genome marks the beginning of an era where reference-quality genomes are achievable and affordable for any organism, not just extensively studied model organisms like mice, fruit flies, and humans. The availability of these extraordinarily complete genomes enables scientists to answer many new biological questions that have the potential to help farmers, government agencies, agricultural companies, and developing countries solve a significant part of the food security problem.

 

Read more about the grant, the scientists, and Papadum’s genome on the NIH’s National Human Genome Research Institute website.