Plasmids are hard!
Plasmids are an important part of microbial biology. Plasmid-borne genes can have serious public health consequences by conferring virulence traits or resistance to antibiotic drugs, and can be readily shared among bacterial cells through cell-cell conjugation or other means. In principle, any gene that gives bacterial cells a selective advantage is likely to be shared via plasmids among related cells. For example, so-called “epidemic resistance plasmids” have been instrumental in the rise of multi-drug resistance in pathogenic E. coli and Klebsiella pneumoniae.
However, determining the bacterial hosts of any given plasmid in a sample can be difficult. The classic approach is to isolate host and plasmid together and culture them in the lab. However, in complex samples with numerous organisms, many of which cannot be cultured readily or even where culturing may alter the selection pressure on the organisms of interest, this approach is often impossible. Alternatives like statistical metagenomic approaches also have difficulty with plasmid-host association, as plasmids do not necessarily resemble their host genomes in either abundance or nucleotide composition and single-cell sequencing approaches are expensive and have a limited range of samples and species they can be used on.
Hi-C to the rescue
Fortunately, recent developments in genomic technology have yielded some novel tools that allow us to circumvent this limitation. Hi-C is a method that allows us to measure 3-dimensional distances between sequences inside intact cells and was originally developed to model 3D folding of genomes inside cells. These structural measurements include a clear signal about which sequences originated inside the same cell simply because the cell membrane generally prevents inter-cellular sequences from coming into contact. Hi-C therefore provides direct physical evidence of DNA sequences originating from the same cell.
Phase Genomics has developed the ProxiMeta™ Hi-C metagenome deconvolution method, which is specifically optimized for metagenomic applications (Figure 1). At Phase Genomics we use ProxiMeta Hi-C to reconstruct whole genomes from a variety of complex samples such as human fecal, wastewater, soil, and co-culture communities (for more information, see our paper about ProxiMeta).
As a necessary part of their life cycle, plasmids need to pass through their bacterial host cells to replicate. Therefore, plasmids typically form Hi-C links to their host genomes simply by virtue of being inside the same cell as their host genome. So, to find the hosts of a given plasmid, one only needs to find these plasmid-genome links. Our analysis of metagenomic Hi-C data bears this conclusion out repeatedly through multiple publications, as described below.
Hi-C links plasmids and hosts
A pair of early publications showed that using this method we could correctly associate several plasmids with their bacterial hosts in an artificial community using and early version of the Hi-C method.
In our more recent paper, we have demonstrated that Hi-C links between plasmids and hosts in a complex human fecal sample link described plasmids to their known hosts. Excitingly, in a single experiment we can now assemble numerous novel microbial genomes, complete with plasmid content, from a complex sample with hundreds of different species present.
An exciting finding from this complex community is that we can directly visualize how plasmids are shared between bacteria in a community (Figure 2). Recall from above that the sharing and spread of plasmids is a serious problem in the epidemiology of antibiotic resistance and infectious disease. For example, the sequence marked with “*” in Figure 2 shows substantial similarity to a plasmid called pBUN24, in addition to other plasmids with unknown hosts. It is clear that this plasmid shows contacts with a variety of genome clusters corresponding to different organisms, suggesting that all of these organisms can act as hosts for this plasmid.
In a more recent collaboration with Mick Watson’s group at the Roslin Institute, we applied ProxiMeta Hi-C to the cow rumen microbiome, a very complex microbial community. In this peer-reviewed paper, we were able to not only discover scores of novel genomes in this community, but also to profile plasmid-genome linkages for these genomes. Thus, Hi-C linkages of plasmids to genomes are robust even to very high complexity of the community.
Looking to the future: Plasmid Biology conference and more.
We have multiple exciting ongoing collaborations using Hi-C to understand the host range and biology of plasmids and other mobile elements; the best is yet to come! To see several examples of our Hi-C technology applied to this problem, you need only read the abstracts for the 2018 Plasmid Biology conference, August 5-10 at the University of Washington in Seattle.
We will be writing more about the uses of ProxiMeta and metagenomic Hi-C on this blog in the future, so stay tuned.