• Skip to primary navigation
  • Skip to main content
  • Skip to footer

Bacteriophages: Genes and Genomes

Transcript of Part 2: Bacteriophages: Genomic insights.

00:00:01.02		Hi. My name is Graham Hatfull. I am a professor at the University of Pittsburgh
00:00:04.15		and a Howard Hughes Medical Institute professor.
00:00:07.03		In part two we are going to talk about some of the insights that we can gain from comparing
00:00:14.02		the genomes of bacteriophages and perhaps learn something about how they are constructed and how they have evolved.
00:00:22.03		In part one we saw some morphologies of bacteriophages, what they look like in the electron microscope,
00:00:32.29		and I showed you some different types of structures
00:00:37.23		that could arguably reflect different ways in which those bacteriophages have evolved.
00:00:44.27		But we have to be very careful about interpreting differences in virion morphology, what the viruses look like,
00:00:53.09		and their evolutionary relationships and how the genomes compare to each other.
00:00:58.09		I've illustrated that in this particular slide
00:01:01.00		where I have shown five examples of bacteriophages.
00:01:05.06		These would all be classified according to their long flexible tails
00:01:10.14		as being members of the Siphoviridae, the Sipho viruses,
00:01:15.03		each with their heads and their tails attached.
00:01:18.18		It might be tempting to look at these and say well they all look very similar to each other,
00:01:25.06		almost indistinguishable, perhaps they all are genetically similar.
00:01:30.11		In fact this is an example where these five share essentially little or no sequence similarity at the genomic level whatsoever.
00:01:40.01		So if we want to understand how genomes have evolved
00:01:44.00		and how they are related to each other from a phylogenetic perspective,
00:01:48.23		we need to go in, isolate the DNA, and sequence those genomes and then compare them.
00:01:54.24		There are various ways in which we can compare the genomic sequences.
00:02:00.22		We can compare them by looking at the similarities of the nucleotide sequences,
00:02:07.26		essentially sequencing the DNA or if it's RNA, the RNA,
00:02:13.05		and then comparing them one to another and seeing what is shared.
00:02:16.08		A second way of doing that would be to look at the genes
00:02:21.09		and comparing them through their predicted amino acid sequence similarities of the proteins that are encoded by those genes.
00:02:28.22		Right here I am showing you an example of what it looks like if we take two bacteriophages
00:02:34.09		and compare their nucleotide sequences.
00:02:37.20		And this particular representation is referred to as a dot plot.
00:02:42.05		And what we have done is to take two bacteriophage genomes, in this case Fruitloop and Boomer,
00:02:49.10		and we have aligned the two sequences, and we are going to slide one next to the other computationally,
00:02:58.03		and ask if there are segments that are similar to each other within a particular window of comparison.
00:03:06.07		And every time we see sequence similarity, a dot is presented on this dot plot.
00:03:12.11		And what you can see here is that
00:03:15.12		there's a rather complex series of relationships reflecting a quasi diagonal line
00:03:25.08		from the top left to the bottom right of this representation.
00:03:29.21		So where you can see a relatively solid line that means that there is a segment of DNA
00:03:35.24		which is substantially similar between the two.
00:03:38.23		Where you fail to see a line, such as in the top left hand corner, is a region where the two genomes
00:03:47.14		appear to be substantially dissimilar.
00:03:50.10		They don't have shared nucleotide sequences.
00:03:52.15		And then there's all sorts of complicated interruptions and shifts in the diagonal line as you look between these.
00:04:02.13		And this tells us an important aspect, a component, of what we see when we compare these types of genomes.
00:04:11.09		And that is, they are not simply completely similar from end to end or completely dissimilar from end to end.
00:04:18.20		But quite commonly we see these interrupted portions
00:04:22.22		where different segments of the genomes are related to each other in different ways
00:04:27.20		as though different parts of the genome have different evolutionary histories,
00:04:33.13		different ways of arriving in the genomes as we see them in Fruitloop and Boomer today.
00:04:40.14		So from this type of analysis and looking at a number of bacteriophage genomes,
00:04:47.08		we can see the following general conclusions.
00:04:51.15		First of all, the DNA that is isolated from these particular virions, these double stranded DNA virus types,
00:05:01.12		that the genomes are linear. So they have a left end and a right end.
00:05:08.23		They tend to form predominantly two types of groups that we can see when we look at the linear genomes.
00:05:17.06		There are those that have defined ends.
00:05:20.03		That means that if you isolate the molecules from a million particles
00:05:25.10		of a particular phage type, each of the million DNA molecules that you get out
00:05:30.08		have the same left and right ends.  In other cases that is not true.
00:05:35.26		The DNAs have the same overall genetic constitution,
00:05:41.14		but the specific physical ends of the left and the right can be positioned in different places.
00:05:47.27		And therefore they are referred to as being circularly permuted.
00:05:52.18		They are not circular. They are linear,
00:05:54.21		but they represent different positions of the ends relative to the genetic information.
00:06:01.28		Often these viruses also contain terminal redundancies,
00:06:06.22		which means that one segment of the genome is duplicated at both ends.
00:06:11.19		And so these two major types of genomes that you see either have defined ends
00:06:17.27		or terminally redundant and circularly permuted ends
00:06:20.24		and there are other viruses that have different variations on these themes.
00:06:25.22		The sizes of bacteriophage genomes varies enormously.
00:06:31.05		There are those that are as small as perhaps 5000 bases, and there are those that are as large as 500 kilobases,
00:06:39.02		which is quite amazing when you think that 500 kilobases
00:06:44.22		is about the same size of the smaller of the free living bacterial genomes.
00:06:50.14		And so there are examples of viruses that are the same size genomically
00:06:55.00		and have the same or more genes as small bacterial genomes.
00:07:00.15		The phage genomes tend to be densely packed with genes.
00:07:05.19		And so most of the DNA is encoding genes.
00:07:10.16		And as I mentioned before in this section, the phages infecting bacteria from different genera
00:07:17.05		tend to be unrelated at the DNA level.
00:07:21.13		So this slide shows an example of what we see when we take a DNA sequence of a particular phage,
00:07:34.00		in this case it is a phage called Giles,
00:07:36.26		And we use computational approaches and a bioinformatic strategy to identify
00:07:44.07		the protein coding genes that are present within the virus.
00:07:49.02		And so the genome is largely filled with protein coding genes,
00:07:54.22		and they are shown here by these boxes, either colored or in white.
00:08:00.21		The genome is represented by what looks like this railroad track here
00:08:05.24		which has markers every kilobase and every 100 bases.
00:08:10.21		And the genome for Giles is linear with defined ends,
00:08:14.27		and so in this representation it begins in top left hand corner and goes to the bottom right hand corner,
00:08:22.14		and each of the genes are shown in these boxes represented either above or below the DNA.
00:08:30.00		Genes that are shown above the DNA are transcribed in the rightwards direction,
00:08:36.03		coming this way, and those that are shown below such as a couple or three genes in the top left hand corner
00:08:44.23		are transcribed in the leftwards direction.
00:08:47.19		So those are the standards that we use for presenting the genes
00:08:52.25		and illustrating the direction that they are transcribed
00:08:56.17		relative to the overall genome structure.
00:08:59.19		You can see here from these genes that they are densely packed into this particular genome.
00:09:06.10		There's few non-coding spaces between the genes.
00:09:10.19		They essentially represent 95% or more of the genetic information that's available.
00:09:19.12		In this particular representation we have colored the genes in such a way
00:09:26.12		as to reflect the relationships that some of these genes share with genes that you find in other bacteriophages.
00:09:34.27		The genes that are shown in white, and you can see some across the top here,
00:09:40.02		are simply genes for which we don't have any other close relatives in any of the databases.
00:09:46.03		And this illustrates the point that phages such as this can be replete with genes
00:09:52.11		that are not closely related to known genes
00:09:55.24		and for which we have rather little idea as to what they do.
00:09:59.04		I mentioned that when we compare the nucleotide sequences of phages we can see that it looks as though the parts
00:10:12.01		have evolved differently to each other.
00:10:15.00		And this leads to the idea that phage genomes are characteristically mosaic.
00:10:21.09		They are constructed architecturally from segments which have been put together in a particular way.
00:10:29.01		Modules if you like. And that each of these modules is in effect mobile
00:10:35.08		or can move around the population of bacteriophages
00:10:38.28		such that you can find it in more than one or perhaps several different genomic contexts.
00:10:45.27		And this slide illustrates how this might look when you see mosaicism
00:10:52.15		at the level of nucleotide sequence comparisons.
00:10:56.00		So this is showing a small segment of three phage genomes.
00:11:01.19		The one at the top, PG1. Rosebush in the middle, and Qyrzula towards the bottom.
00:11:08.17		You can see the genome represented by the markers in the railroad tracks for each of these.
00:11:14.19		The genes that are encoded are shown by the color boxes with their gene names inside the boxes,
00:11:19.11		and where these genomes contain and share nucleotide sequence similarity
00:11:25.04		there is a color coded area shading between the two such as you can see here.
00:11:32.01		Now Rosebush and Qyrzula have very evident and strong nucleotide sequence similarity
00:11:39.22		both in the left part here and over here in the right part as well.
00:11:44.09		PG1 and Rosebush and Qyrzula have no sequence similarity that is evident by this comparison,
00:11:53.22		in this example, because there is no color shading over on this left part.
00:11:58.29		Nonetheless, in this middle segment things are different.
00:12:04.23		There appears to be very little sequence similarity between Rosebush and Qyrzula
00:12:10.16		because there is no shading in that area,
00:12:15.22		however, when we compare PG1 and Rosebush we can see that in this central segment right here
00:12:22.22		that there is indeed a purple color shading that reflects strong sequence similarity
00:12:29.17		between these two genomes, PG1 and Rosebush, in this center portion.
00:12:35.04		So this is really important because it illustrates an example where the different segments of these genomes
00:12:42.09		particularly Rosebush appear to have had different evolutionary histories.
00:12:47.09		They've come from different places.
00:12:49.14		This segment that's in the middle of Rosebush clearly did not come from the same place as Qyrzula
00:12:55.02		It appears to have come from a common ancestor which had more in common in this region with PG1.
00:13:02.15		So this is a good example of mosaicism, a key architectural feature of bacteriophage genomes.
00:13:08.26		When you look at the nucleotide sequence level you can see precisely where these types of events occur-
00:13:18.27		at the boundaries that must reflect where recombination occurred to give you this exchange of information.
00:13:26.08		And in this particular slide I am showing the detailed information of two genomes.
00:13:33.28		The one at the top here you can see the sequences,
00:13:36.27		and in blue the amino acid sequences of the predicted genes in that region.
00:13:41.18		In the bottom you can see a second genome that we are comparing.
00:13:46.00		And this red shading over on the right hand side is
00:13:50.28		simply reflecting a segment where these genomes are closely related.
00:13:54.22		The nucleotide sequences, the DNAs are extremely similar if not identical in this red part,
00:14:03.09		but over here, they are completely different. They are completely dissimilar.
00:14:08.07		And so the key point that you can see from this type of comparison that this module boundary, this junction
00:14:15.20		between the red and the white parts where recombination must have happened,
00:14:21.13		this module boundary, corresponds precisely to the boundaries of the genes.
00:14:27.16		It is this boundary which is where this gene starts up here, and its comparable gene begins down here.
00:14:38.04		These genes to the left are very different, and to the right they are identical.
00:14:42.23		So the module boundary, or the recombinant joint which must have brought these together
00:14:48.15		coincides with the gene boundaries themselves.
00:14:52.25		And this is a common and important observation and it helps us to think about how mosaicism can be generated.
00:15:01.14		And there are two fundamental models.
00:15:03.05		The first is that recombination happens at targeted, short, conserved boundary sequences.
00:15:13.06		The idea that there are some short conserved segments of sequences,
00:15:17.00		perhaps a dozen or a couple of dozen nucleotides in length
00:15:19.23		which corresponds to those boundary regions.
00:15:23.05		And that homologous recombination perhaps encoded by host enzymes
00:15:26.26		catalyzes exchange at that region in order to promote recombination
00:15:32.15		at places where genes themselves in their entirety get exchanged.
00:15:36.22		There are some examples of that that have been reported in the literature.
00:15:41.29		So this is certainly an event that can happen.
00:15:45.05		We think however that it is more likely that much of the mosaicism that you see
00:15:51.02		because it is this pervasive feature throughout phage genomes
00:15:55.07		can occur by an alternative mechanism which is by illegitimate recombination
00:15:59.19		at what are essentially randomly chosen sequences.
00:16:04.04		In other words, that even though we see a close correspondence
00:16:08.08		between the point of recombination and the gene boundaries,
00:16:12.03		this does not result by this model from targeted exchange at that point.
00:16:19.03		Rather that the exchange positions are random
00:16:22.28		and the reason why that correspondence occurs is because of
00:16:26.12		selection for gene function for those genes that can actually work.
00:16:32.03		And so this just illustrates the different types of examples of recombination.
00:16:39.08		In the top panel one could imagine the targeted recombination, targeted homologous recombination,
00:16:45.03		could occur at these short black segments. Short segments of DNA that are conserved at gene boundaries
00:16:53.28		in order to give you these exchange events in these recombinants.
00:16:57.08		This middle panel here shows an example of illegitimate recombination
00:17:02.06		where recombination has essentially happened anywhere.
00:17:05.20		It has happened between sequences that are not related to each other.
00:17:09.02		And you get whatever gobbledygook may arise from just a random exchange in the process.
00:17:15.15		And at the bottom here I want to emphasize that we do expect recombination to occur between shared sequences
00:17:25.04		such as whole genes that are shared.
00:17:27.01		Homologous recombination of this sort always happens,
00:17:31.19		and it gives you new combinations of flanking genes,
00:17:34.28		such as A now joined together with C.
00:17:38.03		Ok. So homologous recombination is always going to play a role in reassorting
00:17:43.07		the types of genes that can be present in the modules.
00:17:47.03		But homologous recombination of this general type does not generate new recombinant boundaries,
00:17:55.18		new module boundaries unless it is in this targeted approach.
00:18:00.27		So as I mentioned we think that whereas there are a small number of examples
00:18:06.26		that would support the exchange of boundary sequences
00:18:11.02		By far the majority of the boundaries that we see when we compare phage genomes
00:18:16.13		show no evidence of such boundary sequences, lending support to the idea that illegitimate recombination
00:18:23.27		is playing a key role. But there are some really important consequences
00:18:28.26		that we have to think about as a model for illegitimate recombination in this process.
00:18:33.16		First of all, illegitimate recombination, recombination between sequences
00:18:38.15		that don't share anything or very little in common,
00:18:41.11		is likely to happen at rather low frequencies.
00:18:45.02		It is going to occur at random positions, for the most part,
00:18:49.26		and that when you put together two pieces of DNA randomly,
00:18:55.00		for the most part it is just going to generate genomic garbage.
00:18:59.17		Material which may not have a genome of the appropriate length,
00:19:04.24		and will have lost some genes and is liable to be non-functional.
00:19:11.03		So in its essence we can think of it as a rather disruptive or destructive type of process.
00:19:19.21		And one can imagine that if this was going to play an important role,
00:19:24.24		that you would probably need multiple low frequency events in order to actually generate survivors,
00:19:34.07		the phoenix that can rise from the ashes with a full complement
00:19:37.04		of functional sequences that can function as a virus.
00:19:41.27		If sequences are going to recombine randomly with each other
00:19:49.07		then there is no necessity to think of these events as being predominantly involving two phage genomes.
00:19:58.12		The bacterial chromosome is about a hundred times the size of an average bacteriophage genome,
00:20:03.22		and therefore there is going to be a strong propensity or at least an opportunity
00:20:08.05		for the phage genome to recombine with the bacterial chromosome.
00:20:13.21		The process we can think of as being one that is infrequent and yet extremely creative.
00:20:22.08		This is the way in which you can take pieces of DNA
00:20:26.08		and put them together in a way in which has perhaps never been seen before in nature.
00:20:32.10		That's a way of making new genes, or perhaps putting domains together in novel combinations,
00:20:40.06		and generating new types of functions which perhaps have not been seen in nature before.
00:20:47.08		And so this fits in very much with our model as described by Darwin for the process of the origin of species,
00:20:58.09		where we can think of the variation being generated by these illegitimate recombination events
00:21:05.26		and then natural selection working on what is essentially this garbage
00:21:11.14		in order to select from that those components that work.
00:21:16.15		Even though we would think of this as being a very low frequency event,
00:21:21.25		requiring infrequent recombination events and multiple numbers of them, it is nonetheless it is creative.
00:21:30.27		And as we saw previously, that phages have likely to have been evolving for many, many years
00:21:40.19		in a very dynamic population very successfully.
00:21:44.26		So this will give us these recombinant joints.
00:21:49.09		These recombinant joints once they are formed are likely to be stably maintained.
00:21:53.07		There's no mechanism necessarily for undoing them and therefore
00:21:57.14		these survive as we see today as the fossilized relics of recombination events
00:22:02.29		that probably happened many of hundreds of millions or even billions of years ago.
00:22:08.27		And thinking about the mechanisms by which this might happen,
00:22:12.00		it's been shown that many bacteriophages encode recombinase enzymes
00:22:19.25		which have the capability to recombine genomes at least at very short sequences
00:22:26.23		that don't have to be completely identical to themselves.
00:22:32.00		raising the interesting possibility that bacteriophages actually encode their own machinery
00:22:36.20		that can facilitate this type of recombination,
00:22:40.27		and indeed the generation of the mosaic genomes as we see them.
00:22:44.10		Looking at bacteriophages that are very different in their sequences
00:22:53.03		is quite limited to each other and this shows us that if we really want to learn more
00:23:00.14		about the details about how mosaicism is created and how it works,
00:23:04.29		we really have to think about, and very carefully, about what types of genomes we want to compare with each other.
00:23:12.04		And we will see an example of that in part three.
00:23:15.09		So we can conclude then from this genomic comparison of phages
00:23:24.14		we can conclude that phage genomes are architecturally mosaic.
00:23:28.15		That mosaicism is fueled by this process of illegitimate recombination.
00:23:32.27		And that genome segments can eventually be reassorted by homologous recombination
00:23:39.13		once new joints between new genes are generated to form that mosaicism.
00:23:45.09		In part three, we'll look at a rather particular case
00:23:49.14		of the detailed analysis of bacteriophages that infect one particular common host
00:23:54.27		where all those bacteriophages can be argued
00:23:58.00		to be potentially in genetic communication with each other,
00:24:01.21		and we can therefore explore what they look like
00:24:04.06		and the insights that they can give us in bacteriophage evolution.

This material is based upon work supported by the National Science Foundation and the National Institute of General Medical Sciences under Grant No. 2122350 and 1 R25 GM139147. Any opinion, finding, conclusion, or recommendation expressed in these videos are solely those of the speakers and do not necessarily represent the views of the Science Communication Lab/iBiology, the National Science Foundation, the National Institutes of Health, or other Science Communication Lab funders.

© 2023 - 2006 iBiology · All content under CC BY-NC-ND 3.0 license · Privacy Policy · Terms of Use · Usage Policy
 

Power by iBiology