• Skip to primary navigation
  • Skip to main content
  • Skip to footer

Genomics and Cell Biology of the Apicomplexa

Transcript of Part 4: Designing and Mining Pathogen Genome Databases: From Genes to Drugs and Vaccines II

00:00:02.06	We're back,
00:00:03.29	and we're now looking live at the Plasmodium genome database
00:00:06.26	at PlasmoDB.org.
00:00:10.16	And before we turn to the question
00:00:13.10	that we raised on trying to identify candidate vaccine targets
00:00:17.06	for malaria,
00:00:19.16	let me just provide a little bit of context
00:00:22.26	that may be a little easier to see
00:00:26.05	than what we'd seen before
00:00:28.29	in those canned screen dumps.
00:00:32.10	The first point I'd like to make is that
00:00:34.25	the success of the Plasmodium genome database
00:00:37.12	has been such that it has led to
00:00:40.13	its expansion to encompass a variety of other organisms.
00:00:43.29	The PlasmoDB project
00:00:47.10	morphed into the Apicomplexan Genome Database,
00:00:50.06	APDB,
00:00:52.10	which was itself expanded still further
00:00:54.03	into the Eukaryotic Pathogen Genome Database,
00:00:58.17	encompassing a wide range of organisms --
00:01:02.28	not only apicomplexan parasites,
00:01:05.28	such as Cryptosporidium and Plasmodium
00:01:08.18	and Toxoplasma and Theileria,
00:01:10.20	but also other species as well,
00:01:12.23	such as Giardia and Trichomonas,
00:01:14.24	which we won't be talking about today.
00:01:17.02	This project is in fact just part of a larger
00:01:20.22	Bioinformatics Resource Center project
00:01:23.04	funded by the US NIAID
00:01:25.09	that includes several different genome databases from...
00:01:30.11	dealing with a variety of pathogens.
00:01:33.00	And those of you who are interested in other pathogens
00:01:36.11	might want to explore this further.
00:01:38.24	Now, the purpose of having an overar...
00:01:40.28	overarching website for exploration
00:01:43.02	not only of malaria parasites
00:01:45.06	but many eukaryotic pathogens
00:01:48.05	is that there are a variety of questions
00:01:49.28	that you might want to ask
00:01:52.12	that extend beyond an individual species.
00:01:57.21	And these are several ways that you can explore that.
00:02:00.02	I should also point out that this homepage
00:02:02.13	also provides, in addition to links
00:02:05.02	to some of these other pages,
00:02:07.10	other bits of information.
00:02:09.00	You might, for example, be interested in tutorials
00:02:11.18	that highlight some of the features
00:02:13.15	we'll be talking about.
00:02:15.11	And further down the page, here,
00:02:17.02	you can see links to...
00:02:19.06	links to individual tutorials,
00:02:21.14	links to publications,
00:02:23.09	workshops, exercises, and so on.
00:02:27.25	We can run questions across
00:02:31.00	a variety of these organisms.
00:02:32.17	If we were interested in apicomplexan parasites, for example,
00:02:34.22	we might want to take a look at metabolic pathway maps
00:02:37.18	for these organisms.
00:02:39.20	And in this case, we've taken annotations
00:02:43.01	from the Plasmodium genome database,
00:02:45.11	the Toxoplasma genome database,
00:02:47.08	the Cryptosporidium genome database,
00:02:49.20	and mapped those on top of
00:02:52.23	the KEGG metabolic pathway projects
00:02:54.29	emerging from database projects in Japan.
00:02:57.18	If, for example, we take a look at carbohydrate metabolism,
00:02:59.28	and dive in further to see the glycolytic pathway,
00:03:03.06	the key to this analysis is indicated up at the top,
00:03:07.14	in which Toxoplasma is indicated in red,
00:03:09.20	Plasmodium is indicated in green,
00:03:12.24	Cryptosporidium in yellow,
00:03:14.16	and human in blue.
00:03:16.09	And so, we can see, by looking at the painting
00:03:18.14	of this metabolic pathway,
00:03:20.01	that from top to bottom
00:03:22.22	all of these organisms are capable
00:03:25.06	of carrying out glycolysis.
00:03:27.16	Now, that might not sound very surprising.
00:03:29.28	But we can dive in a little bit deeper
00:03:32.11	and take a look, for example, at the TCA cycle.
00:03:34.25	And now we see a somewhat different pattern,
00:03:37.05	in which the yellow bug
00:03:39.21	-- in this case, Cryptosporidium --
00:03:42.12	doesn't do this pathway.
00:03:45.03	And indeed, that's the case.
00:03:47.02	Cryptosporidium is an anaerobe, which doesn't carry out...
00:03:49.21	which doesn't carry out oxidative phosphorylation.
00:03:54.18	There are many other pathways we could look at
00:03:58.18	if we were interested, for example,
00:04:00.17	in some of those metabolic pathways,
00:04:02.16	which we now know are associated with the apicoplast --
00:04:04.27	for example, pathways involved in the biosynthesis of steroids.
00:04:08.25	We could see that purely from the pattern
00:04:11.26	of gene presence and absence,
00:04:14.07	the red and green organisms
00:04:17.28	-- Toxoplasma and Plasmodium --
00:04:19.29	clearly use a different pathway
00:04:22.24	for synthesizing isoprenoids
00:04:24.25	than the blue organism -- human.
00:04:26.13	And indeed, this is the case.
00:04:28.10	One of the most striking findings from the biochemistry of the apicoplast
00:04:31.27	is that these parasites synthesize isoprenoid subunits
00:04:37.12	via a xylose pathway
00:04:40.15	typically associated with chloroplasts --
00:04:42.25	quite distinct from the HMG CoA-reductase pathway
00:04:45.26	found in humans.
00:04:48.00	Cryptosporidium does neither,
00:04:49.29	presumably salvaging isoprenoid units,
00:04:52.17	which you can use apparently
00:04:54.24	to produce squalene,
00:04:56.15	but of these organisms is capable of converting that squalene
00:05:00.10	all the way into cholesterol,
00:05:02.25	so this is not a sterol biosynthesis pathway,
00:05:05.26	but it's certainly a pathway for the production
00:05:08.16	of isoprenoid precursors.
00:05:11.18	And there are many other fascinating aspects of parasite biochemistry
00:05:15.06	to take a look at.
00:05:17.00	So, let's return to the Eukaryotic Pathogen Genome Database,
00:05:20.18	and return further to the Plasmodium genome database
00:05:23.20	that we...
00:05:25.08	that is the subject of our discussion today.
00:05:28.22	Now, as you've already seen,
00:05:30.18	we can explore this database in many different ways.
00:05:33.01	We can look, for example, at individual genes,
00:05:36.09	and we'll just take a look at a single gene listed here,
00:05:38.29	the default gene on the...
00:05:41.09	on this pathway
00:05:44.06	known as the apical membrane antigen 1,
00:05:46.27	a famous gene in the world of malaria biology
00:05:49.23	because this has been advanced as one of the leading
00:05:55.19	vaccine candidates for malaria parasites,
00:05:58.07	although there are a variety of concerns about AMA1
00:06:01.04	which lead investigators
00:06:04.10	to be interested in identifying other candidates
00:06:07.09	that might also be worth exploration.
00:06:10.02	We can see that AMA1 is present on chromosome 11
00:06:13.05	and, as you've already seen in the illustration
00:06:15.15	from a chromosome-based view,
00:06:17.23	is a highly polymorphic antigen.
00:06:20.22	Many dozens of polymorphisms
00:06:23.16	known to be associated with this gene,
00:06:25.10	and we can see what those polymorphisms are
00:06:28.09	from different species.
00:06:29.24	We can see, for example, that this particular polymorphism
00:06:31.26	changes the coding potential,
00:06:34.13	such that in the reference 3D7 strain
00:06:36.27	the nucleotide C corresponds to a proline
00:06:39.26	whereas in many of the other species on this list...
00:06:42.29	many of the other isolates...
00:06:46.03	a T... C-to-T nucleotide polymorphism
00:06:49.25	results in a proline-to-serine mutation.
00:06:52.07	We can see user comments which have been entered,
00:06:55.28	providing additional information on these genes;
00:06:58.01	links to a variety of other gene pages;
00:07:00.28	protein features which have been identified
00:07:04.13	by a variety of means;
00:07:06.14	predicted structural information;
00:07:09.09	proteomic data indicating
00:07:12.21	that there's evidence for expression at the protein level;
00:07:15.12	microarray analysis on several different platforms
00:07:18.15	indicating that this gene
00:07:21.24	is most abundantly expressed late in the intraerythrocytic life cycle,
00:07:25.25	as one might expect for a gene
00:07:28.06	that is present in merozoites, the extracellular stage,
00:07:32.22	that one might want to target in a vaccine
00:07:36.16	that would be effective against
00:07:39.20	the disease-causing stage of malaria parasites;
00:07:43.24	additional information from other expression studies,
00:07:46.14	from knockout studies;
00:07:48.16	sequence information that can be shown here;
00:07:51.06	and so forth.
00:07:53.20	But as we described ear... discussed earlier,
00:07:56.21	the real power of this database
00:07:59.15	comes not solely from viewing it as a catalogue
00:08:02.25	of available information
00:08:05.08	but as an opportunity for being able
00:08:09.03	to ask your own questions.
00:08:11.28	So, what kinds of questions can we ask?
00:08:14.03	Here, under the queries and tools link
00:08:18.00	indicated at the upper left-hand corner of your screen,
00:08:21.26	we can see a grid describing
00:08:24.05	a wide range of questions
00:08:26.16	that one might choose to ask.
00:08:28.18	For example, we might imagine that chromosomal location
00:08:32.26	was in some way informative for candidate vaccine targets.
00:08:36.07	I'm not quite sure how that would work;
00:08:38.21	it's not really clear to me
00:08:41.24	how proximity to a centromere
00:08:44.00	might be indicative of a good target for vaccine development,
00:08:47.03	so I'm not going to pursue that line of inquiry,
00:08:49.27	but you might want to if you have reason for thinking
00:08:53.15	that chromosomal location is informative
00:08:55.21	for effective vaccine targets.
00:08:58.02	Let's instead start with some more obvious kinds of approaches.
00:09:02.18	We certainly would expect that a target
00:09:05.27	for vaccine development
00:09:08.14	would have to be antigenic in some way.
00:09:10.20	And so, here we can take advantage
00:09:13.10	of extensively curated information
00:09:15.26	that comes from the Immune Epitope Database Project,
00:09:18.05	whose research on other databases
00:09:20.21	has been incorporated into this database.
00:09:22.28	And we can look for genes that have been annotated
00:09:25.06	with a very high confidence of antigenic function
00:09:29.15	against Plasmodium falciparum.
00:09:32.05	And what we see is several genes --
00:09:35.13	41 to be exact.
00:09:38.18	Now, that's a disappointingly small number.
00:09:41.10	It includes the merozoite surface protein 1, which,
00:09:45.10	with AMA1,
00:09:47.15	is also viewed as a promising candidate for antimalarial vaccine development,
00:09:50.20	and a variety of other merozoite surface proteins,
00:09:53.01	as one as one might expect.
00:09:55.23	But, surely, there must be more than 41 candidate targets
00:10:00.20	in a genome of many thousands of genes.
00:10:03.29	So, let's modify this query
00:10:07.06	to ask a different question,
00:10:10.00	ask not just those antigens
00:10:12.22	that have a high confidence of immune reactivity
00:10:17.04	but those that have any confidence of immune reactivity.
00:10:21.09	And now we come up with a much larger list of genes,
00:10:24.05	which may have lower...
00:10:26.06	for which we may have lower confidence,
00:10:28.12	but things that we might want to explore further.
00:10:32.12	What else might we want to ask?
00:10:34.15	We'll return to our query grid here
00:10:36.29	and ask about other information.
00:10:38.27	So, we've asked about genes
00:10:41.19	that show some evidence -- based on manual curation --
00:10:44.09	of being effective epitopes.
00:10:47.08	We might also imagine that genes
00:10:49.20	that would be effective targets
00:10:53.10	for vaccine development
00:10:58.13	would have to be expressed in the right place and at right time.
00:11:03.08	By in the right place, we mean presumably
00:11:05.27	on the surface of an infected red blood cell
00:11:08.22	or on the surface of the parasite itself,
00:11:11.08	and we can gauge that information
00:11:14.11	by looking at cellular location, here.
00:11:16.17	We know that signal peptides
00:11:18.29	are likely to be involved in targeting proteins outside of the cell,
00:11:21.10	so let's ask for all proteins in a malaria parasite
00:11:24.01	that have a predicted signal sequence.
00:11:27.07	And we're interested, in this case, in Plasmodium falciparum,
00:11:29.24	although we could interrogate other malaria parasites as well.
00:11:33.04	And when we ask a question like that,
00:11:35.19	we get a list of many hundreds of genes,
00:11:39.05	including genes that certainly wouldn't be at the top
00:11:43.01	of anyone's list as a vaccine target,
00:11:45.28	such as this pseudogene that's listed...
00:11:48.04	that's indicated here.
00:11:50.28	Interestingly, as I look at this number,
00:11:54.02	while we find many hundreds of genes,
00:11:56.25	there are not that many hundreds of genes.
00:11:59.01	I would have naively expected that for an organism
00:12:01.12	that makes its living by secreting proteins to modify the host cell,
00:12:05.14	using those specialized apical secretory organelles
00:12:08.20	that we discussed in the first lecture of the series,
00:12:11.24	surely more than 10% of the parasite genome would be...
00:12:16.10	would be secreted.
00:12:19.08	What could possibly account for this...
00:12:22.19	for this shortfall?
00:12:24.22	These organisms, remember, are eukaryotic organisms.
00:12:27.26	And this brings us face to face
00:12:30.08	with the bane of genome annotation
00:12:34.10	in eukaryotic species,
00:12:36.23	and that is the following...
00:12:39.08	that while we are quite good at identifying coding sequence,
00:12:42.19	it's quite difficult to identify every single exon
00:12:46.19	that is encoded into a...
00:12:48.25	that's translated into a protein in eukaryotic species.
00:12:52.02	And that's particularly true
00:12:56.22	at the extreme 5' end of the gene;
00:12:59.03	the first exon is the most difficult to identify.
00:13:02.01	And that, in turn, manifests itself
00:13:04.24	as an inability to accurately predict signal sequences.
00:13:10.10	So, we can imagine expanding our search
00:13:13.23	a little more broadly
00:13:16.15	to identify more proteins.
00:13:18.10	Let's imagine, for example,
00:13:20.10	if we go back to the grid of questions that we've asked,
00:13:22.29	that we might want to ask
00:13:26.03	not only for proteins that have a recognizable signal peptide
00:13:29.27	but also for proteins that have recognizable transmembrane domains,
00:13:34.12	anticipating that those that have a transmembrane domain
00:13:36.19	without a signal sequence
00:13:39.14	are probably proteins for which we weren't really able
00:13:42.10	to recognize the signal sequence accurately.
00:13:44.29	Once again, we will look only in Plasmodium falciparum.
00:13:47.24	I'm not interested in proteins
00:13:50.06	with two or ten transmembrane domains.
00:13:52.06	I'm really interested in proteins that have at least one...
00:13:54.12	I don't care how many...
00:13:56.05	you know, at least one,
00:13:58.01	and no more than a thousand transmembrane domains.
00:14:00.22	And now we see a slightly larger number --
00:14:03.11	actually, about double the number... 1,700+ proteins.
00:14:07.23	Now, some of those proteins...
00:14:09.29	so, this will presumably include many of those proteins
00:14:12.28	with signal peptides.
00:14:14.17	Some of them will be secreted without a transmembrane domain.
00:14:17.01	But it includes many other proteins
00:14:19.14	that we have some confidence are associated at least with a membrane,
00:14:22.09	although we have no confidence that it's associated
00:14:26.03	with the surface membrane of those proteins.
00:14:28.18	Now, those of you with sharp eyes may have noticed
00:14:31.24	that over on the far left-hand end of the screen
00:14:34.28	is a box indicated as "My Query History,"
00:14:38.17	and this is a history of all of the questions
00:14:42.27	that we've asked in the context of this session.
00:14:45.27	We asked, first of all,
00:14:48.14	for this individual gene, AMA1.
00:14:51.06	Secondly, we asked for the...
00:14:53.24	for the high confidence epitopes,
00:14:56.28	and here for epitopes with even low confidence,
00:15:01.28	proteins with signal peptides or with transmembrane domains.
00:15:04.22	And what we're really interested in for...
00:15:07.08	from the standpoint of location
00:15:11.13	is genes that have either a signal peptide or a transmembrane domain,
00:15:14.24	and so I'm going to combine these queries using a combination, here,
00:15:18.14	to look for the results of prote...
00:15:20.22	of our search for a signal peptide --
00:15:23.22	that is, query 4 --
00:15:26.01	or a transmembrane domain.
00:15:28.10	And the result that I get will be a much larger set
00:15:32.16	-- or a somewhat larger set --
00:15:35.04	of about 2,000 proteins that have either a signal peptide
00:15:37.05	or a transmembrane domain.
00:15:38.23	Once again, it includes many proteins
00:15:41.09	that I don't think any of us would advocate
00:15:43.07	as vaccine targets...
00:15:45.02	the cytochrome oxidase genes
00:15:46.28	associated presumably with the parasite mitochondrion.
00:15:49.24	But we can see, now, the results of this query,
00:15:51.28	a new question which I'm going to rename,
00:15:53.27	just so I don't lose track of it.
00:15:56.02	And I'll just call this "signal peptide or transmembrane domain,"
00:16:01.00	just so I don't forget about what the question is that I've asked.
00:16:06.01	So, we can imagine a wide range of other questions,
00:16:08.11	and I would encourage any of you who have questions
00:16:11.02	you would like to ask
00:16:13.28	to explore this query grid
00:16:17.15	for accessible questions that may be relevant to the ways
00:16:22.06	that you choose to interrogate the database.
00:16:24.17	We might also want to know about proteins
00:16:27.07	that are not only in the right place on the surface
00:16:30.00	but also at the right time.
00:16:31.26	Remember that intraerythrocytic life cycle --
00:16:34.16	in which a parasite invades into an erythrocyte,
00:16:38.21	sets up its home as a ring stage parasite,
00:16:40.25	develops and metabolizes
00:16:43.09	and grows as a trophozoite,
00:16:44.28	finally emerging by assembling daughter parasites
00:16:47.18	as a schizont,
00:16:49.15	before rupturing outside of the red blood cell to release merozoites --
00:16:53.08	we might expect that if we were interested in a vaccine that targeted
00:16:56.27	the red blood cell stage of malaria
00:17:00.12	that's responsible for the clinical symptoms,
00:17:03.04	we'd be interested in targeting those merozoites,
00:17:05.27	a very short-lived form about which it's difficult
00:17:09.14	to gather detailed information.
00:17:11.16	We could ask, for example, for proteomic da...
00:17:14.05	for protein data,
00:17:16.29	looking for mass spec-based data...
00:17:19.27	evidence of expression on merozoites.
00:17:23.03	And you may wish to explore that...
00:17:25.10	the datasets associated with transcription,
00:17:28.03	some of which were described in Joseph DeRisi's iBio seminar
00:17:31.27	on malaria...
00:17:35.05	are probably more extensive and more...
00:17:37.13	and more comprehensive.
00:17:39.04	So, I'm going to, instead,
00:17:42.13	interrogate the expression profile for... from trans...
00:17:46.22	from transcript levels.
00:17:49.25	And there are a number of queries that can be used
00:17:52.14	against various different organisms using various different datasets.
00:17:54.17	Since you may be familiar with the data set generated in the DeRisi lab,
00:17:57.05	we'll take a look at that data here,
00:17:59.03	looking at expression timing
00:18:01.21	based on glass slide microarrays,
00:18:03.20	although there are other ways that you can interrogate this data as well.
00:18:06.15	Now, we've already seen,
00:18:09.01	from looking at the expression profile of AMA1,
00:18:11.24	that the transcripts were most abundant
00:18:15.17	towards the end of that 48-hour window
00:18:18.22	of replication inside an erythrocyte.
00:18:21.00	And that makes sense if you imagine
00:18:23.23	that transcription is going to precede translation,
00:18:26.17	and so we might imagine that in that schi...
00:18:29.12	stage of schizogony,
00:18:32.10	proteins are most like... is the most likely time to transcribe genes
00:18:36.24	that are going to be translated for protein in merozoites.
00:18:40.21	And so, I'm going to ask for genes
00:18:42.26	that are maximally expressed at...
00:18:45.18	in the last third of that intraerythrocytic...
00:18:48.28	of that intraerythrocytic life cycle,
00:18:51.20	that is, the last 16 hours.
00:18:53.13	In other words, we're looking for things...
00:18:56.06	genes where expression is maximal at
00:19:01.03	40 plus or minus 8 hours.
00:19:04.01	I don't care when the gene is turned off.
00:19:06.11	But I'm going to look for genes that are upregulated by 4-fold
00:19:11.01	-- you can change these parameters if you wish --
00:19:14.03	and that are reasonably abundant,
00:19:16.19	let's say in the top 60th percentile
00:19:20.08	of all genes in the genome.
00:19:23.16	And now we'll run this query, and presumably return hundreds of genes that are...
00:19:28.26	that fulfill those criteria --
00:19:31.24	600 genes in this particular question.
00:19:38.04	600 genes.
00:19:39.14	And we can see, if I stand aside, the actual expression profile.
00:19:41.27	This hypothetical protein shows, indeed, the pattern that we expect:
00:19:46.16	maximally expressed towards the end of the intraerythrocytic life cycle
00:19:49.14	in all three of these strains
00:19:52.22	symbolized by the red, blue, and yellow curves.
00:19:56.21	Alright.
00:19:59.01	Are there other questions that we... that we might want to address?
00:20:03.05	Well, as a geneticist,
00:20:05.27	I guess I would be particularly interested in taking advantage of some
00:20:08.11	of the most exciting new datasets
00:20:10.16	that have emerged for malaria parasites,
00:20:12.24	from resequencing projects designed to assess the diversity of parasites
00:20:16.18	throughout the world.
00:20:18.15	And these have... and as a result of such studies,
00:20:21.28	we can identify polymorphisms,
00:20:24.08	single nucleotide polymorphisms, or SNPs,
00:20:28.04	that distinguish one gene from another.
00:20:30.09	And so, we'll consider comparing
00:20:35.11	any two strains of our choosing,
00:20:37.01	and I'm going to compare the reference strain, 3D7,
00:20:39.02	the strain whose complete genome was first sequenced,
00:20:43.01	with a field isolate,
00:20:46.24	a field isolate from Ghana, the GHANA1 strain.
00:20:49.23	And we can set our parameters in various different ways.
00:20:53.10	We could ask, for example,
00:20:55.16	for polymorphisms that are known to affect
00:20:59.00	coding potential
00:21:00.22	or for the density of polymorphisms.
00:21:02.25	Just to keep things simple,
00:21:04.22	I'm going to ask for any gene that has
00:21:07.08	at least five known polymorphisms.
00:21:10.06	But once again, you may want to manipulate these parameters.
00:21:14.05	And asking a question like this
00:21:16.27	gives us back several hundred genes,
00:21:19.09	including, as we might expect,
00:21:21.16	the variant surface antigens, PfEMP1 genes
00:21:24.08	that I don't think are...
00:21:27.15	would be likely advocated as a single-subunit vaccine,
00:21:32.13	but certainly genes that are likely to be highly polymorphic.
00:21:39.10	There are many other questions you can consider asking,
00:21:41.22	and this is...
00:21:43.15	this has already become a fairly long session,
00:21:45.10	so I'm going to just limit myself to one more question,
00:21:47.09	a question related to the evolutionary biology of these parasites.
00:21:51.11	One might imagine,
00:21:54.08	if we were looking for candidate vaccine targets,
00:21:58.21	that we'd be interested in genes that are specific to malaria parasites,
00:22:03.16	so we can interrogate for genes
00:22:06.16	across the range of life,
00:22:08.25	for where those genes are found.
00:22:11.07	And we might imagine,
00:22:13.20	as we scroll down to look at eukaryotic organisms,
00:22:16.03	and the apicomplexa in particular,
00:22:18.10	that we'd be interested in genes that are found in Plasmodium falciparum -- of course --
00:22:22.15	and perhaps, if we wanted to consider a candidate target
00:22:25.07	with broad-spectrum activity,
00:22:27.17	we might want to look for genes that are also present in Plasmodium vivax,
00:22:30.12	the second leading cause of malaria in humans.
00:22:35.21	But we're certainly not interested in genes
00:22:38.04	that are present in humans.
00:22:39.22	And so, I'm going to...
00:22:41.10	I'm going to ask in this particular question
00:22:43.23	for genes that are absent from humans
00:22:45.27	or maybe absent from mammals all together.
00:22:49.25	And running a question like this gives me a large fraction of the genome,
00:22:55.09	a third of the parasite genome,
00:22:57.07	which is distinctive in being present in Plasmodium falciparum.
00:23:00.11	Most of these proteins, we know nothing about...
00:23:02.21	hypothetical proteins.
00:23:04.18	So, let's return to our now rather long list of questions
00:23:09.13	that we suggest might be relevant
00:23:11.27	to vaccine development.
00:23:17.13	We've asked for proteins that have antigenicity.
00:23:23.00	That was our question number 3.
00:23:26.01	And I'm going to try to combine that information with the other questions I've asked.
00:23:32.16	I'm going to ask for proteins now,
00:23:34.25	not just for... not for the union of proteins with signal peptides
00:23:38.05	and transmembrane domains,
00:23:40.04	as we asked earlier,
00:23:41.25	but for the intersection of these various different queries.
00:23:43.21	I'm going to ask for genes
00:23:46.12	that have some level of immunogenicity
00:23:48.03	and also have either a signal peptide
00:23:50.17	or a transmembrane domain
00:23:53.16	-- so, that was my question number 6 --
00:23:56.01	were also present at the right time,
00:23:58.25	expressed abundantly in schizonts;
00:24:02.14	and were highly polymorphic,
00:24:05.24	indicating diversifying selection,
00:24:08.00	presumably under control of the immune system;
00:24:10.01	and also showed this evolutionary profile
00:24:13.15	that was present in these particular species.
00:24:16.02	So, this is a question
00:24:18.20	that I've actually never asked in exactly this same way,
00:24:21.11	although I've certainly run many similar sorts
00:24:25.25	of questions in the past.
00:24:27.28	And I can see that in this particular set of queries,
00:24:30.18	I come up with a list of 23 proteins.
00:24:34.15	Let me turn off this track ind...
00:24:38.20	showing the expression profiling
00:24:40.26	so we can see these a little bit better,
00:24:42.23	and I'm going to display all of them on one page.
00:24:47.05	And now, we can ask a little more readily
00:24:50.04	about the various proteins we've looked at.
00:24:53.02	So, let's scroll down the list.
00:24:57.01	First on the list -- just first alphabetically --
00:24:58.27	is a hypothetical protein.
00:25:00.23	It's a conserved hypothetical protein.
00:25:02.21	Is this a vaccine antigen... I don't know.
00:25:05.14	But my eye is immediately drawn to what you might think of
00:25:08.24	for this computational experiment as a positive control,
00:25:13.20	that AMA1 protein,
00:25:16.22	the protein that is one of the leading vaccine targets
00:25:19.00	for antimalarial vaccine development.
00:25:22.00	A number of other proteins: a guanylyl cyclase, a kinase,
00:25:26.09	several other hypothetical proteins.
00:25:28.22	It's hard for me to believe that a tRNA ligase
00:25:31.17	would be a good vaccine candidate,
00:25:34.03	but here's the second of my positive controls,
00:25:37.06	MSP1, the other of these leading candidates
00:25:40.06	for an intraerythrocytic or an erythrocytic stage vaccine,
00:25:44.05	and several other proteins which have certainly been considered.
00:25:47.14	This CLAG9 protein has been advanced, for example,
00:25:50.12	as a candidate target for vaccine development.
00:25:54.10	So, my point here is not to argue
00:25:57.16	that computational approaches,
00:26:00.03	considered in and of themselves,
00:26:02.06	are ever going to be sufficie
00:26:05.07	for identifying successful vaccine targets.
00:26:07.29	That would be as absurd as saying
00:26:11.23	that we can identify the function of the apicoplast
00:26:14.24	solely by using cell biological approaches
00:26:19.08	of organelle purification without any biochemical or genetic characterization.
00:26:25.09	But certainly, in a few minutes sitting here at the computer,
00:26:28.04	we're been able to filter the many thousands of genes
00:26:32.24	in the parasite genome
00:26:35.13	down to a rather short list, a list of 23,
00:26:37.12	that includes both of our positive control antigens,
00:26:41.04	AMA1 and MSP1.
00:26:43.05	And I would imagine that if I were interested
00:26:46.02	in vaccine development for  malaria,
00:26:49.01	I would certainly want to explore further
00:26:51.22	the other 21 proteins on this list,
00:26:54.07	as a manageable set that might be worth exploring
00:26:57.16	for candidate genes
00:27:01.02	that may be as good as or even better
00:27:04.25	than AMA1 or MSP1 as vaccine targets
00:27:08.23	for antimalarial development.

This material is based upon work supported by the National Science Foundation and the National Institute of General Medical Sciences under Grant No. 2122350 and 1 R25 GM139147. Any opinion, finding, conclusion, or recommendation expressed in these videos are solely those of the speakers and do not necessarily represent the views of the Science Communication Lab/iBiology, the National Science Foundation, the National Institutes of Health, or other Science Communication Lab funders.

© 2023 - 2006 iBiology · All content under CC BY-NC-ND 3.0 license · Privacy Policy · Terms of Use · Usage Policy
 

Power by iBiology