Session 6: Synthetic Biology and Metabolic Engineering

Transcript of Part 2: Teaching an Old Bacterium New Tricks

00:00:06.07		My name is Kristala Prather and I'm an associate professor of chemical engineering at MIT.
00:00:11.18		In the first part of my presentation, I gave an overview of metabolic engineering and synthetic biology
00:00:16.23		and now I'd like to talk more specifically about work being done in my lab
00:00:20.24		towards expanding the capacity of biology for chemistry, or as I've titled it here,
00:00:25.22		teaching an old bacterium new tricks.
00:00:28.04		So, in my introduction, I gave this maze as an example of how we think about metabolic engineering,
00:00:34.23		where you have the example here of a mouse Wemberly
00:00:38.08		that's lost its pet rabbit Petal and there's a maze of possibilities
00:00:42.16		of how the mouse might get to the rabbit it's looking for.
00:00:45.22		And our goal is to be able to block off, or to obstruct,
00:00:49.24		those pathways which are not going to be productive,
00:00:52.06		or to stimulate, in this case, the mouse to run faster, or in biological terms,
00:00:57.11		to increase the rate at which material will flow through our maze
00:01:00.15		so that we get to the product that we're interested in more quickly.
00:01:03.17		Now, there's another way that we actually think about engineering pathways
00:01:07.15		that also looks at this maze analogy.
00:01:09.25		In this case, our goal is completely different. Now, we actually want to blow the maze up
00:01:16.00		so that rather than forcing the mouse to run from one to the other through all these obstacles,
00:01:21.20		we have a more direct way to get from point A to point B.
00:01:25.00		And that's actually the focus of much of the work that goes on in my lab.
00:01:28.18		When we started thinking about this problem,
00:01:31.23		that is how do we actually get biology to do more chemistry,
00:01:35.19		the question was, well, what kind of targets could we look at?
00:01:38.09		What are good molecules to look at that might be produced by biology?
00:01:42.00		And in 2004, the US Department of Energy put together a report
00:01:46.02		called "Top Value Added Chemicals from Biomass"
00:01:48.21		where they actually sought to answer that very question.
00:01:51.09		That is to say, if you're using biology, or biomass,
00:01:55.01		as the input for chemicals, what are the right molecules that you'd want to produce.
00:01:59.20		And they came up with a list which is actually called the "top 10" list in the literature
00:02:04.21		of building block molecules.
00:02:06.08		Now, I always find this interesting because it turns out the top ten list actually has 12 lines
00:02:11.24		and a few of these lines have more than one molecule,
00:02:14.09		but nevertheless, it's called the top ten list because that sounds a lot better than the top 14 or 15 list.
00:02:19.12		If we look at this list, we see things for example like glutamic acid,
00:02:23.21		and that's an amino acid. We see aspartic acid, which is also an amino acid,
00:02:28.15		and those were compounds that we weren't really interested in working on
00:02:31.28		because our challenge was to find a pathway that either didn't exist
00:02:36.00		or one that was really, really complicated that we could, again, blow up our maze,
00:02:40.07		if we use that analogy, in order to get to the compound that we're interested in.
00:02:43.27		So, when we looked at this list, we eliminated compounds like that.
00:02:47.22		We also eliminated compounds like glycerol, which it turns out is actually relatively cheap today,
00:02:53.16		but wasn't when this report was first produced.
00:02:56.16		So, once we went through this process,
00:02:58.12		of saying, well here are things that we're not intellectually interested in,
00:03:01.17		and here are compounds that we don't think really give us the value that we want,
00:03:05.08		we began to focus on a couple of different compounds,
00:03:08.16		one of which is glucaric acid, that's shown here,
00:03:10.25		and I'd like to talk to you today about our work that we've done
00:03:14.06		to be able to produce this compound in a microbe, namely, E. coli.
00:03:19.00		Glucaric acid, as it's shown again here, is a structure that has 6 carbons,
00:03:24.13		so it's actually pretty similar to glucose in how it's arranged, and it is actually a natural product
00:03:30.07		and I mentioned natural products in the first half of my talk
00:03:32.24		as being compounds that are naturally produced by nature.
00:03:35.26		It turns out this is a compound that's produced in fruits and vegetables
00:03:39.09		and also in mammals, but there's no known microbial pathway for it,
00:03:43.10		meaning that if we look at the simplest organisms,
00:03:45.22		the ones that are easiest to think about putting into a factory,
00:03:49.13		there are no microbes like that where we know that glucaric acid is produced.
00:03:53.02		This compound has been studied for therapeutic purposes
00:03:56.10		either as an agent to reduce cholesterol, or even possibly to fight cancer,
00:04:00.25		but we've actually been more interested in its properties as a monomer
00:04:04.04		for different kinds of materials, or as detergents.
00:04:07.11		And the final bullet point on this slide just emphasizes the fact
00:04:10.10		that we actually know how to make this compound chemically, from glucose,
00:04:14.19		but it turns out that that process, the way it exists now,
00:04:17.13		is pretty messy, it requires a lot of harsh materials,
00:04:20.21		and so it's both not economical and not environmentally friendly.
00:04:24.22		So, we set out to come up with a way to make glucaric acid using biology.
00:04:29.05		This is actually what the natural pathway looks like
00:04:33.10		and hopefully you can see a lot of arrows here that might make you a little bit squeamish
00:04:37.15		if you were going to graduate school and your advisor said,
00:04:40.09		you've got to get this whole thing to work in E. coli.
00:04:43.08		Just to give you a quick overview of this pathway,
00:04:46.10		the compound we're interested in, glucaric acid, is in this box at the top.
00:04:50.00		I mentioned that this is something that could come from glucose
00:04:52.18		and we actually will use glucose as our starting compound as well,
00:04:55.20		and glucose is on this figure.
00:04:57.20		I'll give you a second to look and see if you can find it,
00:04:59.27		because it turns out there are quite a bit of arrows here,
00:05:02.17		but if you look very closely along the left-hand side, then you can see glucose right here.
00:05:07.15		You can also see that all these arrows are going back and forth,
00:05:10.22		you have this interaction with the pentose phosphate pathway,
00:05:13.17		you have another sugar, galactose, which is one input,
00:05:16.13		and you actually have an additional output, which is ascorbic acid.
00:05:19.23		This is a mess,
00:05:20.27		and this is not something that we would really want to think about putting into E. coli.
00:05:25.14		So, our challenge was to figure out, is there a different way for us to get from glucose
00:05:30.26		to the molecule that we're interested in, that would be much simpler,
00:05:33.25		that would have much, much less of this maze-like effect.
00:05:36.22		One of the nice things about having a molecule, however, that is a natural product,
00:05:42.25		is that we could go to the databases and say, is glucaric acid there?
00:05:46.26		That is, in known metabolism, is there an example where glucaric acid has been found
00:05:52.04		to be associated with biology.
00:05:54.16		And in fact, what we found is that glucaric acid could be produced from a compound
00:05:57.29		called glucaronic acid and it can produced using an enzyme called uronate dehydrogenase
00:06:03.03		that's actually found in a bacterium called Pseudomonas syringae.
00:06:06.15		But that was sort of the end of the story as far as Pseudomonas was concerned.
00:06:11.00		With our glucaronic acid, now, we could go back to the databases again
00:06:14.20		and ask the same question, that is, do we see glucaronic acid being produced by nature,
00:06:18.23		and in fact what we could find is that glucaronic acid could be produced
00:06:22.00		from a compound called myo-inositol with an enzyme called myo-inositol oxygenase.
00:06:26.28		And that enzyme is found in a number of sources, a number of mammalian sources,
00:06:31.12		and fungal sources, and we actually chose the variant from mouse
00:06:35.04		because it was one that had been shown to work well when it was expressed in E. coli.
00:06:39.15		But that was really the end of the story as far as mammalian biology was concerned,
00:06:43.11		but if we said where else does myo-inositol show up in metabolism,
00:06:47.21		we could actually find a linkage directly from myo-inositol, or to myo-inositol,
00:06:52.00		from glucose, and that was work done by John Frost's lab at Michigan State,
00:06:55.18		where he showed that you could use glucose as the input,
00:06:58.15		you would go through glucose-6-phosphate,
00:07:00.14		and then you would have just a single recombinant enzyme,
00:07:03.09		that is, a yeast myo-inositol-1-phosphate synthase that would produce
00:07:07.08		myo-inositol-1-phosphate and that, in E. coli, was naturally dephosphorylated
00:07:12.06		in order to give the myo-inositol compound that we're interested in.
00:07:14.24		So, now, rather than having this very complex network
00:07:17.24		of 11 or 12 steps, we really only need 3 different enzymes
00:07:21.29		to be expressed in E. coli, although from three very different sources,
00:07:25.24		in order to get the compound that we're interested in.
00:07:27.26		And so we could take advantage of that to actually have the first gene
00:07:31.17		directly PCR-amplified because we knew that would work in E. coli from John Frost's work.
00:07:36.11		The second gene we could take advantage of this DNA synthesis
00:07:39.23		that I talked about in the first part to be able to have this version of the gene synthesized
00:07:44.21		but synthesized in a way that E. coli would be able to produce it more easily
00:07:49.11		than the natural sequence of DNA that would come from mouse,
00:07:52.06		and then we actually had to do a little bit of work to figure out what was the sequence of DNA,
00:07:57.05		or the gene, encoding from the uronate dehydrogenase in bacteria.
00:08:01.23		But once we were able to do that, we now had all three of the genes that we needed
00:08:05.18		to put into E. coli to see whether or not it could make glucaric acid.
00:08:09.19		So, when we co-expressed all three of these genes,
00:08:13.28		what we found was exactly what we hoped to find.
00:08:15.28		And that is that we got glucaric acid being produced.
00:08:19.03		And the figure that's shown here shows the titer, or the concentration, in grams per liter,
00:08:24.05		of glucaric acid that we can measure in the culture medium.
00:08:27.13		So, this is actually spit out by the cell into the surrounding medium.
00:08:31.20		And I have two different bars that are shown here, one that has 0.1 millimolar IPTG
00:08:36.10		and one that has 0.05 millimolar IPTG.
00:08:39.04		I want to take just a second and explain what that really means.
00:08:42.07		IPTG in this case is what we'd call an inducer;
00:08:45.15		that means that it's something that we add to the culture that tells the cells
00:08:48.26		you should start making the proteins, or the enzymes, that we're interested in.
00:08:52.16		And what's shown now is the result on this slide, as something that we see a lot of times,
00:08:56.27		which is that if we have a somewhat higher concentration of our inducer,
00:09:00.21		where we're making more protein, you see that we actually have less of the product
00:09:04.27		than if we have a lower concentration of our inducer.
00:09:07.00		And that's really a core principle of metabolic engineering, which is that
00:09:10.18		changes that we make to the cell have these very broad systems-wide effects
00:09:14.23		that we don't always understand.
00:09:16.08		And so every time we seek to engineer an organism to make a compound we're interested in,
00:09:21.17		we have to go through this trial and error process of trying to identify
00:09:24.24		what really are the best conditions to make the compound that we're interested in.
00:09:28.25		The second thing that I want to point out is that we see,
00:09:31.22		besides glucaric acid being produced, we also find that we have myo-inositol,
00:09:36.13		which is accumulating, meaning we can measure that in the culture medium.
00:09:39.14		And the fact that that myo-inositol is there, it lets us know that the enzyme
00:09:44.06		which is converting myo-inositol to glucaronic acid is a limitation in the system.
00:09:48.22		That is, it's not working the way its supposed to work, such that all the myo-inositol that's produced
00:09:53.24		is converted to glucaronic acid, and then onto to glucaric acid.
00:09:57.20		I always think at this point, there must be a joke in here somewhere.
00:10:01.24		We have a yeast, a mouse and a bacterium
00:10:05.07		and they all go into a bar and I'm not really sure what the end result is here,
00:10:08.27		but we know that glucaric acid comes out somewhere.
00:10:10.26		Unfortunately, it's not quite that easy and we have a lot of challenges that we have to try to address
00:10:16.12		in trying to actually get the cells to make a lot more of this product that we're interested in.
00:10:20.20		The first of those challenges actually comes into place
00:10:24.26		when we actually look at the fact that we have this myo-inositol accumulating,
00:10:28.11		as I pointed out in the first graph, that showed glucaric acid being produced.
00:10:31.25		And in this case now, if we take a closer look at this enzyme,
00:10:35.00		all we're focused on is this one reaction.
00:10:37.21		We can see that this MIOX gene, the myo-inositol oxygenase,
00:10:41.12		takes myo-inositol as its input. It also uses molecular oxygen
00:10:45.19		and the product that's produced now is glucaronic acid.
00:10:48.15		And so we know that the cells are not actually doing this reaction,
00:10:52.29		that is, converted myo-inositol to glucaronic acid, at a fast enough rate
00:10:57.08		to consume of all it. So, if we study that enzyme by itself,
00:11:00.26		the experiment we did in this case was to look at cells producing just this enzyme,
00:11:05.01		so it doesn't have the first enzyme, which gives us myo-inositol,
00:11:08.09		it doesn't have the third enzyme, which actually takes that glucaronic acid
00:11:12.05		and converts it to glucaric acid.
00:11:13.19		Instead, we're looking at this in isolation, and we looked at two different conditions:
00:11:17.20		one where we actually have myo-inositol present in the culture medium
00:11:21.23		as we're growing up the cells and making the protein,
00:11:24.06		and one where it's missing.
00:11:26.04		And the only difference now is that at a point where we measure the activity of the cells,
00:11:30.22		we actually have some cells that saw substrate, that is the myo-inositol,
00:11:34.18		and some that didn't, but at the same time, when we would go to analyze them,
00:11:38.23		we take the cells away, so now there's no myo-inositol,
00:11:42.05		we break open the cells and release the protein and we expose those cells
00:11:46.18		to the same concentration of the substrate.
00:11:48.25		And in doing that and measuring the activity,
00:11:51.10		what we find is that for the cells that were able to previously see the substrate,
00:11:55.23		the activity of that protein is about an order of magnitude higher
00:11:59.03		than the cells that only saw substrate for the first time after the protein had actually been produced.
00:12:04.20		Well, so this actually raised an interesting question for us.
00:12:08.09		And we thought about we actually solve this problem,
00:12:10.27		and I can tell you the answer is not toss in a lot of myo-inositol,
00:12:14.11		because that's actually cheating. What we want to do is start from glucose,
00:12:17.19		which is going to be a more cheaply available substrate,
00:12:20.02		and make the product that we're interested in.
00:12:22.00		But now we can think about this as engineers and say, well,
00:12:26.03		what information do we have that actually gives us some guidance
00:12:29.14		on how we might actually be able to sole this problem,
00:12:32.04		even if we don't exactly understand the underlying reasons for the phenomenon that we see.
00:12:37.07		And so the first thing that we thought is, ok, what we want then
00:12:40.15		is for that first enzyme, the INO1, to make a lot of the myo-inositol,
00:12:45.14		and then that would be really good
00:12:47.06		because that's what we need for the second enzyme to be effective.
00:12:50.07		The only problem with that is that it sounds really good to say that,
00:12:53.06		but as we've worked on that, that turned out to be a lot easier said than done.
00:12:56.26		At the same time as we were looking at this,
00:12:59.12		we actually came up with another idea.
00:13:02.28		In this case, the idea came from a collaborator, John Dueber,
00:13:07.07		in SynBERC, which is the Synthetic Biology Engineering Research Center,
00:13:10.18		and John's work was looking at something called enzyme colocalization,
00:13:15.05		where the goal here was to be able to take enzymes
00:13:17.22		that normally  might be freely disbursed throughout the cell,
00:13:20.15		with no reason for them to be together,
00:13:22.15		and to cause a way for those enzymes to be physically located next to each other.
00:13:26.24		In fact, what happens in this case is that the enzymes, shown here now as MIOX and INO1,
00:13:32.19		are actually exposed, or they have covalently attached to them these tags,
00:13:37.01		and those tags fold into a certain 3-dimensional conformation
00:13:40.28		that can then be recognized by a different piece of a protein.
00:13:45.08		That piece of a protein can then be put into something that we call a scaffold,
00:13:48.16		and if you now have the scaffold in the cell,
00:13:51.08		and you have these enzymes that are tagged with pieces that will recognize that scaffold,
00:13:56.04		that actually causes two enzymes to become located close to each other within the cell.
00:14:01.22		So, our idea here was very simple, that if we couldn't actually change the activity of the enzyme
00:14:06.19		and the way that we could get the upstream enzyme to make much more product,
00:14:10.11		if we actually reduced the distance between the two enzymes,
00:14:13.26		that would give us a higher local concentration of myo-inositol,
00:14:17.18		and maybe if that local concentration was higher,
00:14:19.21		that would give us the higher activity that we had seen before,
00:14:22.13		and that would actually give us higher yields and productivities.
00:14:25.15		And the first way that we tested this was exactly as its diagrammed on this slide,
00:14:31.02		where we actually had just these two enzymes being recruited to the scaffold,
00:14:35.01		in a one to one ratio, and in doing that, we actually got an increase of about a factor of 3
00:14:40.29		in the amount of glucaric acid that we were producing.
00:14:43.17		Now, as all good scientists, we have to ask ourselves,
00:14:46.27		is this working the way that we want it work?
00:14:49.10		And I'll remind you that our theory here was that what we would get was not just more glucaric acid,
00:14:55.05		but that that would happen because we would have a higher activity of MIOX,
00:14:58.23		that is, we would have better activation, and that would result in this faster conversion
00:15:02.27		that would give us more of the product that we're interested in.
00:15:05.15		So, we actually needed to test that theory,
00:15:08.07		that is, to measure the activity of this MIOX protein and find out
00:15:12.00		whether or not it actually had higher activity, as we supposed that it might.
00:15:16.25		What's shown now in the upper left-hand corner is the data for the product, or the glucaric acid titer,
00:15:22.19		where the lighter bars here are, well, on the left hand side, I should say,
00:15:26.15		without scaffold, and then on the right hand side, with scaffold.
00:15:29.10		And you can see again, these are two different conditions in terms of how much of this IPTG we use
00:15:34.14		to induce the expression of the proteins.
00:15:36.28		And in the first case now, of these lighter bars, there's no real difference
00:15:41.01		between not having scaffold and having scaffold,
00:15:43.22		on the amount of product that's being produced,
00:15:46.03		and if we actually look at the activity of the protein,
00:15:48.17		there's also no significant difference between the protein activity here and the protein activity in this case as well.
00:15:54.25		However, in our best case, where we actually had an increase of 3-fold
00:15:58.08		in the amount of glucaric acid being produced, that's the darker bar in this case,
00:16:01.29		we can look at the specific activity of the protein and we see about a 30% improvement
00:16:07.15		in the activity of this protein relative to when the scaffolds aren't present.
00:16:11.15		And the p-value is here just to show you that that difference is actually significant.
00:16:15.15		So, now we've actually verified that we have not just higher production of the product that we're interested in,
00:16:21.08		but we're getting that higher production by the mechanism that we had supposed
00:16:25.17		would actually happen.
00:16:27.12		Now, one of the nice things about these scaffolds is that what it allows you to do
00:16:31.16		is to explore different stoichiometries.
00:16:33.22		What I mean by that is you don't just have to have one of one protein
00:16:37.19		and one of a second protein coming together, but you can actually, in that scaffold,
00:16:41.27		dial in the stoichiometry by specifying the number of binding domains that you have
00:16:47.02		for each particular protein. So, this is an example of a different scaffold,
00:16:50.20		where you can see two binding domains for one of the proteins,
00:16:53.29		four binding domains for another protein and a single binding domain for the last protein.
00:16:58.12		And if we put that together, what it actually means is that we have,
00:17:01.10		in this case, four copies of the first gene, the INO1 enzyme, that is,
00:17:06.05		two copies of the second enzyme, and only one copy of that third enzyme.
00:17:09.23		This actually allows us to look at a wide variety of different configurations
00:17:14.20		as well as look at varying the amount of the scaffold that we have
00:17:18.11		and the amount of the enzyme that we have,
00:17:20.03		to look at the effect of that on the productivity.
00:17:22.23		And the result of that exercise is shown here,
00:17:25.12		where each of those dots is the average of a triplicate experiment
00:17:29.07		where we have the same amount of enzyme being produced in all cases,
00:17:32.20		but we're looking at a wide variety of scaffold induction levels
00:17:36.07		and also looking at a very wide configuration of different scaffolds themselves,
00:17:41.10		meaning different numbers of binding domains for these enzymes that we're interested in.
00:17:44.18		What we see if that we actually are able to change
00:17:49.00		the activity of this enzyme over a factor of about 7-fold
00:17:52.21		and that actually results in a change in the amount of glucaric acid that we have
00:17:56.24		in a factor of about 5-fold. So, we really have shown that we can use,
00:18:01.04		in this case what's called a synthetic biology device,
00:18:04.02		that is, these protein-protein co-localization mechanisms,
00:18:07.08		to be able to solve a problem with an engineering approach,
00:18:11.05		even if we still don't understand exactly what is it that leads to these differences
00:18:15.21		that we see in the activity of the protein.
00:18:18.08		Now, I want to remind you again of this maze analogy that we had before
00:18:24.03		of a protein, or rather a compound, coming into a maze
00:18:28.02		and having a number of different places that it could go.
00:18:30.17		And I showed a very simple diagram before of the maze having four different entry points.
00:18:35.11		Well, the reality is that this is really what the maze looks like inside the cell,
00:18:39.28		where each of the individual dots in this figure represents a particular chemical,
00:18:44.10		and each of the lines between those dots represents an enzyme
00:18:48.02		that can convert that chemical into something else.
00:18:50.20		So, that means that the networks that we're really talking about are very, very large mazes,
00:18:55.16		not these very simplified ones that I showed you.
00:18:57.26		And if our goal is to have glucose, for example, as a starting molecule,
00:19:01.06		work its way through this maze, and end up with a final compound that we're interested in,
00:19:05.28		we can often have by-products that are being produced.
00:19:08.24		And ideally what we'd like is to, again, knock-out those unproductive routes,
00:19:13.19		which are going to lead to byproduct formation, but the question becomes,
00:19:17.05		what if your byproduct is actually growth?
00:19:19.11		And growth in this case also means the ability to make the enzymes that you need
00:19:24.19		in order to catalyze all these chemical reactions
00:19:27.12		that are going to give you conversion of your starting substrate, glucose,
00:19:30.21		down to your final product, glucaric acid.
00:19:32.28		In this case now, we don't have the option of simply knocking out or deleting growth,
00:19:38.16		because now we're not actually going to make the enzymes that we need
00:19:41.10		and this means that we have to have a different way of solving this problem,
00:19:44.23		or a different approach to dealing with the byproduct that we have.
00:19:48.08		So, what we can do in this case is again, take advantage of these principles of synthetic biology,
00:19:54.05		which are based on design, to think about a control system.
00:19:57.17		In particular what we want is dynamic control of these activities.
00:20:01.22		We like to have our initial condition be fast growth, or growth being favored,
00:20:06.02		such that we actually make not just the cells, but again the proteins that we need,
00:20:10.03		that are going to give us the enzymes that give us the chemical reactions
00:20:13.02		that we need to make the product that we're interested in.
00:20:15.07		And then we want to trigger a switch to a production phase
00:20:18.16		where we say, stop growing now, and instead of growing,
00:20:21.15		use all of that glucose to make the molecule that we want you to make.
00:20:25.00		I can represent that diagrammatically like this,
00:20:27.26		where if we have our competing activity, initially, when the input is low,
00:20:32.04		that activity will be high, and at some point, I'm going to now add an input
00:20:36.03		that causes the competing activity to be low.
00:20:38.14		You can see that now, specifically, in what we're interested in, which is growth versus production,
00:20:43.24		which is that we want growth to actually start high,
00:20:46.29		and then after awhile, we want growth to go down, and instead we want the production here
00:20:52.21		to actually start to go up. This is actually something called a genetic inverter.
00:20:57.10		It's an inverter because when the input is low, the output is high.
00:21:01.17		When the input is high, the output is low.
00:21:04.05		And there is actually a precedent for this in nature, namely in secondary metabolite production.
00:21:09.03		Now, for secondary metabolites, these are natural products
00:21:12.25		where growth first is favored, and then the cell will naturally make this switch
00:21:17.12		such that you then will have the metabolites being produced later.
00:21:21.00		So, how do we actually make this process happen
00:21:25.20		when we're talking about having a switch for growth
00:21:28.22		where ideally what we're doing is having the cells use glucose for growth initially,
00:21:32.28		and then change that in order to use glucose for product formation
00:21:36.06		at some point after which we apply our trigger.
00:21:39.02		If we look at how glucose is normally used in our cells,
00:21:42.05		it comes in in what's called the PTS system,
00:21:44.11		and that PTS system brings in glucose as glucose-6-phosphate.
00:21:48.21		And it has two different routes that it can go into;
00:21:50.27		glycolysis or the pentose-phosphate pathway
00:21:54.02		and that's actually how that glucose is used by the cells for growth.
00:21:58.06		That's how the glucose is eaten, if we want to think about it that way.
00:22:01.16		And that's the process that we want to compete against.
00:22:03.24		Well, glucose-6-phosphate is the original substrate of our glucaric acid pathway,
00:22:08.19		but we didn't really want to deal with quite this complexity to start with,
00:22:12.12		so we decided to start on a simpler scale
00:22:14.22		and see if we could just address the glucose utilization issue
00:22:17.19		and then what we're doing now is to try to work up to the increasing complexity
00:22:21.27		that's required to deal with glucose-6-phosphate specifically.
00:22:25.11		That can be addressed by the fact that there is actually another way that glucose can come into the cell.
00:22:30.00		It can come in through what's called the galP, or galactose permease,
00:22:33.29		and in this case, it comes in as free glucose.
00:22:36.11		That glucose now has to be converted to glucose-6-phosphate
00:22:39.21		with an enzyme called glucokinase that uses ATP.
00:22:43.15		And because now the glucose has to go through that route,
00:22:46.17		it gives us just a single control point for being able to regulate,
00:22:51.01		that is control, how much of the glucose goes into our endogenous metabolism, or growth,
00:22:56.09		versus what goes into the product that we're interested in.
00:22:58.18		So, we can actually have this system now where we knock-out the PTS system,
00:23:02.25		we apply what we describe as a valve to regulate Glk activity,
00:23:07.06		and in doing that, we're able to modulate how much of the glucose is available
00:23:12.00		for endogenous metabolism, that is for growth,
00:23:14.21		versus how much is available for productivity.
00:23:17.06		And I just want to remind you that when we're talking about modulating the protein,
00:23:21.10		that is, how much of the glucokinase that's available, what we're really talking about
00:23:26.03		is controlling how much of the DNA, or how that DNA is being expressed.
00:23:30.18		So, we're actually doing all of our manipulations at the level of DNA synthesis,
00:23:34.21		which comes back to how we think about synthetic biology.
00:23:37.14		So, one way that we can actually test this,
00:23:41.27		rather than immediately going to a process where we have to worry about dynamic control,
00:23:46.18		is to look at what we would call static control of the system.
00:23:49.28		And that is that we can replace the natural glucokinase operon,
00:23:54.01		or production system,
00:23:55.20		which naturally consists of two different promoters that are negatively regulated
00:24:00.01		by this protein called FruR, we can get rid of all of that regulation,
00:24:04.16		that is we can replace that DNA, and instead have a library of different promoters
00:24:09.26		where the binding site for FruR is gone,
00:24:12.19		so the only thing that's regulating how much of this protein is produced
00:24:15.25		is the kind of promoter that we use.
00:24:17.26		And by varying the strength of these promoters, by using different variations here,
00:24:22.21		then we can end up with a library of different expression states
00:24:25.16		and ask the question, does that actually affect how much of a heterologous product
00:24:30.08		we could actually produce.
00:24:31.27		Here's now a little bit of characterization of this library.
00:24:35.16		The first thing that we're looking at in this slide is whether or not we actually do have increases in the mRNA,
00:24:40.28		that is, whether or not changing the promoter strength
00:24:43.10		changed the transcription, and then if that corresponded to increases in the protein being produced.
00:24:48.09		And what's shown in this case now, along the x-axis, is the relative promoter strength,
00:24:52.29		from very low strength, or weak promoters, up to very high strength promoters,
00:24:57.15		and then what's shown on the y-axis, on the left-hand side,
00:25:00.19		is the activity of the protein that we're interested in, glucokinase,
00:25:04.00		and what's shown on the right hand side is the mRNA levels.
00:25:07.05		And you can see now, that activity, which is in the solid circles,
00:25:11.19		does actually go up as we go from low promoter strengths
00:25:15.14		up to high promoter strengths, but it only goes up to a certain point,
00:25:19.01		after which we see it start to decline.
00:25:21.01		The same thing is true for the mRNA, that it actually will go up as we go along this axis here,
00:25:25.20		and it only will go up to a certain point and then it starts to decline as well.
00:25:30.06		These measurements were all done where we use glycerol
00:25:33.20		as a carbon source instead of glucose and that's actually to allow us
00:25:37.00		to decouple growth from measuring the properties of this enzyme
00:25:40.29		just to see if the library is working.
00:25:42.26		And what we actually found when we went to glucose
00:25:45.10		is that when the expression levels were too high here,
00:25:48.07		then these cells no longer grew. So, this cell has high mRNA,
00:25:52.03		but you can see the protein levels are pretty low.
00:25:54.10		And these cells would not grow on glucose.
00:25:56.29		The ones where the protein levels were still pretty high
00:26:00.01		would grow on glucose, except that we did have this gray region here,
00:26:03.29		this stipple region, where we saw the cells could grow, but only very, very poorly.
00:26:09.17		We could then take the cells that we knew were growing well, in this region here,
00:26:13.25		and then ask, can we actually now, in glucose,
00:26:17.03		relate the growth rate to the activity of this protein, which tells us whether or not it really can control
00:26:23.16		how much of the substrate is available for endogenous growth.
00:26:26.25		The result of that experiment is shown on this slide,
00:26:30.24		where again what we have now is expressed in terms of Glk activity
00:26:34.15		where it goes from a very low activity up to our higher activity
00:26:37.29		and then what's shown on the x-axis is the growth rate of the cells.
00:26:41.15		The native promoter is shown right here in this open triangle
00:26:44.25		and the filled squares will tell you that we're able to actually increase the growth rate
00:26:49.13		of the this cell. We can also decrease the growth rate of the cell by changing the glucokinase activity.
00:26:55.08		So, that confirms for us that we actually do have a control point
00:26:58.12		or a specific protein where if we vary the activity of that protein,
00:27:02.22		that actually will tell us, or allow us to control, rather,
00:27:05.26		how the cells are growing. The next question then is,
00:27:09.02		if you can control the growth of the cells, does that actually result in more product being produced.
00:27:14.13		So, in this case we have an example molecule, or a test molecule, gluconate,
00:27:18.02		this can be produced in one single enzymatic step from glucose,
00:27:21.25		and again, the competing reaction here is glucose-6-phosphate,
00:27:24.29		which is actually going to be produced from glucokinase.
00:27:27.18		What's shown now here is 5-KG, this is 5-ketogluconate,
00:27:31.05		which is just a spontaneous product that we actually get in very, very small amounts,
00:27:35.02		but we want to account for that by making sure that we look at the sum of both of these products,
00:27:39.18		to give us a sense of how much of the flux is coming through this side
00:27:43.17		versus this side of our pathway.
00:27:45.23		And now what's shown in this slide is actually the result of that experiment,
00:27:49.23		where what's shown now is the Glk activity, that is from, lower to higher amounts of that protein,
00:27:55.17		which is controlling how much glucose goes into endogenous metabolism
00:27:59.18		and what's shown on the y-axis is the molar yield,
00:28:02.29		and this is really how much of the glucose that we start with
00:28:06.04		goes into the compound that we're interested in, versus goes into other byproducts,
00:28:10.11		or into cellular growth.
00:28:12.04		And we see this very nice relationship where, when the activity is very low,
00:28:15.23		then we can see that we have a moderate amount of the yield, in this case,
00:28:21.12		that is the product that we're interested in. As we increase the activity,
00:28:25.00		then we get a slight bump, but as the activity goes higher and higher,
00:28:28.21		what we actually find is that we are decreasing the yield,
00:28:31.29		which basically tells us that as we get, now,
00:28:34.17		to the point where we're making more and more of this glucokinase,
00:28:38.06		we have more of the glucose going into growth
00:28:40.17		and less of it going into the product that we're interested in.
00:28:43.00		So, that actually gave us the validation that we needed
00:28:46.18		that the system design that we had envisioned,
00:28:49.12		one in which we could control the activity of this enzyme,
00:28:52.20		was going to be useful. What I haven't told you so far is that these cells here,
00:28:56.27		although they had the highest yield, did not have the highest concentration.
00:29:00.27		The concentration wasn't very different from the cultures that surrounded it
00:29:04.27		as far as yield was concerned, and they also didn't grow very well.
00:29:08.05		So, that just meant that the cells overall were not happy,
00:29:11.07		and that our original design of having them have a state where they grow very well first,
00:29:15.29		would probably work better in terms of giving us the maximum yield possible.
00:29:20.03		So, the system that we wanted to design here, again, is an inverter,
00:29:24.17		and the way that this will work is that we have this protein,
00:29:27.25		now as an example, GFP,
00:29:29.15		which is being produced by a promoter which is regulated by the lacI protein,
00:29:34.02		or the lacI operator. When lacI is not present, GFP is turned on.
00:29:39.11		We then have lacI, however, under the control of something called the tet promoter,
00:29:43.27		and the tet promoter is responsive to a small molecule
00:29:46.29		such that when you add aTc, this small molecule, it would turn on
00:29:51.10		the expression of lacI and that would turn off our GFP.
00:29:53.29		Now, that you can see by looking at the graph; the first point that we actually have
00:29:58.04		is the fact that in the absence of any aTc, then we have a very high fluorescence,
00:30:03.03		which means that the whole system is on.
00:30:04.27		If we then move to a point where we add aTc,
00:30:08.13		what you'll find in that case is that you can see the GFP levels start to go down
00:30:12.16		as a function of how much aTc we add,
00:30:15.10		and at the point where we've added 100 ng/ml of aTc, we have very little GPF being produced.
00:30:20.27		We can show that the mechanism of this is working the way we intend it to work
00:30:25.04		by adding an additional protein called IPTG, and what IPTG actually does
00:30:29.23		is to interfere with this lacI binding
00:30:32.13		such that you can recover some of the GPF expression
00:30:34.28		and that's actually shown in the last two points of this graph here,
00:30:38.07		that show that GFP can go back up.
00:30:40.23		So, now we know that our system, our basic inverter is working,
00:30:44.22		and what we have to do in this case now is to integrate that
00:30:47.12		into our cell, that is to change now Glk activity so that it responds in this same way.
00:30:54.06		And what's shown now in this slide is the result of having done exactly that.
00:30:58.25		So, here's now the construct of our inverter,
00:31:01.03		where again, this is really just how the DNA is being constructed,
00:31:04.26		and we're using that to control how Glk is being produced
00:31:07.20		and we can look at the same two properties that we looked at before,
00:31:10.21		which is, is the mRNA changing, that is, is the DNA to mRNA, that transcription process,
00:31:17.05		is that being regulated the way we want it to,
00:31:19.03		and does that correspondingly result in differences in the Glk activity?
00:31:22.25		And the mRNA levels are actually shown at the bottom,
00:31:25.04		where you can see that as we increase the amount of aTc,
00:31:28.00		we actually do see that we start initially with high levels of mRNA,
00:31:31.05		and then those levels of mRNA eventually come down.
00:31:34.05		The top graph here actually shows the response of Glk,
00:31:37.02		where it also starts very high, and then it also will come down to a very, very low level.
00:31:42.09		This is again a characterization in glycerol, where we don't have glucose present,
00:31:46.29		so we're only able to see the response of the cells to Glk
00:31:51.18		when it doesn't really need Glk and that actually tells us, is the system really working.
00:31:55.21		Now, we also want to know that it's actually dynamic.
00:32:00.01		So, the way we tested our static system before was just to change the promoters
00:32:05.13		that were encoding for Glk and then to ask, does that actually give us differences?
00:32:09.13		We now want to know if we have a switch.
00:32:11.13		If we start off with it on and then add this inducer so that we turn it off,
00:32:16.07		that is, we invert the response, do we actually get what we're interested in.
00:32:19.28		And the top graph that's shown here is the response of what happens
00:32:23.02		to the cell growth as we actually add our inducer,
00:32:26.10		where the top part of this is now uninduced,
00:32:28.24		that means that we're not adding anything chemically,
00:32:31.12		and we see that the cells are continuing to grow.
00:32:33.17		If we compare that now to the second line here,
00:32:35.26		where initially they both started off at the same point,
00:32:38.18		we add our inducer, we can see that the cells where we now have turned the gene off
00:32:43.10		by activating our inverter, are growing to a lower point.
00:32:47.00		We can also see a control plot in this very bottom here, which is what happens
00:32:51.14		if we add inducer from the very beginning.
00:32:53.20		That actually means that it turns off gene expression so low
00:32:56.17		that those cells never grow. You can see that the OD stays flat
00:32:59.22		and pretty much close to zero the whole time.
00:33:02.04		So, we know that again, the response we're looking for, growth,
00:33:05.14		is changing the way we want it to,
00:33:07.05		and just very briefly, what's shown in these bottom slides
00:33:09.21		is that the growth rate again is changing,
00:33:12.00		this is now relative OD between those two.
00:33:14.17		The activity is also changing, it's decreasing,
00:33:17.10		and the mRNA levels are going down as well.
00:33:19.22		Ok, so now we know the system is working exactly the way we want it to work,
00:33:23.19		it was designed in a certain way, we seem to have the output that we're interested in
00:33:27.13		from the design perspective of growth.
00:33:29.06		The question now is, does it actually give us the productivity enhancements that we were looking for.
00:33:34.09		And we're now again looking at the same system as before,
00:33:37.23		where our goal is to make this compound gluconate,
00:33:40.16		and the only difference now on this slide is that I've added now this product acetate
00:33:45.01		which is a byproduct of metabolism,
00:33:47.04		and is a representation of how much glucose flux is actually going down into endogenous metabolism.
00:33:52.28		And if you look at now the charts here on the right hand side,
00:33:56.06		the top one gives us the titers, or the concentrations,
00:33:59.07		and it shows that more glucose is being consumed, that's what in this white bar here,
00:34:03.05		is how much glucose is consumed.
00:34:04.21		More of that is consumed when the inverter is on;
00:34:07.20		the gray bar is how much product is being produced.
00:34:10.04		We make substantially more product being produced here as well,
00:34:13.05		and then these smaller bars here, the lightest kind of dark gray and the very, very black bar,
00:34:18.29		give us an indication of some of the minor byproducts.
00:34:21.19		And that's actually represented more easily in the bottom graph here,
00:34:25.04		where again, I'm showing the yield, that is how much of what goes in as glucose
00:34:29.15		is being converted to the glucaric acid product that we're interested in,
00:34:32.18		sorry, in this case the gluconate, or the gluconic acid product that we're interested in.
00:34:36.14		And the open white bars here give us the yield measurements
00:34:40.03		and in this case we've actually increased our yield from about 0.7,
00:34:43.23		and this was actually higher than what we had seen with the other system,
00:34:47.17		which tells us the cells are happier now,
00:34:49.06		and our yield in this case goes up to about 0.8, or a little bit higher than 0.8,
00:34:54.08		so we have about a twenty percent increase in the yield.
00:34:56.17		The grey bars that are shown here is this acetate by-product,
00:34:59.28		and you can see an even larger reduction in the waste going to acetate.
00:35:04.29		So, we have again a twenty percent increase in the yield here,
00:35:08.13		but we also have almost a fifty percent decrease in waste,
00:35:12.12		that is this acetate waste.
00:35:14.03		Now, the last thing that we wanted to look at was the timing of the induction
00:35:19.05		because we do know that based on exactly when we add this inducer to turn off Glk expression,
00:35:24.26		we could have the cell growth go way, way down,
00:35:27.15		I showed you that as a control plot, or if we wait too late,
00:35:30.22		then the cell is not actually able to respond
00:35:33.05		because it's going to stop being very active.
00:35:35.19		So, what we're looking at here now is the OD, or that is the growth,
00:35:39.06		at which we induce, starting from very early induction times,
00:35:42.08		up to later induction times, and then what's shown on the y-axis is the yield
00:35:46.19		relative to an uninduced culture. And we have two different yields that we're looking at,
00:35:50.27		one is the yield of product, and that's shown in the top here,
00:35:53.15		with the squares, and the second is the acetate yield,
00:35:56.22		or, again, a measure of waste that we have here.
00:35:59.29		What we find in this case is that the yield improvements are actually best
00:36:03.13		when we induce earlier. That means give the cells a little bit of time grow,
00:36:07.03		but don't let them grow too far, and we can see in our best case
00:36:10.20		about a 70% reduction in waste and a 20% increase in product being produced.
00:36:15.29		Let me summarize the story that I've given you about glucaric acid.
00:36:21.06		I started by talking about how we could come up with a new pathway
00:36:24.18		to be able to make this compound that was still a natural product,
00:36:28.16		but whose natural pathway was too cumbersome from being, to be produced in E. coli.
00:36:33.09		What we used in this case is part selection, or bio prospecting,
00:36:36.25		to find the enzymes that we could move from one source into another source,
00:36:41.04		and we're able to do this because once we know the DNA that encodes for those enzymes,
00:36:45.07		we can synthesize that DNA, and easily move it around between organisms.
00:36:49.15		And the second thing I showed you was this example of a synthetic biology device,
00:36:54.02		that was the protein-protein colocalization study,
00:36:56.26		which gave us increases in productivity.
00:36:59.04		And those protein-protein colocalization devices, or the scaffolds,
00:37:02.28		have been shown to be useful in other projects as well,
00:37:05.18		so that we know that they are reusable
00:37:07.14		and modular in a way that makes them very useful
00:37:10.10		for thinking about how do we actually engineer the metabolism of cells
00:37:13.26		to make the products that we're interested in.
00:37:15.16		And the last part that I showed you was an example
00:37:18.00		of how we might engineer the host, or chassis in the language of synthetic biology
00:37:22.10		to give us further improvements both in the titers,
00:37:25.16		that is the concentrations that we're interested in,
00:37:27.18		and in the flux, or the yield of the product that we want,
00:37:31.00		such that we get more of the substrate that we start with
00:37:33.23		going into more of the product that we're interested in.
00:37:35.29		I'd actually like to end this whole iBio seminar by acknowledging the folks that did the work.
00:37:41.26		I won't go through all the names,
00:37:43.12		but you can see them highlighted here in red,
00:37:45.13		as students who are both currently in the group working on these projects,
00:37:48.26		as well as former students and postdocs in the groups.
00:37:51.18		I've recognized John Dueber as a collaborator, he is still at the University of California at Berkeley,
00:37:56.13		and this work was primarily funded by the National Science Foundation
00:37:59.19		through SynBERC and through the Office of Naval Research through the young investigator program
00:38:03.13		with the last part of it being funded primarily by the National Science Foundation
00:38:07.06		through the career program. I hope you've enjoyed the iBio seminar
00:38:10.18		and thank you very much.