In the first part of her lecture, Dr. Prather explains that synthetic biology involves applying engineering principles to biological systems to build “biological machines”. The key material in building these machines is synthetic DNA. Synthetic DNA can be added in different combinations to biological hosts, such as bacteria, turning them into chemical factories that can produce small molecules of choice.
In Part 2, Prather describes how her lab used design principles to engineer E. coli that produce glucaric acid from glucose. Glucaric acid is not naturally produced in bacteria, so Prather and her colleagues “bioprospected” enzymes from other organisms and expressed them in E. coli to build the needed enzymatic pathway. Prather walks us through the many steps of optimizing the timing, localization and levels of enzyme expression to produce the greatest yield.
Introduction to Synthetic Biology and Metabolic Engineering
Concepts: Introduction to synthetic biology, synthetic DNA
00:00:06.09 My name is Kristala Prather, or Kris. I'm an associate professor
00:00:11.15 in the Department of Chemical Engineering at MIT
00:00:14.00 and today I'm going to talk to you about metabolic engineering and synthetic biology,
00:00:17.08 two fields in which I work and which have a very, very strong influence from biology.
00:00:22.01 If you think about biology in the world around you,
00:00:25.28 if you look around at nature, you'll see a lot of beauty and a lot of diversity.
00:00:30.17 You might see feathers on a peacock.
00:00:34.00 You can look at a nautilus shell
00:00:36.06 and see the structures, the symmetry, all of the details that are there.
00:00:40.08 You might look at the wing of a butterfly, for example,
00:00:42.28 and you see colors, and patterns, and lots of richness,
00:00:46.24 or even as you look at the leaf of a tree, you'll see lots of wonderful details and structures.
00:00:52.10 And all of that we see on a very large scale,
00:00:54.27 thinking about how nature gives us a lot of beauty, a lot of diversity,
00:00:59.01 a lot of function. If we actually look deeper, though, at that same leaf that came from the tree,
00:01:05.03 you'll notice that as you go closer and closer in scope,
00:01:08.02 or in scale, you'll see details that you couldn't see before.
00:01:11.20 You'll see that that leaf is actually composed of a series of individual cells,
00:01:15.15 and even within those cells, you see even smaller structures, structures we call organelles
00:01:20.25 that are all giving you this macroscopic property, or phenotype you see,
00:01:25.00 that you recognize as a leaf. And when we look at structures like this,
00:01:29.10 what we really realize is that it's really all about the DNA.
00:01:33.24 Everything that gives us what we see from nature,
00:01:37.00 that gives us the color, the structure, and the function,
00:01:40.11 the things that we think are beautiful, and the things that we think are useful,
00:01:44.08 really come back down to the DNA.
00:01:46.18 And it's really this focus on the DNA which is a hallmark of synthetic biology.
00:01:52.01 Synthetic biology has been defined in many different ways,
00:01:56.15 and it's actually interesting to think about some of the older definitions.
00:02:00.04 One of my favorite actually comes from 2000,
00:02:02.20 in a publication called Chemical and Engineering News.
00:02:05.21 And in that publication, synthetic biology was defined in the way that you can see on the screen now.
00:02:10.06 In particular, there's a focus on the use of non-natural, synthetic molecules.
00:02:15.24 That is, things that aren't really of biological origin,
00:02:19.05 and being able to use those molecules in order to give you function.
00:02:22.18 Ok, so the keys here are things being non-natural,
00:02:25.08 and the function that you would get in biological systems.
00:02:27.26 In the 10 to 15 years since this definition came out,
00:02:31.26 really synthetic biology has changed and the work that's going on in synthetic biology has become much broader.
00:02:38.13 And the definitions that are used more commonly than this first one that's given
00:02:42.17 are the following. One is that synthetic biology is the design and construction
00:02:47.15 of new biological parts, devices and systems.
00:02:52.01 And those parts, devices and systems are words that actually come from engineering.
00:02:57.07 So, one of the ways that we also think about synthetic biology, then,
00:03:00.22 is to re-design existing, natural biological systems for useful purposes,
00:03:06.06 which is really what engineering is all about.
00:03:08.12 Engineering is about design, it's about re-design, for useful purposes,
00:03:12.26 and that is, for specific applications.
00:03:14.25 So, you can see in just looking at how the definitions have changed,
00:03:18.07 from one in 2000, to the working definitions that are used more today,
00:03:22.15 we don't have to focus on unnatural pieces giving us natural or biological functions,
00:03:27.25 but rather, there's more of a focus on taking what nature has given us,
00:03:31.27 repurposing or re-using that for intentions that we as engineers
00:03:36.21 actually design.
00:03:38.02 So, for us, within what's called the Synthetic Biology Engineering Research Center,
00:03:43.24 this also a center in which I work,
00:03:45.17 we have what I describe as a practical definition of synthetic biology.
00:03:49.00 And that is synthetic biology is the effort to make biology easier to engineer.
00:03:54.09 And it's this fusion of engineering principles with biology
00:03:58.01 that really gives synthetic biology its heart and its purpose.
00:04:02.04 And here are some of the engineering principles that we think about as engineers.
00:04:05.27 Things like design; by design, what we mean in that case is saying,
00:04:09.27 I want to build a certain machine that has this specific function,
00:04:14.16 and I know how to draw out or sketch out a way in which I get there.
00:04:18.17 If I think about modeling in an engineering sense, modeling is really about mathematics.
00:04:23.07 That means I can write an equation that actually will support my design,
00:04:27.25 and it represents as well the understanding I have of the underlying principles
00:04:32.24 that allow me to have that design.
00:04:34.17 And then we have these principles of characterization and abstraction,
00:04:37.23 and that really means the practice of going through your design,
00:04:42.02 what you have actually designed, to the point where you build that and then you test it,
00:04:45.22 and in the process of testing it, you characterize the system as a whole,
00:04:49.16 as well as the individual parts.
00:04:51.20 And finally, abstraction means actually being able to take, now,
00:04:55.09 a larger view and if go back to my definitions of parts and devices,
00:04:59.18 it means not always having to look at the very specific level of detail,
00:05:03.26 but knowing that if I want some bigger function,
00:05:06.18 I could encode that in a simpler way.
00:05:09.08 So, the key technology in synthetic biology for all of this
00:05:13.08 is DNA synthesis.
00:05:14.27 And DNA synthesis is really about having biology or biological function
00:05:20.07 but taking a step where you really remove biology from that process.
00:05:24.01 Here's actually an example of how that works;
00:05:27.28 so if we think about biology and think about DNA,
00:05:30.27 I've already told you that all of biology is really about the underlying DNA,
00:05:36.02 there's a sequence of A's and G's and C's and T's
00:05:39.03 in the natural system that is the DNA and how those strings of nucleotides, as they're called,
00:05:44.20 are strung together, actually gives us the function that we're interested in.
00:05:48.14 So, I may study the biological system,
00:05:51.05 and then figure out, what is that sequence that gives me the function that I actually want,
00:05:55.22 or the function that I'm looking to be able to now design,
00:05:58.21 into a new system,
00:06:00.03 I can then go now to a computer, store that information digitally
00:06:04.25 and go through the design process that I talked about
00:06:07.27 where I can specify now my own sequence of those A's and G's and C's and T's,
00:06:13.01 to give me the function that I want and then rather than having to go back into the biological host,
00:06:17.28 I can take advantage of DNA synthesis to make the DNA that I want,
00:06:21.20 without actually having to go back into a biological host to do this,
00:06:26.06 but by rather recognizing that those sequences of A's and G's and C's and T's
00:06:31.02 are just chemicals and those chemicals can be synthesized without biology,
00:06:34.28 and they can be strung together without biology.
00:06:37.05 I can though, once I've made those, put those, put them back into a biological host,
00:06:41.26 and that gives me then the function, as far as biology is concerned, that I'm interested in.
00:06:46.24 So, this is a nice representation of this type of process,
00:06:51.27 from Seed magazine, this was drawn by Drew Endy,
00:06:54.18 who's now at Stanford University,
00:06:56.15 and it's just a cartoon representation of exactly what I described,
00:07:00.01 where you start at the beginning, with actually defining what that sequence is,
00:07:03.17 of the A's, the G's, the T's and the C's, from now a natural host,
00:07:08.04 you can then reconstruct those now as synthetic DNA,
00:07:11.19 and then this abstraction is actually the line that I'm crossing here,
00:07:15.10 where we think about now that I have that DNA, if you will,
00:07:18.28 encoding a function that I'm interested in,
00:07:20.24 and that may result in taking a certain input, converting it to a different output,
00:07:25.16 or stringing together different devices here now,
00:07:28.27 one that may have one function, one that has another function,
00:07:31.21 and having now a composite device,
00:07:34.20 as we would call it, that would give us this feature that we're interested in.
00:07:38.05 And I might be able to string these together in many different ways,
00:07:41.02 in order to get different kinds of functions now,
00:07:43.07 putting two devices together, or maybe multiple devices together that may give me a certain structure,
00:07:48.14 or feature, that has the function that I'm interested in.
00:07:50.29 So, if we now think about some examples of how we could make this work,
00:07:56.08 that is designing different pieces of DNA such that we put them together and get different functions,
00:08:01.11 you can see a movie that's playing on the screen now
00:08:04.23 where there are individual cells that are growing, they're dividing,
00:08:07.14 and you can see that they're actually blinking.
00:08:09.20 They're sometimes having light turned on and sometimes having light turned off.
00:08:13.22 This is actually an example of something that's called an oscillator,
00:08:16.27 that process of turning on and off means that the expression, in this case of this protein,
00:08:21.27 is oscillating. And that feature of being able to have a system now that blinks, if you will,
00:08:26.25 is something that can be encoded in the DNA,
00:08:29.03 by taking advantage of different parts,
00:08:31.08 for example, what's shown here is a Tet repressor,
00:08:34.05 a Lac repressor and a lambda repressor,
00:08:36.12 that all work together in a way that you have now a system that gives you sometimes the gene expression being on,
00:08:43.13 that is the light is on,
00:08:44.26 and sometimes the light being off.
00:08:46.15 So, this is an example of how a specific function, that is, oscillation, or blinking,
00:08:51.24 could be designed, there were models that were built in this system,
00:08:54.28 that is mathematical equations to describe how that had to happen,
00:08:58.16 and then you could actually build in those pieces with DNA,
00:09:01.10 these circles here are plasmid DNA, and the output, finally, is GFP.
00:09:05.22 And that's actually the protein that gives you the lightness or the darkness,
00:09:08.22 that is the blinking pattern.
00:09:11.21 Here's a different example of being able to string those pieces of DNA together
00:09:15.25 in order to get a particular type of function that we want,
00:09:18.24 and in this case, the goal was to have effectively a bacterial photography system.
00:09:23.11 And in this case, there was DNA taken from many different pieces,
00:09:26.15 there was something called phytochromes, or light sensors, from an organism called Synechocystis.
00:09:31.23 There was something called an osmoregulation system,
00:09:34.23 and this is really just a way to make proteins from E. coli
00:09:38.15 and then a protein called LacZ, which has really been around for quite a long time,
00:09:43.18 in biological standards, since the late 1970's,
00:09:46.06 which allows you to either have color or not have color.
00:09:49.14 And you can see now in the pictorial diagram here the way this is supposed to work
00:09:53.14 is that when light is present, you're going to have now activation of this osmoregulation system,
00:09:58.27 that gives you an output which in this case is going to be black.
00:10:02.11 If there's no light that's present, then you'll have an output that's going to be light.
00:10:06.07 So, we can have a table that's written here, that would be our design table that says,
00:10:10.14 the first condition that we want is a light condition,
00:10:13.07 and in that case the LacZ is going to be low,
00:10:16.07 and the result is a light color.
00:10:17.15 The second condition now would be a condition that's dark.
00:10:20.08 In that case, the LacZ output is going to be high, and that's actually going to give us a dark color.
00:10:25.00 How does this actually work?
00:10:27.21 Well, what you can do now is to create a mask where if you look on the left-hand side here,
00:10:33.16 what's shown is 'Hello World,' where everything now that's white would be white,
00:10:37.24 and you would actually be able to shine light through the words hello and world
00:10:41.29 and you can see next to that then the result of what happens.
00:10:45.02 When the light output was low,
00:10:47.10 you have no color, when it's high, you have a dark color,
00:10:50.15 and that actually gives you now, in bacteria, bacteria that are dark that say hello,
00:10:55.07 bacteria that are dark that say world, and will actually recapitulate, or give you that image,
00:11:00.05 much like a camera would.
00:11:01.26 Here's an example of this same system now, taking it a little bit further,
00:11:07.01 with images that are even more complex.
00:11:09.05 And you see in this case, now, from a paper published in Nature from the same group,
00:11:13.08 that you can actually end up with a picture of a bacteriophage
00:11:16.14 based on this same principle of having a mask and exposing light,
00:11:20.27 and in the places where the light is there, you have a dark color,
00:11:23.21 when it's not, you have a lighter color.
00:11:25.25 And you can even go even further and make a picture of Andy Ellington,
00:11:29.02 who is the professor in whose lab this was developed.
00:11:32.08 These are examples, now, of being able to put biology, or biological pieces together
00:11:38.19 for functions, but as engineers, we often want to think about how do we actually solve problems,
00:11:43.26 whether they be problems in healthcare, in energy or the environment.
00:11:47.29 And so I'd like give you a few examples of applications
00:11:51.12 that are emerging from synthetic biology
00:11:53.23 where researchers are actively working to build these biological systems
00:11:57.17 to address some of these global problems.
00:11:59.26 And the first example I'm going to give you is from the lab of professor Ron Weiss
00:12:03.18 who's in biological engineering at MIT,
00:12:05.16 and he's been looking at the issue of diabetes.
00:12:08.19 There are two types of diabetes: type I and type II.
00:12:11.18 In type I diabetes, what actually happens is that your body destroys
00:12:16.08 the cells that make the insulin that you need
00:12:18.17 to control your glucose levels in the blood.
00:12:21.24 And so, you may have seen an image like this before,
00:12:25.02 where patients who have diabetes have to check their blood glucose levels,
00:12:29.07 they actually have to prick themselves to extract blood,
00:12:32.19 expose that to a glucose meter, and then based on their glucose levels,
00:12:36.01 decide to dose themselves with insulin or not.
00:12:39.05 Well, if we say the problem is in the pancreas,
00:12:41.26 the question is, can you actually engineer an artificial pancreas
00:12:46.11 or engineer cells that will perform the function of the pancreas
00:12:49.21 so that you now no longer need to have this process of measuring blood glucose levels,
00:12:55.00 and then actually dosing yourself with insulin.
00:12:57.09 So, what Professor Weiss is doing is looking at engineering stem cells
00:13:01.01 to be able to stay in an undifferentiated state
00:13:03.19 to then sense when the presence of these insulin producing cells has gone very low,
00:13:09.00 and then to differentiate and produce new cells, only up to a point,
00:13:12.18 and then to stay quiet again, or quiescent,
00:13:14.29 such that you maintain this population of cells
00:13:17.18 that can spontaneously produce new insulin producing cells whenever your body needs them.
00:13:22.19 That's an application in health.
00:13:25.23 There are other applications, for example, in the environment,
00:13:29.05 and this is actually a significant problem in agriculture,
00:13:32.02 which is that you have to provide a lot of nitrogen
00:13:35.09 to plants in order to get them to grow.
00:13:37.18 Proteins, for example, have a lot of nitrogen in them, and so it's necessary to provide that
00:13:42.03 because it can be difficult to actually extract it in a way that's usable.
00:13:46.04 But it turns out that there are certain organisms that will actually live on the roots of plants
00:13:51.02 that have the ability to fix nitrogen, that means they can take nitrogen from the atmosphere,
00:13:56.00 and convert it into the kind of nitrogen which is useful for plants.
00:14:00.01 And so Professor Chris Voigt, who's in biological engineering at MIT
00:14:03.28 has been looking at whether or not you could take that ability to fix nitrogen, as we call it,
00:14:08.21 that is to take nitrogen out of the atmosphere and put it into a usable form for plants,
00:14:13.25 can you actually take that capacity from these microorganisms and put it directly into the plants
00:14:20.10 so that you actually have a need for much less fertilizer in the environment.
00:14:25.13 Here's a third example of a way that now a group of students
00:14:30.23 were looking at using synthetic biology to be able to really address a critical problem in both health and the environment.
00:14:37.14 And this is actually part of the iGEM program, you can see the URL for that at the bottom of the screen,
00:14:41.28 where iGEM stands for international genetically engineered machines
00:14:46.01 and the iGEM competition is an opportunity for students from all over the world to come together
00:14:51.15 and decide for themselves, here's a problem that we want biology to try to solve
00:14:55.21 and then to go through this process of designing, modeling, characterizing and building these systems
00:15:00.29 to see if they can address those problems.
00:15:02.29 The University of Edinburgh iGEM team in 2006
00:15:07.15 decided to try to tackle the problem of groundwater contaminated by arsenic in Bangladesh.
00:15:13.03 They studied the problem, found that it really is significant,
00:15:15.29 in terms of a lot of the groundwater being contaminated
00:15:18.15 and there not being really any easy systems for villagers to know whether or not a source of water
00:15:24.14 was safe to drink or not.
00:15:26.03 So, they decided to take pieces from biology that naturally responded to arsenic
00:15:31.07 and to build a sensor that would tell them whether or not there was arsenic in the water or not.
00:15:35.29 And it was actually designed after something you may have seen,
00:15:38.29 which is just a sensor that tells you, for example, the chlorine level and the pH level in a pool.
00:15:43.14 The idea being that you could take a sample of water, you could add now this sample of bacteria,
00:15:48.19 E. coli in this case, that could detect the arsenic.
00:15:51.19 If the arsenic was present at a certain level, the colors would become very bright,
00:15:55.13 and you would know that that water was not safe to drink.
00:15:58.02 Now, I want to actually switch gears a little bit and talk about metabolic engineering,
00:16:04.07 which is an area that's been around for awhile,
00:16:06.10 but we're increasingly seeing a merger between principles of metabolic engineering and those of synthetic biology.
00:16:12.16 And metabolic engineering is really about the fact
00:16:15.17 that biology is very good at doing chemistry;
00:16:18.09 that is, from biological systems, you can get a wide range of chemical molecules
00:16:23.10 that have useful functions. And I've shown two of them here.
00:16:26.02 The first one is called caspofungin, and then there's another one that's shown here that's called lovastatin.
00:16:31.15 Caspofungin is actually an antifungal organism, that is, it's used to treat fungal infections,
00:16:37.00 and lovastatin is one of the first cholesterol lowering drugs.
00:16:40.11 So, you've heard about statins, perhaps, and there are lots of them now,
00:16:43.14 but lovastatin was one of the first that was discovered.
00:16:45.29 Both of these are naturally produced by biological systems
00:16:49.20 and they've been very useful as natural products, we would call them in this case,
00:16:53.21 to have therapeutic functions. And traditionally, when we think about biology being used for chemistry,
00:16:59.06 it's usually for molecules like this.
00:17:01.06 If you look at caspofungin, for example, you can see that it has complexity
00:17:04.20 both in terms of just the number of atoms that it has, it's a pretty big molecule,
00:17:09.17 and then you'll also see a lot of these hydroxyl groups, you'll also see chiral centers,
00:17:13.16 which are shown now by the bold, or the arrows that go back and forth.
00:17:18.14 And so traditionally, if you think about how synthetic chemistry works
00:17:22.18 it's not that chemistry can't make a molecule like that,
00:17:25.09 but the yields would be very low, it would take a large number of steps
00:17:29.03 to get to that compound, whereas you have a biological organism
00:17:32.19 that can make these molecules very easily.
00:17:35.01 And so it's molecules like this that traditionally have been made by biology.
00:17:39.28 Now, I want to introduce as well a couple of other molecules, one being an amino acid, glutamic acid,
00:17:45.12 and the other being an organic acid, malic acid.
00:17:48.12 And these are also molecules that biology can make efficiently
00:17:52.09 using biological means as opposed to chemistry.
00:17:55.14 And what I mean by that is they can be produced commercially through fermentation.
00:17:59.16 So, you have an organism that's capable of making these compounds,
00:18:03.12 you can grow them up in very large quantities,
00:18:05.12 and now you have a product that you can bring to market.
00:18:08.02 What's true about all of these molecules is that they are produced by organisms
00:18:13.11 that naturally make them and the goal when metabolic engineering first arose
00:18:17.25 was to figure out how do you actually get these organisms
00:18:20.28 to do what they do better.
00:18:23.11 And better, in an engineering extent means to make more of the molecule, to make it faster,
00:18:28.13 and to make it more efficiently and the efficiency part, it's typically considered as yield.
00:18:33.19 That is, how much of the starting material that goes into the system
00:18:37.06 ends up in the product that you're interested in.
00:18:39.00 So, I have a graduate student who once came up with this analogy, or this cartoon,
00:18:44.09 to describe how metabolic engineering actually works in terms of improving these natural producers.
00:18:49.14 And what you see here is a maze, where you have this poor mouse, Wemberly,
00:18:53.28 that's lost its pet rabbit Petal, and Wemberly has to figure out how to get to Petal.
00:18:58.20 And you can see, as with any maze, there are a number of different starting points
00:19:02.12 that the mouse could use in order to get to the end point.
00:19:05.00 However, we know not all of those are going to be productive.
00:19:07.22 So, with metabolic engineering, what you want to do is to remove those routes
00:19:12.05 that are going to be non-productive.
00:19:13.11 That means to actually knock out, or delete, competing pathways.
00:19:17.07 Pathways that would actually take your substrate, your intermediate or your carbon
00:19:21.29 a place that you don't want it to go.
00:19:23.23 The other thing that you might want to have in order to have this faster objective met
00:19:28.13 is a little bit of a stimulation or motivation
00:19:31.17 for the enzymes to be overexpressed.
00:19:33.13 And overexpressing those enzymes, you can increase the amount of a limiting enzyme
00:19:40.01 in order to get more of that through the system.
00:19:42.26 And now again, in our cartoon fashion, what that means is encouraging the mouse to run a little bit faster
00:19:47.10 and to get through the maze quicker than it otherwise would.
00:19:50.12 So, I finally want to introduce just as background two other molecules that are interesting
00:19:57.06 both from a metabolic engineering and a synthetic biology standpoint.
00:20:00.23 And these are 1,3-propanediol and artemisinic acid
00:20:04.03 and you can see on the slide the uses for them.
00:20:06.20 1,3-propanediol, or PDO, as it's called, is an industrial chemical that's also used for materials production,
00:20:12.09 and artemesinic acid is a precursor to an anti-malarial drug.
00:20:16.19 Now, these are also compounds that are produced by biology,
00:20:20.00 meaning that we can make them through fermentation,
00:20:22.07 that is growing up a large number of microorganims to produce the compound that we're interested in.
00:20:27.14 They're also natural products, meaning that they're naturally produced by organisms.
00:20:32.01 But the difference between these two molecules and the first four examples that I gave
00:20:36.05 is that those molecules are produced naturally by one particular host,
00:20:41.17 but it's actually a different host that's been able to be used to have them produced economically.
00:20:46.28 And this allows us now to think about that DNA that we talked about, in terms of moving that around,
00:20:52.19 to be able to move it to reconstitute natural pathways in heterologous hosts,
00:20:57.19 or in hosts that don't normally contain that pathway.
00:21:01.10 Here's actually an example of doing just this thing.
00:21:05.14 So, the artemisinic acid that I told you about is a precursor to the drug that's shown here,
00:21:09.17 which is called artemesinin. It's naturally produced in a plant that's called Artemisia annua
00:21:14.17 and the goal is to be able to have, rather than that plant, a yeast cell
00:21:19.10 make this same compound. The reason for that is that you can put yeast cells into a factory
00:21:24.11 that looks much like factories that you may have seen before,
00:21:27.00 or, if you think about yeast and fermentation, this might actually be a brewery,
00:21:31.13 or a beer manufacturing unit.
00:21:33.00 And you can't take plants and actually scale them up in that same way.
00:21:37.02 Instead, you have to plant plants in the ground,
00:21:39.13 and wait for the proper amount of sunlight and nutrition in order for them to grow.
00:21:43.11 So, if my goal is to actually have a compound that's produced in Artemisia annua,
00:21:48.27 to have that be produced in a yeast, so that I can put it into a factory,
00:21:53.06 what that really means is identifying the DNA that encodes for those enzymes
00:21:57.25 that gives me the chemical that I'm interested in. I now can go through this process that I talked about before,
00:22:03.01 of sequencing that DNA and then synthesizing the DNA to get just those pieces that I need,
00:22:08.16 and then I can move that DNA now into my unnatural, or my heterologous host,
00:22:13.25 and that host, once it's properly engineered, is able to make the compound that I'm interested in,
00:22:18.22 and I can actually grow it up now in a large factory.
00:22:21.20 And this is work that's been done by Professor Jay Keasling, in chemical engineering at UC-Berkeley.
00:22:26.20 So, the work that's done in my lab is really focused on expanding this capacity of biology
00:22:33.04 to do chemistry. And we're motivated by the diagram that's shown here,
00:22:37.01 where if we think about the materials that we get in our world today,
00:22:40.16 where they come from and what they're used for, most of it comes from crude oil as the input,
00:22:45.15 and the outputs are things that you're familiar with, which include fuels,
00:22:49.05 which I think is mostly what we think about in terms of oil being used for,
00:22:52.27 but also quite a large bit of petrol chemicals.
00:22:55.22 And these are actually the molecules, olefins and aromatics are highlighted here as examples,
00:22:59.24 that are used for polymers, for resins, for adhesives, et cetera.
00:23:03.28 That is, those are molecules where the chemicals that are being produced
00:23:08.05 are being used for their mass properties, or their properties as chemicals
00:23:11.14 and not for their energetic properties, which is what we use them for for fuels.
00:23:15.18 And we've talked a lot in this country and across the world
00:23:18.19 about replacing crude oil as the input for this process,
00:23:21.29 and instead we can think about creating what we might describe as a bio refinery,
00:23:26.05 where the input in that case, rather than being oil, is glucose or other sugars,
00:23:31.07 that might come from biomass, in the same way that we now want to be able to make biofuels,
00:23:36.01 we want to be able to make more chemicals,
00:23:38.21 that is, the same chemicals that give us the function that we're used to from petrochemicals,
00:23:42.23 we want to be able to access those from biomass as well.
00:23:46.17 In the second part of my talk, I'm actually going to give you examples from my lab,
00:23:50.14 where we focus on exactly this, that is, building new kinds of chemical molecules
00:23:55.02 from biology in different ways that really take advantage of the key principles of synthetic biology,
00:24:00.25 but also are very firmly rooted within metabolic engineering as well.
00:24:05.23 So, this is actually our vision of how that happens, this is a cartoon representation from an artist at MIT,
00:24:13.16 where really what we're looking at doing in expanding the capacity of biology to do chemistry,
00:24:18.03 is to think about these microbes as they were, this is an E. coli representation,
00:24:22.21 as little chemical factories, where we can now, inside the cell, engineer different pathways
00:24:28.12 to make different products and that same image that I showed you before
00:24:31.28 of a large factory, we can think about that on a greatly, greatly magnified scale
00:24:36.23 or a greatly miniaturized scale, I should say, in terms of having now small microbes give us this same capacity.
00:24:43.00 So, let me give you my final thoughts about the field of synthetic biology
00:24:47.10 and a little bit about metabolic engineering. Synthetic biology is a very diverse field
00:24:51.21 and it's actually composed of very diverse individuals as well,
00:24:55.06 and so people like myself, who work in metabolic engineering are in that field,
00:24:58.13 those who are trained as electrical engineers, as computer scientists, as biological engineers,
00:25:03.11 as physicists, they are a lot of different people in this area who are looking at how do you actually use DNA
00:25:08.21 in order to get important functions of interest to solve the problems that we have to solve in the world.
00:25:13.25 The problems that are being worked on are very diverse problems;
00:25:16.25 I gave you examples that come from health, from the environment, from energy,
00:25:21.02 and again, this diverse set of people are working on this diverse set of problems
00:25:24.21 and are also taking diversity of approaches towards solving those problems.
00:25:28.27 And I would say that the goal for all of us as we go through this
00:25:32.10 is to actually make biology easier to engineer,
00:25:35.02 so that we really can bring solutions to some of our most pressing global problems.
00:25:39.22 In the second half of my talk, I'll talk much more about examples from my lab,
00:25:43.10 but this is my overview for metabolic engineering and for synthetic biology.
- What are some key differences between science and engineering? How does synthetic biology bridge these fields?
- What principles of engineering are used in synthetic biology?
- Map these onto another engineering process.
- How are these principles applied in other areas of biology research?
- What are some real world applications of synthetic biology, particularly using hosts other than yeast and bacteria?
- What are additional challenges to implementing synthetic biology outside of the lab?
Teaching an Old Bacterium New Tricks
Concepts: Engineering the production of glucaric acid from glucose in E. Coli
00:00:06.07 My name is Kristala Prather and I'm an associate professor of chemical engineering at MIT.
00:00:11.18 In the first part of my presentation, I gave an overview of metabolic engineering and synthetic biology
00:00:16.23 and now I'd like to talk more specifically about work being done in my lab
00:00:20.24 towards expanding the capacity of biology for chemistry, or as I've titled it here,
00:00:25.22 teaching an old bacterium new tricks.
00:00:28.04 So, in my introduction, I gave this maze as an example of how we think about metabolic engineering,
00:00:34.23 where you have the example here of a mouse Wemberly
00:00:38.08 that's lost its pet rabbit Petal and there's a maze of possibilities
00:00:42.16 of how the mouse might get to the rabbit it's looking for.
00:00:45.22 And our goal is to be able to block off, or to obstruct,
00:00:49.24 those pathways which are not going to be productive,
00:00:52.06 or to stimulate, in this case, the mouse to run faster, or in biological terms,
00:00:57.11 to increase the rate at which material will flow through our maze
00:01:00.15 so that we get to the product that we're interested in more quickly.
00:01:03.17 Now, there's another way that we actually think about engineering pathways
00:01:07.15 that also looks at this maze analogy.
00:01:09.25 In this case, our goal is completely different. Now, we actually want to blow the maze up
00:01:16.00 so that rather than forcing the mouse to run from one to the other through all these obstacles,
00:01:21.20 we have a more direct way to get from point A to point B.
00:01:25.00 And that's actually the focus of much of the work that goes on in my lab.
00:01:28.18 When we started thinking about this problem,
00:01:31.23 that is how do we actually get biology to do more chemistry,
00:01:35.19 the question was, well, what kind of targets could we look at?
00:01:38.09 What are good molecules to look at that might be produced by biology?
00:01:42.00 And in 2004, the US Department of Energy put together a report
00:01:46.02 called "Top Value Added Chemicals from Biomass"
00:01:48.21 where they actually sought to answer that very question.
00:01:51.09 That is to say, if you're using biology, or biomass,
00:01:55.01 as the input for chemicals, what are the right molecules that you'd want to produce.
00:01:59.20 And they came up with a list which is actually called the "top 10" list in the literature
00:02:04.21 of building block molecules.
00:02:06.08 Now, I always find this interesting because it turns out the top ten list actually has 12 lines
00:02:11.24 and a few of these lines have more than one molecule,
00:02:14.09 but nevertheless, it's called the top ten list because that sounds a lot better than the top 14 or 15 list.
00:02:19.12 If we look at this list, we see things for example like glutamic acid,
00:02:23.21 and that's an amino acid. We see aspartic acid, which is also an amino acid,
00:02:28.15 and those were compounds that we weren't really interested in working on
00:02:31.28 because our challenge was to find a pathway that either didn't exist
00:02:36.00 or one that was really, really complicated that we could, again, blow up our maze,
00:02:40.07 if we use that analogy, in order to get to the compound that we're interested in.
00:02:43.27 So, when we looked at this list, we eliminated compounds like that.
00:02:47.22 We also eliminated compounds like glycerol, which it turns out is actually relatively cheap today,
00:02:53.16 but wasn't when this report was first produced.
00:02:56.16 So, once we went through this process,
00:02:58.12 of saying, well here are things that we're not intellectually interested in,
00:03:01.17 and here are compounds that we don't think really give us the value that we want,
00:03:05.08 we began to focus on a couple of different compounds,
00:03:08.16 one of which is glucaric acid, that's shown here,
00:03:10.25 and I'd like to talk to you today about our work that we've done
00:03:14.06 to be able to produce this compound in a microbe, namely, E. coli.
00:03:19.00 Glucaric acid, as it's shown again here, is a structure that has 6 carbons,
00:03:24.13 so it's actually pretty similar to glucose in how it's arranged, and it is actually a natural product
00:03:30.07 and I mentioned natural products in the first half of my talk
00:03:32.24 as being compounds that are naturally produced by nature.
00:03:35.26 It turns out this is a compound that's produced in fruits and vegetables
00:03:39.09 and also in mammals, but there's no known microbial pathway for it,
00:03:43.10 meaning that if we look at the simplest organisms,
00:03:45.22 the ones that are easiest to think about putting into a factory,
00:03:49.13 there are no microbes like that where we know that glucaric acid is produced.
00:03:53.02 This compound has been studied for therapeutic purposes
00:03:56.10 either as an agent to reduce cholesterol, or even possibly to fight cancer,
00:04:00.25 but we've actually been more interested in its properties as a monomer
00:04:04.04 for different kinds of materials, or as detergents.
00:04:07.11 And the final bullet point on this slide just emphasizes the fact
00:04:10.10 that we actually know how to make this compound chemically, from glucose,
00:04:14.19 but it turns out that that process, the way it exists now,
00:04:17.13 is pretty messy, it requires a lot of harsh materials,
00:04:20.21 and so it's both not economical and not environmentally friendly.
00:04:24.22 So, we set out to come up with a way to make glucaric acid using biology.
00:04:29.05 This is actually what the natural pathway looks like
00:04:33.10 and hopefully you can see a lot of arrows here that might make you a little bit squeamish
00:04:37.15 if you were going to graduate school and your advisor said,
00:04:40.09 you've got to get this whole thing to work in E. coli.
00:04:43.08 Just to give you a quick overview of this pathway,
00:04:46.10 the compound we're interested in, glucaric acid, is in this box at the top.
00:04:50.00 I mentioned that this is something that could come from glucose
00:04:52.18 and we actually will use glucose as our starting compound as well,
00:04:55.20 and glucose is on this figure.
00:04:57.20 I'll give you a second to look and see if you can find it,
00:04:59.27 because it turns out there are quite a bit of arrows here,
00:05:02.17 but if you look very closely along the left-hand side, then you can see glucose right here.
00:05:07.15 You can also see that all these arrows are going back and forth,
00:05:10.22 you have this interaction with the pentose phosphate pathway,
00:05:13.17 you have another sugar, galactose, which is one input,
00:05:16.13 and you actually have an additional output, which is ascorbic acid.
00:05:19.23 This is a mess,
00:05:20.27 and this is not something that we would really want to think about putting into E. coli.
00:05:25.14 So, our challenge was to figure out, is there a different way for us to get from glucose
00:05:30.26 to the molecule that we're interested in, that would be much simpler,
00:05:33.25 that would have much, much less of this maze-like effect.
00:05:36.22 One of the nice things about having a molecule, however, that is a natural product,
00:05:42.25 is that we could go to the databases and say, is glucaric acid there?
00:05:46.26 That is, in known metabolism, is there an example where glucaric acid has been found
00:05:52.04 to be associated with biology.
00:05:54.16 And in fact, what we found is that glucaric acid could be produced from a compound
00:05:57.29 called glucaronic acid and it can produced using an enzyme called uronate dehydrogenase
00:06:03.03 that's actually found in a bacterium called Pseudomonas syringae.
00:06:06.15 But that was sort of the end of the story as far as Pseudomonas was concerned.
00:06:11.00 With our glucaronic acid, now, we could go back to the databases again
00:06:14.20 and ask the same question, that is, do we see glucaronic acid being produced by nature,
00:06:18.23 and in fact what we could find is that glucaronic acid could be produced
00:06:22.00 from a compound called myo-inositol with an enzyme called myo-inositol oxygenase.
00:06:26.28 And that enzyme is found in a number of sources, a number of mammalian sources,
00:06:31.12 and fungal sources, and we actually chose the variant from mouse
00:06:35.04 because it was one that had been shown to work well when it was expressed in E. coli.
00:06:39.15 But that was really the end of the story as far as mammalian biology was concerned,
00:06:43.11 but if we said where else does myo-inositol show up in metabolism,
00:06:47.21 we could actually find a linkage directly from myo-inositol, or to myo-inositol,
00:06:52.00 from glucose, and that was work done by John Frost's lab at Michigan State,
00:06:55.18 where he showed that you could use glucose as the input,
00:06:58.15 you would go through glucose-6-phosphate,
00:07:00.14 and then you would have just a single recombinant enzyme,
00:07:03.09 that is, a yeast myo-inositol-1-phosphate synthase that would produce
00:07:07.08 myo-inositol-1-phosphate and that, in E. coli, was naturally dephosphorylated
00:07:12.06 in order to give the myo-inositol compound that we're interested in.
00:07:14.24 So, now, rather than having this very complex network
00:07:17.24 of 11 or 12 steps, we really only need 3 different enzymes
00:07:21.29 to be expressed in E. coli, although from three very different sources,
00:07:25.24 in order to get the compound that we're interested in.
00:07:27.26 And so we could take advantage of that to actually have the first gene
00:07:31.17 directly PCR-amplified because we knew that would work in E. coli from John Frost's work.
00:07:36.11 The second gene we could take advantage of this DNA synthesis
00:07:39.23 that I talked about in the first part to be able to have this version of the gene synthesized
00:07:44.21 but synthesized in a way that E. coli would be able to produce it more easily
00:07:49.11 than the natural sequence of DNA that would come from mouse,
00:07:52.06 and then we actually had to do a little bit of work to figure out what was the sequence of DNA,
00:07:57.05 or the gene, encoding from the uronate dehydrogenase in bacteria.
00:08:01.23 But once we were able to do that, we now had all three of the genes that we needed
00:08:05.18 to put into E. coli to see whether or not it could make glucaric acid.
00:08:09.19 So, when we co-expressed all three of these genes,
00:08:13.28 what we found was exactly what we hoped to find.
00:08:15.28 And that is that we got glucaric acid being produced.
00:08:19.03 And the figure that's shown here shows the titer, or the concentration, in grams per liter,
00:08:24.05 of glucaric acid that we can measure in the culture medium.
00:08:27.13 So, this is actually spit out by the cell into the surrounding medium.
00:08:31.20 And I have two different bars that are shown here, one that has 0.1 millimolar IPTG
00:08:36.10 and one that has 0.05 millimolar IPTG.
00:08:39.04 I want to take just a second and explain what that really means.
00:08:42.07 IPTG in this case is what we'd call an inducer;
00:08:45.15 that means that it's something that we add to the culture that tells the cells
00:08:48.26 you should start making the proteins, or the enzymes, that we're interested in.
00:08:52.16 And what's shown now is the result on this slide, as something that we see a lot of times,
00:08:56.27 which is that if we have a somewhat higher concentration of our inducer,
00:09:00.21 where we're making more protein, you see that we actually have less of the product
00:09:04.27 than if we have a lower concentration of our inducer.
00:09:07.00 And that's really a core principle of metabolic engineering, which is that
00:09:10.18 changes that we make to the cell have these very broad systems-wide effects
00:09:14.23 that we don't always understand.
00:09:16.08 And so every time we seek to engineer an organism to make a compound we're interested in,
00:09:21.17 we have to go through this trial and error process of trying to identify
00:09:24.24 what really are the best conditions to make the compound that we're interested in.
00:09:28.25 The second thing that I want to point out is that we see,
00:09:31.22 besides glucaric acid being produced, we also find that we have myo-inositol,
00:09:36.13 which is accumulating, meaning we can measure that in the culture medium.
00:09:39.14 And the fact that that myo-inositol is there, it lets us know that the enzyme
00:09:44.06 which is converting myo-inositol to glucaronic acid is a limitation in the system.
00:09:48.22 That is, it's not working the way its supposed to work, such that all the myo-inositol that's produced
00:09:53.24 is converted to glucaronic acid, and then onto to glucaric acid.
00:09:57.20 I always think at this point, there must be a joke in here somewhere.
00:10:01.24 We have a yeast, a mouse and a bacterium
00:10:05.07 and they all go into a bar and I'm not really sure what the end result is here,
00:10:08.27 but we know that glucaric acid comes out somewhere.
00:10:10.26 Unfortunately, it's not quite that easy and we have a lot of challenges that we have to try to address
00:10:16.12 in trying to actually get the cells to make a lot more of this product that we're interested in.
00:10:20.20 The first of those challenges actually comes into place
00:10:24.26 when we actually look at the fact that we have this myo-inositol accumulating,
00:10:28.11 as I pointed out in the first graph, that showed glucaric acid being produced.
00:10:31.25 And in this case now, if we take a closer look at this enzyme,
00:10:35.00 all we're focused on is this one reaction.
00:10:37.21 We can see that this MIOX gene, the myo-inositol oxygenase,
00:10:41.12 takes myo-inositol as its input. It also uses molecular oxygen
00:10:45.19 and the product that's produced now is glucaronic acid.
00:10:48.15 And so we know that the cells are not actually doing this reaction,
00:10:52.29 that is, converted myo-inositol to glucaronic acid, at a fast enough rate
00:10:57.08 to consume of all it. So, if we study that enzyme by itself,
00:11:00.26 the experiment we did in this case was to look at cells producing just this enzyme,
00:11:05.01 so it doesn't have the first enzyme, which gives us myo-inositol,
00:11:08.09 it doesn't have the third enzyme, which actually takes that glucaronic acid
00:11:12.05 and converts it to glucaric acid.
00:11:13.19 Instead, we're looking at this in isolation, and we looked at two different conditions:
00:11:17.20 one where we actually have myo-inositol present in the culture medium
00:11:21.23 as we're growing up the cells and making the protein,
00:11:24.06 and one where it's missing.
00:11:26.04 And the only difference now is that at a point where we measure the activity of the cells,
00:11:30.22 we actually have some cells that saw substrate, that is the myo-inositol,
00:11:34.18 and some that didn't, but at the same time, when we would go to analyze them,
00:11:38.23 we take the cells away, so now there's no myo-inositol,
00:11:42.05 we break open the cells and release the protein and we expose those cells
00:11:46.18 to the same concentration of the substrate.
00:11:48.25 And in doing that and measuring the activity,
00:11:51.10 what we find is that for the cells that were able to previously see the substrate,
00:11:55.23 the activity of that protein is about an order of magnitude higher
00:11:59.03 than the cells that only saw substrate for the first time after the protein had actually been produced.
00:12:04.20 Well, so this actually raised an interesting question for us.
00:12:08.09 And we thought about we actually solve this problem,
00:12:10.27 and I can tell you the answer is not toss in a lot of myo-inositol,
00:12:14.11 because that's actually cheating. What we want to do is start from glucose,
00:12:17.19 which is going to be a more cheaply available substrate,
00:12:20.02 and make the product that we're interested in.
00:12:22.00 But now we can think about this as engineers and say, well,
00:12:26.03 what information do we have that actually gives us some guidance
00:12:29.14 on how we might actually be able to sole this problem,
00:12:32.04 even if we don't exactly understand the underlying reasons for the phenomenon that we see.
00:12:37.07 And so the first thing that we thought is, ok, what we want then
00:12:40.15 is for that first enzyme, the INO1, to make a lot of the myo-inositol,
00:12:45.14 and then that would be really good
00:12:47.06 because that's what we need for the second enzyme to be effective.
00:12:50.07 The only problem with that is that it sounds really good to say that,
00:12:53.06 but as we've worked on that, that turned out to be a lot easier said than done.
00:12:56.26 At the same time as we were looking at this,
00:12:59.12 we actually came up with another idea.
00:13:02.28 In this case, the idea came from a collaborator, John Dueber,
00:13:07.07 in SynBERC, which is the Synthetic Biology Engineering Research Center,
00:13:10.18 and John's work was looking at something called enzyme colocalization,
00:13:15.05 where the goal here was to be able to take enzymes
00:13:17.22 that normally might be freely disbursed throughout the cell,
00:13:20.15 with no reason for them to be together,
00:13:22.15 and to cause a way for those enzymes to be physically located next to each other.
00:13:26.24 In fact, what happens in this case is that the enzymes, shown here now as MIOX and INO1,
00:13:32.19 are actually exposed, or they have covalently attached to them these tags,
00:13:37.01 and those tags fold into a certain 3-dimensional conformation
00:13:40.28 that can then be recognized by a different piece of a protein.
00:13:45.08 That piece of a protein can then be put into something that we call a scaffold,
00:13:48.16 and if you now have the scaffold in the cell,
00:13:51.08 and you have these enzymes that are tagged with pieces that will recognize that scaffold,
00:13:56.04 that actually causes two enzymes to become located close to each other within the cell.
00:14:01.22 So, our idea here was very simple, that if we couldn't actually change the activity of the enzyme
00:14:06.19 and the way that we could get the upstream enzyme to make much more product,
00:14:10.11 if we actually reduced the distance between the two enzymes,
00:14:13.26 that would give us a higher local concentration of myo-inositol,
00:14:17.18 and maybe if that local concentration was higher,
00:14:19.21 that would give us the higher activity that we had seen before,
00:14:22.13 and that would actually give us higher yields and productivities.
00:14:25.15 And the first way that we tested this was exactly as its diagrammed on this slide,
00:14:31.02 where we actually had just these two enzymes being recruited to the scaffold,
00:14:35.01 in a one to one ratio, and in doing that, we actually got an increase of about a factor of 3
00:14:40.29 in the amount of glucaric acid that we were producing.
00:14:43.17 Now, as all good scientists, we have to ask ourselves,
00:14:46.27 is this working the way that we want it work?
00:14:49.10 And I'll remind you that our theory here was that what we would get was not just more glucaric acid,
00:14:55.05 but that that would happen because we would have a higher activity of MIOX,
00:14:58.23 that is, we would have better activation, and that would result in this faster conversion
00:15:02.27 that would give us more of the product that we're interested in.
00:15:05.15 So, we actually needed to test that theory,
00:15:08.07 that is, to measure the activity of this MIOX protein and find out
00:15:12.00 whether or not it actually had higher activity, as we supposed that it might.
00:15:16.25 What's shown now in the upper left-hand corner is the data for the product, or the glucaric acid titer,
00:15:22.19 where the lighter bars here are, well, on the left hand side, I should say,
00:15:26.15 without scaffold, and then on the right hand side, with scaffold.
00:15:29.10 And you can see again, these are two different conditions in terms of how much of this IPTG we use
00:15:34.14 to induce the expression of the proteins.
00:15:36.28 And in the first case now, of these lighter bars, there's no real difference
00:15:41.01 between not having scaffold and having scaffold,
00:15:43.22 on the amount of product that's being produced,
00:15:46.03 and if we actually look at the activity of the protein,
00:15:48.17 there's also no significant difference between the protein activity here and the protein activity in this case as well.
00:15:54.25 However, in our best case, where we actually had an increase of 3-fold
00:15:58.08 in the amount of glucaric acid being produced, that's the darker bar in this case,
00:16:01.29 we can look at the specific activity of the protein and we see about a 30% improvement
00:16:07.15 in the activity of this protein relative to when the scaffolds aren't present.
00:16:11.15 And the p-value is here just to show you that that difference is actually significant.
00:16:15.15 So, now we've actually verified that we have not just higher production of the product that we're interested in,
00:16:21.08 but we're getting that higher production by the mechanism that we had supposed
00:16:25.17 would actually happen.
00:16:27.12 Now, one of the nice things about these scaffolds is that what it allows you to do
00:16:31.16 is to explore different stoichiometries.
00:16:33.22 What I mean by that is you don't just have to have one of one protein
00:16:37.19 and one of a second protein coming together, but you can actually, in that scaffold,
00:16:41.27 dial in the stoichiometry by specifying the number of binding domains that you have
00:16:47.02 for each particular protein. So, this is an example of a different scaffold,
00:16:50.20 where you can see two binding domains for one of the proteins,
00:16:53.29 four binding domains for another protein and a single binding domain for the last protein.
00:16:58.12 And if we put that together, what it actually means is that we have,
00:17:01.10 in this case, four copies of the first gene, the INO1 enzyme, that is,
00:17:06.05 two copies of the second enzyme, and only one copy of that third enzyme.
00:17:09.23 This actually allows us to look at a wide variety of different configurations
00:17:14.20 as well as look at varying the amount of the scaffold that we have
00:17:18.11 and the amount of the enzyme that we have,
00:17:20.03 to look at the effect of that on the productivity.
00:17:22.23 And the result of that exercise is shown here,
00:17:25.12 where each of those dots is the average of a triplicate experiment
00:17:29.07 where we have the same amount of enzyme being produced in all cases,
00:17:32.20 but we're looking at a wide variety of scaffold induction levels
00:17:36.07 and also looking at a very wide configuration of different scaffolds themselves,
00:17:41.10 meaning different numbers of binding domains for these enzymes that we're interested in.
00:17:44.18 What we see if that we actually are able to change
00:17:49.00 the activity of this enzyme over a factor of about 7-fold
00:17:52.21 and that actually results in a change in the amount of glucaric acid that we have
00:17:56.24 in a factor of about 5-fold. So, we really have shown that we can use,
00:18:01.04 in this case what's called a synthetic biology device,
00:18:04.02 that is, these protein-protein co-localization mechanisms,
00:18:07.08 to be able to solve a problem with an engineering approach,
00:18:11.05 even if we still don't understand exactly what is it that leads to these differences
00:18:15.21 that we see in the activity of the protein.
00:18:18.08 Now, I want to remind you again of this maze analogy that we had before
00:18:24.03 of a protein, or rather a compound, coming into a maze
00:18:28.02 and having a number of different places that it could go.
00:18:30.17 And I showed a very simple diagram before of the maze having four different entry points.
00:18:35.11 Well, the reality is that this is really what the maze looks like inside the cell,
00:18:39.28 where each of the individual dots in this figure represents a particular chemical,
00:18:44.10 and each of the lines between those dots represents an enzyme
00:18:48.02 that can convert that chemical into something else.
00:18:50.20 So, that means that the networks that we're really talking about are very, very large mazes,
00:18:55.16 not these very simplified ones that I showed you.
00:18:57.26 And if our goal is to have glucose, for example, as a starting molecule,
00:19:01.06 work its way through this maze, and end up with a final compound that we're interested in,
00:19:05.28 we can often have by-products that are being produced.
00:19:08.24 And ideally what we'd like is to, again, knock-out those unproductive routes,
00:19:13.19 which are going to lead to byproduct formation, but the question becomes,
00:19:17.05 what if your byproduct is actually growth?
00:19:19.11 And growth in this case also means the ability to make the enzymes that you need
00:19:24.19 in order to catalyze all these chemical reactions
00:19:27.12 that are going to give you conversion of your starting substrate, glucose,
00:19:30.21 down to your final product, glucaric acid.
00:19:32.28 In this case now, we don't have the option of simply knocking out or deleting growth,
00:19:38.16 because now we're not actually going to make the enzymes that we need
00:19:41.10 and this means that we have to have a different way of solving this problem,
00:19:44.23 or a different approach to dealing with the byproduct that we have.
00:19:48.08 So, what we can do in this case is again, take advantage of these principles of synthetic biology,
00:19:54.05 which are based on design, to think about a control system.
00:19:57.17 In particular what we want is dynamic control of these activities.
00:20:01.22 We like to have our initial condition be fast growth, or growth being favored,
00:20:06.02 such that we actually make not just the cells, but again the proteins that we need,
00:20:10.03 that are going to give us the enzymes that give us the chemical reactions
00:20:13.02 that we need to make the product that we're interested in.
00:20:15.07 And then we want to trigger a switch to a production phase
00:20:18.16 where we say, stop growing now, and instead of growing,
00:20:21.15 use all of that glucose to make the molecule that we want you to make.
00:20:25.00 I can represent that diagrammatically like this,
00:20:27.26 where if we have our competing activity, initially, when the input is low,
00:20:32.04 that activity will be high, and at some point, I'm going to now add an input
00:20:36.03 that causes the competing activity to be low.
00:20:38.14 You can see that now, specifically, in what we're interested in, which is growth versus production,
00:20:43.24 which is that we want growth to actually start high,
00:20:46.29 and then after awhile, we want growth to go down, and instead we want the production here
00:20:52.21 to actually start to go up. This is actually something called a genetic inverter.
00:20:57.10 It's an inverter because when the input is low, the output is high.
00:21:01.17 When the input is high, the output is low.
00:21:04.05 And there is actually a precedent for this in nature, namely in secondary metabolite production.
00:21:09.03 Now, for secondary metabolites, these are natural products
00:21:12.25 where growth first is favored, and then the cell will naturally make this switch
00:21:17.12 such that you then will have the metabolites being produced later.
00:21:21.00 So, how do we actually make this process happen
00:21:25.20 when we're talking about having a switch for growth
00:21:28.22 where ideally what we're doing is having the cells use glucose for growth initially,
00:21:32.28 and then change that in order to use glucose for product formation
00:21:36.06 at some point after which we apply our trigger.
00:21:39.02 If we look at how glucose is normally used in our cells,
00:21:42.05 it comes in in what's called the PTS system,
00:21:44.11 and that PTS system brings in glucose as glucose-6-phosphate.
00:21:48.21 And it has two different routes that it can go into;
00:21:50.27 glycolysis or the pentose-phosphate pathway
00:21:54.02 and that's actually how that glucose is used by the cells for growth.
00:21:58.06 That's how the glucose is eaten, if we want to think about it that way.
00:22:01.16 And that's the process that we want to compete against.
00:22:03.24 Well, glucose-6-phosphate is the original substrate of our glucaric acid pathway,
00:22:08.19 but we didn't really want to deal with quite this complexity to start with,
00:22:12.12 so we decided to start on a simpler scale
00:22:14.22 and see if we could just address the glucose utilization issue
00:22:17.19 and then what we're doing now is to try to work up to the increasing complexity
00:22:21.27 that's required to deal with glucose-6-phosphate specifically.
00:22:25.11 That can be addressed by the fact that there is actually another way that glucose can come into the cell.
00:22:30.00 It can come in through what's called the galP, or galactose permease,
00:22:33.29 and in this case, it comes in as free glucose.
00:22:36.11 That glucose now has to be converted to glucose-6-phosphate
00:22:39.21 with an enzyme called glucokinase that uses ATP.
00:22:43.15 And because now the glucose has to go through that route,
00:22:46.17 it gives us just a single control point for being able to regulate,
00:22:51.01 that is control, how much of the glucose goes into our endogenous metabolism, or growth,
00:22:56.09 versus what goes into the product that we're interested in.
00:22:58.18 So, we can actually have this system now where we knock-out the PTS system,
00:23:02.25 we apply what we describe as a valve to regulate Glk activity,
00:23:07.06 and in doing that, we're able to modulate how much of the glucose is available
00:23:12.00 for endogenous metabolism, that is for growth,
00:23:14.21 versus how much is available for productivity.
00:23:17.06 And I just want to remind you that when we're talking about modulating the protein,
00:23:21.10 that is, how much of the glucokinase that's available, what we're really talking about
00:23:26.03 is controlling how much of the DNA, or how that DNA is being expressed.
00:23:30.18 So, we're actually doing all of our manipulations at the level of DNA synthesis,
00:23:34.21 which comes back to how we think about synthetic biology.
00:23:37.14 So, one way that we can actually test this,
00:23:41.27 rather than immediately going to a process where we have to worry about dynamic control,
00:23:46.18 is to look at what we would call static control of the system.
00:23:49.28 And that is that we can replace the natural glucokinase operon,
00:23:54.01 or production system,
00:23:55.20 which naturally consists of two different promoters that are negatively regulated
00:24:00.01 by this protein called FruR, we can get rid of all of that regulation,
00:24:04.16 that is we can replace that DNA, and instead have a library of different promoters
00:24:09.26 where the binding site for FruR is gone,
00:24:12.19 so the only thing that's regulating how much of this protein is produced
00:24:15.25 is the kind of promoter that we use.
00:24:17.26 And by varying the strength of these promoters, by using different variations here,
00:24:22.21 then we can end up with a library of different expression states
00:24:25.16 and ask the question, does that actually affect how much of a heterologous product
00:24:30.08 we could actually produce.
00:24:31.27 Here's now a little bit of characterization of this library.
00:24:35.16 The first thing that we're looking at in this slide is whether or not we actually do have increases in the mRNA,
00:24:40.28 that is, whether or not changing the promoter strength
00:24:43.10 changed the transcription, and then if that corresponded to increases in the protein being produced.
00:24:48.09 And what's shown in this case now, along the x-axis, is the relative promoter strength,
00:24:52.29 from very low strength, or weak promoters, up to very high strength promoters,
00:24:57.15 and then what's shown on the y-axis, on the left-hand side,
00:25:00.19 is the activity of the protein that we're interested in, glucokinase,
00:25:04.00 and what's shown on the right hand side is the mRNA levels.
00:25:07.05 And you can see now, that activity, which is in the solid circles,
00:25:11.19 does actually go up as we go from low promoter strengths
00:25:15.14 up to high promoter strengths, but it only goes up to a certain point,
00:25:19.01 after which we see it start to decline.
00:25:21.01 The same thing is true for the mRNA, that it actually will go up as we go along this axis here,
00:25:25.20 and it only will go up to a certain point and then it starts to decline as well.
00:25:30.06 These measurements were all done where we use glycerol
00:25:33.20 as a carbon source instead of glucose and that's actually to allow us
00:25:37.00 to decouple growth from measuring the properties of this enzyme
00:25:40.29 just to see if the library is working.
00:25:42.26 And what we actually found when we went to glucose
00:25:45.10 is that when the expression levels were too high here,
00:25:48.07 then these cells no longer grew. So, this cell has high mRNA,
00:25:52.03 but you can see the protein levels are pretty low.
00:25:54.10 And these cells would not grow on glucose.
00:25:56.29 The ones where the protein levels were still pretty high
00:26:00.01 would grow on glucose, except that we did have this gray region here,
00:26:03.29 this stipple region, where we saw the cells could grow, but only very, very poorly.
00:26:09.17 We could then take the cells that we knew were growing well, in this region here,
00:26:13.25 and then ask, can we actually now, in glucose,
00:26:17.03 relate the growth rate to the activity of this protein, which tells us whether or not it really can control
00:26:23.16 how much of the substrate is available for endogenous growth.
00:26:26.25 The result of that experiment is shown on this slide,
00:26:30.24 where again what we have now is expressed in terms of Glk activity
00:26:34.15 where it goes from a very low activity up to our higher activity
00:26:37.29 and then what's shown on the x-axis is the growth rate of the cells.
00:26:41.15 The native promoter is shown right here in this open triangle
00:26:44.25 and the filled squares will tell you that we're able to actually increase the growth rate
00:26:49.13 of the this cell. We can also decrease the growth rate of the cell by changing the glucokinase activity.
00:26:55.08 So, that confirms for us that we actually do have a control point
00:26:58.12 or a specific protein where if we vary the activity of that protein,
00:27:02.22 that actually will tell us, or allow us to control, rather,
00:27:05.26 how the cells are growing. The next question then is,
00:27:09.02 if you can control the growth of the cells, does that actually result in more product being produced.
00:27:14.13 So, in this case we have an example molecule, or a test molecule, gluconate,
00:27:18.02 this can be produced in one single enzymatic step from glucose,
00:27:21.25 and again, the competing reaction here is glucose-6-phosphate,
00:27:24.29 which is actually going to be produced from glucokinase.
00:27:27.18 What's shown now here is 5-KG, this is 5-ketogluconate,
00:27:31.05 which is just a spontaneous product that we actually get in very, very small amounts,
00:27:35.02 but we want to account for that by making sure that we look at the sum of both of these products,
00:27:39.18 to give us a sense of how much of the flux is coming through this side
00:27:43.17 versus this side of our pathway.
00:27:45.23 And now what's shown in this slide is actually the result of that experiment,
00:27:49.23 where what's shown now is the Glk activity, that is from, lower to higher amounts of that protein,
00:27:55.17 which is controlling how much glucose goes into endogenous metabolism
00:27:59.18 and what's shown on the y-axis is the molar yield,
00:28:02.29 and this is really how much of the glucose that we start with
00:28:06.04 goes into the compound that we're interested in, versus goes into other byproducts,
00:28:10.11 or into cellular growth.
00:28:12.04 And we see this very nice relationship where, when the activity is very low,
00:28:15.23 then we can see that we have a moderate amount of the yield, in this case,
00:28:21.12 that is the product that we're interested in. As we increase the activity,
00:28:25.00 then we get a slight bump, but as the activity goes higher and higher,
00:28:28.21 what we actually find is that we are decreasing the yield,
00:28:31.29 which basically tells us that as we get, now,
00:28:34.17 to the point where we're making more and more of this glucokinase,
00:28:38.06 we have more of the glucose going into growth
00:28:40.17 and less of it going into the product that we're interested in.
00:28:43.00 So, that actually gave us the validation that we needed
00:28:46.18 that the system design that we had envisioned,
00:28:49.12 one in which we could control the activity of this enzyme,
00:28:52.20 was going to be useful. What I haven't told you so far is that these cells here,
00:28:56.27 although they had the highest yield, did not have the highest concentration.
00:29:00.27 The concentration wasn't very different from the cultures that surrounded it
00:29:04.27 as far as yield was concerned, and they also didn't grow very well.
00:29:08.05 So, that just meant that the cells overall were not happy,
00:29:11.07 and that our original design of having them have a state where they grow very well first,
00:29:15.29 would probably work better in terms of giving us the maximum yield possible.
00:29:20.03 So, the system that we wanted to design here, again, is an inverter,
00:29:24.17 and the way that this will work is that we have this protein,
00:29:27.25 now as an example, GFP,
00:29:29.15 which is being produced by a promoter which is regulated by the lacI protein,
00:29:34.02 or the lacI operator. When lacI is not present, GFP is turned on.
00:29:39.11 We then have lacI, however, under the control of something called the tet promoter,
00:29:43.27 and the tet promoter is responsive to a small molecule
00:29:46.29 such that when you add aTc, this small molecule, it would turn on
00:29:51.10 the expression of lacI and that would turn off our GFP.
00:29:53.29 Now, that you can see by looking at the graph; the first point that we actually have
00:29:58.04 is the fact that in the absence of any aTc, then we have a very high fluorescence,
00:30:03.03 which means that the whole system is on.
00:30:04.27 If we then move to a point where we add aTc,
00:30:08.13 what you'll find in that case is that you can see the GFP levels start to go down
00:30:12.16 as a function of how much aTc we add,
00:30:15.10 and at the point where we've added 100 ng/ml of aTc, we have very little GPF being produced.
00:30:20.27 We can show that the mechanism of this is working the way we intend it to work
00:30:25.04 by adding an additional protein called IPTG, and what IPTG actually does
00:30:29.23 is to interfere with this lacI binding
00:30:32.13 such that you can recover some of the GPF expression
00:30:34.28 and that's actually shown in the last two points of this graph here,
00:30:38.07 that show that GFP can go back up.
00:30:40.23 So, now we know that our system, our basic inverter is working,
00:30:44.22 and what we have to do in this case now is to integrate that
00:30:47.12 into our cell, that is to change now Glk activity so that it responds in this same way.
00:30:54.06 And what's shown now in this slide is the result of having done exactly that.
00:30:58.25 So, here's now the construct of our inverter,
00:31:01.03 where again, this is really just how the DNA is being constructed,
00:31:04.26 and we're using that to control how Glk is being produced
00:31:07.20 and we can look at the same two properties that we looked at before,
00:31:10.21 which is, is the mRNA changing, that is, is the DNA to mRNA, that transcription process,
00:31:17.05 is that being regulated the way we want it to,
00:31:19.03 and does that correspondingly result in differences in the Glk activity?
00:31:22.25 And the mRNA levels are actually shown at the bottom,
00:31:25.04 where you can see that as we increase the amount of aTc,
00:31:28.00 we actually do see that we start initially with high levels of mRNA,
00:31:31.05 and then those levels of mRNA eventually come down.
00:31:34.05 The top graph here actually shows the response of Glk,
00:31:37.02 where it also starts very high, and then it also will come down to a very, very low level.
00:31:42.09 This is again a characterization in glycerol, where we don't have glucose present,
00:31:46.29 so we're only able to see the response of the cells to Glk
00:31:51.18 when it doesn't really need Glk and that actually tells us, is the system really working.
00:31:55.21 Now, we also want to know that it's actually dynamic.
00:32:00.01 So, the way we tested our static system before was just to change the promoters
00:32:05.13 that were encoding for Glk and then to ask, does that actually give us differences?
00:32:09.13 We now want to know if we have a switch.
00:32:11.13 If we start off with it on and then add this inducer so that we turn it off,
00:32:16.07 that is, we invert the response, do we actually get what we're interested in.
00:32:19.28 And the top graph that's shown here is the response of what happens
00:32:23.02 to the cell growth as we actually add our inducer,
00:32:26.10 where the top part of this is now uninduced,
00:32:28.24 that means that we're not adding anything chemically,
00:32:31.12 and we see that the cells are continuing to grow.
00:32:33.17 If we compare that now to the second line here,
00:32:35.26 where initially they both started off at the same point,
00:32:38.18 we add our inducer, we can see that the cells where we now have turned the gene off
00:32:43.10 by activating our inverter, are growing to a lower point.
00:32:47.00 We can also see a control plot in this very bottom here, which is what happens
00:32:51.14 if we add inducer from the very beginning.
00:32:53.20 That actually means that it turns off gene expression so low
00:32:56.17 that those cells never grow. You can see that the OD stays flat
00:32:59.22 and pretty much close to zero the whole time.
00:33:02.04 So, we know that again, the response we're looking for, growth,
00:33:05.14 is changing the way we want it to,
00:33:07.05 and just very briefly, what's shown in these bottom slides
00:33:09.21 is that the growth rate again is changing,
00:33:12.00 this is now relative OD between those two.
00:33:14.17 The activity is also changing, it's decreasing,
00:33:17.10 and the mRNA levels are going down as well.
00:33:19.22 Ok, so now we know the system is working exactly the way we want it to work,
00:33:23.19 it was designed in a certain way, we seem to have the output that we're interested in
00:33:27.13 from the design perspective of growth.
00:33:29.06 The question now is, does it actually give us the productivity enhancements that we were looking for.
00:33:34.09 And we're now again looking at the same system as before,
00:33:37.23 where our goal is to make this compound gluconate,
00:33:40.16 and the only difference now on this slide is that I've added now this product acetate
00:33:45.01 which is a byproduct of metabolism,
00:33:47.04 and is a representation of how much glucose flux is actually going down into endogenous metabolism.
00:33:52.28 And if you look at now the charts here on the right hand side,
00:33:56.06 the top one gives us the titers, or the concentrations,
00:33:59.07 and it shows that more glucose is being consumed, that's what in this white bar here,
00:34:03.05 is how much glucose is consumed.
00:34:04.21 More of that is consumed when the inverter is on;
00:34:07.20 the gray bar is how much product is being produced.
00:34:10.04 We make substantially more product being produced here as well,
00:34:13.05 and then these smaller bars here, the lightest kind of dark gray and the very, very black bar,
00:34:18.29 give us an indication of some of the minor byproducts.
00:34:21.19 And that's actually represented more easily in the bottom graph here,
00:34:25.04 where again, I'm showing the yield, that is how much of what goes in as glucose
00:34:29.15 is being converted to the glucaric acid product that we're interested in,
00:34:32.18 sorry, in this case the gluconate, or the gluconic acid product that we're interested in.
00:34:36.14 And the open white bars here give us the yield measurements
00:34:40.03 and in this case we've actually increased our yield from about 0.7,
00:34:43.23 and this was actually higher than what we had seen with the other system,
00:34:47.17 which tells us the cells are happier now,
00:34:49.06 and our yield in this case goes up to about 0.8, or a little bit higher than 0.8,
00:34:54.08 so we have about a twenty percent increase in the yield.
00:34:56.17 The grey bars that are shown here is this acetate by-product,
00:34:59.28 and you can see an even larger reduction in the waste going to acetate.
00:35:04.29 So, we have again a twenty percent increase in the yield here,
00:35:08.13 but we also have almost a fifty percent decrease in waste,
00:35:12.12 that is this acetate waste.
00:35:14.03 Now, the last thing that we wanted to look at was the timing of the induction
00:35:19.05 because we do know that based on exactly when we add this inducer to turn off Glk expression,
00:35:24.26 we could have the cell growth go way, way down,
00:35:27.15 I showed you that as a control plot, or if we wait too late,
00:35:30.22 then the cell is not actually able to respond
00:35:33.05 because it's going to stop being very active.
00:35:35.19 So, what we're looking at here now is the OD, or that is the growth,
00:35:39.06 at which we induce, starting from very early induction times,
00:35:42.08 up to later induction times, and then what's shown on the y-axis is the yield
00:35:46.19 relative to an uninduced culture. And we have two different yields that we're looking at,
00:35:50.27 one is the yield of product, and that's shown in the top here,
00:35:53.15 with the squares, and the second is the acetate yield,
00:35:56.22 or, again, a measure of waste that we have here.
00:35:59.29 What we find in this case is that the yield improvements are actually best
00:36:03.13 when we induce earlier. That means give the cells a little bit of time grow,
00:36:07.03 but don't let them grow too far, and we can see in our best case
00:36:10.20 about a 70% reduction in waste and a 20% increase in product being produced.
00:36:15.29 Let me summarize the story that I've given you about glucaric acid.
00:36:21.06 I started by talking about how we could come up with a new pathway
00:36:24.18 to be able to make this compound that was still a natural product,
00:36:28.16 but whose natural pathway was too cumbersome from being, to be produced in E. coli.
00:36:33.09 What we used in this case is part selection, or bio prospecting,
00:36:36.25 to find the enzymes that we could move from one source into another source,
00:36:41.04 and we're able to do this because once we know the DNA that encodes for those enzymes,
00:36:45.07 we can synthesize that DNA, and easily move it around between organisms.
00:36:49.15 And the second thing I showed you was this example of a synthetic biology device,
00:36:54.02 that was the protein-protein colocalization study,
00:36:56.26 which gave us increases in productivity.
00:36:59.04 And those protein-protein colocalization devices, or the scaffolds,
00:37:02.28 have been shown to be useful in other projects as well,
00:37:05.18 so that we know that they are reusable
00:37:07.14 and modular in a way that makes them very useful
00:37:10.10 for thinking about how do we actually engineer the metabolism of cells
00:37:13.26 to make the products that we're interested in.
00:37:15.16 And the last part that I showed you was an example
00:37:18.00 of how we might engineer the host, or chassis in the language of synthetic biology
00:37:22.10 to give us further improvements both in the titers,
00:37:25.16 that is the concentrations that we're interested in,
00:37:27.18 and in the flux, or the yield of the product that we want,
00:37:31.00 such that we get more of the substrate that we start with
00:37:33.23 going into more of the product that we're interested in.
00:37:35.29 I'd actually like to end this whole iBio seminar by acknowledging the folks that did the work.
00:37:41.26 I won't go through all the names,
00:37:43.12 but you can see them highlighted here in red,
00:37:45.13 as students who are both currently in the group working on these projects,
00:37:48.26 as well as former students and postdocs in the groups.
00:37:51.18 I've recognized John Dueber as a collaborator, he is still at the University of California at Berkeley,
00:37:56.13 and this work was primarily funded by the National Science Foundation
00:37:59.19 through SynBERC and through the Office of Naval Research through the young investigator program
00:38:03.13 with the last part of it being funded primarily by the National Science Foundation
00:38:07.06 through the career program. I hope you've enjoyed the iBio seminar
00:38:10.18 and thank you very much.
- What was the impetus for synthesizing glucaric acid?
- What is the bioprospecting process? What techniques are needed to do this?
- What is a synthetic biology device? Can you imagine different types of devices?
- In this example, what is the inducer? What is the genetic inverter?
- In summary, what are the three facets of synthetic biology that are optimized in the synthesis of glucaric acid?
Paper for this Session’s Discussion
Sheppard MJ, Kunjapur AM, Wenck SJ, Prather KL. Retro-biosynthetic screening of a modular pathway design achieves selective route for microbial synthesis of 4-methyl-pentanol. Nat Commun. 2014. PMID: 25248664
Discussion Questions for the Paper
- Describe one synthetic biology application that would be useful for your personal life and another application that would be useful for your research
- Summarize this paper using the 5-sentence grant-proposal structure outlined below:
- What is the problem?
- What are the knowledge gaps that limit current solutions?
- What is the specific insight/technology used that will overcome this?
- How do they solve the problem using this technology?
- What is the next problem?
- Many aspects of this artificial process need to be checked and optimized. Pick one of these processes (listed below) and explain the experiments involved.
- Substrate specificity (Fig. 2,3)
- Avoidance of futile cycles (Fig 3)
- Reducing products from shunts (Fig. 3)
- Efficiency (Fig. 4)
- Operon order (pg. 5)
Kristala Jones Prather received her S.B. degree from the Massachusetts Institute of Technology and her PhD at the University of California, Berkeley both in chemical engineering. Upon graduation, Prather joined the Merck Research Labs for 4 years before returning to academia. Prather is now an Associate Professor of Chemical Engineering at MIT and an investigator… Continue Reading
Leave a Reply