In this session, we investigate some of the methods used by scientists to measure evolution. A whiteboard animation introduces Hardy-Weinberg equilibrium (HWE) and explains how it can be used to calculate frequencies of specific alleles in a population. This whiteboard also suggests reasons why allele frequencies in a population may evolve or change over time. Three short video clips (from Drs. Hale, Tishkoff, and Newman) explain methods for generating phylogenetic trees and what they can tell us about the relatedness of species and how long ago they diverged. In our last video, David Haussler shares his excitement about the chance to compare multiple sequenced genomes and identify the genetic innovations that made us who we are today.
All Course Materials for this Session (Educators only)
00:00:07.24 Hi. My name is Melina Hale.
00:00:09.08 I'm a professor at the University of Chicago.
00:00:11.14 In my lab, we study neurobiology,
00:00:14.22 and evolution.
00:00:16.09 I'm going to present two different topics.
00:00:18.12 The first is an introduction to evolution.
00:00:22.12 Then we'll go on to talk about
00:00:24.10 a specific example from my lab
00:00:26.26 of how we map the nervous system
00:00:29.20 and aspects of the nervous system
00:00:31.12 onto the evolution of animals.
00:00:33.18 We work in my lab, specifically,
00:00:35.06 on vertebrate animals,
00:00:36.18 things like fish and tetrapods,
00:00:38.17 mammals, and reptiles,
00:00:40.27 and so I'm going to focus on that
00:00:43.07 part of biodiversity
00:00:45.07 in my talks.
00:00:46.22 There's a lot of other organisms out there, of course,
00:00:49.08 invertebrates, and insects, and plants,
00:00:51.18 and microbes,
00:00:52.26 that we won't touch on in these lectures.
00:00:55.26 So, we'll start with this introduction into evolution.
00:00:58.10 What is evolution?
00:01:00.05 Now, Charles Darwin
00:01:02.13 originally proposed the theory of evolution,
00:01:05.00 which can be summarized
00:01:06.27 in a very succinct phrase:
00:01:08.14 descent with modification.
00:01:10.05 Now, let's break that down a little bit, though,
00:01:12.17 to a broader definition,
00:01:14.22 which is change in the heritable characteristics
00:01:17.10 of organisms
00:01:18.29 from generation to generation.
00:01:20.29 We can break that down
00:01:22.13 even further to look at
00:01:24.11 the component parts of that sentence.
00:01:25.18 First, if we think about
00:01:27.03 this idea of generation to generation,
00:01:29.12 that means that when we look at evolution,
00:01:31.05 we're really not talking about changes
00:01:33.00 in individuals
00:01:34.17 or over short time frames.
00:01:36.01 Instead, we're talking about
00:01:38.04 changes that we see
00:01:39.28 over a long history
00:01:41.23 of the descent of an organism over time.
00:01:45.11 What about heritable characteristics?
00:01:47.04 Well, we all have lots of characteristics
00:01:49.01 to our bodies.
00:01:51.02 We may have big muscles
00:01:52.24 if we exercise a lot,
00:01:54.09 we may have had injuries in our lifetime
00:01:56.11 that have given us scars.
00:01:57.21 Those are not heritable characteristics.
00:02:00.10 Heritable characteristics
00:02:02.02 are the types of traits
00:02:03.19 that we pass on to subsequent generations,
00:02:06.11 or that we inherited from our parents
00:02:08.29 and grandparents.
00:02:10.18 Heritable characteristics
00:02:12.15 are an important part of evolution,
00:02:14.11 because it allows transmission
00:02:17.12 from one generation to the next,
00:02:18.24 and on and on through evolutionary history.
00:02:21.23 Now, the last part of this is change,
00:02:23.29 and change is also really important.
00:02:26.08 There has to be the ability in evolution
00:02:29.05 for these heritable characteristics
00:02:30.29 to vary,
00:02:32.21 to change in response to environmental factors
00:02:35.13 that might favor one type of characteristic
00:02:38.09 or another,
00:02:39.16 and we'll come back to that.
00:02:40.25 And that's what Darwin was getting at
00:02:42.21 with this idea of modification,
00:02:44.13 that there's going to be change
00:02:46.01 in how organisms are organized
00:02:47.24 and how they look over time.
00:02:52.03 So, here's an example,
00:02:53.18 a cute picture of a pair of dogs
00:02:56.07 and their puppies,
00:02:57.19 where you can really see
00:02:59.07 the variation in characteristics,
00:03:00.25 even in one generation.
00:03:02.23 If you look at the parents
00:03:04.11 and you look at the pups,
00:03:05.21 you can see some of the puppies
00:03:07.03 look like one parent,
00:03:08.21 with, you know, pure light fur,
00:03:11.03 others look like the other parent,
00:03:12.27 with very dark fur around the face,
00:03:15.03 but yet there are other puppies in the litter
00:03:17.13 that look different yet again,
00:03:19.04 that have a mix of the characteristics
00:03:21.29 of those two adults.
00:03:24.05 So, you can get a sense
00:03:25.29 of the variation in this image
00:03:27.26 that can be explored in evolution
00:03:30.29 and capitalized upon
00:03:33.06 through evolutionary time.
00:03:35.29 One example of variation
00:03:38.05 that's been really important for us
00:03:40.11 to understand how we can
00:03:43.17 change the characteristics,
00:03:45.07 the features of a species,
00:03:47.14 over time,
00:03:48.28 is the peppered moth.
00:03:50.12 So, these two moths,
00:03:51.23 that look very, very different
00:03:53.06 -- the light one on the left
00:03:54.24 and the dark one on the right --
00:03:56.01 are the same species.
00:03:57.12 They can interbreed.
00:03:58.23 Now, the dark one and the light one,
00:04:00.04 as you might expect,
00:04:02.07 do better in different types of environments.
00:04:06.03 This color characteristic
00:04:08.03 varies, of course,
00:04:10.00 and in some environments
00:04:11.29 it benefits the organisms
00:04:13.21 to be light or to be dark.
00:04:15.02 In other environments,
00:04:16.21 that same characteristic
00:04:18.16 may be detrimental to the animal.
00:04:20.14 So, these peppered moths
00:04:22.10 provided a classic example
00:04:24.01 of how characteristics can vary
00:04:27.07 with environment,
00:04:28.12 and how populations of a particular species
00:04:30.28 can vary.
00:04:32.22 So, this was noted
00:04:34.07 particularly in the industrial revolution.
00:04:36.19 At that time,
00:04:38.00 we went from manufacturing
00:04:39.22 using people
00:04:41.21 sewing or create objects
00:04:43.08 to using a lot of machines
00:04:44.26 to make products.
00:04:47.01 With the use of machines
00:04:48.21 came the use of coal,
00:04:50.25 and with coal came soot,
00:04:52.19 or pollution in the air.
00:04:54.06 Now, with that soot and pollution,
00:04:55.24 you could imagine that structures in the environment,
00:04:59.03 like trees,
00:05:00.19 would become darker,
00:05:01.28 and the peppered moth populations
00:05:04.05 changed in order to accommodate that.
00:05:06.27 And the darker morph
00:05:09.05 of the peppered moth
00:05:11.14 survived better. Right?
00:05:12.26 It was better camouflaged
00:05:14.23 against potential predators in the environment.
00:05:17.08 When the environment cleared up
00:05:19.21 and pollution decreased,
00:05:21.08 the tree barks became lighter
00:05:23.26 and the lighter version of the moth
00:05:25.25 actually survived better.
00:05:27.17 So, we can see variation
00:05:29.00 in the characteristics in a population,
00:05:31.19 even over this short amount of time,
00:05:34.20 and due to a human-induced
00:05:37.07 artifact in the environment,
00:05:38.09 this pollution from coal.
00:05:41.01 Now, just to show you how striking
00:05:43.05 this difference can be in the camouflage
00:05:45.05 of these moths on trees,
00:05:46.29 we can see some here.
00:05:48.19 So, here's our dark morph and our light morph,
00:05:50.17 and if we look at this tree,
00:05:51.28 we can see both the dark morph
00:05:53.15 and the light morph.
00:05:54.19 Here's the light one right down here,
00:05:56.20 and you can see it better camouflages
00:05:58.00 against the light bark
00:05:59.17 in this healthy tree.
00:06:01.05 The dark morph stands out against that light tree,
00:06:03.27 expect in this area over here,
00:06:05.22 where it's against this injury to the tree,
00:06:08.09 which shows up darker.
00:06:10.03 Another example in variation in populations
00:06:14.08 that we've probably all had experience with
00:06:16.14 is in bacteria
00:06:18.19 and the treatment of bacteria with antibiotics.
00:06:21.06 So, when we go to our doctor's office
00:06:22.26 with a bacterial infection,
00:06:24.08 we're prescribed antibiotics,
00:06:26.06 medicine to kill those bacteria,
00:06:28.19 and doctors are often very specific
00:06:31.09 about the need to take that medicine
00:06:34.11 over a precise time course,
00:06:36.11 and in particularly they say,
00:06:37.27 "Don't stop the medicine early.
00:06:40.12 You have to take the full course of medicine.
00:06:42.19 Even if you're feeling better,
00:06:44.21 take the full course of medicine."
00:06:46.07 It's important to do that.
00:06:47.25 Why is that?
00:06:49.02 It's because of the selection
00:06:51.00 that's acting on the variation in the population.
00:06:54.13 So, when we have a bacterial infection,
00:06:57.00 the species of bacteria
00:06:58.27 that's in our bodies
00:07:00.07 may have lots of variants to it,
00:07:02.04 and this is shown in number 1 on the left.
00:07:04.06 They might vary in aspects of their biology,
00:07:07.16 including how strong they are,
00:07:09.01 how resistant they are
00:07:11.01 to antibiotic medicines.
00:07:12.27 If we treat them,
00:07:15.06 shown in point 2 over here,
00:07:17.05 but we don't treat them long enough,
00:07:19.14 which are the bacteria
00:07:21.03 that are going to survive?
00:07:22.12 It's going to be the ones that are the strongest,
00:07:23.28 that are the most resistant
00:07:25.25 to the medication.
00:07:27.06 So, if we don't kill them
00:07:29.03 and we stop taking the medicine,
00:07:30.26 they'll be able to multiply
00:07:32.24 and will take on a larger part
00:07:34.28 of the population
00:07:36.20 of the bacteria.
00:07:37.25 It's not unless we kill them all
00:07:39.17 that we can prevent those resistant bacteria
00:07:42.06 from then multiplying
00:07:43.28 and becoming a problem
00:07:45.15 for our antibiotic medications
00:07:47.16 down the road.
00:07:49.07 So, I've shown you several examples
00:07:51.08 of how populations of a species can vary,
00:07:55.07 whether it's peppered moths or bacteria,
00:07:58.05 but how do we go from that
00:07:59.23 population-level variation
00:08:01.26 to the evolution of new species?
00:08:05.11 This is called speciation,
00:08:07.13 and in general
00:08:09.15 what happens is that populations
00:08:11.08 of a species
00:08:12.28 will be separated
00:08:14.13 and unable to interbreed,
00:08:16.03 and if they're separated
00:08:18.01 for a long enough period of time,
00:08:19.17 when they come back together
00:08:21.02 they may not be able to interbreed,
00:08:23.27 and then we would call them
00:08:25.25 different species.
00:08:27.02 One of the ways
00:08:28.25 that interbreeding is prevented
00:08:30.16 is through geographic isolation.
00:08:35.06 One of the students in my lab,
00:08:36.18 Andrew Trandai,
00:08:38.07 actually helped me out
00:08:40.13 by developing this hypothetical example
00:08:42.14 that I'm going to show you
00:08:44.23 on what a speciation event
00:08:46.13 might look like,
00:08:47.23 so I have to thank Andrew
00:08:49.19 for all of the images
00:08:50.28 that are coming up in the next series.
00:08:54.06 Okay, so in our hypothetical example,
00:08:56.29 what we're looking at is
00:08:58.29 some rodent squirrel-like animal
00:09:01.00 in an environment
00:09:02.19 -- one species --
00:09:04.06 all together as one population.
00:09:07.20 So, how do we separate them
00:09:09.16 and get new populations to evolve?
00:09:11.17 Well, in Andrew's example, here,
00:09:13.23 we have flooding
00:09:15.29 and an aquatic barrier
00:09:17.27 that these animals cannot cross,
00:09:20.01 so effectively
00:09:21.29 the population in the trees
00:09:23.12 and the population in the sand
00:09:25.19 are separated now
00:09:27.16 and will be evolving independently.
00:09:30.21 Over time, if we look at each of them,
00:09:32.23 we may see differences
00:09:34.12 being incorporated
00:09:37.17 into their biology.
00:09:38.25 Just superficially,
00:09:40.06 we might see the animals
00:09:41.29 that are in the forest
00:09:43.23 turning a different color,
00:09:45.19 other aspects of their anatomy
00:09:47.15 might change
00:09:49.17 to live in the trees.
00:09:51.04 On the opposite side of our river,
00:09:54.18 we may see the populations
00:09:56.07 that are in more of a sandy desert environment
00:09:59.18 change coat color
00:10:01.09 to match that environment,
00:10:02.20 or change size
00:10:04.15 to better adjust physiologically
00:10:06.10 to this drier environment.
00:10:08.13 Then ultimately,
00:10:10.00 once these differences have occurred
00:10:12.06 over, again, a very, very long period of time,
00:10:15.01 through evolution,
00:10:16.08 what would happen if the river dried up
00:10:18.16 and these animals
00:10:20.24 were able to come back together?
00:10:22.27 Well, they might come back together
00:10:25.07 and be able to interbreed,
00:10:27.10 but they may come back together
00:10:28.29 and not recognize each other
00:10:30.18 as the same species,
00:10:32.00 and therefore,
00:10:33.16 even though they're together
00:10:34.25 in this environment,
00:10:35.29 they would not interbreed
00:10:37.18 and their independent characteristics
00:10:39.04 would be carried on
00:10:40.25 from generation to generation
00:10:42.12 in those species.
00:10:47.14 So, that was an example
00:10:49.04 of geographic isolation,
00:10:51.04 and the biggest example of geographic isolation
00:10:53.22 happened about 200 million years ago,
00:10:56.18 when Pangaea,
00:10:58.06 which was this big super continental landmass,
00:11:00.25 broke apart to give us
00:11:03.16 the different continents that we know today.
00:11:05.28 So, South America and Africa
00:11:09.16 broke apart from North America and Europe,
00:11:13.08 and those continents
00:11:15.03 moved and separated around the globe.
00:11:17.22 With that separation,
00:11:20.00 the species that were together
00:11:22.08 prior to this breakup
00:11:23.25 then became separated,
00:11:25.16 and so if we look at species
00:11:27.10 that are in Africa versus South America,
00:11:29.20 for example,
00:11:30.28 we can see animals that
00:11:32.27 perhaps came from the same lineage,
00:11:34.14 but now are very, very different,
00:11:37.06 and are in fact different species.
00:11:43.20 Okay, so we've talked about this
00:11:46.04 process of evolution
00:11:47.16 and how it can occur.
00:11:49.28 What if we want to understand
00:11:51.16 the evolutionary history
00:11:53.11 of the animals that are
00:11:55.22 alive on earth today?
00:11:58.09 Well, we have to use a different set of techniques
00:12:00.28 to do that.
00:12:02.14 Here's just some of vertebrate diversity
00:12:04.16 and, as I said at the beginning of the lecture,
00:12:07.11 we also have lots of plants
00:12:09.16 and invertebrates and insects.
00:12:11.10 So I'm just showing you a very small part
00:12:12.14 of biodiversity here.
00:12:14.14 How do we figure out,
00:12:16.06 with animals so diverse as these,
00:12:18.17 how they're related to one another?
00:12:20.16 And how they evolved through time?
00:12:23.10 Well, we can take
00:12:25.04 a very simple example
00:12:27.03 of how we construct our own family trees
00:12:29.24 over very short time periods,
00:12:31.23 over several generations, say.
00:12:33.21 We research our genealogy,
00:12:35.11 we use birth notices and death notices,
00:12:38.27 and we recalled history
00:12:40.27 from our parents or grandparents,
00:12:42.29 and we can use that
00:12:45.01 to construct relationships
00:12:46.21 among our relatives and ourselves.
00:12:49.13 This is a really interesting family tree
00:12:51.25 that's on the wall of a Czech castle, actually,
00:12:55.01 and shows the relatedness
00:12:56.19 of this family,
00:12:58.04 going from a founder
00:12:59.17 down at the base of the tree, in the trunk,
00:13:01.20 up to the descendants at the top of the tree.
00:13:05.24 So, if we take a hypothetical example,
00:13:09.01 of building a family tree,
00:13:11.05 and we start with
00:13:13.06 this family of green-ish and blue-ish,
00:13:15.13 big-eared and small-eared organisms,
00:13:18.04 and try to construct how they're related,
00:13:20.21 we can just look and see how family trees
00:13:23.05 are organized.
00:13:26.01 So, here I've taken that population
00:13:27.29 and put them onto their tree
00:13:29.25 -- that I made up --
00:13:32.01 and we can see that they're related to one another.
00:13:36.10 So, the individuals
00:13:39.12 that are connected at the first branch
00:13:41.21 are siblings.
00:13:43.11 They have the same parents.
00:13:45.21 If we move back in the tree,
00:13:48.01 we're looking at the different common ancestors
00:13:51.23 of these individuals.
00:13:54.17 So, if we go back,
00:13:56.21 these groups that are bracketed
00:13:58.23 in the orange boxes
00:14:00.14 are shared pairs of grandparents,
00:14:03.20 so they'd be cousins.
00:14:06.19 And if we look down near the base,
00:14:08.28 we can see that all of these organisms
00:14:11.07 share a pair of grandparents.
00:14:13.28 Now, because we're in recent history
00:14:16.25 and we have all sorts of ways
00:14:18.13 to record our history,
00:14:19.25 we may even know what these grandparents look like,
00:14:22.19 what our common ancestors of us,
00:14:24.15 and our sibling, and cousins, look like,
00:14:28.04 and I've reconstructed them this way.
00:14:30.01 If we look at at a group of animals
00:14:31.27 that's as broad as fish and mammals
00:14:33.27 and amphibians and reptiles, though,
00:14:36.04 we don't have that record,
00:14:38.18 to know what those common ancestors are
00:14:41.17 or what they looked like.
00:14:43.03 We have to use other types of approaches,
00:14:44.29 called phylogenetic approaches,
00:14:46.16 to basically try
00:14:48.26 to reconstruct the common ancestor
00:14:51.06 and how those species are related.
00:14:53.22 So, if we take this set of vertebrates,
00:14:56.22 this small number of animals,
00:14:58.20 and try to put them on a tree,
00:15:00.21 this is what it would look like,
00:15:02.04 and this is based on lots of peoples' research
00:15:04.08 over many, many years,
00:15:06.06 and I'll run you through it quickly.
00:15:08.20 On the far left,
00:15:10.18 we have the base of the vertebrate tree,
00:15:13.12 and these are lampreys,
00:15:14.28 these are animals that don't even have, really,
00:15:18.20 They have these suction discs
00:15:20.02 that rasp and grip onto other species.
00:15:22.26 As we move up the tree,
00:15:24.13 we get into things like sharks,
00:15:26.00 and skates, and rays,
00:15:27.16 that have jaws,
00:15:30.02 but they have a cartilaginous skeleton.
00:15:31.29 When we move up yet again,
00:15:33.14 we get to the bony organisms
00:15:35.03 that include the fishes,
00:15:36.18 shown with these anemone fish,
00:15:38.14 the third image from the left,
00:15:40.08 and then we get up into the tetrapods,
00:15:42.26 that include amphibians,
00:15:45.17 reptiles, birds, and mammals.
00:15:48.14 Now, how do we construct
00:15:50.10 this kind of tree when
00:15:52.14 we don't have these detailed records
00:15:53.29 that we have of families?
00:15:55.12 Well, we do it by looking at
00:15:57.23 what characteristics these organisms share
00:16:00.14 and what characteristics vary between them.
00:16:02.26 There are lots of different types of characteristics
00:16:04.15 that we can use.
00:16:08.04 So, one of the features that we look for
00:16:10.20 when we're looking at shared characteristics,
00:16:12.22 or similarities and differences among organisms,
00:16:15.20 are anatomical features,
00:16:17.22 things like the shape of bones
00:16:20.06 or where sutures
00:16:21.14 -- where bones connect to one another --
00:16:23.04 or where we see holes through our skull
00:16:25.03 or other parts of our anatomy.
00:16:27.12 Bone and other structures
00:16:29.09 from the body
00:16:30.26 provide really nice characters
00:16:32.11 that we can use to try to figure
00:16:34.03 the relatedness of organisms.
00:16:36.11 In addition to using anatomical features
00:16:39.10 to try to understand the evolutionary history
00:16:41.20 of organisms and their relatedness,
00:16:43.24 DNA is now also
00:16:46.14 providing a really powerful way
00:16:48.24 of generating characters
00:16:50.17 to try to understand
00:16:52.20 how organisms have evolved.
00:16:54.08 In particular, we can compare a single gene
00:16:56.24 among different organisms,
00:16:58.14 different animals and species,
00:17:00.17 and see how it varies and how it's similar,
00:17:03.05 and look for changes in that
00:17:07.17 organization of the gene itself
00:17:09.05 that might give us signals
00:17:11.07 about how close a species is
00:17:12.27 to another species
00:17:14.20 and the relationship among them
00:17:16.28 and to different species.
00:17:18.18 Now, another set of data
00:17:20.12 that's been useful in understanding evolutionary history,
00:17:22.28 of course, is fossils.
00:17:24.22 They're really important.
00:17:26.01 Now, fossils provide information
00:17:28.16 about when and how features arose.
00:17:30.26 They won't, though,
00:17:32.20 provide the common ancestor.
00:17:34.07 It would be very unlikely
00:17:35.21 to actually dig up a fossil
00:17:37.12 that gives you the exact common ancestor of a species
00:17:39.24 but, nevertheless,
00:17:41.20 what they can provide us,
00:17:42.28 how they can ground our understanding
00:17:45.29 of when an organism
00:17:47.24 or particular elements and characteristics
00:17:49.11 of an organism arose,
00:17:50.28 is incredibly important.
00:17:53.25 So, to summarize
00:17:56.23 our introduction to evolution
00:17:58.09 and some of the major points we've talked about...
00:18:00.11 first, evolution is change
00:18:02.13 in the heritable characteristics of organisms
00:18:05.00 from generation to generation,
00:18:06.17 descent with modification
00:18:08.21 as proposed by Darwin.
00:18:11.08 Variation in characteristics
00:18:13.09 allows some subsets of populations
00:18:15.18 to be selected for or against.
00:18:18.20 And selection can cause change
00:18:20.14 in the characteristics
00:18:22.11 that persist in a population,
00:18:23.26 and this can allow for populations to diverge.
00:18:28.27 Reconstructing how the diversity of organisms
00:18:33.09 involves making trees,
00:18:34.23 or these phylogenies that I talked about,
00:18:37.09 that show different organisms
00:18:39.12 are related to one another.
00:18:41.19 And phylogenies, though,
00:18:43.08 depend on identifying characteristics
00:18:45.29 that are shared between organisms
00:18:48.06 and that can suggest their common ancestry.
00:18:50.07 And, again, we can get those characteristics
00:18:52.22 from morphology, from genes,
00:18:54.24 from all sorts of different sources.
00:18:57.19 Thank you.
00:00:07.17 For the last part of my lecture series,
00:00:10.11 I wanna talk about examples of natural selections in humans,
00:00:14.29 and the two particular examples
00:00:17.01 that I'm going to be talking about
00:00:19.00 are the evolution or lactose tolerance in east Africa,
00:00:22.10 and of pygmy short stature.
00:00:25.04 So if we're going to be talking about natural selection,
00:00:27.11 we have to first of course
00:00:28.29 acknowledge Charles Darwin,
00:00:31.12 who came up with the theory of natural selection.
00:00:36.10 In fact, to quote from Darwin, he said,
00:00:39.13 "This preservation of favourable variations
00:00:42.03 and the rejection of injurious variations,
00:00:44.21 I call Natural Selection.
00:00:47.06 Variations neither useful nor injurious
00:00:49.27 would not be affected by natural selection,
00:00:52.18 and would be left a fluctuating element,
00:00:55.01 as perhaps we see in the species called polymorphic."
00:00:58.10 And that was from his classic book
00:01:00.06 On The Origin of Species,
00:01:01.24 published in 1859,
00:01:03.25 and you might recognize from our first lecture,
00:01:07.03 that this is really talking about genetic drift,
00:01:09.27 random fluctuations.
00:01:13.13 However, part of the evolutionary change that we see
00:01:18.12 is not just going to be due to random genetic drift,
00:01:21.03 it's also going to be due to natural selection.
00:01:24.18 And so, according to that theory,
00:01:27.10 natural variation exists and is heritable,
00:01:30.02 more organisms are born than can survive,
00:01:32.11 and therefore organisms best suited to the environment
00:01:35.07 survive more often,
00:01:36.25 and slight differences can accumulate in a species over time.
00:01:40.24 So this is the idea of gradual evolution of a species
00:01:43.27 by natural selection.
00:01:45.18 And this is Huxley,
00:01:47.03 who was also known as Darwin's bulldog
00:01:50.11 because he was the big proponent of his theory,
00:01:52.17 and he said,
00:01:54.05 "How extremely stupid not to have thought of that!"
00:01:57.12 So when Darwin first came up with his theory of natural selection,
00:02:01.08 there was really no concept of genetics
00:02:04.12 as we know it today.
00:02:06.02 In fact, it wasn't until the late 1800s
00:02:08.12 that Mendel proposed his theory of genetics.
00:02:13.02 So in the 1930s and 1940s
00:02:15.22 there was sort of a synthesis of natural selection
00:02:19.09 and genetics and mathematics,
00:02:22.16 population genetics,
00:02:23.25 and at that time it was proposed that genetic variation in populations
00:02:27.05 arises by chance through mutation and recombination,
00:02:31.15 that evolution consists primarily of changes in the
00:02:34.12 frequencies of alleles between one generation and another,
00:02:37.25 largely as a result of genetic drift,
00:02:40.24 gene flow,
00:02:42.05 and natural selection.
00:02:43.27 And that speciation occurs gradually when populations
00:02:46.00 are reproductively isolated, for example,
00:02:48.20 by geographic barriers.
00:02:52.14 And so if we look at this timeline,
00:02:54.12 starting with the Origin of Species,
00:02:56.21 and then Mendelian inheritance
00:02:59.04 is actually rediscovered in 1900,
00:03:01.19 it was first proposed in the late 1880s,
00:03:04.01 but very few people knew about it at that time.
00:03:06.27 And then in the early 1900s
00:03:09.09 we have the theoretical foundations of population genetics
00:03:12.21 and then, as I mentioned,
00:03:14.09 the modern synthesis in the 30s.
00:03:16.19 And then in the 70s we have Kimura's theory of neutral evolution,
00:03:21.19 which was proposing that most changes and speciation events
00:03:25.07 are simply due to random genetic drift
00:03:27.22 and to new mutation events.
00:03:29.20 And I think that today we would say
00:03:31.14 it's a combination of all of the above.
00:03:33.19 There's certainly a lot of genetic drift that occurs,
00:03:36.02 but we know that natural selection
00:03:37.29 is having a very important influence
00:03:40.26 on the variation that we see
00:03:43.05 in terms of phenotypic variation and even disease susceptibility.
00:03:48.04 So let's look what happens
00:03:49.08 when a neutral mutation occurs in a population,
00:03:52.07 as indicated by this individual in green.
00:03:55.25 Let's look what happens as we proceed forward in generations,
00:03:59.08 and you can see there's not too many changes
00:04:01.21 in allele frequency.
00:04:03.29 But what happens when we have a beneficial mutation,
00:04:09.05 which means that it increases the fitness of the individual,
00:04:12.20 meaning that they're more likely to produce children,
00:04:16.12 and their children are more likely to produce more children,
00:04:19.00 and so on and so forth.
00:04:21.15 And so we can see that each generation,
00:04:24.04 this beneficial mutation is going to spread,
00:04:27.22 until eventually it may be nearly fixed
00:04:32.06 in the population.
00:04:34.17 So I want to tell you about some of our studies
00:04:37.20 focused in African populations
00:04:39.14 in which we're trying to identify
00:04:41.02 genetic signatures of natural selection,
00:04:43.19 and regions of the genome that are targets of natural selection.
00:04:47.29 And this is important
00:04:49.22 because it's thought that mutations associated with diseases
00:04:52.16 in modern populations,
00:04:54.13 like hypertension
00:04:56.03 , diabetes,
00:04:58.10 and asthma,
00:04:59.08 may have been selectively advantageous or adaptive
00:05:01.23 in past hunter-gatherer environments.
00:05:04.04 So if we can identify these regions
00:05:06.25 that are targets of selection, or actual variable sites
00:05:09.18 that are targets of selection,
00:05:11.16 those may be functionally important
00:05:13.14 and may give us a clue about disease risk.
00:05:16.11 So here I'm showing you a few of the populations
00:05:18.12 that we've studied in Africa,
00:05:20.23 and we have people who are living at very different climates,
00:05:23.15 high altitude, low altitude,
00:05:26.03 savannah, and tropical environments, for example.
00:05:30.10 We have people who have very different diets,
00:05:32.22 so agriculturalists,
00:05:35.23 or pastoralists.
00:05:37.14 And they have very different infectious disease exposures,
00:05:40.05 so they've likely undergone local adaptation
00:05:42.13 to different environments.
00:05:45.25 And I'm going to, as I mentioned,
00:05:47.16 tell you about two examples today.
00:05:49.15 The first one is the evolution of lactose tolerance
00:05:51.29 in east African pastoralist populations.
00:05:57.07 So, the ability to digest the sugar lactose,
00:06:00.21 which is quite common in milk,
00:06:03.07 is due to an enzyme called lactase-phlorizine hydrolase,
00:06:07.15 or known as lactase for short.
00:06:09.28 And lactase is expressed specifically
00:06:13.01 in the brush border cells of the small intestine,
00:06:16.25 and in individuals who maintain high levels of this enzyme
00:06:20.29 as adults,
00:06:22.23 they're able to break down the complex sugar lactose
00:06:26.16 into glucose and galactose,
00:06:29.14 which is rapidly taken up into the bloodstream.
00:06:35.19 most mammals, and most humans,
00:06:38.19 shut down lactase activity
00:06:40.23 shortly after weaning.
00:06:42.28 So, as adults, they do not have an active form of this enzyme.
00:06:46.24 And what's going to happen is
00:06:48.19 they're not going to be able to break down that complex sugar.
00:06:51.26 It's going to go down into the lower gut,
00:06:54.13 it's going to be attacked by bacteria,
00:06:56.25 and you're going to have severe intestinal distress.
00:07:01.00 Now, it has been noted for many years by anthropologists
00:07:04.27 that there is a very strong correlation
00:07:06.27 between the lactose tolerance trait,
00:07:09.19 or you could think of it also as the lactase persistence trait,
00:07:13.26 because there's persistence of the enzyme activity as adults.
00:07:18.15 And they've seen a strong correlation
00:07:20.20 between the prevalence of that trait
00:07:23.14 with populations who traditionally practice cattle domestication
00:07:28.04 and dairying.
00:07:30.05 So for example, this trait is most common in northern Europe,
00:07:33.19 it decreases in frequency as one moves
00:07:36.23 into southern Europe
00:07:38.29 and into the Middle East.
00:07:40.24 It's very uncommon in eastern Asia
00:07:43.26 and in the Americas,
00:07:46.13 and it's uncommon in western Africa,
00:07:48.26 which is one of the reasons that we see high levels
00:07:51.17 of lactose intolerance in African Americans, for example.
00:07:55.24 But in regions of Africa where there's a high prevalence
00:07:59.04 of cattle domestication, pastoralism, and dairying,
00:08:03.14 we see a high prevalence of this trait.
00:08:07.15 So, in 2002,
00:08:10.18 there was an elegant study done
00:08:12.20 by Leena Peltonen's group in Finland,
00:08:14.28 in which they identified a genetic mutation
00:08:17.08 that regulates lactose tolerance in Europeans.
00:08:20.20 And it was located near the...
00:08:23.25 upstream of the lactase gene.
00:08:26.01 When we sequenced that region in east African pastoralists,
00:08:29.04 they didn't have it,
00:08:31.10 so we knew they must have something else.
00:08:33.13 So in order to identify those mutations,
00:08:35.21 we did something that's called a lactose tolerance test.
00:08:38.29 So, basically what we do is
00:08:42.11 we give people the sugar lactose in a powdered form,
00:08:46.20 we add water, and it basically tastes like orange Kool-Aid,
00:08:51.09 and then we have to line people up
00:08:54.17 and have them drink the lactose at the same time.
00:08:57.27 This is a group of Maasai women from Tanzania.
00:09:03.28 This is a group of pastoralists from southern Ethiopia.
00:09:11.23 And then we can use a standard diabetes monitoring kit,
00:09:16.03 and what we can do is to measure the blood glucose,
00:09:19.29 starting at baseline before they drink the lactose,
00:09:23.29 and then every 20 minutes we're gonna measure this,
00:09:27.06 over a period of about an hour.
00:09:30.02 And then we're gonna look at the maximum rise
00:09:32.25 in blood glucose.
00:09:35.15 If individuals have a rise
00:09:37.09 that is greater than 1.7 millimolar (mM)
00:09:39.20 we consider them to be lactose tolerant,
00:09:42.20 or to have the lactase persistent trait,
00:09:45.03 shown in light blue.
00:09:47.07 And if they have a rise that is less than 1.1 mM,
00:09:51.12 they're considered to be intolerant,
00:09:53.25 shown in dark blue.
00:09:55.18 So, we measured this trait
00:09:57.12 in nearly 500 individuals
00:09:59.17 from Tanzania, Kenya, and the Sudan,
00:10:02.00 and then we looked for association
00:10:04.10 with genetic variation that we identified
00:10:06.21 by resequencing the region
00:10:08.28 where the European variant had been identified.
00:10:13.13 And in doing so we identified
00:10:15.12 three novel genetic polymorphisms
00:10:18.21 that are associated with the lactose tolerance trait in east Africa,
00:10:22.17 and those are shown here by the boxes.
00:10:26.29 The most common was this one at position 14010,
00:10:31.06 but we also saw those others
00:10:32.24 at positions 13915 and 13907,
00:10:36.03 located roughly 14,000 basepairs
00:10:38.25 upstream of the lactase gene
00:10:41.18 which is located on chromosome 2.
00:10:44.07 Now, one of the really interesting things about this is that,
00:10:48.11 one, these regulatory mutations were pretty far away,
00:10:51.28 about 14,000 basepairs from the gene,
00:10:54.26 and they were located in an intron
00:10:57.25 in a non-coding region of a neighboring gene called MCM6.
00:11:03.00 So this is demonstrating that
00:11:04.25 functionally important variation
00:11:07.13 can actually be located in non-coding regions,
00:11:10.21 and we were able to show,
00:11:13.13 using in vitro cell line studies,
00:11:16.20 that these variants that are derived,
00:11:20.19 shown in the different colors here,
00:11:23.22 that they regulate expression
00:11:26.15 of the lactase gene using the lactase promoter.
00:11:31.10 Now, they're located very close to the mutation
00:11:35.01 associated with lactose tolerance in Europeans,
00:11:38.20 located at position 13910,
00:11:41.14 but they arose independently
00:11:43.17 due to a process called convergent evolution,
00:11:46.13 and probably due to a very strong
00:11:48.27 selective force to be able to drink milk that contains lactose,
00:11:56.15 in these different regions of the world.
00:12:00.27 What's also interesting
00:12:02.19 is that the variants that we identified
00:12:04.16 have a very distinct geographic distribution.
00:12:07.09 So the one that we found that was most common in our study
00:12:10.06 was at position 14010,
00:12:12.08 and we can see that it is pretty localized
00:12:15.01 to east Africa, to Tanzania and Kenya,
00:12:17.26 and that's the most likely site of origin of that mutation.
00:12:21.11 Interestingly, we also see it a bit in south Africa,
00:12:26.01 probably reflecting migration of pastoralists
00:12:28.29 from east Africa into that region.
00:12:32.16 The variant position at 13915
00:12:35.08 appears to have originated in the Middle East,
00:12:37.21 and we could see that it was introduced into northeast Africa,
00:12:40.17 probably by migration.
00:12:43.10 And then the variant at position 13907
00:12:46.26 likely arose in northeast Africa.
00:12:49.21 But again, one of the important take-home points is that
00:12:53.02 we have a functionally important variant
00:12:55.07 that's occurring at high frequency, sometimes as high as 40%,
00:12:59.12 and it's very geographically restricted,
00:13:02.22 and there are likely to be other mutations like that,
00:13:05.06 some of which may have implications for disease susceptibility,
00:13:09.11 again emphasizing the importance
00:13:11.23 to look amongst ethnically diverse Africans.
00:13:16.29 So the next thing we wanted to do
00:13:19.21 was to look for a signature of positive selection,
00:13:23.17 and this is the method in which we can do that.
00:13:27.21 So imagine, here in red,
00:13:30.25 imagine that this is a new mutation that has occurred, say,
00:13:34.15 one of the mutations associated with lactose tolerance.
00:13:38.00 And it's adaptive,
00:13:39.10 meaning that it increases the fitness of individuals who have it,
00:13:43.25 meaning that they're more likely to have children,
00:13:45.27 and their children are more likely to have children,
00:13:47.22 and so on.
00:13:49.28 And so it's going to increase in frequency
00:13:52.24 in the population,
00:13:54.29 and it's going to drag with it
00:13:57.15 the neighboring variants nearby.
00:14:00.03 So, you can see that when it originated, it had...
00:14:02.29 it was on a chromosome with a green variant
00:14:05.18 and a black variant.
00:14:08.03 And now these got dragged along to high frequency,
00:14:11.04 through a process known as hitchhiking.
00:14:14.07 Now, if this had gone to fixation,
00:14:17.12 meaning that everybody has it,
00:14:19.00 we would have called it a full selective sweep.
00:14:21.20 In this case, it hasn't quite reached a full selective sweep,
00:14:26.15 so we call it a partial sweep.
00:14:29.24 Now, that could just mean that
00:14:31.17 there hasn't been time for it to go to a full sweep,
00:14:33.06 or it could be that for some reason
00:14:35.17 there may be some negative aspects of having it,
00:14:38.02 and there's a reason that both variants are maintained in the population.
00:14:43.17 Now, after the sweep occurs,
00:14:45.19 you're going to have new mutation events
00:14:47.26 and new recombination events
00:14:49.21 shuffling up the variants
00:14:52.04 that are linked to the mutation that's adaptive.
00:14:56.12 And so that will decrease the association
00:15:01.00 observed between the mutation and the flanking variation.
00:15:05.00 And in fact,
00:15:06.15 if we have an estimate of the recombination rate,
00:15:08.27 we can use computational methods
00:15:10.24 to estimate how old this mutation is.
00:15:14.18 And that's exactly what we did here.
00:15:17.12 So shown on top
00:15:19.28 is an example from the most common mutation
00:15:22.27 that we found associated with lactose tolerance,
00:15:25.03 at position 14010.
00:15:27.18 Individuals who have the C variant
00:15:29.26 are able to digest milk,
00:15:31.22 and individuals who are homozygous are shown as red.
00:15:35.28 And what we did is we genotyped markers
00:15:38.28 going a distance of about 3 million nucleotides,
00:15:42.26 and what we would do is that if someone is homozygous,
00:15:46.20 starting at the lactose tolerance mutation,
00:15:49.02 and then we go to the next mutation.
00:15:51.02 If they're homozygous,
00:15:53.00 then we continue going.
00:15:55.13 If they underwent a recombination,
00:15:57.05 we stop the line.
00:15:59.13 And what we can basically see is that homozygosity
00:16:02.28 extends about 2 million basepairs
00:16:06.01 on chromosomes that have the lactose tolerance mutation.
00:16:09.15 But if we look at chromosomes that have the ancestral mutation,
00:16:14.00 they have almost no extended haplotype homozygosity.
00:16:19.07 And so this is a classic signature of a selective sweep.
00:16:22.25 It means that this variant
00:16:24.16 was under very strong positive selection
00:16:28.14 and it rapidly increased in frequency in the population,
00:16:32.04 dragging with it the neighboring variation.
00:16:38.16 Now, here I'm showing the European variant,
00:16:41.17 in this case the T variant
00:16:43.20 is associated with lactose tolerance,
00:16:45.27 and it shows a very similar pattern.
00:16:50.22 So using computational approaches,
00:16:52.27 we were able to estimate the age of the African mutation
00:16:57.22 to be somewhere between about 3,000-7,000 years of age.
00:17:01.28 These are the populations
00:17:03.25 that had the oldest age estimates,
00:17:06.08 and they include individuals
00:17:08.07 who speak Cushitic languages.
00:17:10.04 They came from Ethiopia,
00:17:12.10 and they practiced agro-pastoralism.
00:17:15.01 They came into Kenya and Tanzania
00:17:17.16 within the past 5,000 years.
00:17:20.23 And then we saw it at very high prevalence
00:17:23.26 and an old age estimate in Nilo-Saharan-speaking groups,
00:17:27.11 and these would include, for example, the Maasai.
00:17:30.09 Now, they came into the region more recently,
00:17:32.17 from southern Sudan,
00:17:34.08 within the past 3,000 years, so if I were to guess,
00:17:37.05 I would think perhaps this mutation
00:17:39.04 arose in the Cushitic speaking populations.
00:17:42.03 But irregardless, it quickly, rapidly spread
00:17:45.07 to all of the populations in the area
00:17:47.21 because it was so selectively advantageous
00:17:51.22 and adaptive to have this mutation.
00:17:55.03 Now, because we see the correlation
00:17:59.18 between the practice of cattle domestication and pastoralism
00:18:04.11 and the rise in this mutations,
00:18:06.16 this is a really excellent example
00:18:08.22 of gene-culture co-evolution.
00:18:12.03 And in fact, what's really interesting is
00:18:15.01 that the date estimates that we came up with correlate really well
00:18:18.25 with the archaeological data,
00:18:20.17 which shows that cattle domestication
00:18:22.14 arose in the Middle East or north Africa
00:18:27.04 somewhere between 8,000-10,000 years ago,
00:18:29.26 and that corresponds with the age estimate for the European mutation,
00:18:33.18 which we inferred to be about 9,000 years old.
00:18:37.25 But cattle domestication was not introduced
00:18:40.25 south of the Saharan desert
00:18:44.20 until roughly 5,000 or 5,500 years ago,
00:18:48.21 correlating very well with the age estimate
00:18:52.03 for the mutation we found in eastern Africa.
00:18:54.24 And then it was introduced
00:18:56.13 much more recently into southern Africa.
00:19:00.12 But one could argue that perhaps Mendelian traits like lactose tolerance,
00:19:05.04 which are regulated by a single locus or gene of major effect,
00:19:10.23 are in a sense the low hanging fruit;
00:19:12.20 they're the easiest to identify.
00:19:15.04 So one thing that my lab is interesting in doing
00:19:17.10 is looking at more complex traits,
00:19:19.23 and perhaps one of the most classic complex traits is height.
00:19:23.28 So, height is highly heritable,
00:19:26.19 genome wide association studies in tens of thousands of Europeans
00:19:30.12 have identified hundreds of loci,
00:19:33.06 each of very small effect,
00:19:35.06 and explaining only a very small proportion of the variation in height.
00:19:39.20 Now, interestingly, most of these are not part of
00:19:42.22 the growth hormone/IGF1 pathway,
00:19:45.07 which we know plays a very important role in idiopathic short stature,
00:19:49.16 for example.
00:19:53.05 Now, in Africa, we see some of the broadest distributions,
00:19:56.21 or ranges in height,
00:19:59.02 ranging from the very short statured Pygmies in central Africa,
00:20:03.28 and then we see some of the tallest individuals
00:20:07.13 in the Sudan and in eastern Africa.
00:20:10.23 And it's thought that these differences
00:20:12.14 may be partly due to adaptation
00:20:14.29 to different environments.
00:20:16.27 So what I want to tell you today is about
00:20:18.19 our genetic studies of short stature
00:20:22.01 in Pygmy populations from central Africa.
00:20:25.14 And, for you to fully understand and appreciate the work we've done,
00:20:29.17 I think I should first tell you a little bit about
00:20:32.07 how we went about collecting these samples
00:20:34.07 and how challenging it could be.
00:20:35.25 So, this is...
00:20:37.18 to get to one of the groups that we studied in Cameroon,
00:20:39.27 you have to cross this river,
00:20:41.28 and you have a person who has a ferry,
00:20:44.05 he's actually using a hand crank here
00:20:47.13 to get us across.
00:20:50.12 And I guess I'm very fortunate
00:20:52.19 because as a woman, I was able to get shade,
00:20:54.15 but not everybody was that lucky.
00:20:56.28 And here are some other hazards that we run into,
00:20:59.15 but I'm smiling because the head is cut off of this snake.
00:21:03.01 But I actually have to give credit to Dr. Alain Froment,
00:21:06.24 who has been studying the Pygmy populations in Cameroon
00:21:09.16 for greater than 30 years,
00:21:11.20 and he did the majority of the sample collection
00:21:14.01 in this case.
00:21:16.21 So, the genetic basis of short stature in Pygmies
00:21:19.26 is a question that's been of tremendous interest
00:21:22.08 to endocrinologists and human geneticists alike
00:21:25.07 for most than 50 years.
00:21:27.08 The particular populations that we studied
00:21:29.25 are located in Cameroon, three different groups from Cameroon,
00:21:34.27 who mean male height is 152 cm.
00:21:40.11 And they live in very close connection and interaction
00:21:44.29 with neighboring populations who speak Bantu languages
00:21:47.29 and practice agriculture,
00:21:50.06 and their mean male height is 170 cm,
00:21:54.04 so that's quite a difference between the two.
00:21:58.16 So, the Pygmy short statured phenotype in humans
00:22:01.26 has arisen independently in different global populations.
00:22:05.12 Typically, these are populations
00:22:07.03 that live in tropical environments,
00:22:09.14 so there have been a number of hypotheses
00:22:11.11 about why this trait might be adaptive.
00:22:14.28 And these include thermoregulation,
00:22:19.04 limited food resources,
00:22:21.19 locomotion - that it may be easier to move
00:22:23.28 in a dense tropical environment if you're short,
00:22:26.18 and more recently there's a theory
00:22:30.10 that this is due to a life-history tradeoff,
00:22:32.10 and I'm going to focus on that theory.
00:22:35.08 And that has to do with the fact that
00:22:37.14 Pygmies have a remarkably short lifespan.
00:22:40.11 Their chance of living to age 15
00:22:42.06 is only about 40%,
00:22:44.18 and if they make it to age 15,
00:22:46.23 the expected lifespan is only around 25 years of age.
00:22:50.01 Now, that is due largely to very high infectious disease burden
00:22:54.05 and a very challenging life in dense tropical forests.
00:22:59.24 Now, what the study showed is that
00:23:02.18 Pygmies appear to be reaching reproduction...
00:23:05.25 they appear to be reproducing and reaching puberty
00:23:08.09 at a significantly earlier age
00:23:11.01 than other Africans.
00:23:13.13 And the growth trajectory in Pygmies
00:23:14.28 appears to be similar to other populations until the point of puberty,
00:23:19.21 and then they lack the adolescent growth spurt.
00:23:22.15 So this may be some sort of a tradeoff:
00:23:24.18 there's selection to reproduce earlier
00:23:26.22 because they're dying very young,
00:23:28.22 but that may be a tradeoff,
00:23:30.24 in that they're not undergoing the adolescent growth spurt.
00:23:35.20 Now, there have been only a handful
00:23:37.28 of physiologic and metabolic studies in Pygmies,
00:23:42.00 but nearly all of these are pointing towards
00:23:44.18 disruptions of the growth hormone/IGF1 pathway,
00:23:47.19 so this is in contrast to what we're seeing in European populations.
00:23:52.05 However, there's been quite a bit of dispute of
00:23:54.28 where along this pathway these disruptions are occurring.
00:24:00.04 So, in order to try to address these questions,
00:24:03.06 we genotyped one million single nucleotide polymorphisms
00:24:08.06 in 67 pygmy individuals
00:24:10.23 and 58 of the neighboring Bantu individuals.
00:24:14.14 And here we can see a plot,
00:24:17.10 similar to what I've shown you before,
00:24:19.09 based on structure analysis.
00:24:21.09 And to remind you,
00:24:23.02 this is composed of a series of lines,
00:24:24.23 and each line represents a person,
00:24:26.16 and they can have ancestry
00:24:28.08 from different ancestral populations,
00:24:31.01 represented by the different colors.
00:24:33.04 So here in orange
00:24:34.29 are individuals who speak the Bantu language
00:24:38.00 and practice agriculture,
00:24:40.08 and in dark green are individuals who self-identify as Pygmies.
00:24:44.19 And what you can see is that there's been
00:24:46.22 a lot of admixture between the Pygmies
00:24:49.29 and the neighboring Bantu people.
00:24:52.03 Now, interestingly, this tends to be unidirectional,
00:24:54.28 and it tends to be gene flow between males
00:24:57.27 from the Bantu population
00:25:00.03 with females of the Pygmy population.
00:25:02.22 This is largely due to socioeconomic factors.
00:25:06.23 Now, when we look at a correlation
00:25:08.26 between ancestry and height,
00:25:11.03 we observed a very strong and significant positive correlation.
00:25:15.04 So, we can see that Pygmies who have more of the Bantu ancestry
00:25:19.23 tend to be taller.
00:25:21.19 And, so this is showing
00:25:22.25 that there's a strong genetic component to this trait.
00:25:26.17 We've also worked with collaborators
00:25:28.11 to develop methods
00:25:30.19 to infer tracts of Pygmy and Bantu ancestry
00:25:35.11 across the chromosome.
00:25:36.29 So here, these are the different chromosomes,
00:25:38.18 starting with chromosome 1
00:25:40.05 and going up to chromosome 22,
00:25:42.25 and here I'm showing you an example from chromosome 3.
00:25:46.04 And in blue is showing tracts of the genome
00:25:49.03 that are Pygmy ancestry,
00:25:50.24 and in red are tracts of the genome that are Bantu ancestry,
00:25:54.23 and what we tend to see are very, very short tracts of Bantu ancestry.
00:25:58.28 And that's reflected in the fact that admixture
00:26:01.08 has been occurring over thousands of years.
00:26:06.11 Now, the next question that we wanted to address
00:26:08.17 is how do the genomes of the Pygmy hunter-gatherers
00:26:12.04 differ from the genomes of the Bantu agriculturalists
00:26:17.00 and from other groups, such as the Maasai pastoralists
00:26:20.28 from east Africa.
00:26:22.28 And to do that,
00:26:25.03 we use a number of scans of natural selection
00:26:27.28 across the genome.
00:26:29.29 Without getting into detail about the methods,
00:26:32.26 I'll just point out that you can see by the different colors here
00:26:37.00 across the different chromosomes,
00:26:39.00 here's chromosome 22 and going down to chromosome 1,
00:26:42.04 that we found a number of regions of the genome
00:26:44.22 that are targets of selection.
00:26:47.05 But there was one region in particular,
00:26:49.26 on chromosome 3,
00:26:52.04 where we saw a cluster of targets of natural selection.
00:26:57.01 And this was over about a 15 million basepair region.
00:27:01.14 Now, given our small sample size,
00:27:03.20 we have very little power
00:27:05.15 to detect a genome-wide association.
00:27:09.04 And so what we did is,
00:27:10.26 under the hypothesis that this is an adaptive trait,
00:27:13.17 we just focused on the regions of the genome
00:27:16.07 that are targets of selection, shown here,
00:27:19.11 and then we looked for an association with height.
00:27:22.10 And one of the strongest, most significant associations
00:27:25.09 was exactly in that same 15 million basepair region
00:27:29.19 of chromosome 3.
00:27:31.23 And indeed, it encompassed several genes,
00:27:34.15 one of which is DOCK3,
00:27:36.18 which has been shown to be associated with height
00:27:39.09 in non-African populations,
00:27:41.08 so we replicated that finding.
00:27:43.20 But nearby was another gene called CISH,
00:27:47.09 which is a member of the cytokine signaling family,
00:27:50.10 plays a very important role in regulating
00:27:52.28 IL-2 cytokine signaling pathway,
00:27:56.18 and studies have shown that it's associated
00:27:58.26 with resistance to a number of infectious diseases
00:28:01.18 in Africa.
00:28:04.01 Now, interestingly,
00:28:05.29 CISH also directly inhibits
00:28:07.14 human growth hormone receptor action
00:28:10.06 by blocking the STAT5 phosphorylation pathway.
00:28:13.15 And so we know that studies in mice
00:28:15.17 show that when this gene is overexpressed,
00:28:18.06 the mice are short statured.
00:28:20.23 Now, this led me to the hypothesis that,
00:28:24.14 could it be that there could actually be selection
00:28:26.19 for immune function
00:28:28.11 that is indirectly resulting
00:28:30.05 in short stature in Pygmies,
00:28:32.05 because that gene plays an important role in both.
00:28:35.29 And we need to do further functional studies,
00:28:38.20 and look at differences in gene expression
00:28:40.13 to test this hypothesis.
00:28:44.04 The last study I wanna tell you about is a study
00:28:46.20 in which we sequenced the entire genomes,
00:28:49.15 at high coverage,
00:28:51.07 of 15 African hunter-gatherers,
00:28:53.22 including 5 Pygmies,
00:28:55.28 5 Hadza,
00:28:57.10 and 5 Sandawe.
00:28:59.26 We identified over 13 million variants,
00:29:02.29 3 million of which are completely novel;
00:29:05.29 they have never previously been identified.
00:29:08.13 And that's just from 15 individuals,
00:29:10.14 so you can imagine how much variation is out there.
00:29:13.16 Many of these are novel variants...
00:29:15.27 many of these novel variants are in known regulatory sites.
00:29:21.04 So now, combining the two studies,
00:29:24.08 we wanted to ask the question,
00:29:26.03 which pathways are enriched for genes near targets of selection?
00:29:29.16 And these enriched pathways
00:29:31.25 include genes involved in neuro-endocrine signaling,
00:29:37.11 and immune function,
00:29:38.22 and interestingly, based on the whole genome sequencing study,
00:29:42.08 we saw an enrichment for genes
00:29:44.06 that play a role in pituitary function in Pygmies,
00:29:47.13 including follicle-stimulating hormone receptor,
00:29:50.13 growth hormone receptor,
00:29:52.11 HESX1, which I'll tell you more about in a moment,
00:29:55.11 and thyrotropin-releasing hormone receptor.
00:29:58.15 In fact, TRHR was one of the biggest hits
00:30:02.13 that we saw in terms of these studies of selection.
00:30:05.17 And what's interesting is that this gene
00:30:08.22 plays an important role in the hypothalamic-pituitary-thyroid axis,
00:30:12.28 influencing a number of traits that could potentially
00:30:15.14 be of adaptive significance in Pygmies.
00:30:18.26 And also of interest was that anthropologists
00:30:21.18 have noted that there is a significant difference
00:30:24.23 in the prevalence of Goiter
00:30:27.00 among Pygmies and neighboring Bantu groups.
00:30:29.24 So the Pygmies have a much lower frequency of Goiter
00:30:33.16 compared to the neighboring Bantu populations,
00:30:36.16 and this could reflect a biological adaptation in Pygmies
00:30:41.20 to a low iodine environment.
00:30:43.24 It's very deleterious to get Goiter
00:30:46.22 because it can also lead to a diseased called Cretinism,
00:30:49.27 which of course is going to be very deleterious.
00:30:52.18 So again, here's an example
00:30:54.10 where something like adaptation to diet
00:30:56.13 could indirectly influence growth
00:30:58.28 or other phenotypes in the Pygmy population.
00:31:04.01 The last thing we wanted to do
00:31:06.01 was to look for regions of the genome,
00:31:08.08 using the whole genome sequencing data,
00:31:10.13 that are specific to Pygmies,
00:31:12.20 and those are shown in green here.
00:31:16.02 Now, we identified 25 clusters in the genome,
00:31:19.23 and the largest cluster
00:31:22.27 was right in that same region of chromosome 3
00:31:25.14 that we had previously identified.
00:31:28.00 But we had missed it in the prior study,
00:31:30.11 and the reason why is because
00:31:32.17 it contains these Pygmy-specific variants,
00:31:35.08 that were not captured by the SNP array that we used,
00:31:39.17 and thus demonstrating the great importance
00:31:42.00 of doing resequencing for identifying novel
00:31:44.24 and potentially functionally important variation
00:31:47.15 in ethnically diverse populations.
00:31:50.28 Now, this cluster consisted of
00:31:55.10 44 SNPs in 100% association with each other
00:31:59.16 over 170,000 nucleotide,
00:32:03.06 shown here,
00:32:05.24 and it contained a very interesting candidate gene called HESX1.
00:32:10.10 HESX1 codes for a transcription factor
00:32:13.05 that plays a very important role
00:32:15.04 in regulating the development
00:32:17.15 at the anterior pituitary in the brain,
00:32:20.14 and that's the site of production of growth hormone,
00:32:22.23 as well as other reproductive hormones.
00:32:25.11 Now, interestingly,
00:32:27.06 we identified a non-synonymous,
00:32:29.28 so an amino acid change, basically,
00:32:33.23 in this gene
00:32:36.03 that had been previously associated
00:32:38.13 with idiopathic short stature in humans.
00:32:41.26 But it turns out that this varian
00:32:44.01 t is present at about a 20% frequency in other Africans.
00:32:47.12 So what we hypothesize is that
00:32:49.13 there's something about this region
00:32:51.22 that may be altering gene expression of HESX1
00:32:55.07 or other genes in that region.
00:32:58.01 Upstream, we found another cluster
00:33:01.18 near this gene POU1F1, also known at Pit-1 in mouse,
00:33:07.13 and again this codes for a transcription factor
00:33:09.18 that plays a critical role in regulating growth hormone expression.
00:33:14.23 So another excellent candidate gene.
00:33:17.28 Now, what is interesting is that
00:33:19.27 both of these clusters, or genes,
00:33:23.18 are amongst the most differentiated regions
00:33:26.27 of the Pygmy genomes,
00:33:28.27 compared to genomes from elsewhere in Africa.
00:33:31.29 So we then picked out some of the SNPs in these regions
00:33:37.13 and genotyped them in a larger set
00:33:39.19 of western and eastern Pygmies,
00:33:41.26 and we showed that they are statistically
00:33:44.02 associated with short stature in Pygmies.
00:33:47.29 So the next step is going to be
00:33:49.24 to try to make transgenic models
00:33:52.01 that express these variants using transgenic mouse models,
00:33:56.06 and see what the phenotype looks like.
00:34:00.19 So that leads us to a number of hypotheses.
00:34:03.19 One, is that alterations in the growth hormone/IGF1 pathway
00:34:07.15 play a role in the short stature trait in Pygmies.
00:34:13.01 Two, is that anterior pituitary hormones
00:34:15.10 may play a central role in the Pygmy phenotype,
00:34:18.09 influencing growth, reproduction,
00:34:20.15 metabolism, and immunity.
00:34:24.00 And thirdly, that short stature
00:34:26.16 could be a byproduct of selection
00:34:28.11 acting on pleiotropic loci.
00:34:31.04 So if we look here,
00:34:32.21 one of the candidate loci that we identified is HESX1.
00:34:36.13 That's going to influence expression and development
00:34:39.20 of the anterior pituitary,
00:34:42.02 site of production of growth hormone.
00:34:44.20 Growth hormone expression is also regulated
00:34:46.23 by this other gene we found, POU1F1.
00:34:50.04 And this CISH regulates growth hormone receptor.
00:34:54.17 Now, if we look at the downstream effects
00:34:56.24 of growth hormone,
00:34:59.07 growth hormone, when it binds to growth hormone receptor,
00:35:02.18 will trigger off expression of IGF1,
00:35:06.12 predominantly from the liver, but from other tissues as well.
00:35:10.06 IGF1 will have an effect on muscle growth
00:35:13.14 and also on bone growth and height,
00:35:16.02 but the other impact, or the other role of growth hormone
00:35:20.12 is that it also influences insulin metabolism,
00:35:24.06 it influences fat metabolism.
00:35:28.01 And then we know that infectious disease
00:35:30.01 alters immune response and cytokine levels,
00:35:33.08 and that these can influence gene expression from CISH,
00:35:36.11 or other genes that are in this pathway.
00:35:40.09 So, when we go back to Africa to study the Pygmies,
00:35:42.28 what we would ultimately like to do next
00:35:45.16 is to measure all of the phenotypes,
00:35:48.01 because if you want to understand something
00:35:50.04 like the evolution of short stature in Pygmies,
00:35:52.19 I think you can't just be looking at stature
00:35:55.09 because the growth hormone pathway
00:35:58.25 plays a role in all of these different traits,
00:36:01.01 so we need to be looking at this as an integrative picture.
00:36:06.01 And in fact, our approach in the future
00:36:08.26 is to use an integrative genomics approach
00:36:11.24 combining whole genome data,
00:36:14.15 data on protein variation from blood,
00:36:17.25 epigenetic variation,
00:36:19.21 which can be influenced by diet and environment,
00:36:22.12 gene expression,
00:36:24.10 we're starting to look at the microbiome,
00:36:27.16 which is the spectrum of bacteria in the gut,
00:36:32.05 because that can not only be influenced by diet,
00:36:35.16 it can also have an influence on the metabolome,
00:36:38.12 or the set of all the metabolites, for example,
00:36:40.27 in blood.
00:36:42.20 And we want to combine that information
00:36:44.25 together with information on diet
00:36:46.22 and other environmental factors,
00:36:48.29 to try to identify genetic and environmental factors
00:36:52.15 that play a role in short stature
00:36:55.05 and in other anthropometric,
00:36:58.01 and metabolic traits.
00:37:00.20 One of the other approaches we can take
00:37:02.20 to distinguish the role of genetics and environment is, for example,
00:37:06.00 to look at individuals of the same or similar ethnic background,
00:37:10.29 but living in an urban versus a rural environment.
00:37:16.21 We can also take a different...
00:37:18.14 the opposite approach.
00:37:20.00 We can look at individuals who have
00:37:22.06 very different genetic ancestries,
00:37:25.03 but live in similar environments.
00:37:27.13 So for example,
00:37:29.20 this is a girl who is from the Fulani population,
00:37:33.17 and here's a neighboring...
00:37:35.19 an individual from the Tupuri population.
00:37:38.26 So they are genetically very differentiated,
00:37:41.20 but live in a similar environment,
00:37:43.16 yet the Fulani seem to have some innate resistance
00:37:47.06 to malaria infection.
00:37:50.03 By contrast, in the San,
00:37:53.09 from southern Africa,
00:37:54.29 are very differentiated from the Bantu,
00:37:57.15 but the San seem to have an innate susceptibility
00:38:01.09 to TB infection.
00:38:03.20 So again, by contrasting populations with different ancestry,
00:38:07.26 and living in different environments,
00:38:09.11 we may identify clues about the genetic basis
00:38:12.10 of differences in phenotypic variation
00:38:14.26 and disease susceptibility.
00:38:17.23 So in conclusion,
00:38:20.20 Africans have the highest levels of genetic diversity
00:38:23.04 within and among populations.
00:38:26.28 The demographic history of Africans
00:38:29.00 and local adaptation to different environments
00:38:31.04 has resulted in population
00:38:33.01 or region specific genetic variation.
00:38:36.25 And we need to be including
00:38:38.21 ethnically diverse Africans in genomic studies
00:38:41.17 to better identify both unique rare, and common variants
00:38:45.28 which may be of functional importance,
00:38:47.28 including those that play a role in disease risk
00:38:50.13 in these populations.
00:38:52.14 And I will just end by thanking
00:38:54.04 the many individuals
00:38:55.25 who contributed to these studies,
00:38:57.29 and my funding agencies,
00:39:00.16 and particular thanks to the Africans
00:39:02.20 who have contributed to these studies.
- If you want to learn more about how scientists measure trait heritability, please review the following paper: Estimating trait heritability Wray, N. & Visscher, P (2008) Nature Education 1(1):29
- Youreka Science: Hardy-Weinberg Equilibrium: Combining Darwinian Evolution and Mendelian Genetics to Study Population Genetics
- Melina Hale iBioSeminar: The Evolution of Neural Circuits and Behaviors
- Dianne Newman iBioSeminar: Microbial Diversity and Evolution
- Sarah Tishkoff iBioSeminar: African Genomics: Human Evolution and Migration
- David Haussler iBioMagazine: What Can We Learn From Sequencing Our Genomes?
Youreka Science was created by Florie Mar, PhD, while she was a cancer researcher at UCSF. While teaching 5th graders about the structure of a cell, Mar realized the importance of incorporating scientific findings into classroom in an easy-to-understand way. From that she started creating whiteboard drawings that explained recent papers in the scientific literature… Continue Reading
Sarah Tishkoff studied anthropology and genetics as an undergraduate at the University of California, Berkeley. She received her PhD in genetics from Yale University and was a post-doctoral fellow at Pennsylvania State University. From 2000-2007, she was a faculty member in the Department of Biology at the University of Maryland. Currently, Dr. Tishkoff is the… Continue Reading
Dr. Newman is a Professor in the Divisions of Biology and Geological and Planetary Sciences at the California Institute of Technology. When Newman began her undergraduate studies at Stanford University she wasn’t sure she was going to be a scientist because she was interested in a variety of different fields. In fact, she received her… Continue Reading
David Haussler is Scientific Director of the University of California Santa Cruz (UCSC) Genomics Institute and Investigator of the Howard Hughes Medical Institute (HHMI). Haussler uses mathematics, computer science, and biology to study the genomes of organisms with the goal of understanding disease and evolution. As part of the Human Genome Project, he led the… Continue Reading
Melina Hale is a professor of Organismal Biology and Anatomy and Neurobiology and Computational Neuroscience at the University of Chicago. Using predominantly zebra fish, Hale’s lab studies neural circuits that control limb and axis movement and how that movement changes over time. Movement changes can be seen both in the short time frame of development… Continue Reading