Some of Darwin’s earliest supporters found evidence of evolution in fossils. Fossils continue to inform our ideas of evolution as Neil Shubin tells us. He describes his search for a fossil link between fish and land animals and the excitement of finally finding it. In a review of session 2, Sarah Tishkoff reminds us of what fossils tell us about human evolution and migration. Further evidence of evolution comes from comparing DNA sequences. Hopi Hoekstra studies the impact of genes and the environment on the evolution of specific phenotypes. She found that populations of mice living on white sand beaches were more likely to evolve light colored fur than populations in areas with dark soil. (Why do you think this happens?) In about half of the populations, this change was due to a single mutation in the MC1 receptor. In an interesting example of convergent evolution, the same mutation causes coat color variation in some dogs and may have caused coat colors to vary in woolly mammoths!
All Course Materials for this Session (Educators only)
00:00:07.10 Hi, I'm Sarah Tishkoff.
00:00:08.23 I'm a professor at the University of Pennsylvania
00:00:11.02 in the Departments of Biology and Genetics,
00:00:13.24 and today I'm gonna tell you about my research
00:00:15.18 on African integrative genomics,
00:00:17.29 and implications for human origins and disease.
00:00:21.17 So in Part 1, I'm gonna tell you a bit about
00:00:23.24 human evolutionary history,
00:00:25.24 and what the implications are of that
00:00:27.20 on the patterns of genomic variation
00:00:29.18 that we see in populations today.
00:00:34.05 So I want to start by talking about some of the
00:00:35.26 key challenges in human genomics research.
00:00:38.19 And the first one is to characterize
00:00:40.27 the immense array of genomic and phenotypic diversity
00:00:44.29 across ethnically diverse human populations.
00:00:48.14 Secondly, to understand what the evolutionary processes are
00:00:51.16 that are generating and maintaining that variation.
00:00:54.14 And third, to better understand how
00:00:56.04 gene-gene, gene-protein, and gene-environment interactions
00:00:58.28 contribute to phenotypic variability.
00:01:01.27 So first let's start with the evolutionary history
00:01:05.00 of the hominin lineage
00:01:06.26 that's leading to modern humans,
00:01:10.13 which begins around the time that we
00:01:12.03 diverged from our closest genetic relative
00:01:14.04 the Chimpanzee,
00:01:15.18 sometime between 5-7 million years ago.
00:01:18.14 So shown here are some of the fossils
00:01:20.07 from the different species
00:01:22.17 preceding anatomically modern humans.
00:01:25.16 In blue are shown fossils from the oldest lineages,
00:01:30.06 and in fact one of the oldest is Sahelanthropus,
00:01:34.10 which has been dated to at least 7 million years ago,
00:01:37.29 and there's some debate about whether it even
00:01:39.14 belongs on the hominid lineage
00:01:41.09 or if it actually preceded the Chimpanzee and human divergence.
00:01:45.26 After that, in green,
00:01:47.14 we see the Australopithecus genus.
00:01:50.14 In yellow, we see Paranthropus genus.
00:01:54.09 In orange, we have the genus Homo
00:01:56.24 and the species proceeding anatomically modern humans
00:02:01.13 is Homo erectus, dated to about 2 million years ago.
00:02:06.14 And then we have the origins of
00:02:08.15 Homo neanderthalensis
00:02:11.02 and of anatomically modern humans.
00:02:13.24 Neanderthals are thought to have originated
00:02:16.00 somewhere between 300,000-400,000 years ago,
00:02:19.12 and modern humans originated
00:02:20.27 approximately 200,000 years ago.
00:02:24.03 Here's one of the best examples
00:02:26.11 of Australopithecus afarensis.
00:02:29.07 This was a set of fossils that was
00:02:31.24 discovered in the 1970's by Johanson and Gray,
00:02:36.02 named Lucy,
00:02:38.00 and Lucy was about...
00:02:41.04 she lived about 3.2 million years ago.
00:02:43.29 She was very small, only about 3 feet tall,
00:02:46.13 she had a very small brain,
00:02:48.07 and she was bipedal.
00:02:49.27 And being bipedal, in fact,
00:02:51.07 is one of the characteristics of the hominin lineage.
00:02:57.12 And, interestingly,
00:02:59.17 there have been some fossilized footprints
00:03:01.21 identified in Tanzania,
00:03:03.24 and we can see from these that there
00:03:06.08 appears to have been a mother,
00:03:08.27 from the species Australopithecus afarensis,
00:03:12.08 and she was holding the hands of her child.
00:03:14.29 And they must have been walking
00:03:16.15 in ash from recent volcanic activity,
00:03:20.06 and then that ash hardened and preserved these footprints
00:03:23.06 so that we can see them today,
00:03:24.21 and we can clearly see that they were bipedal.
00:03:29.08 So the species preceding modern humans
00:03:31.28 is called Homo erectus.
00:03:33.24 Homo erectus evolved around 2 million years ago,
00:03:39.02 and then after the origin of Homo erectus in Africa,
00:03:42.24 Homo erectus spread across Eurasia
00:03:47.17 and, indeed, shown here are some of the
00:03:49.21 oldest fossils of Homo erectus,
00:03:52.18 dated to as early as 1.9 million years ago (MYA) in Indonesia.
00:04:00.15 And this species was very successful,
00:04:03.14 lasting to as recently as 25,000 years ago
00:04:06.17 in Southeast Asia.
00:04:09.08 A very interesting recent finding was
00:04:11.20 a set of fossils identified on the island of Flores,
00:04:14.26 which is within Indonesia,
00:04:17.25 and these fossils actually show some characteristics
00:04:21.22 that look very similar to Homo erectus,
00:04:24.19 and for that reason it was proposed that
00:04:27.09 this species may have directly evolved
00:04:30.23 from a Homo erectus ancestor
00:04:33.20 that arrived on that island
00:04:36.07 about 1 million years ago
00:04:37.28 and then evolved in isolation.
00:04:39.25 And two of the very unique features of this species
00:04:42.17 is that they were very short, so again,
00:04:46.01 about the same size as Lucy, around 3 feet tall,
00:04:50.15 and secondly, that they had tiny brains.
00:04:53.14 And there's been a lot of debate about
00:04:55.01 whether this is an adaptation or in fact a pathology,
00:04:58.09 and there's still a lot of research being done,
00:05:01.03 but what was clear is that there were multiple species
00:05:04.01 outside of Africa
00:05:05.29 within the past 2 million years.
00:05:08.20 So now let's move on to the origins of
00:05:10.15 Homo neanderthalensis and Homo sapiens.
00:05:13.12 There's some question about the species preceding
00:05:16.28 Neanderthal and Homo sapiens.
00:05:19.17 Some say that it was heidelbergensis,
00:05:22.04 but there's debate about that.
00:05:24.15 However, what is clear is that the Neanderthals species
00:05:28.10 arose somewhere within the past 300,000-400,000 years,
00:05:32.15 and Homo sapiens arose within the past 200,000 years.
00:05:38.04 And this is a fossil from Neanderthals,
00:05:40.29 we can see a few features such as
00:05:44.02 the double arched and very wide brow ridges,
00:05:47.08 a broad nose,
00:05:48.28 a very large brain size,
00:05:50.27 and a retromolar space,
00:05:52.21 and in fact these species were very robust.
00:05:55.16 The males would have been over 6 feet tall,
00:05:57.15 they had very big bones,
00:05:59.19 and they had rather big brains.
00:06:02.20 In fact, here are some reconstructions of Neanderthal.
00:06:06.28 We have the old reconstruction
00:06:09.03 and then the more recent one as well.
00:06:12.11 So, anatomically modern humans, Homo sapiens sapiens,
00:06:16.06 arose approximately 200,000 years ago.
00:06:19.02 In fact, here these red dots
00:06:21.09 are representing locations where fossils have been found
00:06:24.11 of anatomically modern humans,
00:06:26.27 and the oldest fossil is
00:06:28.22 dated to around 150,000-195,000 years ago,
00:06:32.19 in Southern Ethiopia.
00:06:36.23 We also see evidence of early modern human behavior
00:06:40.10 dated to 70,000 years ago,
00:06:42.11 or even as old as 120,000 years ago,
00:06:45.16 in caves in south Africa
00:06:47.13 and also some from east Africa as well.
00:06:51.05 So after modern humans arose in Africa within the past 200,000 years,
00:06:55.08 one or a few small groups of individuals
00:06:57.25 migrated across the rest of the globe
00:07:00.11 within the past 50,000-100,000 years.
00:07:03.23 Indeed, we think that Europeans...
00:07:07.15 there were no people in Europe, actually,
00:07:09.06 until about 40,000 years ago,
00:07:11.13 and then modern humans crossed the Bering Straits
00:07:14.15 and went into the Americas
00:07:16.28 within the past 30,000 years.
00:07:19.05 The earliest migration event was actually into Australo-Melanesia,
00:07:23.11 dated to about 40,000-60,000 years ago.
00:07:26.14 And then we have much more recent migration events,
00:07:29.03 such as into the Pacific Islands,
00:07:31.12 within the past few thousand years.
00:07:34.11 Now, interestingly,
00:07:36.16 when modern humans migrated out of Africa
00:07:39.08 within the past 50,000-100,000 years,
00:07:42.05 they would have run into Neanderthals,
00:07:44.10 in fact they overlapped in their distribution.
00:07:47.08 So shown here is the distribution of Neanderthals,
00:07:50.22 and the modern humans who lived at that time
00:07:52.25 were referred to as Cro-Magnon,
00:07:55.17 and in fact we did not see anatomically modern humans
00:07:59.09 in this region, in Europe, until about 40,000 years ago.
00:08:03.03 They would have been in the Middle East a little bit earlier,
00:08:05.23 but it appears they overlapped
00:08:08.18 for about at least 10,000 years with Neanderthals.
00:08:12.13 And as we'll discuss later,
00:08:13.27 there is some evidence that there could have been actual admixture
00:08:17.05 between Neanderthal and anatomically modern humans
00:08:20.18 during that time.
00:08:22.26 So now I want to discuss the evolutionary forces
00:08:25.27 that influence the patterns of genetic variation
00:08:28.08 that we see today.
00:08:30.04 And these include mutation,
00:08:32.14 genetic drift,
00:08:35.09 and natural selection.
00:08:37.16 So let's first introduce some terminology.
00:08:40.05 The gene pool refers to the set of all genomes
00:08:42.25 in a specified population,
00:08:44.10 and here we have an example from a population of warthogs.
00:08:47.22 So where we have at a single genetic locus
00:08:51.03 two alleles, big B or little b,
00:08:54.17 and here's an example of an individual
00:08:56.11 who is homozygous for the big B allele,
00:08:59.07 and an individual homozygous for the little b allele,
00:09:02.12 and here's an individual who is heterozygous
00:09:05.08 for big B and little b.
00:09:07.12 And together, the set of alleles in that population
00:09:10.19 represents the gene pool.
00:09:13.28 So when we are doing population genetics analyses,
00:09:16.25 we can't actually go out and look at every genotype
00:09:21.00 for every individual in the population,
00:09:23.14 that would be unfeasible.
00:09:25.13 So what we typically do is to
00:09:26.23 infer frequencies by estimating them
00:09:30.10 from a random sample.
00:09:32.25 So in population genetics
00:09:35.01 generation, each new individual
00:09:37.16 is viewed as drawing from a set of gametes
00:09:39.20 with alternative alleles,
00:09:41.08 so let's use an example here
00:09:43.01 in which we have a set of marbles in a bowl.
00:09:46.05 And initially, we have a distribution of
00:09:51.26 60 of the white marbles
00:09:54.13 relative to 40 of the green marbles,
00:09:56.27 and these, the white and the green,
00:09:58.08 are representing different alleles.
00:10:00.14 So let's say that we're gonna pick...
00:10:02.04 we're gonna reach into this bag
00:10:04.04 and we're gonna randomly draw out
00:10:06.09 another hundred of these marbles.
00:10:09.01 And now in the next generation
00:10:10.26 we have 80 of the white and we have 20 of the green.
00:10:15.02 We're gonna reach back in,
00:10:16.01 we're gonna grab another set of a hundred,
00:10:18.09 and now in the next generation
00:10:20.15 we have 100 of the white alleles and 0 of the green.
00:10:26.08 And this is a demonstration of
00:10:27.15 how we get changes in allele frequency over time.
00:10:31.25 Allele frequencies will also change over time
00:10:34.23 due to genetic drift,
00:10:36.21 which is defined as random fluctuations
00:10:39.01 of allele frequencies from generation to generation,
00:10:42.03 simply due to chance.
00:10:44.19 So as we see, sometimes things could happen,
00:10:47.16 like these bugs are getting squashed,
00:10:50.00 and that's gonna change, perhaps,
00:10:52.07 the allele frequency in the next generation.
00:10:55.19 Here's another example from some lady bugs,
00:10:58.23 and we can see that, perhaps,
00:11:01.03 in the next generation, just by chance,
00:11:03.10 we're gonna see more of these ladybugs
00:11:04.29 with the dark colors,
00:11:06.12 or we might see more that are with the medium colors and dots.
00:11:10.16 And the fact is that drift is just an inevitable fact of life.
00:11:16.15 I also want to define what we mean by neutral evolution.
00:11:20.08 So we define a selectively neutral allele
00:11:22.10 as one that does not affect reproductive fitness of individuals
00:11:25.20 who carry that allele,
00:11:27.20 so it's frequency in the population
00:11:29.25 changes by chance or genetic drift alone.
00:11:32.18 And here we have an example:
00:11:35.04 this is just a substitution
00:11:37.22 in the third position of the codon,
00:11:41.02 and when we have substitutions
00:11:44.09 of nucleotides in the third position,
00:11:46.20 very typically they result in a silent or synonymous change.
00:11:51.05 So here there's been a substitution,
00:11:53.00 but there's no change in the amino acid;
00:11:55.02 it remains as valine.
00:11:57.26 So the rate at which genetic drift occurs
00:12:00.01 is going to inversely proportional to the population size, N,
00:12:03.23 and it's going to be very fast in small populations.
00:12:06.27 And here's an example that we can look at
00:12:08.23 based on computer simulation.
00:12:11.20 So let's assume here that we're looking at a single locus
00:12:15.15 and it has two alleles
00:12:18.06 that are at 50% frequency each,
00:12:21.25 as we can see here.
00:12:23.22 We have a sample size of 25,
00:12:27.06 and we're going to do the simulation
00:12:29.03 over 80 generations.
00:12:31.14 Now, each of these lines here
00:12:34.03 represents a different simulation,
00:12:36.27 and what we can see is that
00:12:38.23 over time alleles are either going to
00:12:44.02 be lost from the population
00:12:46.08 or they're going to reach fixation,
00:12:48.17 which means that they go to 100% frequency.
00:12:52.10 And the rate at which this occurs
00:12:54.00 is going to depend on the sample size.
00:12:56.09 So in a small sample it's gonna be very rapid,
00:12:59.19 but in this example where we have a larger sample, now N=300,
00:13:03.26 you can see that it just takes more time.
00:13:05.23 There's not as much genetic drift occurring.
00:13:08.19 Now, the end result is gonna be the same,
00:13:10.15 it just takes more time.
00:13:14.09 The change in allele frequency also is going to depend
00:13:17.27 on the initial allele frequencies.
00:13:19.20 So in this particular case,
00:13:21.05 we've now changed the starting frequency:
00:13:23.20 it's not 50%, it's now 10%.
00:13:27.06 And you can see that there's much more
00:13:29.28 probability of loss of the allele in this case,
00:13:34.11 and here we have just one of the alleles reaching fixation.
00:13:42.08 So again, in this particular case,
00:13:44.05 about 1 out of 10 will eventually become fixed,
00:13:47.14 or reach 100% frequency.
00:13:51.09 Now here's an example from a large population.
00:13:54.01 It'll take longer for this to occur,
00:13:56.02 but the proportion of alleles are gonna be
00:13:58.12 roughly the same,
00:13:59.29 so again roughly 1 out of 10 will go to fixation,
00:14:03.06 it's just gonna take longer.
00:14:05.16 Other important terms in population genetics
00:14:07.26 are bottleneck and founder effects,
00:14:10.08 and this is because genetic drift
00:14:11.23 has a large effect on allele frequencies
00:14:14.10 when a population originates
00:14:16.05 via a small number of people from a larger population.
00:14:19.16 So here we have an example of a bottleneck,
00:14:22.10 and what a bottleneck means is that
00:14:24.01 there's been a decrease in population size
00:14:26.21 at some time in the past.
00:14:28.14 So you can think of it as a population crash.
00:14:31.10 And what happens when the population is very small,
00:14:34.28 you're going to have a higher rate of genetic drift,
00:14:37.12 and we can see here that these alleles,
00:14:39.20 which are represented by the different colors,
00:14:42.00 have shifted from what we're seeing
00:14:44.18 back at this earlier time.
00:14:46.25 Now we go through the bottleneck,
00:14:48.19 and now we're seeing predominantly
00:14:50.07 these white and black alleles.
00:14:53.09 Another example we can look at is a founder event,
00:14:57.20 which is sort of a special case of a bottleneck event.
00:15:00.11 And in this case it's where a population, a small population,
00:15:05.03 breaks off from the larger population,
00:15:07.25 and again there's going to be increased genetic drift
00:15:10.26 in this initially small population
00:15:13.12 and here, by chance,
00:15:15.05 we just happened to see more of these dark blue
00:15:18.12 and light blue alleles.
00:15:21.09 The pattern of variation that we see
00:15:22.23 in the human genome
00:15:24.09 is also dependent on the effective population size,
00:15:27.17 which we distinguish as capital N sub e.
00:15:32.10 And the definition of the effective population size
00:15:35.10 is the number of breeding individuals in a population.
00:15:38.19 So estimates of Ne
00:15:40.17 are most strongly influenced by population sizes
00:15:43.07 when they're at their smallest,
00:15:45.10 and it could take many generations
00:15:47.02 to recover from a bottleneck event.
00:15:49.11 So estimates of Ne in modern populations
00:15:51.21 reflect the size of the population
00:15:53.20 prior to population expansion.
00:15:56.22 Pretty consistently, studies of nuclear sequence diversity in humans
00:16:00.24 have estimated an effective population size
00:16:03.15 of about 10,000.
00:16:05.19 Now, by contrast, if we look at Chimpanzees,
00:16:08.29 the estimate is closer to 35,000.
00:16:12.14 And so what that means is that
00:16:14.01 humans have undergone a bottleneck
00:16:16.18 sometime during their evolutionary history.
00:16:19.22 So the pattern of genomic variation
00:16:21.25 that we see in modern populations today
00:16:24.00 is a reflection of our evolutionary and demographic history.
00:16:27.14 So how much do we differ?
00:16:29.17 Well, identical twins
00:16:31.27 have no differences at the nucleotide level.
00:16:35.06 If we compare unrelated humans,
00:16:36.29 we differ at about 1 out of 1,000 nucleotide sites.
00:16:41.12 And if we compare humans to our closest genetic relative, the Chimpanzee,
00:16:45.02 we differ at about 1 out of 100 sites.
00:16:47.29 So, as a whole, our species is very similar,
00:16:50.27 and that simply reflects our recent common ancestry
00:16:54.05 from Africa within the past 100,000 years.
00:16:57.06 But when you consider that there are
00:16:58.27 over 3 billion DNA bases in the genome,
00:17:02.02 that results in 3 million differences
00:17:04.16 between each pair of genomes,
00:17:06.05 more than enough to generate the diversity
00:17:08.29 that will make each of us unique.
00:17:12.02 Now I want to introduce a statistic
00:17:14.13 that we typically use to look at how much variation
00:17:17.06 there is among populations,
00:17:20.01 and this is referred to as an Fst statistic.
00:17:24.00 And it's simply looking at the proportion of genetic variation
00:17:27.03 that is within populations,
00:17:29.06 relative to that which is between populations.
00:17:32.18 Fst can be measured based upon heterozygosity,
00:17:37.20 and heterozygosity is simply a measure of genetic variation,
00:17:41.26 which is very simply calculated as
00:17:44.15 1 minus the sum of the allele frequencies squared.
00:17:49.09 And so once we calculate
00:17:51.26 the heterozygosity for each locus,
00:17:53.29 we can look at the average,
00:17:55.23 and we can look at the average within a subpopulation,
00:17:58.03 or in the total combined population.
00:18:00.29 Now, just as an example,
00:18:03.15 if we were to see here that
00:18:06.22 in the case of Fst = 1,
00:18:09.12 it means that there is no overlap at all in the allele frequencies.
00:18:13.15 So we can see that in population 1 they have all A's,
00:18:16.13 and in population 2 they have all B's.
00:18:19.15 And in the case of Fst = 0,
00:18:22.18 there is complete similarity,
00:18:26.08 so here we see exactly the same number
00:18:28.13 of A alleles and exactly the same number of B alleles.
00:18:32.01 And then here's an intermediate case
00:18:33.29 where we have about 0.11, 11%,
00:18:39.07 showing that there's just a small amount of differentiation
00:18:43.04 between these two populations.
00:18:46.09 So what do we see in humans?
00:18:47.29 Well, the average Fst between human populations
00:18:51.04 is about 15%,
00:18:53.15 and what that means is that the majority of genetic variation
00:18:56.04 is found within a population,
00:18:59.07 and only about 15% of the genetic diversity
00:19:02.08 differs between populations.
00:19:04.23 Again, this is reflecting our recent common ancestry in Africa,
00:19:09.00 within the past 50,000-100,000 years.
00:19:14.13 Now, interestingly,
00:19:16.09 if we were to do this calculation from Chimpanzee populations,
00:19:19.08 we see that the value is around 32%,
00:19:22.15 so there's actually a lot more differentiation
00:19:25.04 among Chimpanzee populations
00:19:27.07 than among human populations,
00:19:29.18 again reflecting our overall close genetic similarity to each other.
00:19:36.19 So I now want to talk about the
00:19:38.04 different sources of DNA that we use
00:19:40.04 to reconstruct human evolutionary history.
00:19:43.01 One source of DNA is
00:19:45.29 that which is present in the nuclear genome
00:19:48.06 that's located in the nucleus of the cell.
00:19:51.03 And there's another type of genome
00:19:53.20 which is present in the mitochondria of the cell,
00:19:56.15 and the mitochondria is the energy-producing organelle.
00:20:02.13 So what is the difference between these different genomes?
00:20:06.03 Well, the nuclear genome
00:20:08.09 consists of 22 autosomal pairs of chromosomes
00:20:12.26 and then the sex chromosomes,
00:20:14.15 XX for females and XY for males.
00:20:17.27 The nuclear genome is about 3.4 billion bases in size,
00:20:22.02 and it consists of about 20,000 coding genes.
00:20:25.10 It's inherited from both parents,
00:20:27.21 but it also undergoes extensive recombination each generation.
00:20:32.07 But, one of the reasons it's useful is that there's
00:20:34.18 so many different locations where we can study variation,
00:20:38.08 given that there are 3 billion nucleotides,
00:20:41.02 it's just a little bit more difficult to trace them back
00:20:43.29 to a single common ancestor.
00:20:46.20 By contrast, the mitochondria DNA genome
00:20:50.21 is very small, it's only about 16,000 nucleotides in size,
00:20:55.14 and it's circular,
00:20:57.17 and it's passed on only through the maternal lineage.
00:21:00.19 There's also no recombination
00:21:02.17 and it has a very high mutation rate.
00:21:05.00 All of these features make it very useful
00:21:07.01 for tracing evolutionary history.
00:21:09.27 So let me give you another example of what I'm referring to.
00:21:13.12 The mitochondrial DNA is inherited through the maternal lineage,
00:21:17.05 whereas the nuclear DNA is inherited from both parents.
00:21:22.08 So if we were to trace back from a present day individual,
00:21:25.26 they will have inherited their nuclear genome
00:21:28.20 from their parents,
00:21:30.17 their parents would have inherited from their set of parents,
00:21:33.28 and then their set of parents, and so on.
00:21:36.15 So we can trace it back to a large number of ancestors.
00:21:39.16 But by contrast, if we're tracing back mitochondrial DNA lineages,
00:21:44.00 we can see that they're only passed on
00:21:46.25 through the maternal lineage,
00:21:49.10 so they're essentially inherited from a single lineage.
00:21:52.03 We can trace them back to a single common female ancestor,
00:21:56.01 and that's why they're been very useful
00:21:57.29 for human evolutionary genetics studies.
00:22:00.21 So for example, if we were to consider
00:22:02.26 these dots to be mitochondrial DNA lineages,
00:22:06.20 and let's start at generation 11 at the bottom,
00:22:10.12 shown by the red dots,
00:22:12.06 and imagine those are different mitochondrial DNA sequences
00:22:15.00 from different individuals.
00:22:17.10 At some time in the past, these two individuals, for example,
00:22:22.06 coalesced back to a common ancestor,
00:22:24.26 and then this group coalesces back to a common ancestor here,
00:22:29.29 and ultimately they all coalesce back
00:22:32.20 to a single common ancestor.
00:22:35.03 Now, in the popular literature,
00:22:36.22 the single common ancestor for mitochondrial DNA
00:22:39.04 is often referred to as "mitochondrial Eve",
00:22:42.21 but one thing to remember is that
00:22:45.17 Eve was not alone, she lived within a population,
00:22:49.06 as we can see here by the other colors.
00:22:51.22 But those lineages just never made it
00:22:54.22 down to the present day.
00:22:57.25 So this is a phylogenetic tree
00:23:00.11 constructed by sequencing mitochondrial DNA
00:23:03.10 whole genome lineages
00:23:05.02 from ethnically diverse individuals.
00:23:07.19 So each individual actually represents
00:23:10.29 a branch on this tree,
00:23:13.02 and if two individuals are very closely related to each other,
00:23:16.05 they'll be very close to each other
00:23:19.01 in the tree.
00:23:21.03 So one of the first things you can see
00:23:22.19 using Chimpanzee as an outgroup
00:23:25.01 is that all modern human lineages
00:23:27.25 coalesce at about 170,000 years ago,
00:23:31.12 and so that corresponds very well with the
00:23:33.05 time of origin of anatomically modern humans.
00:23:36.23 So another thing that we can see is that
00:23:39.25 all of the oldest genetic lineages
00:23:42.26 are from African individuals.
00:23:45.22 We can also see that
00:23:48.12 the very oldest lineages
00:23:50.15 are from the San and the Mbuti pygmy hunter-gatherers,
00:23:54.28 and then the more recent lineages
00:23:57.13 are from non-African populations.
00:24:00.01 And that is a pattern that's very consistent
00:24:02.17 with the model of a recent African origin
00:24:05.12 of modern humans.
00:24:07.23 Now, another way that we can compare mitochondrial DNA sequences
00:24:11.21 is to simply count up the number of sites
00:24:14.04 at which they differ when we compare any pair of sequences.
00:24:17.23 And when we do this,
00:24:19.09 we observe that
00:24:22.11 any two African lineages will differ from each other
00:24:25.03 at many more sites than any two non-African lineages.
00:24:29.06 And again, that means that there has been more time
00:24:32.02 for variation to accumulate in Africa,
00:24:34.16 and is consistent with an African origin
00:24:37.08 of modern humans.
00:24:39.20 When we sequence the mitochondrial DNA lineages,
00:24:42.21 we can classify them as haplotypes,
00:24:45.10 and those haplotypes belong to
00:24:47.16 larger subsets of haplogroups.
00:24:50.01 You can think of a haplotype as simply
00:24:52.14 the arrangement of genetic variants along a chromosome,
00:24:55.19 or in the case of the mitochondrial DNA
00:24:57.22 there's just a single genome,
00:24:59.14 so it's really just the different nucleotide differences
00:25:02.27 amongst different mitochondrial DNA lineages.
00:25:06.24 And one of the first things that you can note is that
00:25:09.26 there are different haplogroups
00:25:11.29 in different regions of the world.
00:25:13.19 So here are some that seem to be pretty specific to Africa,
00:25:16.20 but are also present in some regions
00:25:18.20 where there may have been some gene flow
00:25:20.20 from Africa.
00:25:22.21 Then we have others that may be more common in Europe,
00:25:25.12 or in east Asia,
00:25:28.18 or in the Americas.
00:25:30.19 And for that reason,
00:25:32.11 mitochondrial DNA can be very useful for
00:25:34.11 tracing recent human migration events.
00:25:38.13 Now, by contrast,
00:25:40.02 the Y chromosome is also inherited with no recombination,
00:25:45.14 and so it can also be very useful for tracing back
00:25:48.01 through the male lineages.
00:25:50.16 And here is a phylogeny constructed from Y chromosome variation,
00:25:55.07 and as with the mitochondrial DNA,
00:25:58.08 what we see is that the oldest lineages
00:26:01.19 are specific to Africans,
00:26:04.02 and the more recent lineages
00:26:06.05 are found predominantly in Non-Africans,
00:26:08.13 although we do see some in Africans as well.
00:26:11.25 Again, this is consistent with the recent African origin of modern humans.
00:26:18.14 We can also look at Y chromosome haplogroups,
00:26:22.09 and one of the things that's a little bit different
00:26:24.04 is you can see that they're a bit more differentiated
00:26:26.16 between geographic regions.
00:26:29.03 So for example,
00:26:30.24 here we just see haplogroups that are in blue,
00:26:34.04 and we see very distinct haplogroups
00:26:36.20 in the Americas, shown in purple.
00:26:39.26 And one of the reasons for that may have to do with
00:26:43.08 sex-biased migration,
00:26:46.01 that you may have, for example,
00:26:47.16 one male traveling long distances.
00:26:50.06 And it may also have to do with patterns of mating structure.
00:26:54.20 So for example, in some populations or ethnic groups,
00:26:57.23 you may have one male who has many different wives,
00:27:01.05 and because of that the effective population size of the Y chromosome
00:27:07.01 is actually smaller than the mitochondrial DNA,
00:27:09.28 and we tend to get more genetic differentiation
00:27:12.27 around the world.
00:27:15.07 So now I want to talk about analyses of ancient DNA,
00:27:18.27 for example, in this case from Neanderthal,
00:27:22.12 and these are some images of scientists
00:27:25.20 working on a Neanderthal fossil.
00:27:29.10 And this type of analysis is very challenging
00:27:32.01 for a number of reasons.
00:27:33.25 One is that DNA which is that old,
00:27:38.04 on the order of say 30,000 years old
00:27:40.10 to even 100,000 years old,
00:27:42.06 is going to be highly degraded,
00:27:44.24 and if there's any contamination
00:27:46.25 with modern human DNA,
00:27:49.02 that is much more likely to amplify
00:27:51.19 than the old degraded DNA
00:27:54.01 from the archaic species,
00:27:56.21 so one has to be extremely careful when analyzing this DNA.
00:28:01.03 Now, more recently,
00:28:02.24 there was a pinky finger bone
00:28:05.07 identified in a cave in Siberia
00:28:07.22 from a region called Denisova,
00:28:10.11 so it's referred to as the Denisova
00:28:13.21 or Denisovan genome.
00:28:16.11 Here I'm presenting a phylogenetic tree
00:28:18.29 based on mitochondrial DNA variation
00:28:21.24 comparing modern humans, shown in blue here,
00:28:26.09 to Neanderthals shown in red,
00:28:29.01 and the Denisova individual shown in yellow.
00:28:32.23 And what we can see is that the
00:28:34.17 time to most recent common ancestry in humans,
00:28:37.08 as we've already discussed,
00:28:39.00 is about 200,000 years ago.
00:28:41.13 The time to most recent common ancestry
00:28:43.14 between humans and Neanderthals
00:28:46.01 is about 500,000 years ago,
00:28:48.13 for the mitochondrial DNA lineages.
00:28:51.03 And the time to most recent common ancestry
00:28:53.20 with the Denisova mitochondrial lineages
00:28:57.08 is about 1 million years ago.
00:29:00.05 So this is demonstrating a couple of things.
00:29:02.20 From the mitochondrial DNA perspective,
00:29:05.07 there's no evidence of any admixture
00:29:07.13 with anatomically modern humans.
00:29:10.02 The Neanderthal sequences are clearly
00:29:12.18 very distinct from modern humans.
00:29:14.28 It's also showing you that there was another species, Denisova,
00:29:18.15 that appears to be distinct from the Neanderthals,
00:29:21.07 and they diverge even earlier than Neanderthals
00:29:24.09 from modern humans.
00:29:26.21 So if we were to compare pairwise nucleotide diversity,
00:29:31.01 for example,
00:29:33.02 among anatomically modern humans shown in blue,
00:29:35.24 you can see that there's not a lot of diversity,
00:29:38.15 as expected considering that
00:29:40.13 we all have a very recent common ancestry.
00:29:43.04 If you compare the modern human mitochondrial genomes to Neanderthal,
00:29:48.03 you can see that they're more divergent,
00:29:50.07 as expected, given that the mitochondrial DNA lineage
00:29:54.04 diverged about 500,000 years ago.
00:29:57.02 If we compare to the
00:29:59.03 Denisovan mitochondrial DNA lineage,
00:30:01.10 they're even more divergent.
00:30:04.04 And then if we compare to Chimpanzee,
00:30:06.14 of course as expected,
00:30:08.11 given that they diverged at least 5 million years ago,
00:30:11.14 they are the most different in terms of sequence variation.
00:30:15.13 Now, several years ago
00:30:18.13 there was a draft sequence produced of
00:30:21.20 the Neanderthal genome using next-generation sequencing technology.
00:30:25.25 And this was an absolutely amazing feat,
00:30:28.17 but at the time they had very low coverage,
00:30:31.07 meaning that any particular region of the genome
00:30:33.19 was sequenced only about once or twice.
00:30:36.20 Now, more recently,
00:30:38.07 as the technology has improved,
00:30:40.05 they've gotten much better coverage of the Neanderthal sequence,
00:30:43.04 and quite recently they now have a 30-fold coverage,
00:30:46.22 meaning that on average most sites
00:30:49.03 will have sequenced 30 times.
00:30:51.22 And so you'll have a much better accuracy
00:30:54.23 when determining nucleotide variation.
00:31:01.07 So, when the Neanderthal genome
00:31:03.25 was compared to the human genome,
00:31:06.11 what you can do is first
00:31:08.10 look at how much divergence has occurred
00:31:11.02 since modern humans differentiated from Chimpanzees
00:31:15.10 within the past 6.5 million years.
00:31:18.12 And you can look at the divergence
00:31:20.24 that has occurred specifically in the human lineage
00:31:24.06 since they diverged from Neanderthal,
00:31:26.21 and they've only accumulated
00:31:29.07 about 8% of this total divergence.
00:31:34.08 And so the estimate of the time of population divergence
00:31:38.06 between humans and Neanderthals
00:31:40.15 is about 400,000 years ago.
00:31:43.09 Furthermore, it has been estimated that
00:31:45.24 there may have been a small amount of admixture
00:31:48.16 between Neanderthals and anatomically modern humans,
00:31:52.01 as shown by this red arrow here.
00:31:54.18 So the estimated amount of admixture is about 1-2%,
00:32:00.15 of the modern human genome,
00:32:02.17 may be of Neanderthal ancestry.
00:32:05.03 But what is of interest is to note that
00:32:07.24 this is only present in Non-Africans.
00:32:10.13 It is not present in African genomes.
00:32:13.05 And so what we can infer from that is
00:32:15.16 that this admixture event probably occurred
00:32:18.25 before modern humans spread across the globe.
00:32:22.01 It may have occurred, for example, in the Middle East,
00:32:24.28 and that's why we're seeing it present in all Non-Africans,
00:32:29.18 and we don't see it at all in Africans.
00:32:32.15 Now, more recently, there has been
00:32:34.22 whole genome sequencing of the Denisovan individual,
00:32:39.20 and what that has shown is that
00:32:42.09 the Denisovan species, or this individual,
00:32:45.15 appears to have diverged from modern day humans
00:32:48.13 around 800,000 years ago,
00:32:51.09 consistent with what we saw from the mitochondrial DNA.
00:32:55.21 They also observed low levels of heterozygosity in Denisova,
00:32:59.21 suggesting that they may have had
00:33:01.19 a small population size.
00:33:04.06 Additionally, when a phylogenetic tree
00:33:07.24 was constructed from the nuclear DNA variation,
00:33:11.13 they could see that the modern humans
00:33:15.11 tend to cluster together,
00:33:17.09 and as we expect they're divergent
00:33:19.01 from the Denisova and the Neanderthals.
00:33:21.29 The Neanderthals tend to cluster together,
00:33:24.06 so they're clearly divergent from Denisova.
00:33:27.03 But what's interesting is if you look at how much
00:33:31.01 variation there is amongst the modern humans,
00:33:34.11 as indicated by the length of these lineages,
00:33:38.06 and then you compare that to Neanderthals,
00:33:40.14 which have very short branches.
00:33:43.06 What that suggests is
00:33:44.28 that there was not a lot of genetic variation
00:33:47.09 amongst the Neanderthals,
00:33:49.23 and therefore they may have undergone a bottleneck,
00:33:52.11 so they might have undergone a population crash
00:33:54.20 at some point in the past.
00:33:57.07 So in summary,
00:33:59.04 what we can see is that
00:34:01.23 Homo erectus left Africa
00:34:04.05 within the past 2 million years,
00:34:06.28 and spread throughout Eurasia,
00:34:09.09 giving rise, possibly,
00:34:11.09 to species like Homo floresiensis,
00:34:14.17 and surviving until quite recently,
00:34:17.12 as recently as around 25,000 years ago.
00:34:20.28 Then we have other species like Neanderthal and Denisovans,
00:34:27.02 who may have originated from a different species,
00:34:30.07 such as heidelbergensis,
00:34:33.10 and they differentiated sometime
00:34:36.12 around 600,000 or 700,000 years ago in the case of Denisova,
00:34:39.29 or in Neanderthals around 400,000 years ago.
00:34:43.05 And then we have the modern human lineage,
00:34:46.11 Homo sapiens,
00:34:49.00 which arose around 200,000 years ago
00:34:51.07 and spread out of Africa.
00:34:53.21 And when they did so,
00:34:55.02 they would have encountered these other species,
00:34:57.09 and there may have then been low levels of gene flow.
00:35:01.20 And in fact for the case of the Denisovan genome,
00:35:03.23 it appears that the gene flow
00:35:05.26 was predominantly with populations from Oceania,
00:35:10.01 implying that this admixture
00:35:12.17 may have occurred in a different location and a different time.
00:35:16.00 Now, we still don't know exactly
00:35:18.05 how much admixture there may have been
00:35:20.12 between archaic species
00:35:22.23 and modern humans in Africa,
00:35:25.01 but there's some preliminary data suggesting that
00:35:27.10 this has occurred there as well.
00:35:29.14 The problem is that the fossils don't preserve as well in Africa,
00:35:32.19 so we don't have any DNA sequences
00:35:34.26 from archaic lineages in Africa at this point.
00:35:40.01 So in conclusion,
00:35:41.18 Africa has the most genetic diversity in the world.
00:35:44.15 Human dispersions out of Africa
00:35:46.11 populated the entire world,
00:35:48.15 and we are the last of a series of hominin dispersal events
00:35:51.14 out of Africa.
Sarah Tishkoff studied anthropology and genetics as an undergraduate at the University of California, Berkeley. She received her PhD in genetics from Yale University and was a post-doctoral fellow at Pennsylvania State University. From 2000-2007, she was a faculty member in the Department of Biology at the University of Maryland. Currently, Dr. Tishkoff is the… Continue Reading
Dr. Neil Shubin is a Professor in the Department of Organismal Biology and Anatomy and the Committee on Evolutionary Biology at the University of Chicago. Shubin’s research focuses on understanding the evolutionary origins of new anatomical features such as limbs. Shubin is well known for his discovery of Tiktaalik roseae,the 375 million year old fossil… Continue Reading
After a short stint studying political science in college, Hopi Hoekstra switched her focus to biology. She received her B.A. in Integrative Biology from UC Berkeley, and her Ph.D. in Zoology from the University of Washington. She completed postdoctoral research at the University of Arizona, and, in 2003, she joined the faculty at UC San… Continue Reading