• Skip to primary navigation
  • Skip to main content
  • Skip to footer

Session 4: How is Evolution Measured

Transcript of Part 4: African Genomics: Human Evolution

00:00:07.17	For the last part of my lecture series,
00:00:10.11	I wanna talk about examples of natural selections in humans,
00:00:14.29	and the two particular examples
00:00:17.01	that I'm going to be talking about
00:00:19.00	are the evolution or lactose tolerance in east Africa,
00:00:22.10	and of pygmy short stature.
00:00:25.04	So if we're going to be talking about natural selection,
00:00:27.11	we have to first of course
00:00:28.29	acknowledge Charles Darwin,
00:00:31.12	who came up with the theory of natural selection.
00:00:36.10	In fact, to quote from Darwin, he said,
00:00:39.13	"This preservation of favourable variations
00:00:42.03	and the rejection of injurious variations,
00:00:44.21	I call Natural Selection.
00:00:47.06	Variations neither useful nor injurious
00:00:49.27	would not be affected by natural selection,
00:00:52.18	and would be left a fluctuating element,
00:00:55.01	as perhaps we see in the species called polymorphic."
00:00:58.10	And that was from his classic book
00:01:00.06	On The Origin of Species,
00:01:01.24	published in 1859,
00:01:03.25	and you might recognize from our first lecture,
00:01:07.03	that this is really talking about genetic drift,
00:01:09.27	random fluctuations.
00:01:13.13	However, part of the evolutionary change that we see
00:01:18.12	is not just going to be due to random genetic drift,
00:01:21.03	it's also going to be due to natural selection.
00:01:24.18	And so, according to that theory,
00:01:27.10	natural variation exists and is heritable,
00:01:30.02	more organisms are born than can survive,
00:01:32.11	and therefore organisms best suited to the environment
00:01:35.07	survive more often,
00:01:36.25	and slight differences can accumulate in a species over time.
00:01:40.24	So this is the idea of gradual evolution of a species
00:01:43.27	by natural selection.
00:01:45.18	And this is Huxley,
00:01:47.03	who was also known as Darwin's bulldog
00:01:50.11	because he was the big proponent of his theory,
00:01:52.17	and he said,
00:01:54.05	"How extremely stupid not to have thought of that!"
00:01:57.12	So when Darwin first came up with his theory of natural selection,
00:02:01.08	there was really no concept of genetics
00:02:04.12	as we know it today.
00:02:06.02	In fact, it wasn't until the late 1800s
00:02:08.12	that Mendel proposed his theory of genetics.
00:02:13.02	So in the 1930s and 1940s
00:02:15.22	there was sort of a synthesis of natural selection
00:02:19.09	and genetics and mathematics,
00:02:22.16	population genetics,
00:02:23.25	and at that time it was proposed that genetic variation in populations
00:02:27.05	arises by chance through mutation and recombination,
00:02:31.15	that evolution consists primarily of changes in the
00:02:34.12	frequencies of alleles between one generation and another,
00:02:37.25	largely as a result of genetic drift,
00:02:40.24	gene flow,
00:02:42.05	and natural selection.
00:02:43.27	And that speciation occurs gradually when populations
00:02:46.00	are reproductively isolated, for example,
00:02:48.20	by geographic barriers.
00:02:52.14	And so if we look at this timeline,
00:02:54.12	starting with the Origin of Species,
00:02:56.21	and then Mendelian inheritance
00:02:59.04	is actually rediscovered in 1900,
00:03:01.19	it was first proposed in the late 1880s,
00:03:04.01	but very few people knew about it at that time.
00:03:06.27	And then in the early 1900s
00:03:09.09	we have the theoretical foundations of population genetics
00:03:12.21	and then, as I mentioned,
00:03:14.09	the modern synthesis in the 30s.
00:03:16.19	And then in the 70s we have Kimura's theory of neutral evolution,
00:03:21.19	which was proposing that most changes and speciation events
00:03:25.07	are simply due to random genetic drift
00:03:27.22	and to new mutation events.
00:03:29.20	And I think that today we would say
00:03:31.14	it's a combination of all of the above.
00:03:33.19	There's certainly a lot of genetic drift that occurs,
00:03:36.02	but we know that natural selection
00:03:37.29	is having a very important influence
00:03:40.26	on the variation that we see
00:03:43.05	in terms of phenotypic variation and even disease susceptibility.
00:03:48.04	So let's look what happens
00:03:49.08	when a neutral mutation occurs in a population,
00:03:52.07	as indicated by this individual in green.
00:03:55.25	Let's look what happens as we proceed forward in generations,
00:03:59.08	and you can see there's not too many changes
00:04:01.21	in allele frequency.
00:04:03.29	But what happens when we have a beneficial mutation,
00:04:09.05	which means that it increases the fitness of the individual,
00:04:12.20	meaning that they're more likely to produce children,
00:04:16.12	and their children are more likely to produce more children,
00:04:19.00	and so on and so forth.
00:04:21.15	And so we can see that each generation,
00:04:24.04	this beneficial mutation is going to spread,
00:04:27.22	until eventually it may be nearly fixed
00:04:32.06	in the population.
00:04:34.17	So I want to tell you about some of our studies
00:04:37.20	focused in African populations
00:04:39.14	in which we're trying to identify
00:04:41.02	genetic signatures of natural selection,
00:04:43.19	and regions of the genome that are targets of natural selection.
00:04:47.29	And this is important
00:04:49.22	because it's thought that mutations associated with diseases
00:04:52.16	in modern populations,
00:04:54.13	like hypertension
00:04:56.03	, diabetes,
00:04:57.08	obesity,
00:04:58.10	and asthma,
00:04:59.08	may have been selectively advantageous or adaptive
00:05:01.23	in past hunter-gatherer environments.
00:05:04.04	So if we can identify these regions
00:05:06.25	that are targets of selection, or actual variable sites
00:05:09.18	that are targets of selection,
00:05:11.16	those may be functionally important
00:05:13.14	and may give us a clue about disease risk.
00:05:16.11	So here I'm showing you a few of the populations
00:05:18.12	that we've studied in Africa,
00:05:20.23	and we have people who are living at very different climates,
00:05:23.15	high altitude, low altitude,
00:05:26.03	savannah, and tropical environments, for example.
00:05:30.10	We have people who have very different diets,
00:05:32.22	so agriculturalists,
00:05:34.08	hunter-gatherers,
00:05:35.23	or pastoralists.
00:05:37.14	And they have very different infectious disease exposures,
00:05:40.05	so they've likely undergone local adaptation
00:05:42.13	to different environments.
00:05:45.25	And I'm going to, as I mentioned,
00:05:47.16	tell you about two examples today.
00:05:49.15	The first one is the evolution of lactose tolerance
00:05:51.29	in east African pastoralist populations.
00:05:57.07	So, the ability to digest the sugar lactose,
00:06:00.21	which is quite common in milk,
00:06:03.07	is due to an enzyme called lactase-phlorizine hydrolase,
00:06:07.15	or known as lactase for short.
00:06:09.28	And lactase is expressed specifically
00:06:13.01	in the brush border cells of the small intestine,
00:06:16.25	and in individuals who maintain high levels of this enzyme
00:06:20.29	as adults,
00:06:22.23	they're able to break down the complex sugar lactose
00:06:26.16	into glucose and galactose,
00:06:29.14	which is rapidly taken up into the bloodstream.
00:06:32.23	However,
00:06:35.19	most mammals, and most humans,
00:06:38.19	shut down lactase activity
00:06:40.23	shortly after weaning.
00:06:42.28	So, as adults, they do not have an active form of this enzyme.
00:06:46.24	And what's going to happen is
00:06:48.19	they're not going to be able to break down that complex sugar.
00:06:51.26	It's going to go down into the lower gut,
00:06:54.13	it's going to be attacked by bacteria,
00:06:56.25	and you're going to have severe intestinal distress.
00:07:01.00	Now, it has been noted for many years by anthropologists
00:07:04.27	that there is a very strong correlation
00:07:06.27	between the lactose tolerance trait,
00:07:09.19	or you could think of it also as the lactase persistence trait,
00:07:13.26	because there's persistence of the enzyme activity as adults.
00:07:18.15	And they've seen a strong correlation
00:07:20.20	between the prevalence of that trait
00:07:23.14	with populations who traditionally practice cattle domestication
00:07:28.04	and dairying.
00:07:30.05	So for example, this trait is most common in northern Europe,
00:07:33.19	it decreases in frequency as one moves
00:07:36.23	into southern Europe
00:07:38.29	and into the Middle East.
00:07:40.24	It's very uncommon in eastern Asia
00:07:43.26	and in the Americas,
00:07:46.13	and it's uncommon in western Africa,
00:07:48.26	which is one of the reasons that we see high levels
00:07:51.17	of lactose intolerance in African Americans, for example.
00:07:55.24	But in regions of Africa where there's a high prevalence
00:07:59.04	of cattle domestication, pastoralism, and dairying,
00:08:03.14	we see a high prevalence of this trait.
00:08:07.15	So, in 2002,
00:08:10.18	there was an elegant study done
00:08:12.20	by Leena Peltonen's group in Finland,
00:08:14.28	in which they identified a genetic mutation
00:08:17.08	that regulates lactose tolerance in Europeans.
00:08:20.20	And it was located near the...
00:08:23.25	upstream of the lactase gene.
00:08:26.01	When we sequenced that region in east African pastoralists,
00:08:29.04	they didn't have it,
00:08:31.10	so we knew they must have something else.
00:08:33.13	So in order to identify those mutations,
00:08:35.21	we did something that's called a lactose tolerance test.
00:08:38.29	So, basically what we do is
00:08:42.11	we give people the sugar lactose in a powdered form,
00:08:46.20	we add water, and it basically tastes like orange Kool-Aid,
00:08:51.09	and then we have to line people up
00:08:54.17	and have them drink the lactose at the same time.
00:08:57.27	This is a group of Maasai women from Tanzania.
00:09:03.28	This is a group of pastoralists from southern Ethiopia.
00:09:11.23	And then we can use a standard diabetes monitoring kit,
00:09:16.03	and what we can do is to measure the blood glucose,
00:09:19.29	starting at baseline before they drink the lactose,
00:09:23.29	and then every 20 minutes we're gonna measure this,
00:09:27.06	over a period of about an hour.
00:09:30.02	And then we're gonna look at the maximum rise
00:09:32.25	in blood glucose.
00:09:35.15	If individuals have a rise
00:09:37.09	that is greater than 1.7 millimolar (mM)
00:09:39.20	we consider them to be lactose tolerant,
00:09:42.20	or to have the lactase persistent trait,
00:09:45.03	shown in light blue.
00:09:47.07	And if they have a rise that is less than 1.1 mM,
00:09:51.12	they're considered to be intolerant,
00:09:53.25	shown in dark blue.
00:09:55.18	So, we measured this trait
00:09:57.12	in nearly 500 individuals
00:09:59.17	from Tanzania, Kenya, and the Sudan,
00:10:02.00	and then we looked for association
00:10:04.10	with genetic variation that we identified
00:10:06.21	by resequencing the region
00:10:08.28	where the European variant had been identified.
00:10:13.13	And in doing so we identified
00:10:15.12	three novel genetic polymorphisms
00:10:18.21	that are associated with the lactose tolerance trait in east Africa,
00:10:22.17	and those are shown here by the boxes.
00:10:26.29	The most common was this one at position 14010,
00:10:31.06	but we also saw those others
00:10:32.24	at positions 13915 and 13907,
00:10:36.03	located roughly 14,000 basepairs
00:10:38.25	upstream of the lactase gene
00:10:41.18	which is located on chromosome 2.
00:10:44.07	Now, one of the really interesting things about this is that,
00:10:48.11	one, these regulatory mutations were pretty far away,
00:10:51.28	about 14,000 basepairs from the gene,
00:10:54.26	and they were located in an intron
00:10:57.25	in a non-coding region of a neighboring gene called MCM6.
00:11:03.00	So this is demonstrating that
00:11:04.25	functionally important variation
00:11:07.13	can actually be located in non-coding regions,
00:11:10.21	and we were able to show,
00:11:13.13	using in vitro cell line studies,
00:11:16.20	that these variants that are derived,
00:11:20.19	shown in the different colors here,
00:11:23.22	that they regulate expression
00:11:26.15	of the lactase gene using the lactase promoter.
00:11:31.10	Now, they're located very close to the mutation
00:11:35.01	associated with lactose tolerance in Europeans,
00:11:38.20	located at position 13910,
00:11:41.14	but they arose independently
00:11:43.17	due to a process called convergent evolution,
00:11:46.13	and probably due to a very strong
00:11:48.27	selective force to be able to drink milk that contains lactose,
00:11:56.15	in these different regions of the world.
00:12:00.27	What's also interesting
00:12:02.19	is that the variants that we identified
00:12:04.16	have a very distinct geographic distribution.
00:12:07.09	So the one that we found that was most common in our study
00:12:10.06	was at position 14010,
00:12:12.08	and we can see that it is pretty localized
00:12:15.01	to east Africa, to Tanzania and Kenya,
00:12:17.26	and that's the most likely site of origin of that mutation.
00:12:21.11	Interestingly, we also see it a bit in south Africa,
00:12:26.01	probably reflecting migration of pastoralists
00:12:28.29	from east Africa into that region.
00:12:32.16	The variant position at 13915
00:12:35.08	appears to have originated in the Middle East,
00:12:37.21	and we could see that it was introduced into northeast Africa,
00:12:40.17	probably by migration.
00:12:43.10	And then the variant at position 13907
00:12:46.26	likely arose in northeast Africa.
00:12:49.21	But again, one of the important take-home points is that
00:12:53.02	we have a functionally important variant
00:12:55.07	that's occurring at high frequency, sometimes as high as 40%,
00:12:59.12	and it's very geographically restricted,
00:13:02.22	and there are likely to be other mutations like that,
00:13:05.06	some of which may have implications for disease susceptibility,
00:13:09.11	again emphasizing the importance
00:13:11.23	to look amongst ethnically diverse Africans.
00:13:16.29	So the next thing we wanted to do
00:13:19.21	was to look for a signature of positive selection,
00:13:23.17	and this is the method in which we can do that.
00:13:27.21	So imagine, here in red,
00:13:30.25	imagine that this is a new mutation that has occurred, say,
00:13:34.15	one of the mutations associated with lactose tolerance.
00:13:38.00	And it's adaptive,
00:13:39.10	meaning that it increases the fitness of individuals who have it,
00:13:43.25	meaning that they're more likely to have children,
00:13:45.27	and their children are more likely to have children,
00:13:47.22	and so on.
00:13:49.28	And so it's going to increase in frequency
00:13:52.24	in the population,
00:13:54.29	and it's going to drag with it
00:13:57.15	the neighboring variants nearby.
00:14:00.03	So, you can see that when it originated, it had...
00:14:02.29	it was on a chromosome with a green variant
00:14:05.18	and a black variant.
00:14:08.03	And now these got dragged along to high frequency,
00:14:11.04	through a process known as hitchhiking.
00:14:14.07	Now, if this had gone to fixation,
00:14:17.12	meaning that everybody has it,
00:14:19.00	we would have called it a full selective sweep.
00:14:21.20	In this case, it hasn't quite reached a full selective sweep,
00:14:26.15	so we call it a partial sweep.
00:14:29.24	Now, that could just mean that
00:14:31.17	there hasn't been time for it to go to a full sweep,
00:14:33.06	or it could be that for some reason
00:14:35.17	there may be some negative aspects of having it,
00:14:38.02	and there's a reason that both variants are maintained in the population.
00:14:43.17	Now, after the sweep occurs,
00:14:45.19	you're going to have new mutation events
00:14:47.26	and new recombination events
00:14:49.21	shuffling up the variants
00:14:52.04	that are linked to the mutation that's adaptive.
00:14:56.12	And so that will decrease the association
00:15:01.00	observed between the mutation and the flanking variation.
00:15:05.00	And in fact,
00:15:06.15	if we have an estimate of the recombination rate,
00:15:08.27	we can use computational methods
00:15:10.24	to estimate how old this mutation is.
00:15:14.18	And that's exactly what we did here.
00:15:17.12	So shown on top
00:15:19.28	is an example from the most common mutation
00:15:22.27	that we found associated with lactose tolerance,
00:15:25.03	at position 14010.
00:15:27.18	Individuals who have the C variant
00:15:29.26	are able to digest milk,
00:15:31.22	and individuals who are homozygous are shown as red.
00:15:35.28	And what we did is we genotyped markers
00:15:38.28	going a distance of about 3 million nucleotides,
00:15:42.26	and what we would do is that if someone is homozygous,
00:15:46.20	starting at the lactose tolerance mutation,
00:15:49.02	and then we go to the next mutation.
00:15:51.02	If they're homozygous,
00:15:53.00	then we continue going.
00:15:55.13	If they underwent a recombination,
00:15:57.05	we stop the line.
00:15:59.13	And what we can basically see is that homozygosity
00:16:02.28	extends about 2 million basepairs
00:16:06.01	on chromosomes that have the lactose tolerance mutation.
00:16:09.15	But if we look at chromosomes that have the ancestral mutation,
00:16:14.00	they have almost no extended haplotype homozygosity.
00:16:19.07	And so this is a classic signature of a selective sweep.
00:16:22.25	It means that this variant
00:16:24.16	was under very strong positive selection
00:16:28.14	and it rapidly increased in frequency in the population,
00:16:32.04	dragging with it the neighboring variation.
00:16:38.16	Now, here I'm showing the European variant,
00:16:41.17	in this case the T variant
00:16:43.20	is associated with lactose tolerance,
00:16:45.27	and it shows a very similar pattern.
00:16:50.22	So using computational approaches,
00:16:52.27	we were able to estimate the age of the African mutation
00:16:57.22	to be somewhere between about 3,000-7,000 years of age.
00:17:01.28	These are the populations
00:17:03.25	that had the oldest age estimates,
00:17:06.08	and they include individuals
00:17:08.07	who speak Cushitic languages.
00:17:10.04	They came from Ethiopia,
00:17:12.10	and they practiced agro-pastoralism.
00:17:15.01	They came into Kenya and Tanzania
00:17:17.16	within the past 5,000 years.
00:17:20.23	And then we saw it at very high prevalence
00:17:23.26	and an old age estimate in Nilo-Saharan-speaking groups,
00:17:27.11	and these would include, for example, the Maasai.
00:17:30.09	Now, they came into the region more recently,
00:17:32.17	from southern Sudan,
00:17:34.08	within the past 3,000 years, so if I were to guess,
00:17:37.05	I would think perhaps this mutation
00:17:39.04	arose in the Cushitic speaking populations.
00:17:42.03	But irregardless, it quickly, rapidly spread
00:17:45.07	to all of the populations in the area
00:17:47.21	because it was so selectively advantageous
00:17:51.22	and adaptive to have this mutation.
00:17:55.03	Now, because we see the correlation
00:17:59.18	between the practice of cattle domestication and pastoralism
00:18:04.11	and the rise in this mutations,
00:18:06.16	this is a really excellent example
00:18:08.22	of gene-culture co-evolution.
00:18:12.03	And in fact, what's really interesting is
00:18:15.01	that the date estimates that we came up with correlate really well
00:18:18.25	with the archaeological data,
00:18:20.17	which shows that cattle domestication
00:18:22.14	arose in the Middle East or north Africa
00:18:27.04	somewhere between 8,000-10,000 years ago,
00:18:29.26	and that corresponds with the age estimate for the European mutation,
00:18:33.18	which we inferred to be about 9,000 years old.
00:18:37.25	But cattle domestication was not introduced
00:18:40.25	south of the Saharan desert
00:18:44.20	until roughly 5,000 or 5,500 years ago,
00:18:48.21	correlating very well with the age estimate
00:18:52.03	for the mutation we found in eastern Africa.
00:18:54.24	And then it was introduced
00:18:56.13	much more recently into southern Africa.
00:19:00.12	But one could argue that perhaps Mendelian traits like lactose tolerance,
00:19:05.04	which are regulated by a single locus or gene of major effect,
00:19:10.23	are in a sense the low hanging fruit;
00:19:12.20	they're the easiest to identify.
00:19:15.04	So one thing that my lab is interesting in doing
00:19:17.10	is looking at more complex traits,
00:19:19.23	and perhaps one of the most classic complex traits is height.
00:19:23.28	So, height is highly heritable,
00:19:26.19	genome wide association studies in tens of thousands of Europeans
00:19:30.12	have identified hundreds of loci,
00:19:33.06	each of very small effect,
00:19:35.06	and explaining only a very small proportion of the variation in height.
00:19:39.20	Now, interestingly, most of these are not part of
00:19:42.22	the growth hormone/IGF1 pathway,
00:19:45.07	which we know plays a very important role in idiopathic short stature,
00:19:49.16	for example.
00:19:53.05	Now, in Africa, we see some of the broadest distributions,
00:19:56.21	or ranges in height,
00:19:59.02	ranging from the very short statured Pygmies in central Africa,
00:20:03.28	and then we see some of the tallest individuals
00:20:07.13	in the Sudan and in eastern Africa.
00:20:10.23	And it's thought that these differences
00:20:12.14	may be partly due to adaptation
00:20:14.29	to different environments.
00:20:16.27	So what I want to tell you today is about
00:20:18.19	our genetic studies of short stature
00:20:22.01	in Pygmy populations from central Africa.
00:20:25.14	And, for you to fully understand and appreciate the work we've done,
00:20:29.17	I think I should first tell you a little bit about
00:20:32.07	how we went about collecting these samples
00:20:34.07	and how challenging it could be.
00:20:35.25	So, this is...
00:20:37.18	to get to one of the groups that we studied in Cameroon,
00:20:39.27	you have to cross this river,
00:20:41.28	and you have a person who has a ferry,
00:20:44.05	he's actually using a hand crank here
00:20:47.13	to get us across.
00:20:50.12	And I guess I'm very fortunate
00:20:52.19	because as a woman, I was able to get shade,
00:20:54.15	but not everybody was that lucky.
00:20:56.28	And here are some other hazards that we run into,
00:20:59.15	but I'm smiling because the head is cut off of this snake.
00:21:03.01	But I actually have to give credit to Dr. Alain Froment,
00:21:06.24	who has been studying the Pygmy populations in Cameroon
00:21:09.16	for greater than 30 years,
00:21:11.20	and he did the majority of the sample collection
00:21:14.01	in this case.
00:21:16.21	So, the genetic basis of short stature in Pygmies
00:21:19.26	is a question that's been of tremendous interest
00:21:22.08	to endocrinologists and human geneticists alike
00:21:25.07	for most than 50 years.
00:21:27.08	The particular populations that we studied
00:21:29.25	are located in Cameroon, three different groups from Cameroon,
00:21:34.27	who mean male height is 152 cm.
00:21:40.11	And they live in very close connection and interaction
00:21:44.29	with neighboring populations who speak Bantu languages
00:21:47.29	and practice agriculture,
00:21:50.06	and their mean male height is 170 cm,
00:21:54.04	so that's quite a difference between the two.
00:21:58.16	So, the Pygmy short statured phenotype in humans
00:22:01.26	has arisen independently in different global populations.
00:22:05.12	Typically, these are populations
00:22:07.03	that live in tropical environments,
00:22:09.14	so there have been a number of hypotheses
00:22:11.11	about why this trait might be adaptive.
00:22:14.28	And these include thermoregulation,
00:22:19.04	limited food resources,
00:22:21.19	locomotion - that it may be easier to move
00:22:23.28	in a dense tropical environment if you're short,
00:22:26.18	and more recently there's a theory
00:22:30.10	that this is due to a life-history tradeoff,
00:22:32.10	and I'm going to focus on that theory.
00:22:35.08	And that has to do with the fact that
00:22:37.14	Pygmies have a remarkably short lifespan.
00:22:40.11	Their chance of living to age 15
00:22:42.06	is only about 40%,
00:22:44.18	and if they make it to age 15,
00:22:46.23	the expected lifespan is only around 25 years of age.
00:22:50.01	Now, that is due largely to very high infectious disease burden
00:22:54.05	and a very challenging life in dense tropical forests.
00:22:59.24	Now, what the study showed is that
00:23:02.18	Pygmies appear to be reaching reproduction...
00:23:05.25	they appear to be reproducing and reaching puberty
00:23:08.09	at a significantly earlier age
00:23:11.01	than other Africans.
00:23:13.13	And the growth trajectory in Pygmies
00:23:14.28	appears to be similar to other populations until the point of puberty,
00:23:19.21	and then they lack the adolescent growth spurt.
00:23:22.15	So this may be some sort of a tradeoff:
00:23:24.18	there's selection to reproduce earlier
00:23:26.22	because they're dying very young,
00:23:28.22	but that may be a tradeoff,
00:23:30.24	in that they're not undergoing the adolescent growth spurt.
00:23:35.20	Now, there have been only a handful
00:23:37.28	of physiologic and metabolic studies in Pygmies,
00:23:42.00	but nearly all of these are pointing towards
00:23:44.18	disruptions of the growth hormone/IGF1 pathway,
00:23:47.19	so this is in contrast to what we're seeing in European populations.
00:23:52.05	However, there's been quite a bit of dispute of
00:23:54.28	where along this pathway these disruptions are occurring.
00:24:00.04	So, in order to try to address these questions,
00:24:03.06	we genotyped one million single nucleotide polymorphisms
00:24:08.06	in 67 pygmy individuals
00:24:10.23	and 58 of the neighboring Bantu individuals.
00:24:14.14	And here we can see a plot,
00:24:17.10	similar to what I've shown you before,
00:24:19.09	based on structure analysis.
00:24:21.09	And to remind you,
00:24:23.02	this is composed of a series of lines,
00:24:24.23	and each line represents a person,
00:24:26.16	and they can have ancestry
00:24:28.08	from different ancestral populations,
00:24:31.01	represented by the different colors.
00:24:33.04	So here in orange
00:24:34.29	are individuals who speak the Bantu language
00:24:38.00	and practice agriculture,
00:24:40.08	and in dark green are individuals who self-identify as Pygmies.
00:24:44.19	And what you can see is that there's been
00:24:46.22	a lot of admixture between the Pygmies
00:24:49.29	and the neighboring Bantu people.
00:24:52.03	Now, interestingly, this tends to be unidirectional,
00:24:54.28	and it tends to be gene flow between males
00:24:57.27	from the Bantu population
00:25:00.03	with females of the Pygmy population.
00:25:02.22	This is largely due to socioeconomic factors.
00:25:06.23	Now, when we look at a correlation
00:25:08.26	between ancestry and height,
00:25:11.03	we observed a very strong and significant positive correlation.
00:25:15.04	So, we can see that Pygmies who have more of the Bantu ancestry
00:25:19.23	tend to be taller.
00:25:21.19	And, so this is showing
00:25:22.25	that there's a strong genetic component to this trait.
00:25:26.17	We've also worked with collaborators
00:25:28.11	to develop methods
00:25:30.19	to infer tracts of Pygmy and Bantu ancestry
00:25:35.11	across the chromosome.
00:25:36.29	So here, these are the different chromosomes,
00:25:38.18	starting with chromosome 1
00:25:40.05	and going up to chromosome 22,
00:25:42.25	and here I'm showing you an example from chromosome 3.
00:25:46.04	And in blue is showing tracts of the genome
00:25:49.03	that are Pygmy ancestry,
00:25:50.24	and in red are tracts of the genome that are Bantu ancestry,
00:25:54.23	and what we tend to see are very, very short tracts of Bantu ancestry.
00:25:58.28	And that's reflected in the fact that admixture
00:26:01.08	has been occurring over thousands of years.
00:26:06.11	Now, the next question that we wanted to address
00:26:08.17	is how do the genomes of the Pygmy hunter-gatherers
00:26:12.04	differ from the genomes of the Bantu agriculturalists
00:26:17.00	and from other groups, such as the Maasai pastoralists
00:26:20.28	from east Africa.
00:26:22.28	And to do that,
00:26:25.03	we use a number of scans of natural selection
00:26:27.28	across the genome.
00:26:29.29	Without getting into detail about the methods,
00:26:32.26	I'll just point out that you can see by the different colors here
00:26:37.00	across the different chromosomes,
00:26:39.00	here's chromosome 22 and going down to chromosome 1,
00:26:42.04	that we found a number of regions of the genome
00:26:44.22	that are targets of selection.
00:26:47.05	But there was one region in particular,
00:26:49.26	on chromosome 3,
00:26:52.04	where we saw a cluster of targets of natural selection.
00:26:57.01	And this was over about a 15 million basepair region.
00:27:01.14	Now, given our small sample size,
00:27:03.20	we have very little power
00:27:05.15	to detect a genome-wide association.
00:27:09.04	And so what we did is,
00:27:10.26	under the hypothesis that this is an adaptive trait,
00:27:13.17	we just focused on the regions of the genome
00:27:16.07	that are targets of selection, shown here,
00:27:19.11	and then we looked for an association with height.
00:27:22.10	And one of the strongest, most significant associations
00:27:25.09	was exactly in that same 15 million basepair region
00:27:29.19	of chromosome 3.
00:27:31.23	And indeed, it encompassed several genes,
00:27:34.15	one of which is DOCK3,
00:27:36.18	which has been shown to be associated with height
00:27:39.09	in non-African populations,
00:27:41.08	so we replicated that finding.
00:27:43.20	But nearby was another gene called CISH,
00:27:47.09	which is a member of the cytokine signaling family,
00:27:50.10	plays a very important role in regulating
00:27:52.28	IL-2 cytokine signaling pathway,
00:27:56.18	and studies have shown that it's associated
00:27:58.26	with resistance to a number of infectious diseases
00:28:01.18	in Africa.
00:28:04.01	Now, interestingly,
00:28:05.29	CISH also directly inhibits
00:28:07.14	human growth hormone receptor action
00:28:10.06	by blocking the STAT5 phosphorylation pathway.
00:28:13.15	And so we know that studies in mice
00:28:15.17	show that when this gene is overexpressed,
00:28:18.06	the mice are short statured.
00:28:20.23	Now, this led me to the hypothesis that,
00:28:24.14	could it be that there could actually be selection
00:28:26.19	for immune function
00:28:28.11	that is indirectly resulting
00:28:30.05	in short stature in Pygmies,
00:28:32.05	because that gene plays an important role in both.
00:28:35.29	And we need to do further functional studies,
00:28:38.20	and look at differences in gene expression
00:28:40.13	to test this hypothesis.
00:28:44.04	The last study I wanna tell you about is a study
00:28:46.20	in which we sequenced the entire genomes,
00:28:49.15	at high coverage,
00:28:51.07	of 15 African hunter-gatherers,
00:28:53.22	including 5 Pygmies,
00:28:55.28	5 Hadza,
00:28:57.10	and 5 Sandawe.
00:28:59.26	We identified over 13 million variants,
00:29:02.29	3 million of which are completely novel;
00:29:05.29	they have never previously been identified.
00:29:08.13	And that's just from 15 individuals,
00:29:10.14	so you can imagine how much variation is out there.
00:29:13.16	Many of these are novel variants...
00:29:15.27	many of these novel variants are in known regulatory sites.
00:29:21.04	So now, combining the two studies,
00:29:24.08	we wanted to ask the question,
00:29:26.03	which pathways are enriched for genes near targets of selection?
00:29:29.16	And these enriched pathways
00:29:31.25	include genes involved in neuro-endocrine signaling,
00:29:35.01	reproduction,
00:29:36.06	metabolism,
00:29:37.11	and immune function,
00:29:38.22	and interestingly, based on the whole genome sequencing study,
00:29:42.08	we saw an enrichment for genes
00:29:44.06	that play a role in pituitary function in Pygmies,
00:29:47.13	including follicle-stimulating hormone receptor,
00:29:50.13	growth hormone receptor,
00:29:52.11	HESX1, which I'll tell you more about in a moment,
00:29:55.11	and thyrotropin-releasing hormone receptor.
00:29:58.15	In fact, TRHR was one of the biggest hits
00:30:02.13	that we saw in terms of these studies of selection.
00:30:05.17	And what's interesting is that this gene
00:30:08.22	plays an important role in the hypothalamic-pituitary-thyroid axis,
00:30:12.28	influencing a number of traits that could potentially
00:30:15.14	be of adaptive significance in Pygmies.
00:30:18.26	And also of interest was that anthropologists
00:30:21.18	have noted that there is a significant difference
00:30:24.23	in the prevalence of Goiter
00:30:27.00	among Pygmies and neighboring Bantu groups.
00:30:29.24	So the Pygmies have a much lower frequency of Goiter
00:30:33.16	compared to the neighboring Bantu populations,
00:30:36.16	and this could reflect a biological adaptation in Pygmies
00:30:41.20	to a low iodine environment.
00:30:43.24	It's very deleterious to get Goiter
00:30:46.22	because it can also lead to a diseased called Cretinism,
00:30:49.27	which of course is going to be very deleterious.
00:30:52.18	So again, here's an example
00:30:54.10	where something like adaptation to diet
00:30:56.13	could indirectly influence growth
00:30:58.28	or other phenotypes in the Pygmy population.
00:31:04.01	The last thing we wanted to do
00:31:06.01	was to look for regions of the genome,
00:31:08.08	using the whole genome sequencing data,
00:31:10.13	that are specific to Pygmies,
00:31:12.20	and those are shown in green here.
00:31:16.02	Now, we identified 25 clusters in the genome,
00:31:19.23	and the largest cluster
00:31:22.27	was right in that same region of chromosome 3
00:31:25.14	that we had previously identified.
00:31:28.00	But we had missed it in the prior study,
00:31:30.11	and the reason why is because
00:31:32.17	it contains these Pygmy-specific variants,
00:31:35.08	that were not captured by the SNP array that we used,
00:31:39.17	and thus demonstrating the great importance
00:31:42.00	of doing resequencing for identifying novel
00:31:44.24	and potentially functionally important variation
00:31:47.15	in ethnically diverse populations.
00:31:50.28	Now, this cluster consisted of
00:31:55.10	44 SNPs in 100% association with each other
00:31:59.16	over 170,000 nucleotide,
00:32:03.06	shown here,
00:32:05.24	and it contained a very interesting candidate gene called HESX1.
00:32:10.10	HESX1 codes for a transcription factor
00:32:13.05	that plays a very important role
00:32:15.04	in regulating the development
00:32:17.15	at the anterior pituitary in the brain,
00:32:20.14	and that's the site of production of growth hormone,
00:32:22.23	as well as other reproductive hormones.
00:32:25.11	Now, interestingly,
00:32:27.06	we identified a non-synonymous,
00:32:29.28	so an amino acid change, basically,
00:32:33.23	in this gene
00:32:36.03	that had been previously associated
00:32:38.13	with idiopathic short stature in humans.
00:32:41.26	But it turns out that this varian
00:32:44.01	t is present at about a 20% frequency in other Africans.
00:32:47.12	So what we hypothesize is that
00:32:49.13	there's something about this region
00:32:51.22	that may be altering gene expression of HESX1
00:32:55.07	or other genes in that region.
00:32:58.01	Upstream, we found another cluster
00:33:01.18	near this gene POU1F1, also known at Pit-1 in mouse,
00:33:07.13	and again this codes for a transcription factor
00:33:09.18	that plays a critical role in regulating growth hormone expression.
00:33:14.23	So another excellent candidate gene.
00:33:17.28	Now, what is interesting is that
00:33:19.27	both of these clusters, or genes,
00:33:23.18	are amongst the most differentiated regions
00:33:26.27	of the Pygmy genomes,
00:33:28.27	compared to genomes from elsewhere in Africa.
00:33:31.29	So we then picked out some of the SNPs in these regions
00:33:37.13	and genotyped them in a larger set
00:33:39.19	of western and eastern Pygmies,
00:33:41.26	and we showed that they are statistically
00:33:44.02	associated with short stature in Pygmies.
00:33:47.29	So the next step is going to be
00:33:49.24	to try to make transgenic models
00:33:52.01	that express these variants using transgenic mouse models,
00:33:56.06	and see what the phenotype looks like.
00:34:00.19	So that leads us to a number of hypotheses.
00:34:03.19	One, is that alterations in the growth hormone/IGF1 pathway
00:34:07.15	play a role in the short stature trait in Pygmies.
00:34:13.01	Two, is that anterior pituitary hormones
00:34:15.10	may play a central role in the Pygmy phenotype,
00:34:18.09	influencing growth, reproduction,
00:34:20.15	metabolism, and immunity.
00:34:24.00	And thirdly, that short stature
00:34:26.16	could be a byproduct of selection
00:34:28.11	acting on pleiotropic loci.
00:34:31.04	So if we look here,
00:34:32.21	one of the candidate loci that we identified is HESX1.
00:34:36.13	That's going to influence expression and development
00:34:39.20	of the anterior pituitary,
00:34:42.02	site of production of growth hormone.
00:34:44.20	Growth hormone expression is also regulated
00:34:46.23	by this other gene we found, POU1F1.
00:34:50.04	And this CISH regulates growth hormone receptor.
00:34:54.17	Now, if we look at the downstream effects
00:34:56.24	of growth hormone,
00:34:59.07	growth hormone, when it binds to growth hormone receptor,
00:35:02.18	will trigger off expression of IGF1,
00:35:06.12	predominantly from the liver, but from other tissues as well.
00:35:10.06	IGF1 will have an effect on muscle growth
00:35:13.14	and also on bone growth and height,
00:35:16.02	but the other impact, or the other role of growth hormone
00:35:20.12	is that it also influences insulin metabolism,
00:35:24.06	it influences fat metabolism.
00:35:28.01	And then we know that infectious disease
00:35:30.01	alters immune response and cytokine levels,
00:35:33.08	and that these can influence gene expression from CISH,
00:35:36.11	or other genes that are in this pathway.
00:35:40.09	So, when we go back to Africa to study the Pygmies,
00:35:42.28	what we would ultimately like to do next
00:35:45.16	is to measure all of the phenotypes,
00:35:48.01	because if you want to understand something
00:35:50.04	like the evolution of short stature in Pygmies,
00:35:52.19	I think you can't just be looking at stature
00:35:55.09	because the growth hormone pathway
00:35:58.25	plays a role in all of these different traits,
00:36:01.01	so we need to be looking at this as an integrative picture.
00:36:06.01	And in fact, our approach in the future
00:36:08.26	is to use an integrative genomics approach
00:36:11.24	combining whole genome data,
00:36:14.15	data on protein variation from blood,
00:36:17.25	epigenetic variation,
00:36:19.21	which can be influenced by diet and environment,
00:36:22.12	gene expression,
00:36:24.10	we're starting to look at the microbiome,
00:36:27.16	which is the spectrum of bacteria in the gut,
00:36:32.05	because that can not only be influenced by diet,
00:36:35.16	it can also have an influence on the metabolome,
00:36:38.12	or the set of all the metabolites, for example,
00:36:40.27	in blood.
00:36:42.20	And we want to combine that information
00:36:44.25	together with information on diet
00:36:46.22	and other environmental factors,
00:36:48.29	to try to identify genetic and environmental factors
00:36:52.15	that play a role in short stature
00:36:55.05	and in other anthropometric,
00:36:56.25	cardiovascular,
00:36:58.01	and metabolic traits.
00:37:00.20	One of the other approaches we can take
00:37:02.20	to distinguish the role of genetics and environment is, for example,
00:37:06.00	to look at individuals of the same or similar ethnic background,
00:37:10.29	but living in an urban versus a rural environment.
00:37:16.21	We can also take a different...
00:37:18.14	the opposite approach.
00:37:20.00	We can look at individuals who have
00:37:22.06	very different genetic ancestries,
00:37:25.03	but live in similar environments.
00:37:27.13	So for example,
00:37:29.20	this is a girl who is from the Fulani population,
00:37:33.17	and here's a neighboring...
00:37:35.19	an individual from the Tupuri population.
00:37:38.26	So they are genetically very differentiated,
00:37:41.20	but live in a similar environment,
00:37:43.16	yet the Fulani seem to have some innate resistance
00:37:47.06	to malaria infection.
00:37:50.03	By contrast, in the San,
00:37:53.09	from southern Africa,
00:37:54.29	are very differentiated from the Bantu,
00:37:57.15	but the San seem to have an innate susceptibility
00:38:01.09	to TB infection.
00:38:03.20	So again, by contrasting populations with different ancestry,
00:38:07.26	and living in different environments,
00:38:09.11	we may identify clues about the genetic basis
00:38:12.10	of differences in phenotypic variation
00:38:14.26	and disease susceptibility.
00:38:17.23	So in conclusion,
00:38:20.20	Africans have the highest levels of genetic diversity
00:38:23.04	within and among populations.
00:38:26.28	The demographic history of Africans
00:38:29.00	and local adaptation to different environments
00:38:31.04	has resulted in population
00:38:33.01	or region specific genetic variation.
00:38:36.25	And we need to be including
00:38:38.21	ethnically diverse Africans in genomic studies
00:38:41.17	to better identify both unique rare, and common variants
00:38:45.28	which may be of functional importance,
00:38:47.28	including those that play a role in disease risk
00:38:50.13	in these populations.
00:38:52.14	And I will just end by thanking
00:38:54.04	the many individuals
00:38:55.25	who contributed to these studies,
00:38:57.29	and my funding agencies,
00:39:00.16	and particular thanks to the Africans
00:39:02.20	who have contributed to these studies.

This material is based upon work supported by the National Science Foundation and the National Institute of General Medical Sciences under Grant No. 2122350 and 1 R25 GM139147. Any opinion, finding, conclusion, or recommendation expressed in these videos are solely those of the speakers and do not necessarily represent the views of the Science Communication Lab/iBiology, the National Science Foundation, the National Institutes of Health, or other Science Communication Lab funders.

© 2023 - 2006 iBiology · All content under CC BY-NC-ND 3.0 license · Privacy Policy · Terms of Use · Usage Policy
 

Power by iBiology