The Vertebrate Retina, Photoreceptors, and Color Vision

Transcript of Part 2: Photoreceptors and Image Processing II

00:00:09.14 Let's look at the arrangement of photoreceptors within the retina and explore the
00:00:14.20 significance of that arrangement for the functioning of the system.
00:00:18.09 Here's an en face view of a primate retina.
00:00:22.03 We see each of these white spots is a single rod photoreceptor cell
00:00:26.25 These are the closely packed ones, and the clear spaces with a single white spot
00:00:32.27 within it are the cone photoreceptor cells.
00:00:35.06 You can see that it's essentially a sea of rods with cones interspersed like little pebbles.
00:00:40.06 Now this is what the peripheral retina looks like - the region that's off to the side.
00:00:45.22 The retina overall is not a homogeneous structure
00:00:50.16 with respect to photoreceptor topography.
00:00:53.23 In the primate retina, including the human retina,
00:00:56.09 if we were to map across the width of the retina from one side to the other
00:01:03.00 the density of the different photoreceptor cell types,
00:01:06.16 you would see a very distinctive distribution.
00:01:10.00 So, for example, the rod photoreceptors are very abundant (that's these dots here)
00:01:16.12 are very abundant in the peripheral retina.
00:01:18.12 The cone photoreceptors down at the bottom are relatively rare in the peripheral retina,
00:01:23.20 as we saw on the previous slide.
00:01:25.24 But, towards the central retina, and especially at the
00:01:29.20 very center of the retina, called the fovea,
00:01:31.29 the cone density rises to a very high peak,
00:01:37.09 and the rod density correspondingly goes down
00:01:40.20 to essentially zero.
00:01:42.10 That's a very striking arrangement. Why is that?
00:01:46.21 It turns out that the retina of primates like ourselves
00:01:53.06 is really a two-stage structure, in the sense that
00:01:57.03 we have a high acuity zone of the retina, this very central zone,
00:02:02.25 which encompasses just a tiny fraction of the retinal surface area... less than 1%.
00:02:08.06 And the rest of the retina really exists simply to alert the central retina
00:02:14.09 that something interesting has happened.
00:02:15.22 Whenever we look at something that we're interested in,
00:02:19.20 what we do is we turn our eyes towards that object.
00:02:22.22 And, we do that because we want the object to be imaged
00:02:26.08 on the central region of the retina.
00:02:28.18 Let's look at one simple consequence of this specialization within the retina.
00:02:33.29 And that is the eye movements that allow different regions of the world in which
00:02:40.21 we're interested in to be sequentially imaged on the central retina.
00:02:43.26 And there's no better example than reading text.
00:02:46.00 So, if you're reading the page of, for example, a novel,
00:02:49.26 and you look at the eye movements that correspond to that process,
00:02:55.29 you'll see that as we look here on the horizontal axis at eye position
00:03:00.13 and on the vertical axis at time,
00:03:02.16 if we look at the way the eye moves, we'll see that it's a series of stops and starts
00:03:08.12 where the eye is initially focused on one location,
00:03:12.18 and then there's a rapid movement to a new location,
00:03:15.01 a brief pause at that place and then a rapid movement to yet another location
00:03:19.06 and so on as we read from word to word across the page.
00:03:23.20 And, of course, when we get to the end of the line, then we reset all the way back
00:03:27.18 to the beginning of the next line.
00:03:31.03 Why are we doing that? Why can't we just stare straight ahead and read?
00:03:34.29 And the answer is that, when you're reading,
00:03:37.01 you need to see the words... the letters of interest
00:03:41.02 at high acuity. And the only way to do that
00:03:43.14 is to have them fall on the very central retina.
00:03:45.24 So, your eye must continually be moving to allow
00:03:49.20 the object to interest to fall in that most central region, the fovea.
00:03:54.22 Really, the rest of the retina exists simply to get our attention
00:04:00.09 and tell us about an object that we might want to image on the fovea.
00:04:04.21 For example, if you see a fly in your peripheral vision
00:04:08.01 buzzing around the room, that will grab your attention,
00:04:10.20 and you'll perhaps turn your eye towards the fly,
00:04:13.14 but you won't really get a good look at it until you turn your eye towards it.
00:04:17.20 Now let's look at eye movements in a somewhat more natural context.
00:04:22.12 And this tells us also something not only about the eye, but about the brain...
00:04:25.23 that is, about which regions of a given scene
00:04:28.16 (perhaps a more natural scene than reading of text)
00:04:31.01 grabs our attention.
00:04:32.13 And so here, for example, we're looking at eye movements
00:04:36.00 that correspond to looking at faces.
00:04:40.16 So, the top two photographs are the images that were observed,
00:04:46.04 and the bottom data points indicate where the eyes were pointed
00:04:52.16 during viewing of that image.
00:04:54.00 So, when looking at a face... for example, this young lady's face.
00:04:58.07 You can see that the eyes of the observer tend to look at her eyes,
00:05:04.11 and they also look at her mouth, they look at her nose a little bit,
00:05:06.26 maybe around the face.
00:05:07.28 If we look at a face in profile, again, you can see from where the eyes were pointing,
00:05:12.14 that they tend to look at the edges of the image--
00:05:18.02 the nose, the eyes, the chin, the mouth, and so on,
00:05:20.24 and largely neglect the central regions --
00:05:24.04 the cheeks, the neck, and so on that are perhaps of lower interest
00:05:28.11 and where less is going on.
00:05:30.09 This is a highly selective viewing strategy,
00:05:34.03 and so it really means that we're sampling the world in a way that is
00:05:38.26 far from random and far from complete.
00:05:42.00 We're sampling just subsets of any given scene.
00:05:45.17 Now, the fact that the central retina is the only place
00:05:52.08 where the image is seen with high acuity
00:05:55.21 means that, if you're going to examine, say, a non-moving object,
00:06:00.18 and you're doing it in the context of a head and a body that are moving,
00:06:05.17 perhaps, because you're walking or running or just trying to stay still,
00:06:09.11 but you're not entirely still,
00:06:10.18 there must be a mechanism that allows the eye to stabilize itself
00:06:15.11 in the context of this moving holder that is the rest of the body.
00:06:19.21 And, in fact, that mechanism exists, and it's a very powerful one.
00:06:23.22 Here's an example just showing how that sort of movement compensates
00:06:29.05 for head and body movements during walking.
00:06:31.14 Again, on the horizontal axis, we have the angular degree,
00:06:36.26 and we have plotted here, in red, the head movements associated with walking.
00:06:42.13 Of course, there's some swaying back and forth with each step.
00:06:45.00 And, as time goes on, moving up this chart,
00:06:48.28 we see that the head movements in red
00:06:53.05 and the eye movements within the head in green
00:06:56.01 are nearly mirror images of each other.
00:07:00.16 That is, the eye is moving within the socket to almost perfectly compensate
00:07:07.10 for the back and forth movement of the head.
00:07:09.26 It's completely unconscious - we do it without thinking about it.
00:07:12.29 But, the result is that the eye is pointing nearly perfectly straight ahead
00:07:18.29 during the entire walk
00:07:21.23 by virtue of this moment to moment sense of where it is deflected.
00:07:27.05 If it's deflected a little to the left, it will move a little to the right.
00:07:29.24 If it's deflected a little to the right, it will move to the left,
00:07:32.06 and this, of course, keeps the world looking relatively still
00:07:36.05 despite the fact that we are moving around.
00:07:39.27 Now let me say one final thing about the arrangement of photoreceptors
00:07:45.15 within the retina... this non-random arrangement.
00:07:49.02 And that now relates to the distribution of the different cone photoreceptors.
00:07:53.26 We haven't talked about the different kinds of cones.
00:07:56.07 We're going to talk in detail about that in the second and third lectures.
00:07:59.16 But, in the human retina, let's just for the moment note
00:08:03.19 that there are three different cone classes.
00:08:05.29 One most sensitive to longer wavelengths,
00:08:08.23 one most sensitive to medium wavelengths of light,
00:08:10.24 and one most sensitive to shorter wavelengths of light.
00:08:12.24 The medium and longer wave receptors (and I'll call these M and L),
00:08:17.06 are, in fact, rather similar to one another, with largely overlapping absorbances,
00:08:23.08 whereas the shorter wave (which I'll call S),
00:08:26.08 is really quite different. It absorbs at wavelengths quite down
00:08:30.18 towards the short end of the visible spectrum.
00:08:33.13 And, as a consequence, any given image which contains
00:08:38.16 wavelengths throughout the visible spectrum,
00:08:41.12 and which therefore would be exciting all the different photoreceptor cells,
00:08:45.26 will, when it arrives at the retina, not be perfectly focused.
00:08:50.13 Let's think about why that is.
00:08:51.24 It's because any lens, including our own lens,
00:08:55.17 has some degree of what's called chromatic aberration.
00:08:58.16 That is, the image being focused for a given wavelength,
00:09:03.05 is not going to be focused perfectly for some other wavelength.
00:09:07.24 And so, for example, as shown here, in this upper image,
00:09:12.25 if this word red, in red letters, is focused perfectly on the retina,
00:09:18.02 so that it's sharply focused and a point in the outer world
00:09:21.19 comes to a point on the retina,
00:09:23.01 then blue light, coming from the same region of space,
00:09:28.07 would be focused actually ahead of the retina,
00:09:30.18 so that the word blue here, as shown on the right,
00:09:33.09 would be a bit blurry - a bit out of focus.
00:09:35.23 How does the retina deal with this problem?
00:09:38.11 It turns out that in the very central fovea, the S cones have been excluded.
00:09:43.19 Only M and L cones populate that region.
00:09:47.02 That's the region with the very highest acuity.
00:09:50.00 And, what the retina has done... what the eye has done,
00:09:52.24 is simply eliminate the short wave end of the spectrum
00:09:59.00 for that most high acuity zone of the retina.
00:10:03.12 So, by taking a narrow cut of wavelengths,
00:10:07.05 we're effectively getting a sharper image,
00:10:09.01 of course at the price of having a less colorful image.
00:10:12.25 Now let's talk about how the image is processed in the retina.
00:10:19.07 We mentioned at the very beginning of the lecture,
00:10:20.27 that the retina is really different from a film in a camera,
00:10:24.16 in the sense that it is not just detecting the image,
00:10:29.03 and sending that detected image to the brain,
00:10:32.10 but it's processing it and extracting from it those aspects
00:10:35.28 that are of greatest interest to the organism.
00:10:37.24 Let's look at one of those feature extractors here.
00:10:41.19 This can be illustrated with this classic illusion, the so-called Hering illusion.
00:10:45.10 I hope you can see that at the intersections, where these white streets come together,
00:10:50.21 there are illusory black dots.
00:10:53.21 Actually, not on the one that you might focus on,
00:10:56.10 but on the ones that strike your peripheral retina.
00:10:59.01 I'll leave it as a homework exercise to figure out why that should be so.
00:11:03.22 But, if you can't see these as clearly as I can,
00:11:07.29 I invite you to produce an illusion of just this sort on a home computer.
00:11:12.04 You can do it either with black squares and white streets in between
00:11:15.20 or the reverse, if you do it with white squares and black streets,
00:11:19.02 you'll see little white dots at the intersections of the streets.
00:11:22.04 And this illusion has been known for well over a century,
00:11:26.08 and it turns out to be fully explained by a peculiar type of spatial organization
00:11:34.06 of the receptive fields of ganglion cells.
00:11:37.11 Now, let me define receptive field.
00:11:38.27 That refers to that zone of primary photoreceptor input
00:11:45.12 which influences the output of the retinal ganglion cells.
00:11:48.07 The classic retinal ganglion cell receptive field is as shown here.
00:11:53.07 There's a little zone of retina (So, we're looking now en face at the retina...)
00:11:56.22 There's a little zone of retina where illumination excites the ganglion cell
00:12:01.12 and would increase, for instance, the firing of action potentials.
00:12:04.15 And then there is an annulus around it (a donut)
00:12:07.29 where light activation of photoreceptors actually inhibits the ganglion cell.
00:12:15.08 This is essentially a contrast detector.
00:12:18.18 The ganglion cell is looking for spatial differences in illumination in the retina,
00:12:23.26 and let's see how this explains that illusion.
00:12:28.17 Here I've shown, superimposed on a little region of that Hering illusion
00:12:34.28 one of the receptive fields of a ganglion cell,
00:12:39.00 where, in the center, the ganglion cell is looking at the light
00:12:44.15 that falls at one of these intersections
00:12:46.05 between white streets, and I think we can see that for this particular cell,
00:12:50.17 not only is the center being illuminated, which would tend to excite the cell,
00:12:53.16 but a rather substantial chunk of the surround on the right and left sides
00:12:59.02 above and below is also being illuminated, so the result would be
00:13:02.25 that this cell would be substantially suppressed
00:13:06.00 by that illumination of the inhibitory surround.
00:13:09.20 What happens if we look at a ganglion cell thatâ€™s little off to the right?
00:13:14.15 Its center is also fully illuminated, but now I think you can appreciate
00:13:18.16 a bit less of the surround is illuminated than was the case for the first cell.
00:13:23.18 But now, if we look at a cell that is even further off to the right,
00:13:28.15 where we see the center is illuminated,
00:13:31.01 but now the surround is only minimally illuminated...
00:13:34.07 just this horizontal region here is receiving light,
00:13:38.10 I think we can appreciate that this cell would be substantially more active,
00:13:41.18 because the central region being fully illuminated is countered by
00:13:46.14 only a modest amount of inhibitory illumination just from the left and right sides.
00:13:51.04 So, the brain would perceive, or the retina would perceive,
00:13:56.28 this zone of white street as brighter, relative to the zone over here,
00:14:04.08 because the ganglion cells, seeing the street that's off to the right
00:14:08.09 or off to the left side,
00:14:10.10 would have greater overall activity than would the ganglion cell
00:14:15.11 that was centered over this intersection here.
00:14:20.12 And that is the source of that illusion of blackness...
00:14:24.02 that sort of fuzzy blackness at the intersections between the streets.
00:14:27.16 I should just say that this idea of center-surround ganglion cells was really predicted
00:14:34.09 by this illusion many years before those cells
00:14:36.28 were actually identified electrophysiologically.
00:14:39.21 Now this leads us to a larger question related to image processing in the visual system,
00:14:52.06 and here we're going to move a little bit beyond the retina, into the brain.
00:14:55.00 But, if what the retina and also the brain are looking for,
00:15:00.01 among other things, are differences
00:15:02.11 between one region of the image and another region,
00:15:06.07 as illustrated by those center-surround ganglion cells,
00:15:09.01 then it stands to reason that schematics of the sort that bring out those differences
00:15:17.22 might resonate in some way - might have a special meaning
00:15:21.02 in terms of their information content for our visual system.
00:15:25.03 And that is exactly the case.
00:15:26.12 So, for example, we see here on the left, a photograph of a hand
00:15:30.29 with fingers about to snap, and on the right,
00:15:34.02 we see a schematic here, just a line drawing of the same kind of hand
00:15:40.08 with the fingers about to do like this (snap).
00:15:43.11 And, we immediately recognize in this simple line drawing what's going on.
00:15:48.06 But that's actually kind of surprising that we should recognize
00:15:52.21 this image of a hand so readily,
00:15:55.19 after all, on a pixel by pixel basis, this little schematic
00:15:59.14 bears very little resemblance to the photograph.
00:16:02.03 The schematic shows black lines that outline each object -
00:16:08.04 each finger and the palm of the hand,
00:16:09.27 but the real world isn't like that at all.
00:16:12.09 In fact, the real world is made of shapes with shadows
00:16:15.22 and complex changes in illumination--
00:16:18.13 as shown by that photograph--very different on a pixel by pixel basis
00:16:22.04 from this schematic.
00:16:23.17 Why is it that the schematic is so immediately recognizable?
00:16:27.28 I think almost certainly it's because the schematic is essentially the processed version
00:16:33.17 that our retina and our brain has created from the real image.
00:16:37.29 We've fed the visual system by giving it this schematic
00:16:41.20 a version which is already part way up the visual chain of command.
00:16:46.03 Let's see another example of this.
00:16:48.04 Here, in this self-portrait by Pablo Picasso,
00:16:51.27 we see the remarkable effectiveness of simple portrait sketches.
00:16:56.21 Why is this so immediately recognizable as a face?
00:16:59.15 And, in fact, it's recognizable as Picasso's face.
00:17:01.21 Again, even though, on a pixel-by-pixel basis, it's very different
00:17:06.01 from a real human face, it has abstracted the information
00:17:10.16 in just the way that our brain abstracts it as well.
00:17:13.10 Now there are other things that the visual system is interested in,
00:17:17.17 besides changes in intensity over space.
00:17:22.08 It's also interested in changes over time and the combination of time and space.
00:17:28.11 Let's look at an example of time changes.
00:17:31.16 Here are the responses of a retinal ganglion cell in the rabbit retina,
00:17:37.06 to an object moving in various different directions.
00:17:40.17 The directions of motion across the receptive field
00:17:42.28 are indicated by the arrows in the center here,
00:17:45.23 and the responses that go with each of those directions of motion
00:17:49.14 are indicated by the cluster of action potentials
00:17:54.04 recorded from that cell that are indicated adjacent to the corresponding arrow.
00:17:59.07 Now, I think you can appreciate that motion in the upward direction
00:18:05.00 and directions that are close to it are especially effective,
00:18:08.04 but motion in the downward direction gives essentially no response.
00:18:12.13 So, this is a cell that is interested in movement,
00:18:16.05 and in particular, in movement in a given direction.
00:18:19.14 Having cells of this kind makes perfect sense in a visual system.
00:18:24.29 Because of course, things that are moving in the world are of interest to the organism.
00:18:29.02 If you're a rabbit, you're interested in things like hawks that might be moving,
00:18:33.14 and so having cells that respond selectively to motion
00:18:37.21 as opposed to simply the world that is unchanging,
00:18:41.11 has obvious selective value.
00:18:43.04 And I think we can all appreciate that in our own, everyday experience,
00:18:46.24 For example, if a fly or some other object is moving in the peripheral visual field,
00:18:52.06 that immediately catches our attention and it does so in a way
00:18:56.21 that non-moving objects do not.
00:18:58.17 Now this brings us to a potential paradox.
00:19:02.07 As we saw a few minutes ago, the head and body are generally moving
00:19:07.15 around in space as we walk or go around our daily business,
00:19:12.09 and even though there are compensatory eye movements
00:19:14.13 which work quite well to keep the eye largely oriented in space,
00:19:20.00 despite those movements, the compensatory eye movements are not
00:19:23.19 perfect, and so the eye is constantly wiggling around back and forth in space,
00:19:28.19 and therefore the image which falls upon the retina
00:19:31.20 is constantly wiggling back and forth in the reverse direction.
00:19:36.06 So, if our retinas were composed of direction-selective cells
00:19:43.10 of the sort we've seen here,
00:19:44.21 and these cells are the ones that are supposed to tell us
00:19:48.06 what's moving out there in the world,
00:19:49.28 we have a problem that these cells would be constantly stimulated simply by
00:19:54.23 the motion that's referable to head, body, and eye motion, that is self motion.
00:20:00.18 What's the solution to that?
00:20:02.16 It turns out that the solution is a yet cleverer set of cells.
00:20:07.12 Again, motion-selective cells, but they respond only
00:20:11.28 to local motion and not to global motion.
00:20:14.26 And so, for example, if we deliver to one of these cells, within its receptive field,
00:20:22.11 a grating of alternating black and white stripes which are moving from left to right,
00:20:28.09 so they are just flowing across the visual field,
00:20:32.20 the cell gives a very brisk response if that's all that is being presented to the retina.
00:20:39.14 That is, the surrounding region of retina is being presented simply
00:20:43.03 with a non-moving grey background.
00:20:45.16 The cell loves this stimulus and responds vigorously.
00:20:49.09 But what if we simply unmasked the rest of the retina
00:20:54.29 and show it the same set of bars moving from left to right.
00:20:58.20 Now, the receptive field of the cell--this central circular region--
00:21:04.10 hasn't changed at all in terms of the stimulus being delivered to it.
00:21:07.26 Completely unchanged.
00:21:10.14 The only thing that's different is that now the surround is doing exactly the same thing.
00:21:14.12 What happens? The cell immediately falls silent.
00:21:18.03 That is, this is a cell which wants to see a little region of the world moving
00:21:26.07 against a background that is not moving.
00:21:28.25 And if everything moves at the same time, that's not an interesting stimulus.
00:21:33.07 This is perfect because this is just the kind of cell that would tell a rabbit,
00:21:38.13 for example, if a hawk were flying across the sky,
00:21:42.10 but not if the rabbit had turned its eye a little bit,
00:21:46.01 and all the trees and bushes and everything else whose images
00:21:50.06 are impinging on the retina
00:21:51.08 had moved across the retinal surface.
00:21:54.11 And, of course, that's just what the rabbit wants to know.
00:21:56.21 There's a very beautiful illusion -- the so-called Ouchi illusion,
00:22:00.26 named after the Japanese artist who initially derived this
00:22:06.02 which nicely illustrates this kind of motion-sensitive ganglion cells effect
00:22:11.21 on our visual system.
00:22:12.25 In the Ouchi illusion, and I've shown them here at two different magnifications,
00:22:18.01 because depending on how you're viewing this lecture,
00:22:20.21 one or the other might be more effective.
00:22:22.27 In the Ouchi illusion, there are a series of horizontal black and white bricks
00:22:28.21 which encompass the surrounding zone and are most of the image,
00:22:33.10 and then in the circular center is essentially the same thing,
00:22:37.01 except it's been flipped vertically.
00:22:38.27 And I think you can appreciate that the central region seems to float around
00:22:42.28 and bounce back and forth in a way somewhat different from the surround,
00:22:46.19 as if it's almost autonomous from the surround.
00:22:49.08 The origin of that motion is the spontaneous eye movements which we have all the time.
00:22:56.17 Our eyes are constantly jiggling around in their sockets.
00:22:59.03 And, because of that jiggling, the direction-selective ganglion cells
00:23:06.01 are differentially activated by the center versus the surround of this illusion.
00:23:12.06 So, for example, if the eye were to make a horizontal excursion,
00:23:18.16 these horizontal brick stimuli would be less provocative to those cells on average.
00:23:24.12 Why? Because a small horizontal movement leaves much of the visual field
00:23:29.27 with the same stimulus - the same little black zone that was stimulated
00:23:33.16 when the eye was at one location will now, if the eye moves a tiny bit horizontally,
00:23:38.10 still be stimulating in a black region of the new location.
00:23:44.08 But the vertically arranged bricks will have a relatively larger effect
00:23:48.29 for a given horizontal eye movement
00:23:50.18 because that little movement will be more likely to shift the gaze
00:23:54.27 from, say a black zone to a white zone or a white zone to a black zone.
00:23:58.10 Now, I think you can appreciate that the reverse is true for vertical eye movements.
00:24:01.27 So, that causes the visual system to make this segregation of the central circular zone
00:24:10.18 from the surround and it gives it this sort of bouncing back and forth...
00:24:14.07 this sort of autonomy that grabs our attention and makes this illusion so effective.
00:24:19.01 To continue with our theme of looking at how the visual system
00:24:24.07 extracts information from the scene,
00:24:27.06 we can ask whether having two eyes gives us some special advantage
00:24:31.00 over an organism that might have only one.
00:24:34.23 And the answer is yes, having two eyes does confer a special advantage.
00:24:37.27 And, in particular, it allows us to determine the depth of an object.
00:24:42.11 That is, its distance away from ourselves,
00:24:45.07 using the mechanism called stereoscopic depth perception.
00:24:49.05 Let's see how that works.
00:24:50.24 So, if we have two points in the visual world,
00:24:55.10 one here on the right, and one here on the left,
00:24:58.12 I think you can see by simple geometric optics
00:25:00.25 would be imaged on the retinas of the two eyes
00:25:04.04 shown down at the bottom.
00:25:06.11 And, of course, the left point is imaged to the right on the eye,
00:25:10.26 because the image is reversed when it's imaged on the retina,
00:25:14.09 and the right point is imaged on the left side.
00:25:16.12 And those two images fall on what would be called corresponding points.
00:25:21.07 There are corresponding points for the right spot,
00:25:24.19 and there are corresponding points for the left spot.
00:25:26.23 So, there's nothing surprising here.
00:25:29.08 But, now let's ask, what if there is a pair of points that differ
00:25:35.21 not by virtue of their position left and right,
00:25:38.29 but by virtue of the fact that one is further and one is nearer to the viewer.
00:25:43.02 Now, if you think about it, what the geometric optics tells us
00:25:48.17 is that the innermost point is imaged further to the periphery
00:25:56.07 on the left retina and also further to the periphery on the right retina.
00:26:01.00 That is, the corresponding points for this closer object
00:26:08.07 lie on opposite sides of the corresponding points for the further object.
00:26:13.19 That's this more distant one here, which is now imaged on points
00:26:19.03 that are closer in - more nasal - on the two retinae.
00:26:23.00 Now that is a bit more confusing, perhaps, for the brain to figure out.
00:26:30.10 What the brain has to do is figure out that the image that it sees for that nearer point
00:26:37.29 on the left retina and on the right retina - those two images
00:26:41.25 are really representing the same object. This is the so-called correspondence problem.
00:26:46.02 And because it falls on these different sides of that further object's image,
00:26:51.29 therefore, it must be nearer.
00:26:56.09 The nearer one can be detected as nearer by virtue of the fact
00:27:01.13 that its corresponding points are lateral,
00:27:04.06 and the further one can be detected as being further
00:27:07.11 by virtue of the fact that its corresponding points are more nasal.
00:27:10.21 Now that's a very difficult computational problem, as it turns out.
00:27:15.00 It's not fully understood how it's done.
00:27:17.03 But, it's a very powerful system, and it gives us very accurate depth perception.
00:27:22.16 As we consider further how the brain analyzes the visual scene,
00:27:29.00 we eventually reach a point where that information is so complex,
00:27:33.26 that the analysis requires not just an assumption-less series of steps,
00:27:40.22 but a series of steps that involve imposing some order
00:27:43.19 related to our previous experience of visual information.
00:27:47.25 Let me give you an example of this.
00:27:49.01 Here we see a way in which we analyze depth from shading,
00:27:53.26 and we have included in that analysis a hidden assumption.
00:27:57.26 And the hidden assumption is that this three dimensional object
00:28:01.20 has been illuminated from above.
00:28:03.18 That's a natural assumption because ,in general, the sun, the classic illuminant,
00:28:09.11 (the classic source of illumination) is above.
00:28:11.20 And so, when we see an object, generally, shadows are below,
00:28:15.22 and the rounded object can be analyzed in part based on the pattern of its shadows,
00:28:22.15 with that correct assumption that the sun is sitting above it.
00:28:27.05 So here we have, just to illustrate this type of analysis,
00:28:31.12 two images of cuneiform writing.
00:28:34.18 In fact, these two images are the same image. The one on the right
00:28:40.08 differs from the one on the left only by virtue of having been rotated 180 degrees.
00:28:45.22 And yet, I think, we can appreciate that the one on the left looks
00:28:50.12 as if the cuneiform is coming out at us from the plane,
00:28:54.15 and the one on the right looks like the cuneiform is receding into the plane.
00:28:58.20 And, of course, the reason is that when we see the shadows on the left,
00:29:03.02 we immediately assume that the sunlight is from above, and therefore,
00:29:07.04 since the shadows are below, the objects are coming out,
00:29:10.08 and on the right side, the reverse -- the shadows,
00:29:14.15 if we assume the illumination is from the top,
00:29:17.11 would tell us that the objects have receded into the plane.
00:29:20.22 OK, that's obviously a learned assumption...
00:29:23.29 it's really an unconsciously learned assumption.
00:29:25.20 And, it's generally a correct one.
00:29:28.21 Let's look at another way in which learned responses
00:29:34.02 affect the way we analyze a scene.
00:29:36.28 Of course, we've moved around in a world of three dimensions,
00:29:40.02 and we're very tuned into the depth of 3D scenes.
00:29:44.18 We see some objects as closer, some as further away,
00:29:47.15 In this little cartoon, we certainly get a strong sense
00:29:51.13 that the walls form a zigzag pattern.
00:29:54.12 coming out and receding. Of course, the gentlemen who are pictured here
00:29:59.01 by virtue of their size reinforce that sense.
00:30:03.20 And now, if we look at these heavy black bars -
00:30:07.00 the one that's here and the one over here to the left,
00:30:10.10 we get a strong sense, because we have assumed their different depths,
00:30:16.29 that the bar that is to the right is a smaller object than the bar that is to the left.
00:30:22.23 And, of course, in the context of this three-dimensional scene, that is perfectly right.
00:30:27.22 But, in fact, if you measure the size of these two bars in this particular image...
00:30:33.16 you simply put a ruler next to this bar and a ruler next to that bar,
00:30:37.28 they're exactly the same size.
00:30:39.19 Now, what's most striking is that once one knows that,
00:30:43.12 even if you hold two rulers next to them,
00:30:46.05 and you prove to yourself that that is the case,
00:30:49.18 you have a very difficult time convincing your visual system that that's true.
00:30:54.05 That differential in size is a very strong and persistent sense,
00:31:00.25 and again, it's a learned sense,
00:31:04.09 and it is learned in the context of depth perception in the real world.
00:31:08.20 Now, let's look at one particularly famous example of this sort
00:31:14.00 of imposing of order upon a scene.
00:31:16.28 This is a commemorative vase that we're seeing here.
00:31:20.17 It was made to honor Prince Philip and Queen Elizabeth,
00:31:23.22 and I think you can appreciate that, not only is it a beautiful white vase,
00:31:28.11 but on one side is the profile of Queen Elizabeth -
00:31:31.29 that is the shape of the vase makes her profile,
00:31:34.23 and on the other side is the profile of Prince Philip.
00:31:37.23 That's the nose right here, here's a nose right here.
00:31:40.16 And, the striking thing about the vase, as you view it and look for those profiles,
00:31:48.22 is that the sense that it is a white vase or the sense
00:31:54.00 that it is two faces looking at each other,
00:31:57.01 seem to be mutually exclusive. That is, you can see one or you can see the other,
00:32:02.18 but it's hard to see them both or see them fully
00:32:06.06 and see them in a dominant way at the same time.
00:32:09.01 And this is just an example, again, of the visual system
00:32:11.29 imposing some order on the scene.
00:32:14.10 It's trying to choose which of those two somewhat exclusive ways
00:32:19.01 of looking at the scene is the correct one.
00:32:20.27 Let's look at one final example.
00:32:23.23 And this is a series of four sketches that Pablo Picasso drew of a bull
00:32:29.02 (one of his favorite subjects).
00:32:30.16 I think you can appreciate that going from the upper left to the lower right,
00:32:35.11 we're seeing an increasingly abstract version of a bull.
00:32:39.00 Yet the striking aspect of this set of drawings is that, despite the abstraction,
00:32:45.02 we immediately recognize it for what it is.
00:32:47.27 Now, like those simple line drawings, I think this begs the question why that should be so...
00:32:53.26 why the visual brain is so good at these sets of abstract images as representing a bull.
00:33:02.18 And, like those line drawings, I think our conclusion would be that
00:33:06.08 what Picasso has hit upon is that these almost child-like drawings
00:33:10.27 are really the internal representation
00:33:13.12 of the bull. And that we have created a simplified, abstracted version in our heads,
00:33:18.11 our memory trace of what a bull is that really is this kind of drawing
00:33:24.13 and therefore resonates immediately with it.