Depth Perception

In this video I describe the many cues that we use to perceive depth and experience a 3D world based on the 2D information from our retinas. These include monocular cues (linear perspective, relative size, texture gradient, interposition, and shading), motion-based cues (motion parallax and optic flow) and binocular cues (disparity and convergence).

Don’t forget to subscribe to the channel to see future videos! Have questions or topics you’d like to see covered in a future video? Let me know by commenting or sending me an email!

Need more explanation? Check out my full psychology guide: Master Introductory Psychology: http://amzn.to/2eTqm5s

For more explanation of 3D movies, check out this blog post.
Video Transcript:

Hi, I’m Michael Corayer and this is Psych Exam Review. In this video I want to explain how we’re able to perceive depth. So we see a world that appears to be three dimensional and yet our retinas are flat surfaces that light is projected onto.

So they can really only see in 2D. So how is it that we’re able to feel like the world in 3D when we look around? These are a number of different cues that our brain uses to figure out how much depth there is in particular situations and how far away things are from us. The first group of cues that we’ll talk about are called monocular cues and that just means “one eye” so these are cues that work even if you only use one eye. We’ll see later there’s other cues that involve the use of both eyes.

So what are these monocular cues? The first we have is linear perspective and this is the idea that parallel lines converge as they travel away from us into the distance. So if you look at the train tracks they appear to join together in the distance even though of course they don’t actually join together, they stay parallel.

Next we have relative size. This is the idea that things that are closer to us appear to be bigger so if we look at two people, if we look down the sidewalk and we see one person that the image on the retina is much larger than the other person, we assume the people are probably roughly the same size but that one person is much closer to us than the other one.

Next we have texture gradient. This is the idea that we can see things more clearly when they’re close to us. We can see textures and so if you’re looking out and you can see lots of details on something, very clearly and crisply, that tells you that it’s probably close to you. It’s not really far off in the distance because if it were, you wouldn’t be able to see it that clearly.

Next we have interposition. This is just the idea that if something blocks the view of another object, then it’s probably closer to you. If I have hand here and you can’t see my face anymore that tells you that my hand is closer to you than my face. Otherwise it wouldn’t block it.

Lastly we have the idea of shading. This is the idea that we use the way that shadows fall in order to tell us about how close things are to us. You use this every time you walk up a flight of stairs or even just as you walk and you see the slight shadow cast by a sidewalk. You can sort of automatically calculate without even thinking about how big of a step you have to take in order to get up that curb.

OK so I thought I would draw a picture here and try to use each of these different cues and see how they would work in an actual picture. Now this obvious this obviously is a flat 2D picture but hopefully it will have some sense of depth. I’m not a great artist so it probably won’t be amazing but we’ll get an idea of each of the cues in practice. Let’s start with linear perspective.

We’ll start with this simple idea that I mentioned here. So here’s a horizon here and then we’ll just have some train tracks which sort of converge off in the distance. This immediately gives us a sense of depth to our picture. We feel like this part of the picture is farther away than this part.

Now we’ll expand on that by drawing some people here. Here’s one person here and here’s another person here. Now using this idea of a relative size we see that this person feels closer to us. It could be the case that this is a tiny person floating in the air here next to this person. But that’s not probably how you’re going to perceive it. You’re going to assume that this person is about the same size as this person just farther away and therefore the image on your retina is much smaller.

Now let’s add some texture gradient. Let’s imagine that we’re looking at some little tree here or something. Actually let’s rather than a tree let’s use a little plant here, it will be a little easier for me to draw. If we imagine some plant here close to us, we can see all of the details, we can see individual leaves on there maybe, we can see the veins on the leaves, I’m not able to draw this in a great amount of detail.

Whereas the plant back here is just a little bit less detailed and the one over here is just like a green spot. We can’t really see much detail at all. Now I’ve also added some relative size here but we can imagine that even if this one was a much larger plant, right, so it’s similar in size to the other one but we can’t see as much detail and that tells us that it’s farther away. Whereas this one we can see all of the fine detail.

So that’s texture gradient. Now we’ll add some interposition here. Let’s add some sign here or something in front of this guy’s face. OK so when we look at this we’re going to assume that this sign is closer to us because it’s blocking our view of this guy here and so that tells us something about the depth.

Finally we’ll add some shading. So I’ll put a little shading into this picture and this is just to give an idea where the light is coming from. Now let’s draw two similar objects here.

Now I’m going to shade these differently. OK so what I’m trying to express here, and this is probably the hardest the thing to try to draw in this, is if you’re looking at these objects, you’d hopefully get the sense that this one is a hole. This is an indentation, the way the shadow is falling on this part of it here. We don’t usually see the shadows on top of things so that tells you that there’s some depth going down. Whereas this one where the shadow is on the bottom here tells you that this is rock that you could trip over. That this is something that’s sort of sticking out in the world whereas this is something that’s sticking in.

So that’s one way that we use shading. As I said, you use this type of thing when you look at a flight of stairs, the way that the shadows fall on the steps tell you which part is a step which part is a vertical part, which part is a horizontal part you can step on. You do that unconsciously, without thinking about it, you just see the stairs and you immediately know how to walk up them. So those are all of the monocular cues.

There’s a couple other cues, actually these next two are also monocular but these involve motion. So these are motion-based cues. So there’s two main ways that we use motion to tell about depth. The first of these is called motion parallax.

The idea of motion parallax is that things appear to move at different speeds based on how far away from us they are. So a great way to demonstrate this is just to imagine that you’re looking out a train window here.

Let’s go ahead and give ourselves this same white background here. So I’m looking out some train window and as I am looking out the window here I’ll see that the tree that’s really close to the train tracks here, the trees are going to whip by the window really quickly. They’re going to travel a large distance in a short amount of time. Whoosh, they’re whipping by but things that are farther away, now in the background here we’ll draw some mountains or something so the mountains here in the background, they’re moving by the window but they move much more slowly.

In the same amount of time that that tree whips by the mountain barely moves. Then the sun in the sky up here, barely even perceptible movement over that same amount of time. So the sun is making its way across the window over the course of an hour or two or something, whereas the trees are whipping by every second and the mountains are there for maybe half and hour and eventually you pass by them.

So that’s the idea of motion parallax; we can just the distance from us based on how quickly something appears to move past our field of vision. The second motion-based cue that we have is called optic flow.

The idea of optic flow is that the way that things move on our retina tells us about their distance from us. So if I throw a ball to you, the ball is initially small, so this is going to incorporate relative size as well, but as it moves toward you it’s expanding in all direction equally.

It’s growing at a uniform rate in all directions as it’s moving closer to you and it’s the rate of this growth, the way that it moves on your retina, the way that it expands on your retina tells you about how quickly it’s approaching you and how far away it is. When it’s very small and far away it’s not growing as much, as it gets closer, as it’s about to hit you in the face, it’s getting very large very quickly and that’s going to tell you about the depth.

So those are our motion-based cues. The last cues that we have are the ones that involve the fact that we have two eyes. So we say these are binocular cues, right? Two eyes, binocular. So there’s two main binocular cues that we use. The first of these is disparity this is called binocular disparity or it can also be called retinal disparity. This is the idea that we see two versions of the world.

Then we combine them into one, right? We have two eyes, so we’re seeing two different views of the world, slightly different angles and the way that these two views differ helps us to judge depth.

How can you demonstrate this? If you hold an object really close to face, you view it with just your left eye then just your right eye, you see that it sort of jumps back and forth. It’s on the far right side of your visual field for your left eye but on your right eye it’s on the far left side of the visual field. It’s in very different places and that tells you that it’s very close to you.

Whereas if you hold it at arm’s length and do the same thing, it moves a little bit when you switch eyes, but not that much. If you put it on the other side of the room, you look at something that’s very far away and you switch eyes, the object doesn’t really move at all. That tells you that it’s farther away. The move different these views are, that tells us that the objects are closer to us.

This is how 3D movies work; they present two versions of the film to us, one to each eye. Then they manipulate how different certain objects are in those two views. That gives them this appearance of depth. In the old days they did this with red and blue colored lenses. The red would filter out one version of the movie, the blue would filter out another version. The screen would be showing both and that’s why if you looked at it without glasses, it looks weird and the colors are kinda off and everything’s a little fuzzy.

That’s because there’s two overlapped versions of the movie on the screen. The glasses separate them so each eye only sees one version. Then your brain combines those into an apparently 3D version of the film. This is the same way that modern 3D movies work as well. They don’t use colored lenses anymore, they use polarized lenses now to separate the two versions of the film. But it’s the same principle, you have to send 2 different versions of the movie, one to each eye. So if you only have one eye, there’s no way that you can enjoy those kinds of 3D movies.

OK and the last cue that we have is called convergence. This is the idea that we have two eyes and that we have to move them differently depending on whether something is close to us or far away from us. So what do I mean by that? Let’s imagine these are my two eyes here and if I’m looking at something that’s really close to me I have to point my eyes at this very narrow angle. I have to actually move the muscles around my eyes and point them in like this to look at my finger.

That tells me that the thing I’m looking at, if I have to angle my eyes that way, it’s very close to me. Whereas if I’m looking at the same object but my eyes are pointing mostly straight ahead the angle is different, the point where they converge means that they’re at different angles, that tells me that the object I’m looking at is farther away from me.

So that’s a final cue that we use to help judge depth. OK I hope you found this helpful, if so, please like the video and subscribe to the channel for more.

Thanks for watching!

Leave a Reply

Your email address will not be published. Required fields are marked *