Psych 129 - Sensory Processes
Depth
The nature of depth
- We live in a 3D world, but the images formed on our retinae are only 2D. Thus, information about depth - i.e., how far things are away from you - must be inferred from other information present in the image or in the visual system.
- The way the visual system reconstructs depth information is analogous to the way the auditory system localizes sound from an intrinsically non-spatial detector.
Depth cues
- There are three main classes of depth cues: oculomotor cues, visual binocular cues, and visual monocular cues.
- Oculomotor cues consist of accommodation and vergence. Accommodation is the processes by which the lens changes shape in order to bring an object in focus on the retina. Far away objects require low convexity of the lens, whereas near objects require high convexity of the lens in order to become focused on the retina. Thus, how much the lens needs to be squeezed to bring an object in focus provides a cue to depth. Vergence is the process by which the eyes move in equal and opposite directions of one another in order to fixate an object. Near objects require both eyes oriented inwards in order to have the object foveated, whereas far objects require both eyes oriented along parallel lines of sight.
- Visual binocular cues consist of the disparity present between the left and right eye images. The process by which the brain infers depth from disparity is known as stereopsis.
- Visual monocular cues consist of occlusion, size, perspective, and parallax.
Stereopsis
- Each eye gets a slightly different view of the world. Disparity refers to the pixel-by-pixel differences between the images produced in the left and right eye. The brain uses the disparity between left and right images to infer depth information from a scene.
- Many people are disparity-blind, meaning they cannot tap into this cue for depth. Yet they can get around in the world perfectly well because they rely on the other cues to depth.
- Computation of disparity by the visual system requires comparison of left and right images to establish correspondence (much like ITD in the auditory system).
- Random-dot stereograms demonstrate that matches between left and right eye images are made based on the basis of relatively low-level visual information (i.e., individual pixels).
- Disparity-selective cells in the visual cortex may underlie the perception of depth.
- When the left and right eye images dont at all match you get binocular rivalry. The two images compete, oscillating back and forth with a period of several seconds (try it!). Whats going on neurally? (big question)
- Small mismatches in the left and right imagesi.e., where a feature in the image of one eye has no corresponding feature in the other eyecan also be used as a depth cue. This form of binocular depth cue is called Da Vinci stereopsis (since Leonardo first described it).
- Abnormalities of stereopsis include strabismus (due to misalignment of the eyes that goes uncorrected early in life), and amblyobia (where one eye has poor vision).
Occlusion
- When one object is in front of another (with respect to the viewer), it over-writes the other object in the image.
- The brain understands this rule of the image formation process, and fills-in image structure when edges are not there - e.g., Kanizsa figures, illusory contours.
- Amodal completion
refers to the completion of an object behind the occluder. Removing the occluder alone can result in a failure of amodal completion - e.g., Bregman Bs.
Size
refers to the fact that familiar objects appear the same size irrespective of distance (different sizes on retina).
The Ames room demonstrates that context can over-ride size constancy. The moon illusion illustrates another breakdown in size constancy due to context.
Perspective
Lines that are parallel in the world (e.g., railroad tracks) will not remain parallel when projected onto the retina. Rather, they converge towards infinity. This convergence of parallel lines provides an important cue to depth, and was an important innovation by artists of the renaissance. In order to infer depth from this cue though, the brain must assume that the lines are actually parallel in the external world.
In a surface containing textured elements of a constant size, the elements sitting on parts of the surface far from an observer will become smaller when projected onto the retina. Such a gradient in texture provides an important cue to depth. Again, though, the brain must assume that the elements of the texture actually all have the same size in the external world.
Parallax
refers to the relative motions of objects across the retina. Near objects move faster across the retina than far objects, and so relative motion provides an important cue to depth. Parallax may be seen as a form of disparity over time and is used by the locust (by wobbling the head back and forth) to infer depth.