visual neuroscience courses

Perception of Motion, Depth, and Form


IN VISION, AS IN OTHER mental operations, we experience the world as a whole. Independent attributes—motion, depth, form, and color—are coordinated into a single visual image. In the two previous chapters we began to consider how two parallel pathways—the magnocellular and parvocellular pathways, that extend from the retina through the lateral geniculate nucleus of the thalamus to the primary visual (striate) cortex—might produce a coherent visual image. In this chapter we examine how the information from these two pathways feeds into multiple higher-order centers of visual processing in the extrastriate cortex. How do these pathways contribute to our perception of motion, depth, form, and color?
The magnocellular (M) and parvocellular (P) pathways feed into two extrastriate cortical pathways: a dorsal pathway and a ventral pathway. In this chapter we examine, in cell-biological terms, the information processing in each of these pathways.
We shall first consider the perception of motion and depth, mediated in large part by the dorsal pathway to the posterior parietal cortex. We then consider the perception of contrast and contours, mediated largely by the ventral pathway extending to the inferior temporal cortex. This pathway also is concerned with the assessment of color, which we will consider in Chapter 29 . Finally, we shall consider the binding problem in the visual system: how information conveyed in parallel but separate pathways is brought together into a coherent perception.


Figure 28-1 Organization of V1 and V2.
A. Subregions in V1 (area 17) and V2 (area 18). This section from the occipital lobe of a squirrel monkey at the border of areas 17 and 18 was reacted with cytochrome oxidase. The cytochrome oxidase stains the blobs in V1 and the thick and thin stripes in V2. (Courtesy of M. Livingstone.)
B. Connections between V1 and V2. The blobs in V1 connect primarily to the thin stripes in V2, while the interblobs in V1 connect to interstripes in V2. Layer 4B projects to the thick stripes in V2 and to the middle temporal area (MT). Both thin and interstripes project to V4. Thick stripes in V2 also project to MT.

The Parvocellular and Magnocellular Pathways Feed Into Two Processing Pathways in Extrastriate Cortex
In Chapter 27 we saw that the parallel parvocellular and magnocellular pathways remain segregated even in the striate cortex. What happens to these P and M pathways beyond the striate cortex? Early research on these pathways indicated that the P pathway continues in the ventral cortical pathway that extends to the inferior temporal cortex, and that the M pathway becomes the dorsal pathway that extends to the posterior parietal cortex. However, the actual relationships are probably not so exclusive.
The evidence for separation of function of the dorsal and ventral pathways begins in the primary visual, or striate, cortex (V1). Staining for the mitochondrial enzyme cytochrome oxidase reveals a precise and repeating pattern of dark, peg-like regions about 0.2 mm in diameter called blobs. The blobs are especially prominent in the superficial layers 2 and 3, where they are separated by intervening regions that stain lighter, the interblob regions. The same stain also reveals alternating thick and thin stripes separated by interstripes of little activity ( Figure 28-1 in the secondary visual cortex, or V2).


Figure 28-2 The magnocellular (M) and parvocellular (P) pathways from the retina project through the lateral geniculate nucleus (LGN) to V1. Separate pathways to the temporal and parietal cortices course through the extrastriate cortex beginning in V2. The connections shown in the figure are based on established anatomical connections, but only selected connections are shown and many cortical areas are omitted (compare Figure 25-9 ). Note the cross connections between the two pathways in several cortical areas. The parietal pathway receives input from the M pathway but only the temporal pathway receives input from both the M and P pathways. (Abbreviations: AIT = anterior inferior temporal area; CIT = central inferior temporal area; LIP = lateral intraparietal area; Magno = magnocellular layers of the lateral geniculate nucleus; MST = medial superior temporal area; MT = middle temporal area; Parvo = parvocellular layers of the lateral geniculate nucleus; PIT = posterior inferior temporal area; VIP = ventral intraparietal area.) (Based on Merigan and Maunsell 1993 .)

Margaret Livingstone and David Hubel identified the anatomical connections between labeled regions in V1 and V2 ( Figure 28-1B ). They found that the P and M pathways remain partially segregated through V2. The M pathway projects from the magnocellular layers of the lateral geniculate nucleus to the striate cortex, first to layer 4C and then to layer 4B. Cells in layer 4B project directly to the middle temporal area (MT) and also to the thick stripes in V2, from which cells also project to MT. Thus, a clear anatomical pathway exists from the magnocellular layers in the lateral geniculate nucleus to MT and from there to the posterior parietal cortex ( Figure 28-2 ).
Cells in the parvocellular layers of the lateral geniculate nucleus project to layer 4C in the striate cortex, from which cells project to the blobs and interblobs of V1. The blobs send a strong projection to the thin stripes in V2, whereas interblobs send a strong projection to the interstripes in V2. The thin stripe and interstripe areas of V2 may in turn project to discrete subregions of V4, thus maintaining this separation in the P pathway into V4 and possibly on into the inferior temporal cortex. A pathway from the P cells in the lateral geniculate nucleus to the inferior temporal cortex can therefore also be identified ( Figure 28-2 ).


Figure 28-3 Motion in the visual field can be perceived in two ways.
A. When the eyes are held still, the image of a moving object traverses the retina. Information about movement depends upon sequential firing of receptors in the retina.B. When the eyes follow an object, the image of the moving object falls on one place on the retina and the information is conveyed by movement of the eyes or the head.

But are these pathways exclusive of each other? Several anatomical observations suggest that they are not. In V1 both the magnocellular and parvocellular pathways have inputs in the blobs, and local neurons make extensive connections between the blob and interblob compartments. In V2 cross connections exist between the stripe compartments. Thus, the separation is not absolute, but whether there is an intermixing of the M and P contributions or whether the cross connections allow one cortical pathway to modulate activity in the other is not clear.
Results of experiments that selectively inactivate the P and M pathways as they pass through the lateral geniculate nucleus (described in Chapter 27 ) also erode the notion of strict segregation between the pathways in V1. Blocking of either pathway affects the responses of fewer than half the neurons in V1, which indicates that most V1 neurons receive physiologically effective inputs from both pathways. Further work has shown that the responses of neurons both within and outside of the blobs in the superficial layers of V1 are altered by blocking only the M pathway. Both observations suggest that there is incomplete segregation of the M and P pathways in V1.
This selective blocking of the P and M pathways also reveals the relative contributions of the pathways to the parietal and inferior temporal cortices. Blocking the magnocellular layers of the lateral geniculate nucleus eliminates the responses of many cells in MT and always reduces the responses of the remaining cells; blocking the parvocellular layers produces a much weaker effect on cells in MT. In contrast, blocking the activity of either the parvocellular or magnocellular layers in the lateral geniculate nucleus reduces the activity of neurons in V4. Thus, the dorsal pathway to MT seems primarily to include input from the M pathway, whereas the ventral pathway to the inferior temporal cortex appears to include input from both the M and P pathways. We can now see that there is substantial segregation of the P and M pathways up to V1, probably separation into V2, a likely predominance of the M input to the dorsal pathway to MT and the parietal cortex, and a mixture of P and M input into the pathway leading to the inferior temporal lobe (as indicated by the lines crossing between the pathways in Figure 28-2 ).


Figure 28-4 The illusion of apparent motion is evidence that the visual system analyzes motion in a separate pathway.
A. Actual motion is experienced as a sequence of visual sensations, each resulting from the image falling on a different position in the retina.
B. Apparent motion may actually be more convincing than actual motion, and is the perceptual basis for motion pictures. Thus, when two lights at positions 1 and 2 are turned on and off at suitable intervals, we perceive a single light moving between the two points. This perceptual illusion cannot be explained by processing of information based on different retinal positions and is therefore evidence for the existence of a special visual system for the detection of motion. (From Hochberg 1978 .)

Figure 28-5 Separate human brain areas are activated by motion and color. Motion studies. Six subjects viewed a black and white random-dot pattern that moved in one of eight directions or remained stationary. The figure shows the effect of motion because the PET scans taken while the pattern was stationary were subtracted from those taken while the pattern was moving. The white and red areas show the high end of activity (increased blood flow). The areas are located on the convexity of the prestriate cortex at the junction of Brodmann's areas 19 and 37.
Color studies. The subjects viewed a collage of 15 squares and rectangles of different colors or, alternatively, the same patterns in gray shades only. The figure shows the difference in blood flow while viewing the color and gray patterns. The area showing increased flow, subserving the perception of color, is located inferiorly and medially in the occipital cortex. (From Zeki et al. 1991 ).

What should we conclude about the organization of visual processing throughout the multiple areas of the visual cortex? First, we know that there are specific serial pathways through the multiple visual areas, not just a random assortment of equally connected areas. There is substantial evidence for two major processing pathways, a dorsal one to the posterior parietal cortex and a ventral one to the inferior temporal cortex, but other pathways may also exist. Second, there is strong evidence that the processing in these two cortical pathways is hierarchical. Each level has strong projections to the next level (and projections back), and the type of visual processing changes systematically from one level to the next. Third, the functions of cortical areas in the two cortical pathways are strikingly different, as judged both by the anatomical connections and the cellular activity considered in this chapter and by the behavioral and brain imaging evidence discussed in Chapter 25 .
Our examination of the functional organization within these vast regions of extrastriate visual cortex begins with the dorsal cortical pathway and the most intensively studied visual attribute, motion. We then examine the processing of depth information in the dorsal pathway. Finally, we turn to the ventral cortical pathway and consider the processing of information related to form. Color vision is the subject of the next chapter.
Motion Is Analyzed Primarily in the Dorsal Pathway to the Parietal Cortex
We usually think of motion as an object moving in the visual field, a car or a tennis ball, and we easily distinguish these moving objects from the stationary background. However, we often see objects in motion not because they move on our retina, but because we track them with eye movements; the image remains stationary on the retina but we perceive movement because our eyes move ( Figure 28-3 ).
Motion in the visual field is detected by comparing the position of images recorded at different times. Since most cells in the visual system are exquisitely sensitive to retinal position and can resolve events separated in time by 10 to 20 milliseconds, most cells in the visual system should, in principle, be able to extract information about motion from the position of the image on the retina by comparing the previous location of an object with its current location. What then is the evidence for a special neural subsystem specialized for motion?
The initial evidence for a special mechanism designed to detect motion independent of retinal position came from psychophysical observations on apparent motion, an illusion of motion that occurs when lights separated in space are turned on and off at appropriate intervals ( Figure 28-4 ). The perception of motion of objects that in fact have not changed position suggests that position and motion are signaled by separate pathways.
Box 28-1 Optic Flow
Optic flow refers to the perceived motion of the visual field that results from an individual's own movement through the environment. With optic flow the entire visual field moves, in contrast to the local motion of objects. Optic flow provides two types of cues: information about the organization of the environment (near objects will move faster than more distant objects) and information about the control of posture (side-to-side patterns induce body sway). Particularly influential in the development of ideas about optic flow was the demonstration by the experimental psychologist James J. Gibson that optic flow is critical for indicating the direction of observer movement (“heading”). For example, when an individual moves forward with eyes and head directed straight ahead, optic flow expands outward from a point straight ahead in the visual field, a pattern that is frequently used in movies to show space ship flight.
Where is optic flow represented in the brain? Neurons in one region of the medial superior temporal area of the parietal cortex in monkeys respond in ways that would make these cells ideal candidates to analyze optic flow. These neurons respond selectively to motion, have receptive fields that cover large parts of the visual field, and respond preferentially to large-field motion in the visual field. Additionally, the neurons are sensitive to shifts in the origin of full-field motion and to differences in speed between the center and periphery of the field. The neurons also receive input related to eye movement, which is particularly significant because forward movement is typically accompanied by eye and head movement. Finally, electrical stimulation of this area alters the ability of the monkey to locate the point of origin of field motion, providing further evidence that the superior temporal area of the parietal cortex is important for optic flow.
Motion Is Represented in the Middle Temporal Area
Experiments on monkeys show that neurons in the retina and lateral geniculate nucleus, as well as many areas in the striate and extrastriate cortex, respond very well to a spot of light moving across their receptive fields. In area V1, however, cells respond to motion in one direction, while motion in the opposite direction has little or no effect on them. This directional selectivity is prominent among cells in layer 4B of the striate cortex. Thus, cells in the M pathway provide input to cells in 4B, but these input cells themselves do not show directional selectivity. They simply provide the raw input for the directionally selective cortical cells.
In monkeys one area at the edge of the parietal cortex, the middle temporal area (MT), appears to be devoted to motion processing because almost all of the cells are directionally selective and the activity of only a small fraction of these cells is substantially altered by the shape or the color of the moving stimulus. Like V1, MT has a retinotopic map of the contralateral visual field, but the receptive fields of cells within this map are about 10 times wider than those of cells in the striate cortex. Cells with similar directional specificity are organized into vertical columns running from the surface of the cortex to the white matter. Each part of the visual field is represented by a set of columns in which cells respond to different directions of motion in that part of the visual field. This columnar organization is similar to that seen in V1.
Cells in MT respond to motion of spots or bars of light by detecting contrasts in luminance. Some cells in MT also respond to moving forms that are not defined by differences in luminance but by differences only in texture or color. While these cells are not selective for color itself, they nonetheless detect motion by responding to an edge defined by color. Thus, even though MT and the dorsal pathway to the parietal cortex may be devoted to the analysis of motion, the cells are sensitive to stimuli (color) that were thought to be analyzed primarily by cells in the ventral pathway. Stimulus information on motion, form, and color therefore is not processed exclusively in separate functional pathways.
This description of motion processing is based on research on the MT area in monkeys. In the human brain an area devoted to motion has been identified at the junction of the parietal, temporal, and occipital cortices. Figure 28-5 shows changes in blood flow in this area in PET scans made while the subject viewed a pattern of dots in motion.
A cortical area adjacent to MT, the medial superior temporal area (MST), also has neurons that are responsive to visual motion and these neurons may process a type of global motion in the visual field called optic flow, which is important for a person's own movements through an environment ( Box 28-1 ).
Cells in MT Solve the Aperture Problem
We have considered the response of MT neurons to the motion of simple stimulus like an edge or a line. However,
in the everyday world complex two- and three-dimensional patterns often give rise to ambiguous or illusory perception. Consider the example in Figure 28-6A , which shows three gratings moving in three directions. When viewed through a small circular aperture, all three gratings appear to move in the same direction. The observer only reports the component of motion that is perpendicular to the orientation of the bars in the gratings. This phenomenon, known as the aperture problem, applies to the study of neurons as well as perception. Since most neurons in V1 and MT have relatively small receptive fields, they confront the aperture problem when an object larger than their receptive field moves across the visual field.


Figure 28-6 The aperture problem.
A. Three patterns moving in three different directions produce the same physical stimulus if only part of the pattern is within view, and thus all three patterns are perceived as moving in the same direction. Three patterns are shown moving in three directions. When seen through a small aperture the three gratings appear to move in the same direction, downward and to the right. This failure to accurately detect the true direction of motion is called the aperture problem. (Adapted from Movshon et al. 1985 .)
B. A formal solution to the aperture problem. When motion in different directions, downward (1) or rightward (2), is seen through a small aperture, the motion of the edge seen through the aperture does not indicate the true direction of the entire pattern. Assume now that the aperture represents the receptive field of a neuron, and that there are two apertures rather than one (3). This represents the situation in which two or more cells that respond to specific directions perpendicular to their axis of orientation are activated by different edges moving in different directions. A higher-order cell that integrates the signals from the lower-order cells could encode the motion of the entire object. (Adapted from Movshon 1990 .)

To solve the aperture problem, neurons may extract information about motion in the visual field in two stages. In the initial stage neurons that respond to a specific axis of orientation signal motion of components perpendicular to their axis of orientation. The second stage is concerned with establishing the direction of motion of the entire pattern. In this stage higher-order neurons integrate the local components of motion analyzed by neurons in the initial stage.
The hypothesis that motion information in the visual system is processed in two stages was tested by Tony Movshon and his colleagues, who recorded the responses of cells in V1 and MT to a moving plaid pattern. The neurons of V1 as well as the majority of neurons in MT responded only to the components of the plaid. Each cell responded best when the lines in the plaid moved in the direction preferred by the cell. Cells did not respond to the direction of motion of the entire plaid. Movshon therefore called These neurons component direction-selective neurons. In contrast, about 20% of the neurons in MT responded only to motion of the plaid pattern. These cells, called pattern direction-sensitive neurons, receive input from the component direction-selective cells ( Figure 28-7 ). Thus, as suggested by the two-stage hypothesis, the global motion of an object is computed by pattern-selective neurons in MT based on the inputs of the component direction-selective neurons in V1 and MT.


Figure 28-7 Neurons in the middle temporal area of cortex in monkeys are sensitive to the motion of an entire object in the visual field.
A. Stimuli used to activate cells in V1 and MT. Pairs of gratings are oriented at a 90° angle (left) or 135° angle (right) to each other. When each pair is superimposed during movement, the resulting plaid pattern appears to move directly to the right. The motion of each component grating is perpendicular to the orientation of its bars. The movement of either component alone should stimulate first-stage neurons that prefer the direction of motion of the one grating. When the two gratings are superimposed to form a moving plaid, other (second-stage) neurons should be activated.
B. Polar plots illustrate the motion signaled by first-stage neurons in V1. The plots show the response of a neuron to the direction (0 to 360°) of motion of individual gratings (1) and the plaid (2). The response of the neuron to motion in each direction is indicated by the distance of the point from the center of the plot. The circle at the center indicates the neuron's activity when no stimulus is presented.
1. This neuron responds best when the motion of a grating is downward and to the right (blue arrow).
2. When presented with the moving plaid, the neuron responds to the motion of each component grating (solid color) rather than to the rightward motion of the plaid. The response of the cell to the grating components is expected to have the two-lobed configuration indicated by the dashed lines. Neurons that respond only to the motion of the components of the plaid are referred to as component direction-selective neurons.
C. These polar plots illustrate the motion signaled by a higher-order neuron in the middle temporal area (MT).
1. As with the lower-order cell in V1, this cell in MT responds to motion downward and to the right.
2. When presented with the plaid, the neuron responds to the direction of motion of the plaid (solid color), not to the directions of the component gratings (dashed line). This indicates that the neuron has processed the component signals of V1 into a more accurate perception of the movement of the object, and the neuron is referred to as a pattern direction-sensitive neuron. (Modified from Movshon et al. 1985 .)

Figure 28-8 Cortical lesions in monkeys and humans produce similar deficits in smooth-pursuit eye movement.
A. Smooth-pursuit eye movements of a monkey before a lesion in the foveal region medial superior temporal area (MST) in the right hemisphere (prelesion) and 24 h after the lesion (postlesion). The monkey's task is to to keep the moving target on the fovea by making smooth-pursuit eye movement. The dotted line shows the position of the target over time. The stimulus is turned off where the monkey is fixating and then turned on again as it moves smoothly at 15° per second to the right. The solid lines show superimposed eye movements as the monkey pursues the target on 10 separate trials. Note that before the lesion the monkey nearly matches the target motion with smooth-pursuit eye movement, but after the lesion it fails to do so. Instead it makes frequent saccadic eye movements (the series of small steps in the eye movement record) to catch up with the target. (From Dursteler et al. 1987.)
B. Smooth-pursuit eye movements in a human subject with a right occipital-parietal lesion. The patient attempts to follow a target moving 20° per second to the right, but the eye movements do not keep up with the target motion. As with the monkey, the subject uses a series of catch-up saccades to compensate for the slow pursuit. In both humans and monkeys the pursuit deficit is most prominent when the target is moving toward the side of the brain containing the lesion (in these cases, right brain lesions and deficits with rightward pursuit). The human subject, with a large lesion that must include multiple brain areas, has a deficit in smooth-pursuit eye movements very similar to the deficit seen in the monkey with a lesion limited to small and identified visual areas. (From Morrow and Sharpe 1993 .)

Control of Movement Is Selectively Impaired by Lesions of MT
These correlations of neuronal activity and visual perception raise the question, Is the activity of direction-selective cells in MT causally related to the visual perception of motion and the control of motion-dependent movement? The question whether direction-selective cells in MT directly affect the control of movement was first addressed in an experiment that examined the relationship of these cells to smooth-pursuit eye movements, the movements that keep a moving target in the fovea (see Figure 28-3 ). When discrete focal chemical lesions were made within different regions of the retinotopic map in MT of a monkey, the speed of the moving target could no longer be estimated correctly in the region of the visual field monitored by the damaged MT area. In contrast, the lesions did not affect pursuit of targets in other regions of the visual field nor did they affect eye movements to stationary targets. Thus, visual processing in MT is selective for motion of the visual stimulus; lesions produce a blind spot, or a scotoma, for motion.
Human patients with lesions of parietal cortex also sometimes have these deficits in smooth-pursuit eye movements, but the most frequent behavioral deficit is quite different from that seen after lesions of MT. The neurologist Gordon Holmes originally reported that these patients were unable to follow a target when it was moving toward the side of the brain that had the lesion. For example, a patient with a lesioned right hemisphere has difficulty pursuing a target moving toward the right ( Figure 28-8B ). Later experiments on monkeys showed that lesions centered on the medial superior temporal area (MST), the next level of processing for visual motion, produced just such a deficit ( Figure 28-8A ).
Perception of Motion Is Altered by Lesions and Microstimulation of MT
The question whether MT cells contribute to the perception of visual motion was addressed in an experiment in which monkeys were trained to report the direction of motion in a display of moving dots. The experimenter varied the proportion of dots that moved in the same direction. At zero correlation the motion of all dots was random and at 100% correlation the motion of all dots was in one direction ( Figure 28-9A ). While normal monkeys could perform the task with less than 10% of the dots moving in the same direction, monkeys with a lesion in MT required nearly 100% coherence to perform as well ( Figure 28-9B ). A human patient with bilateral brain damage also lost the perception of motion when tested on the same task ( Figure 28-9B ). In both the monkeys and the human subject, visual acuity for stationary stimuli was not affected by the brain damage.


Figure 28-9 A monkey with an MT lesion and a human patient with damage to extrastriate visual cortex have similar deficits in motion perception.
A. Displays used to study the perception of motion. In the display on the left there is no correlation between the directions of movement of several dots, and thus no net motion in the display. In the display on the right all the dots move in the same direction (100% correlation). An intermediate case is in the center; 50% of the dots move in the same direction while the other 50% move in random directions (essentially noise added to the signal). (From Newsome and Pare 1988 .)
B. The performance of a monkey before and after an MT lesion (left). The performance of a human subject with bilateral brain damage is compared to two normal subjects (right). The ordinate of the graph shows the percent correlation in the directions of all moving dots (as in part A) required for the monkey to pick out the one common direction. The abscissa indicates the size of the displacement of the dot and thus the degree of apparent motion. Note the general similarity between the performance of the humans and that of the monkey and the devastation to this performance after the cortical lesions. (From Newsome and Pare 1988 , Baker et al. 1991 .)

Thus damage to MT reduces the ability of monkeys to detect motion in the visual field, as indicated by disruptions in the pursuit of moving objects and perception of the direction of motion. However, monkeys with MT lesions quickly recover these functions. Directionally selective cells in other areas of cerebral cortex, such as MST, apparently can take over the function performed by MT. Recovery of function is greatly slowed when the lesion affects not only MT but also MST and other extrastriate areas.


Figure 28-10 Alteration of perceived direction of motion by stimulation of MT neurons. A monkey was shown a display of moving dots with a relatively low correlation of 25.6% (see Figure 28-9A ) and instructed to indicate in which of eight directions the dots appeared to be moving. The open circles show the proportion of decisions made for each direction of motion—about equal choice for all directions. Electric current was passed through a microelectrode positioned among cells that responded best to stimulus motion in one direction, 225° on the polar plot. The microstimulation was applied for 1 s, beginning and ending with the onset and offset of the visual stimulus. Filled circles show the response of the monkey when the MT cells were stimulated at the same time the visual stimulus was presented. Stimulation increased the likelihood that the monkey would indicate seeing motion in the direction preferred by the stimulated MT cells (225°). (Adapted from Salzman and Newsome 1994 .)

If cells in MT are directly involved in the analysis of motion, the firing patterns of these neurons should affect perceptual judgments about motion. How well does the firing pattern of these neurons actually correlate with behavior? To address this question, William Newsome and Movshon recorded the activity of direction-selective neurons in MT while the monkeys reported the direction of motion in a random-dot display. Firing of the neurons correlated extremely well with performance. Thus the directional information encoded by the neurons of MT cells is sufficient to account for the monkey's judgment of motion.
If this inference is correct, then modifying the firing rates of the MT neurons should alter the monkey's perception of motion. In fact, Newsome found that stimulating clusters of neurons in a single column of cells sensitive to one direction of motion biases the monkey's judgment toward that direction of motion. The electrical stimulation acts as if a constant visual motion signal were added to the signal conveyed by the whole population of MT neurons ( Figure 28-10 ). Thus, the firing of a relatively small population of motion-sensitive neurons in MT directly contributes to perception.
Depth Vision Depends on Monocular Cues and Binocular Disparity
One of the major tasks of the visual system is to convert a two-dimensional retinal image into three dimensions. How is this transformation achieved? How do we tell how far one thing is from another? How do we estimate the relative depth of a three-dimensional object in the visual field? Psychophysical studies indicate that the shift from two to three dimensions relies on two types of clues: monocular cues for depth and stereoscopic cues for binocular disparity.
Monocular Cues Create Far-Field Depth Perception
At distances greater than about 100 feet the retinal images seen by each eye are almost identical, so that looking at a distance we are essentially one-eyed. Nevertheless we can perceive depth with one eye by relying on a variety of tricks called monocular depth cues. Several of these monocular cues were appreciated by the artists of antiquity, rediscovered during the Renaissance, and codified early in the sixteenth century by Leonardo da Vinci.

  • Familiar size. If we know from experience something about the size of a person, we can judge the person's distance ( Figure 28-11A ).
  • Occlusion. If one person is partly hiding another person, we assume the person in front is closer ( Figure 28-11A ).
  • Linear perspective Parallel lines, such as those of a railroad track, appear to converge with distance. The greater the convergence of lines, the greater is the impression of distance. The visual system interprets the convergence as depth by assuming that parallel lines remain parallel ( Figure 28-11A ).
  • Size perspective If two similar objects appear different in size, the smaller is assumed to be more distant ( Figure 28-11A ).
  • Distribution of shadows and illumination Patterns of light and dark can give the impression of depth. For example, brighter shades of colors tend to be seen as nearer. In painting this distribution of light and shadow is called chiaroscuro.
  • Motion (or monocular movement) parallax. Perhaps the most important of the monocular cues, this is not a static pictorial cue and therefore does not come to us from the study of painting. As we move our heads or bodies from side to side, the images projected by an object in the visual field move across the retina. Objects closer than the object we are looking at seem to move quickly and in the direction opposite to our own movement, whereas more distant objects move more slowly and in the same direction as our movement ( Figure 28-11B ).

Figure 28-11 Monocular depth cues provide information on the relative distance of objects and have been used by painters since antiquity.
A. The upper drawing shows the side view of a scene. When the scene is traced on a plane of glass held between the eye and the scene (lower drawing) the resulting two-dimensional tracing reveals the cues needed to perceive depth. Occlusion: The fact that rectangle 4 interrupts the outline of 5 indicates which object is in front, but not how much distance there is between them. Linear perspective: Although lines 6-7 and 8-9 are parallel in reality, they converge in the picture plane. Size perspective: Because the two boys are similiar figures, the smaller boy (2) is assumed to be more distant than the larger boy (1) in the picture plane. Familiar size: The man (3) and the nearest boy are drawn to nearly the same size in the picture. If we know that the man is taller than the boy, we deduce on the basis of their sizes in the picture that the man is more distant than the boy. This type of cue is weaker than the others. (Adapted from Hochberg 1968.)
B. Motion of the observer or sideways movement of head and eyes produces depth cues. If the observer moves to the left while looking at the tree, objects closer than the tree move to the right; those farther away move to the left. The full-field motion that results from the observer's own movement is referred to as optic flow. (see Box 28-1 .) (Adapted from Busettini et al. 1996 ).

Stereoscopic Cues Create Near-Field Depth Perception
The perception of depth at distances less than 100 feet also depends on monocular cues but in addition is mediated by stereoscopic vision. Stereoscopic vision is possible because the two eyes are horizontally separated (by about 6 cm in humans) so that each eye views the world from a slightly different position. Thus, objects at different distances produce slightly different images on the two retinas. This can be clearly demonstrated by closing each eye in turn. As vision is switched from one to the other eye, any near object will appear to shift sideways.
Understanding stereopsis begins with an understanding of the simple geometry of the images falling on the retina. When we fixate on a point, the image of this point falls upon corresponding points on the center of the retina in each eye ( Figure 28-12 ). The point of focus is called the fixation point; the parallel (vertical) plane of points on which it lies is called the fixation plane. The distance of an image from the center of the two eyes allows the visual system to calculate the distance of the object relative to the fixation point. Any point on the object that is nearer or farther than the fixation point will project an image at some distance from the center of the retina. Parts of the object that are closer to us will be farther apart on the two retinas in a horizontal direction. Parts of the object that are farther from us will project closer together on the two retinas.
Clearly, the difference in position, called binocular disparity, depends on the distance of the object from the fixation plane. Thus points on a three-dimensional object just outside the fixation plane stimulate different points on each eye, and the multiple disparities provide cues for stereopsis, the perception of solid objects.
Surprisingly, not one of the great early students of optics—Euclid, Archimedes, Leonardo da Vinci, Newton, nor Goethe—understood stereopsis, although each could readily have discovered it with the methods available to them. Stereoscopic vision was not discovered until 1838, when the physicist Charles Wheatstone invented the stereoscope. Two photographs of a scene 60-65 mm apart, one taken from the position of each eye, are mounted into a binocular-like device such that the right eye sees only the picture taken from one position and the left eye sees only the other picture. Remarkably, this presentation produces a three-dimensional scene.


Figure 28-12 When we fix our eyes on a point the convergence of the eyes causes that point (the fixation point) to fall on identical portions of each retina. Cues for depth are provided by points just proximal or distal to the fixation point. These points produce binocular disparity by stimulating slightly different parts of the retina of each eye. When the lack of correspondence is in the horizontal direction only and is not greater than about 0.6 mm or 2° of arc, the disparity is perceived as a single, solid (three-dimensional) spot.

Information From the Two Eyes Is First Combined in the Primary Visual Cortex
How is stereopsis accomplished? Clearly the brain must somehow calculate the disparity between the images seen by the two eyes and then estimate distance based on simple geometric relations. However, this cannot occur before information from the two eyes comes together, and cells in the primary visual cortex (V1) are the first in the visual system to receive input from the two eyes ( Chapter 27 ). Stereopsis, however, requires that the inputs from the two eyes be slightly different—there must be a horizontal disparity in the two retinal images ( Figure 28-13 ). The important finding that certain neurons in V1 are actually selective for horizontal disparity was
made in 1968 by Horace Barlow, Colin Blakemore, Peter Bishop, and Jack Pettigrew. They found that a neuron that prefers an oriented bar of light at one place in the visual field responds better when that stimulus appears in front of the screen (referred to as a near stimulus) or when the stimulus is beyond the screen (a far stimulus). There is thus an additional level of organization of information in the ocular dominance columns in V1.


Figure 28-13 Neuronal basis of stereoscopic vision. (Adapted from Ohzawa et al. 1996 .)
A. When an observer looks at point P the image P falls on corresponding points on the retina of each eye. These images completely overlap and therefore have zero binocular disparity. When looking at a point to the left and closer, point Q, the image Q in the left eye falls on the same point as P, but the image in the right eye is laterally displaced. These images have binocular disparity.
B. A cortical neuron receiving binocular inputs is maximally activated when the inputs from the two eyes have zero disparity as at P.
C. Another cortical neuron receiving binocular inputs responds best when the inputs from the two eyes are spatially disparate on the two retinas (Q); it is most sensitive to near stimuli.

Cells sensitive to binocular disparity are found in several cortical visual areas. In addition to V1, some cells in the extrastriate areas V2 and V3 respond to disparity, and many direction-selective cells in MT respond best to stimuli at specific distances, either at the plane of fixation or nearer or farther than the plane. Some cells in MST, the next step in the parietal pathway, fire in response to combinations of disparity and direction of motion. That is, the direction of motion preferred by the cell varies with the disparity of the stimulus. For example, a cell that responds to leftward-moving far stimuli might also respond to rightward-moving near stimuli. These cells can convey information not only about the direction of motion but about the direction of motion at different depths within the visual field (as in Figure 28-11B ).
Studies of cells in the striate and extrastriate cortex that respond selectively to binocular disparity fall into several broad categories. Among these, tuned cells respond best to stimuli at a specific disparity, frequently on the plane of fixation. Other cells respond best to stimuli at a range of disparities either in front of the fixation plane (“near cells”) or beyond the plane (“far cells”) ( Figure 28-14 )
Just as the motion information processed in MT is used both for the visual guidance of movement and for visual perception, disparity-sensitive cells in different regions of visual cortex may use disparity information for different purposes. One use is the perception of depth, which we have already considered. Another is in aligning the eyes to focus at a particular depth in the field. The eyes rotate toward each other (convergence) to focus on near objects and rotate apart (divergence) to focus on more distant objects. The ability to align the eyes develops in the first few months of life and disparity information may play a key role in establishing this alignment.
Random Dot Stereograms Separate Stereopsis From Object Vision
Must the brain recognize an object before it can match the corresponding points of the object in the two eyes? Until 1960 this was generally thought to be so, and stereopsis therefore was thought to be a late stage in visual processing. In 1960 Bela Julesz proved that this idea was wrong when he found that stereoscopic fusion and depth perception do not require monocular identification of form. The only clue necessary for stereopsis is retinal disparity.
To demonstrate this remarkable fact, Julesz created a pattern of randomly distributed dots in the middle of which is a square area of dots. He made two copies of
the pattern but in one copy the inner square of dots is positioned slightly differently from the other copy. The inner square of dots is visible only when the identical copies of the pattern are viewed in a stereoscope. If one inner square is displaced so the two squares are closer together, in binocular view the square appears to lie in front of the pattern. If one inner square is shifted so the two squares are further apart, the perceived square appears to lie behind the surrounding dots ( Figure 28-15 ). By itself, each random-dot pattern will not produce any depth clues. Only with stereoscopic vision can one see the square within the pattern. With this method, Julesz demonstrated that humans can detect form based strictly on binocular disparity.


Figure 28-14 Different disparity profiles are found in neurons in cortical visual areas of the monkey. The curves show the responses of six different neurons to bright bars of optimal orientation moving in the preferred direction across the receptive fields at a series of horizontal disparities. These different disparity profiles have been observed in many areas of the monkey visual cortex. The tuned cells are more common in areas V1 and V2, especially in the region of foveal representation, and the “near” and “far” cells are more common in MST. (After Poggio 1995 .)

Are there, among the disparity-sensitive neurons in the visual cortex, individual neurons that respond to a stereogram that contains no depth clues except retinal disparity? To answer this question, Gian Poggio first located responsive cells using a bar of light as a stimulusHe then replaced the bar with a random-dot pattern stereogram. Many of the neurons that responded to the solid figure also responded to the random-dot stereogram.
Object Vision Depends on the Ventral Pathway to the Inferior Temporal Lobe
The ventral cortical pathway extends from V1 through V2 to V4 and then to the inferior temporal cortex. We have already noted that V2 has subregions referred to as thick stripes, thin stripes, and interstripes and that the thin and interstripe regions project to V4. As we have indicated, the ventral pathway appears to be concerned with analysis of form and color. Here we will concentrate on the processing of form in V2, V4, and the inferior temporal cortex.
Cells in V2 Respond to Both Illusory and Actual Contours
As in V1, cells in V2 are sensitive to the orientation of stimuli, to their color, and to their horizontal disparity, and they continue the analysis of contour begun by cells in V1. Their response to contours was explored in experiments in which cells were tested for their sensitivity to certain illusory contours of the sort we considered in Chapter 25 .
Many cells in V2 responded to the illusory contours just as they responded to edges ( Figure 28-16 ). In contrast, few cells in V1 responded to the same illusory contours (although other experiments have shown responses of V1 cells to more limited illusory contours). These results suggest that V2 carries out an analysis of contours at a level beyond that of V1, and they are further evidence of the progressive abstraction that occurs in each of the two pathways of the visual system.


Figure 28-15 Stereopsis does not depend on perception of form.
A. A square form inside these identical random-dot displays cannot be seen by looking at either display alone. It can be seen only when the two identical images are viewed in a stereoscope, or by training the eyes to focus outside the image plane.
B. The square areas in the two random-dot patterns have different positions. The square becomes visible only because of the ocular disparity of the two dot patterns, not because either eye recognizes the form of the square.
C. In the stereoscope the random-dot images are placed behind a rectangular opening. If one inner square of dots is displaced so the left and right inner squares are closer together (1), the square is perceived in front of the larger pattern. If the inner squares are shifted so that the two squares are further apart (2), the square is perceived behind the larger pattern. (Adapted from Julesz 1971 .)

Cells in V4 Respond to Form
Initial observations on cells in V4 indicated that the cells were selective for color, and it was thought that they were devoted exclusively to color vision. However, many of these same cells are also sensitive to the orientation of bars of light and are more responsive to finer-grained than to coarse-grained stimuli. Thus, some V4 cells are responsive to combinations of color and form.
Does removal of V4 alter a monkey's responses to color more than to form? Experiments show that ablation of V4 impairs a monkey's ability to discriminate patterns and shapes but only minimally affects its ability to distinguish colors with different hues and saturation. In other experiments ablation of V4 altered only subtle color discriminations, such as the ability to identify colors under different illumination conditions (color constancy).
We have noted that some humans lose color vision (achromatopsia) after localized damage to the ventral occipital cortex. PET scans of normal human subjects reveal
an increase in activity in the lingual and fusiform gyri when colored stimuli are presented (see Figure 28-5 ). The deficits in patients with achromatopsia differ from those in monkeys with lesions of V4. The human patients cannot discriminate hues but can discriminate shape and texture, whereas the monkeys' ability to differentiate shapes is markedly diminished while hue discrimination is only minimally affected. It therefore seems unlikely that the area identified in the human brain is directly comparable to the V4 region in the monkey, but instead includes more extended regions, including the inferior temporal cortex, the area we consider next.


Figure 28-16 Illusions of edges used to study the higher level information processing in V2 cells of the monkey.
A. Examples of illusory contours. 1. A white triangle is clearly seen, although it is not defined in the picture by a continuous border. 2. A vertical bar is seen, although again there is no continuous border. 3. Slight alterations obliterate the perception of the bar seen in 2. 4. The curved contour is not represented by any edges or lines. (From Von der Heydt et al. 1984 .)
B. A neuron in V2 responds to illusory contours. The cell's receptive field is represented by an ellipse in the drawings on the left. 1. A cell responds to a bar of light moving across its receptive field. Each dot in the record on the right indicates a cell discharge and successive lines indicate the cell's response to successive movements of the bar. 2. The neuron also responds when an illusory contour passes over its receptive field. 3, 4. When only half of the stimulus moves across the cell's receptive field, the response resembles spontaneous activity (5). (Adapted from Von der Heydt et al. 1984 .)

Recognition of Faces and Other Complex Forms Depends Upon the Inferior Temporal Cortex
We are capable of recognizing and remembering an almost infinite variety of shapes independent of their size or position on the retina. Clinical work in humans and experimental studies in monkeys suggest that form recognition is closely related to processes that occur in the inferior temporal cortex.
The response properties of cells in the inferior temporal cortex are those we might expect from an area involved in a later stage of pattern recognition. For example, the receptive field of virtually every cell includes the foveal region, where fine discriminations are made. Unlike cells in the striate cortex and many other extra-striate visual areas, the cells in the inferior temporal area do not have a clear retinotopic organization, and the receptive fields are very large and occasionally may include the entire visual field (both visual hemifields). Such large fields may be related to position invariance, the ability to recognize the same feature anywhere in the visual field. For example, even a small eye movement can easily move an edge stimulus from the receptive field of one V1 neuron to another. In contrast, such a movement would simply move the edge within the receptive field of one inferior temporal neuron. The larger receptive field of many extrastriate regions, including the inferior temporal, may be important in the ability to recognize the same object regardless of its location.
The most prominent visual input to the inferior temporal cortex is from V4, so it would not be surprising to see a continuation of the visual processing observed in V4. Inferior temporal cortex appears to have functional subregions and, like V4, may have separate pathways to these regions. Also like V4, inferior temporal cells are sensitive to both shape and color. Many cells in inferior temporal cortex respond to a variety of shapes and colors, although the strength of the response varies for different combinations of shape and color ( Figure 28-17 ). Other cells are selective only for shape or color.
Most interesting is the finding that some inferotemporal cells respond only to specific types of complex stimuli, such as the hand or face. For cells that respond to a hand, the individual fingers are a particularly critical visual feature; these cells do not respond when there are no spaces separating the fingers. However, all orientations of the hand elicit similar responses. Among neurons selective for faces, the frontal view of the face is the most effective stimulus for some, while for others it is the side view. Moreover, whereas some neurons respond preferentially to faces, others respond prefer-entially to specific facial expressions. Although the proportion of cells in the inferior temporal cortex responsive to hands or faces is small, their existence, together with the fact that lesions of this region lead to specific deficits in face recognition ( Chapter 25 ), indicates that the inferior temporal cortex is responsible for face recognition.


Figure 28-17 Many inferior temporal neurons respond both to form and color.
A. Average responses for a single neuron to stimuli with different shapes. The height of each bar indicates the average discharge rate during presentation of the stimulus. The dashed line indicates the background discharge rate.
B. Responses of the same neuron to colored stimuli. Discharge rates are indicated by the size of each circle. The open circle represents a discharge rate of 30 spikes/s. The responses are plotted on a color map with the relative location of colors, red, green, and blue given for reference. The axes are relative amounts of primary colors. (Adapted from Komatsu and Ideura 1993 .)

One of the major issues in understanding the brain's analysis of complex objects is the degree to which individual cells respond to the simpler components of these objects. Certain critical elements of faces are sufficient to activate some inferior temporal neurons. For example, instead of a face, two dots and a line appropriately positioned might activate the cell ( Figure 28-18 ). Other experiments suggest that some cells respond to facial dimensions (distance between the eyes) and others to the familiarity of the face. There is also evidence that cells responding to similar features are organized in columns.
Visual Attention Facilitates Coordination Between Separate Visual Pathways
The limited capacity of the visual system means that at any given time only a fraction of the information available from the visual scene falling on the two retinas can be processed. Thus some information is used to produce perception and movement while other information is lost or discarded. This selective filtering of visual information is achieved by visual attention. As may be appreciated from the evidence presented in this and earlier chapters, understanding the neuronal mechanisms of attention and conscious awareness is one of the great unresolved problems in perception. Can we resolve these mechanisms and understand their contribution to behavior? How does attention alter the processing of visual information?
Investigation of spatial attention at the neuronal level began in the 1970s with exploration of the cellular basis of visual attention in the superior colliculus, the striate cortex (V1), and the posterior parietal cortex of awake primates (see Figure 20-15 ). Michael Goldberg and Robert Wurtz examined the response of cells to a spot of light under two conditions: (1) when the monkey looked elsewhere and did not attend to the location of the spot, and (2) when the animal was required to fix its gaze on the spot of light by making rapid or saccadic eye movements to the spot. When the animal attended to the spot, cells in the superior colliculus responded more intensely, while the response of cells in V1 showed little modulation. However, the enhanced response of the cells in the superior colliculus did not result from selective attention per se but was dependent upon the initiation of eye movement. In similar tests of the responsiveness of cells in the posterior parietal cortex, a region known from clinical studies to be involved in attention ( Chapter 20 ), the cells' responses were enhanced whether the monkey made an eye movement to the visual target or reached for it ( Box 28-2 ).
The effects of attention on cells in V4 and inferior temporal cortex were next determined by Robert Desimone and his colleagues by presenting two stimuli, both falling in the receptive field of one cell. The experimenters found that they could turn a neuron on or off depending on whether they required the monkey to attend to one of the stimuli. The stimulus remained the same between trials; only the monkey's attention shifted.


Figure 28-18 Response of a neuron in the inferior temporal cortex to complex stimuli. The cell responds strongly to the face of a toy monkey (A). The critical features producing the response are revealed in a configuration of two black spots and one horizontal black bar arranged on a gray disk (B). The bar, spots, and circular outline together were essential, as can be seen by the cell's responses to images missing one or more of these features (C, D, E, F). The contrast between the inside and outside of the circular contour was not critical (G). However, the spots and bar had to be darker than the background within the outline (H). (i = spikes.) (Modified from Kobatake and Tanaka 1994 .)

Since attention is the selection of one stimulus from among many, it would be reasonable to expect that the effect of attention on cell responsiveness would increase with the number of stimuli presented. In one experiment a monkey was required to focus attention on one stimulus in a group of six to eight identical stimuli. The responses of one-third of the neurons in V4 were altered when the monkey's attention shifted to one stimulus. Activity in most of these same neurons was not altered as much when the monkey had to choose from only two identical stimuli. Furthermore, increasing the number of stimuli also enhanced the responses of neurons in earlier stages of the visual pathways, in V2 and V1. As the demands for selection among visual targets increases, so does the relative effect of attention.
Changes in cellular activity also occur when the focus of attention is a specific object rather than a location. In one set of experiments a monkey was cued to select an object, or the color or shape of an object, and then required to select a similar object from among a set of objects presented either simultaneously or in series. Remarkably, presentation of the matching object can have a greater effect on a neuron's response than the sample stimulus that is present. In one of these experiments the cells in V4 responded more vigorously when the color of the matching object was the same as the cue ( Figure 28-19 ). During the search for matching stimuli the activity of neurons in the ventral pathway and inferior temporal cortex is modified. In the dorsal pathway the activity of cells in area 7A, MST, and MT is also modified, particularly when multiple stimuli fall within the receptive field of a cell.
The Binding Problem in the Visual System
We have seen that information about motion, depth, form, and color is processed in many different visual areas and organized into at least two cortical pathways. How can such distributed processing lead to cohesive perceptions? When we see a red ball we combine into one perception the sensations of color (red), form (round), and solidity (ball). We can equally well combine red with a square box, a pot, or a shirt. The possible combination of elements is so great that the existence of an individual feature-detecting cell for each set of combinations is improbable.
Instead, as we have seen in this chapter, the evidence is strongly in favor of a constructive process by which complex visual images are built up at successively higher processing centers. Is there a “final common pathway” where all the elements of a complex percept are brought together? Or do the distributed afferent pathways interact in some continuous fashion to produce coherent percepts? There is as yet no satisfactory solution to the binding problem, the problem of how consciousness of an ongoing, coherent experience emerges from the information processing being conducted independently in different cortical areas.
As described in Chapter 25 , Anne Treisman and Bela Julesz independently showed that the associative process by which multiple features of one object are brought together in a coherent percept requires attention. They suggested that different properties are encoded in different feature maps during a preattentive stage of perception and that attention selects specific features in these different maps and ties them together (as illustrated in Figure 25-15 ).
Box 28-2 Parietal Cortex and Movement
The dorsal visual path extends to the posterior parietal cortex, which, based on clinical observations of patients with parietal damage, is known to be involved in the representation of the visual world and the planning of movement. Recent studies of neurons in the parietal cortex of monkeys have revealed several functionally distinct areas, which may account for the varied deficits following damage to the parietal cortex. The activity in most of these areas is related to the transition from sensory processing to the generation of movement.
Neurons in one of these subregions, the lateral intraparietal area, fire in connection with saccadic eye movements ( Chapter 39 ). These neurons fire in response to a visual target, before a saccade to the target, and increase their activity just before the beginning of the saccade, indicating that the activity in these neurons is related both to the visual input and to the motor output of the brain. Between the sensory and motor events, continuing activity in these neurons depends on the condition under which the saccade is made, such as whether the saccade is made to a visual stimulus or the location of a remembered stimulus. Thus, although activity in these neurons is closely associated with the transition from sensory perception to motor movement, it is not exclusively related to one or the other.
These neurons also clearly receive information more complex than either pure sensory and pure motor information. Many neurons respond differently to the same visual stimulus, depending upon where in space the eyes and head are oriented, indicating that they receive input about eye position as well as the visual stimulus. Such neurons might be involved in shifting the frame of reference in which sensory information is processed (from eye to head to body; see Box 25-1 ) a shift that is necessary to control movements such as reaching. We therefore have strong evidence that parietal neurons are involved in putting visual information in the service of the motor systems and for compensating for the disruption to vision that results from such movement.
A related view of the effect of attention on the binding problem recently was advanced by John Reynolds and Robert Desimone. They based their interpretation on two observations already described in this chapter: neurons have larger and larger receptive fields at higher levels in the cortical visual pathways and attention to one of several stimuli falling in one of these large receptive fields increases the response to that stimulus. They assume that attention acts to increase the competitive advantage of the attended stimulus so that the effect of attention is to shrink the effective size of the field around the attended stimulus. Now instead of many stimuli with different characteristics such as color and form, only the one stimulus is functionally present in the receptive field. Because the effective receptive field now just includes that one stimulus, all the characteristics of the stimulus are effectively bound together.
Another approach to the binding problem has been emphasized by Charles Gray and Wolfgang Singer and Reinhold Eckhorn and their colleagues. They found that when an object activates a population of neurons in the visual cortex, the neurons tend to oscillate and fire in unison. They suggest that these oscillations are indicative of a synchrony among cells and that this synchrony of firing would bind together the activity of cells responding to different features of the same object. To combine the visual features (color, form, motion) of the same object, the synchrony between neurons would, according to this view, extend across neurons in different cortical areas.
Quite a different solution to the binding problem was proposed by Lance Optican, Barry Richmond, and their colleagues. They found that neurons extending from the lateral geniculate nucleus to the inferior temporal cortex convey more information if the temporal pattern of their discharge is considered. Instead of measuring the total number of spikes in a time period, they measured the distribution of the spikes in that time period and found that different stimulus features (eg, form, contrast, color) tended to be represented by different response patterns of the same cell. They propose that the pattern of discharge in each cell carries information about different features so that the problem of binding across cells, each representing a different feature, is eliminated. Cells in different areas would all convey some information about a number of stimulus features, but different cells would carry comparatively more or less about each feature.
Thus, while several solutions to the binding problem have been proposed, it still remains one of the central unsolved puzzles in our understanding of the neurobiological bases of perception.


Figure 28-19 The response of V4 neurons to an effective visual stimulus is modified by selective attention.
A. A monkey was trained to shift its attention to one set of stimuli as opposed to another. At the start of each trial (initial fixation) the monkey was trained to look at a fixation point on a screen (the dot in the square). At this point in the experiment the receptive field of a V4 cell has been located (dotted circle). On any given trial the fixation point was either red (upper row) or green (bottom row). Then six other stimuli came on, one of which fell into the receptive field of the cell (stimulus presentation). The monkey knew from prior training that it would be required to discriminate only those stimuli that were the same color as the fixation point—the three red stimuli in the top row or the three green in the bottom row. The assumption in the experiment is that this requires the monkey to attend to the appropriate three stimuli. If one of those stimuli is in the receptive field of the cell, as in the match trials (top row), the monkey presumably is paying attention to that stimulus in the receptive field. If those stimuli with the same color as the fixation point lie outside the receptive field, as in the nonmatch trials (bottom row), the monkey presumably is attending to the stimuli outside the receptive field. The responses of the V4 neuron to these match and nonmatch trials were compared. Note that the stimulus falling on the receptive field of the cell is the same in the match and nonmatch trials were compared, only its significance for the monkey has changed. In the last phase of a trial (discrimination) only two stimuli remain on, and the monkey, in order to obtain a reward, must indicate whether the matched stimulus is tilted to the right or to the left.
B. Increased response to the same visual stimulus falling on the receptive field of a V4 cell during the match (upper record) as opposed to the nonmatch (lower record) trials. Each line represents a successive trial, and each dot indicates the discharge of the neuron. The vertical tick marks indicate the monkey's behavioral response. While the match and nonmatch trials are shown separately, they were interleaved during the experiment. (After Motter 1994 .)

An Overall View
Much like the somatic sensory system the visual system consists of several parallel pathways not a single serial pathway. The M and P pathways pass from the retina, through the parvocellular and magnocellular layers of the lateral geniculate nucleus, to layer 4C of the primary visual cortex (V1), where they feed into parallel pathways extending through the cerebral cortex. A dorsal pathway extends from V1, through areas MT and MST, to the posterior parietal cortex. A ventral pathway extends from V1, through V4, to the inferior temporal cortex. The parietal pathway appears to be dominated by the M input, but the inferior temporal pathway depends upon both the P and the M input.
Several factors lead to the conclusion that these pathways serve different functions much as do the submodalities for somaesthesis: The anatomical connections along these two pathways, differences in neuronal activity, the behavioral deficits occurring after damage to the terminal areas of the pathways in both humans and monkeys, and the activity detected in the human brain during tasks that should differentially activate the two pathways. One view is that the dorsal or posterior parietal pathway is concerned with determining where an object is, whereas the ventral or inferior temporal pathway is involved in recognizing what the object is. Another view is that the dorsal pathway leads to action, the ventral to perception. But all agree that the function of the two pathways is different.
We have concentrated on the neuronal mechanisms mediating motion and depth information in the dorsal posterior parietal pathway and on form perception in the ventral inferior temporal pathway. Both pathways represent hierarchies for visual processing that lead to greater abstraction at successive levels. Neurons in MT respond to the motion of a patterned stimulus, whereas cells in V1 respond only to motion of the elements of a pattern. Neurons in inferior temporal cortex respond to a given shape at any position in large areas of the visual field, whereas simple cells in V1 respond only when an edge is positioned at one location in the field. In addition, cellular responses along the pathways tend to become increasingly dependent on the stimulus characteristic selected for attention. The effect of the remembered object in a visual search can have more effect on the response of neurons in V4 and the inferior temporal cortex than the stimulus that is present.
While we have considered separately the visual processing for motion, depth, and form, these parallel pathways may not be mutually exclusive pathways. Some processing combines the activity of the pathways. For example, form can be seen when the only cue is the coherent motion of components of the scene (which is regarded as the purview of the parietal pathway). Likewise, some MT cells respond to the motion of an edge defined only by color (a property that should be conveyed by the inferior temporal pathway). Thus, at both a perceptual and a physiological level, cross talk between the posterior parietal and inferior temporal pathways must occur as is also indicated by the anatomical evidence for cross connections.
We know the outline of the steps the brain takes in constructing complex visual images from the pattern of light and dark falling on the retina—the early processing along the M and P pathways and the later, more abstract processing in the dorsal posterior parietal and ventral inferior temporal pathways. But these steps remain only an outline. Many cortical areas must still be explored, and critical details of visual processing are only beginning to be understood.
Selected Readings
Andersen RA, Snyder LH, Bradley DC, Xing J. 1997. Multiple representations of space in the posterior parietal cortex and its use in planning movement. Ann Rev Neurosci 20:303–330.
Felleman DJ, Van Essen DC. 1991. Distributed hierarchical processing in primate cerebral cortex. Cereb Cortex 1:1–47.
Ferrera VP, Nealey TA, Maunsell JH. 1994. Responses in macaque visual area V4 following inactivation of the parvocellular and magnocellular LGN pathways. J Neurosci 14:2080–2088.
Hubel DH. 1988. Eye, Brain, and Vision. New York: Scientific American Library.
Julesz B. 1971. Foundations of Cyclopean Perception. Chicago: University of Chicago Press.
Livingstone MS, Hubel DH. 1987. Psychophysical evidence for separate channels for the perception of form, color, movement, and depth. J Neurosci 7:3416–3468.
Maunsell JH, Newsome WT. 1987. Visual processing in monkey extrastriate cortex. Annu Rev Neurosci 10:363–401.
Merigan WH, Maunsell JH. 1993. How parallel are the primate visual pathways? Annu Rev Neurosci 16:369–402.
Miyashita Y. 1993. Inferior temporal cortex: where visual perception meets memory. Annu Rev Neurosci 16:245–263.
Poggio GF. 1995. Mechanisms of stereopsis in monkey visual cortex. Cereb Cortex 3:193–204.
Salzman CD, Britten KH, Newsome WT. 1990. Cortical microstimulation influences perceptual judgements of motion direction. Nature 346:174–177.
Singer W, Gray CM. 1995. Visual feature integration and the temporal correlation hypothesis. Annu Rev Neurosci 18:555–586.
Stoner GR, Albright TD. 1993. Image segmentation cues in motion processing: implications for modularity in vision. J Cogn Neurosci 5:129–149.
Tanaka K. 1996. Inferotemporal cortex and object vision. Annu Rev Neurosci 19:109–139.
References
Albright TD, Desimone R, Gross CG. 1984. Columnar organization of directionally selective cells in visual area MT of the macaque. J Neurophysiol 51:16–31.
Baizer JS, Ungerleider LG, Desimone R. 1991. Organization of visual inputs to the inferior temporal and posterior parietal cortex in macaques. J Neurosci 11:168–190.
Baker CL, Hess RF, Zihl J. 1991. Residual motion perception in a “motion-blind” patient, assessed with limited-lifetime random dot stimuli. J Neurosci 11:454–461.
Barlow HB, Blakemore C, Pettigrew JD. 1967. The neural mechanism of binocular depth discrimination. J Physiol (Lond) 193:327–342.
Brewster D. 1856. The Stereoscope, Its History, Theory and Construction. London: John Murray.
Bishop PO, Pettigrew JD. 1986. Neural mechanisms of binocular vision. Vision Res 26:1587–1600.
Britten KH, Shadlen MN, Newsome WT, Movshon JA. 1992. The analysis of visual motion: a comparison of neuronal and psychophysical performance. J Neurosci 12:4745–4765.
Busettini C, Masson GS, Miles FA. 1996. A role for stereoscopic depth cues in the rapid visual stabilization of the eyes. Nature 380:342–345.
Desimone R, Wessinger M, Thomas L, Schneider W. 1990. Attentional control of visual perception: cortical and subcortical mechanisms. Cold Spring Harbor Symp Quant Biol 55:963–971.
DeYoe EA, Felleman DJ, Van Essen DC, McClendon E. 1994. Multiple processing streams in occipitotemporal visual cortex. Nature 371:151–154.
Duffy CJ, Wurtz RH. 1995. Response of monkey MST neurons to optic flow stimuli with shifted centers of motion. J Neurosci 15:5192–5208.
Duhamel J-R, Colby CL, Golberg ME. 1992. The updating of the represention of visual space in parietal cortex by intended eye movements. Science 255:90–92.
Dürsteler MR, Wurtz RH, Newsome WT. 1987. Directional pursuit deficit following lesions of the foveal representation within the superior temporal sulcus of the macaque monkey. J Neurophysiol 57:1262–1287.
Eckhorn R, Bauer R, Jordan W, Brosch M, Kruse W, Munk M, Reitboeck HJ. 1988. Coherent oscillations: a mechanism for feature linking in the visual cortex. Biol Cybern 60:121–130.
Escher MC. 1971. The Graphic Work of M. C. Escher. New rev. and exp. ed. New York: Ballantine Books.
Fox JC, Holmes G. 1926. Optic nystagmus and its value in the localization of cerebral lesions. Brain 49:333–371.
Gibson JJ. 1950. The Perception of the Visual World. Boston: Houghton Mifflin.
Graziano M, Andersen R, Snowden R. 1994. Tuning of MST neurons to spiral motions. J Neurosci 14:54–56.
Haenny PE, Maunsell JH, Schiller PH. 1988. State dependent activity in monkey visual cortex II. Retinal and extraretinal factors in V4. Exp Brain Res 69:245–259.
Hasselmo ME, Rolls ET, Baylis GC. 1989. The role of expression and identity in face-selective response of neurons in the temporal visual cortex of the monkey. Behav Brain Res 32:203–218.
Heywood CA, Cowey A, Newcombe F. 1994. On the role of parvocellular (P) and magnocellular (M) pathways in cerebral achromatopsia. Brain 117:245–254.
Heywood CA, Gadotti A, Cowey A. 1992. Cortical area V4 and its role in the perception of color. J Neurosci 12:4056–4065.
Hochberg JE. 1978. Perception 2nd ed. Englewood Cliffs, NJ: Prentice-Hall.
Horton JC. 1984. Cytochrome oxidase patches: a new cytoarchitectonic feature of monkey visual cortex. Philos Trans R Soc Lond B 304:199–253.
Julesz B. 1986. Stereoscopic vision. Vision Res 26:1601–1612.
Kobatake E, Tanaka K. 1994. Neuronal selectivities to complex object features in the ventral visual pathway of the macaque cerebral cortex. J Neurophys 71:856–867.
Komatsu H, Ideura Y. 1993. Relationships between color, shape, and pattern selectivities of neurons in the inferior temporal cortex of the monkey. J Neurophysiol 70:677-694
Lueschow A, Miller EK, Desimone R. 1994. Inferior temporal mechanisms for invariant object recognition. Cereb Cortex 4:523–531.
Malpeli JG, Schiller PH, Colby CL. 1981. Response properties of single cells in monkey striate cortex during reversible inactivation of individual lateral geniculate laminae. J Neurophysiol 46:1102–1119.
Maunsell JH, Nealey TA, DePriest DD. 1990. Magnocellular and parvocellular contributions to responses in the middle temporal visual area (MT) of the macaque monkey. J Neurosci 10:3323–3334.
Maunsell JH, Sclar G, Nealey TA, DePriest DD. 1991. Extraretinal representations in area V4 in the macaque monkey. Vis Neurosci 7:561–573.
McClurkin JW, Zarbock JA, Optican LM. 1994. Temporal codes for colors, patterns, and memories. In: A Peters, KS Rockland (eds). Cerebral Cortex. Vol. 10, Primary Visual Cortex in Primates, pp. 443-467. New York: Plenum.
Moran J, Desimone R. 1985. Selective attention gates visual processing in the extra striate cortex. Science 229:782–784.
Morrow MJ, Sharpe JA. 1993. Retinotopic and directional deficits of smooth pursuit initiation after posterior cerebral hemispheric lesions. Neurology 43:595–603.
Motter BC. 1993. Focal attention produces spatially selective processing in visual cortical areas V1, V2, and V4 in the presence of competing stimuli. J Neurophys 70:909–919.
Motter BC. 1994. Neural correlates of attentive selection for color or luminance in extrastriate area V4. J Neurosci 14:2178–2189.
Movshon JA. 1990. Visual processing of moving images. In: H Barlow, C Blakemore, M Weston-Smith (eds). Images and Understanding: Thoughts About Images; Ideas About Understanding, pp. 122-137. New York: Cambridge Univ. Press.
Movshon JA, Adelson EH, Gizzi MS, Newsome WT. 1985. The analysis of moving visual patterns. In: C Chagas, R Gattass, C Gross (eds). Pattern Recognition Mechanisms, pp. 117-151. New York: Springer-Verlag.
Nealey TA, Maunsell JH. 1994. Magnocellular and parvocellular contributions to the responses of neurons in macaque striate cortex. J Neurosci 14:2069–2079.
Newsome WT, Pare EB. 1988. A selective impairment of motion perception following lesions of the middle temporal visual area (MT). J Neurosci 8:2201–2211.
Ohzawa I, DeAngelis GC, Freeman RD. 1996. Encoding of binocular disparity by simple cells in the cat's visual cortex. J Neurophysiol 75:1779–1805.
Optican LM, Richmond BJ. 1987. Temporal encoding of two-dimensional patterns by single units in primate inferior temporal cortex. III. Information theoretic analysis. J Neurophysiol 57:162–178.
Perrett DI, Mistlin AJ, Chitty AJ. 1987. Visual neurones responsive to faces. Trends Neurosci 10:358–364.
Poggio GF. 1989. Neural responses serving stereopsis in the visual cortex of the alert macaque monkey: position-disparity and image-correlation. In: JS Lund (ed). Sensory Processing in the Mammalian Brain: Neural Substrates and Experimental Strategies, pp. 226-241. New York: Oxford Univ. Press.
Reynolds JH, Desimone R. 1999. The role of neural mechanisms of attention in solving the binding problem. Neuron: In press.
Roy J-P, Komatsu H, Wurtz RH. 1992. Disparity sensitivity of neurons in monkey extrastriate area MST. J Neurosci 12:2478–2492.
Salzman CD, Murasugi CM, Britten KH, Newsome WT. 1992. Microstimulation in visual area MT: effects on direction discrimination performance. J Neurosci 12:2331–2355.
Salzman CD, Newsome WT. 1994. Neural mechanisms for forming a perceptual decision. Science 264:231–237.
Tootell RB, Hamilton SL. 1989. Functional anatomy of the second visual area (V2) in the macaque. J Neurosci 9:2620–2644.
Treisman A. 1986. Features and objects in visual processing. Sci Am 255(5):114B-125.
Treue S, Maunsell JH. 1996. Attentional modulation of visual motion processing in cortical areas MT and MST. Nature 382:539–541.
Ullman S. 1986. Artificial intelligence and the brain: computational studies of the visual system. Annu Rev Neurosci 9:1–26.
Von der Heydt R, Peterhans E. 1989. Mechanisms of contour perception in monkey visual cortex. I. Lines of pattern discontinuity. J Neurosci 9:1731–1748.
Von der Heydt R, Peterhans E, Baumgartner G. 1984. Illusory contours and cortical neuron responses. Science 224:1260–1262.
Wong-Riley MTT, Carrol EW. 1984. Quantitative light and electron microscopic analysis of cytochrome oxidase-rich zones in VII prestriate cortex of the squirrel monkey. J Comp Neurol 222:18–37.
Wurtz RH, Goldberg ME, Robinson DL. 1982. Brain mechanisms of visual attention. Sci Am 246(6):124–135.
Yoshioka AT, Levitt JB, Lund JS. 1994. Independence and merger of thalamocortical channels within macaque monkey primary visual cortex: anatomy of interlaminar projections. Vis Neurosci 11:467–489.
Zeki SM. 1976. The functional organization of projections from striate to prestriate visual cortex in the rhesus monkey. Cold Spring Harbor Symp Quant Biol 40:591–600.
Zeki S, Shipp S. 1988. The functional logic of cortical connections. Nature 355:311–317.
Zeki S, Watson JD, Lueck CJ, Friston KJ, Kennard C, Frackowiak RS. 1991. A direct demonstration of functional specialization in human visual cortex. J Neurosci 11:641–649.
Zihl J, von Cramon D, Mai N, Schmid C. 1991. Disturbance of movement vision after bilateral posterior brain damage. Further evidence and follow-up observations. Brain 114:2235–2252.