Do different ‘magnocellular tasks’ probe the same neural substrate?

Patrick T. Goodbourn, Jenny M. Bosten, Ruth E. Hogg, Gary Bargary, Adam J. Lawrance-Owen, J. D. Mollon


The sensory abnormalities associated with disorders such as dyslexia, autism and schizophrenia have often been attributed to a generalized deficit in the visual magnocellular–dorsal stream and its auditory homologue. To probe magnocellular function, various psychophysical tasks are often employed that require the processing of rapidly changing stimuli. But is performance on these several tasks supported by a common substrate? To answer this question, we tested a cohort of 1060 individuals on four ‘magnocellular tasks’: detection of low-spatial-frequency gratings reversing in contrast at a high temporal frequency (so-called frequency-doubled gratings); detection of pulsed low-spatial-frequency gratings on a steady luminance pedestal; detection of coherent motion; and auditory discrimination of temporal order. Although all tasks showed test–retest reliability, only one pair shared more than 4 per cent of variance. Correlations within the set of ‘magnocellular tasks’ were similar to the correlations between those tasks and a ‘non-magnocellular task’, and there was little consistency between ‘magnocellular deficit’ groups comprising individuals with the lowest sensitivity for each task. Our results suggest that different ‘magnocellular tasks’ reflect different sources of variance, and thus are not general measures of ‘magnocellular function’.

1. Introduction

Multiple parallel pathways within the mammalian visual system relay information from the retina to higher cortical areas. Among the most celebrated is the magnocellular–dorsal pathway, which draws input from the retinal parasol cells that project to the ventral two magnocellular layers of the lateral geniculate nucleus (LGN). From the LGN, it continues to layers 4Cα and 4B of the primary visual cortex. Efferent projections from layer 4B proceed via the cytochrome oxidase thick stripes of area V2 to the middle temporal and related areas, and extend to the posterior parietal cortex [1]. Despite evidence of interaction between streams at subcortical and cortical levels [2,3], the classical view of a single magnocellular–dorsal pathway remains pervasive.

In an influential publication, Schiller et al. [4] reported that lesions to the magnocellular layers of the LGN led to a characteristic profile of psychophysical deficits in macaques. The monkeys were impaired in the detection of motion and of luminance flicker, whereas the processing of colour, texture and pattern was spared. Lesions to the parvocellular layers produced the opposite profile: colour, texture and pattern discrimination were disrupted, while motion and flicker detection were spared. These findings, suggesting a critical and dissociable role for the magnocellular pathway in processing transient stimuli, were consistent with known physiological properties of magnocellular neurons—in particular their selectivity for high temporal frequencies [5].

In the two decades since Schiller et al. published their findings, it has often been assumed that psychophysical tasks requiring fine temporal processing—like those revealing deficits following a magnocellular lesion—provide a useful measure of ‘magnocellular function’. This assumption has been particularly common in the clinical literature, and various ‘magnocellular tasks’ have been used to assess the functional integrity of the magnocellular–dorsal pathway in several psychological conditions. Most prominent in this field are the controversial magnocellular deficit theory of dyslexia [6,7] and the related dorsal stream vulnerability hypothesis of developmental disorders [8,9]. Psychophysical attempts to assess magnocellular–dorsal function have been made in relation to dyslexia [1014], dyspraxia [15], dyscalculia [16], autism spectrum disorder [17], Williams syndrome [18], fragile X syndrome [19], schizophrenia [20], Parkinson's disease [21], migraine [22] and glaucoma [23]. However, some authors have questioned whether the same brain functions are probed by the different tasks: for example, Dakin & Frith [24] and Pellicano & Gibson [25] distinguish sensitivity to flicker contrast and sensitivity to coherent motion as assessing lower subcortical and higher cortical levels of the dorsal pathway, respectively.

In the present study, we examined whether different putative measures of magnocellular function are consistent in their ranking of a cohort of 1060 individuals, and thus whether performance on the tasks is determined by common factors. We selected three representative visual tasks, all of which have been claimed to assess magnocellular–dorsal function. The frequency-doubled grating task [11,13,21]—so called because the dominant spatial frequency of the suprathreshold percept is double the actual spatial frequency—measures luminance contrast threshold for detecting a grating of low spatial frequency that reverses in contrast at a high temporal frequency (figure 1a). The steady-pedestal grating task [20,23,26,27] measures luminance contrast threshold for detecting a grating of low spatial frequency, presented in a brief pulse on a steady luminance pedestal (figure 1b). The third visual task, the coherent motion task [10,12,1419,28], measures coherence threshold for detecting the primary direction of motion in a random dot kinematogram (figure 1c).

Figure 1.

Stimuli used in the study. (a–c) Single frames of stimuli used in the frequency-doubled grating task, the steady-pedestal grating task and the coherent motion task, respectively; (d) normalized power spectrogram of a tone sequence used in the auditory temporal order task; and (e) stimulus used in the ‘non-magnocellular’ short-wave cone task.

Functional subdivisions exist in sensory modalities other than vision, and psychophysical tasks have been devised to target the auditory homologue of the visual magnocellular system. It has been suggested that a common factor could govern the development of large neurons throughout the brain; thus, a generalized magnocellular deficit should also affect the processing of transient auditory stimuli [6,7,29]. Accordingly, we included in the present study an auditory temporal order task [30,31], which measures threshold for discriminating the order of two auditory tones embedded in a rapidly presented sequence (figure 1d). Finally, we included a control task that is not linked to magnocellular–dorsal function: the short-wave cone task, which measures threshold for detecting targets defined by a spatial decrement in short-wave cone excitation (figure 1e). We minimized procedural differences between tasks by employing a common forced choice paradigm, with stimulus level dictated by two interleaved adaptive staircases.

2. Methods

(a) Participants

Participants (n = 1060, 647 female) were of European descent and aged from 16 to 40 years (mean = 22 years, s.d. = 4 years). All were inexperienced in psychophysical observation and naive to the particular aims of the experiments. They were paid £25 to complete a battery of psychophysical tasks lasting about 2.5 h. All gave informed consent. Participants were refracted to their best corrected visual acuity (all less than or equal to 0.00 logMAR). A randomly selected subsample (n = 105, 66 female) returned for a second identical session at least one week later.

(b) Apparatus

Experiments were conducted in a darkened room. All stimuli were generated using Matlab R2007b software with PsychToolbox-3 [32,33] or CRS Toolbox for Matlab (Cambridge Research Systems). Responses were collected using a two- or four-button hand-held box. Visual stimuli were displayed via a specialized video processor (BITS++ or VSG 2/3; Cambridge Research Systems) on a gamma-corrected Sony Trinitron monitor operating at 100 Hz. Observers viewed stimuli monocularly using their preferred eye, or—if the difference in visual acuity between eyes was 0.10 logMAR or greater—using the eye with better acuity. They used a headrest to maintain a viewing distance of 0.5 m (1.0 m for the steady-pedestal grating and short-wave cone tasks). Auditory stimuli were played binaurally via an M-Audio Fast Track USB sound card at a 48 kHz sample rate through Sennheiser HD205 circumaural stereo headphones.

(c) Stimuli

(i) Frequency-doubled grating task

Stimuli were vertically oriented, low-spatial-frequency Gabors (fS = 0.5 c deg–1, σ = 2.0°, ϕ randomized) reversing in contrast according to a 25 Hz square wave (figure 1a). They were centred 9.0° to either the left or right of fixation, and displayed for 500 ms with contrast ramped on and off over 120 ms. Mean luminance was 30 cd m−2. The participant's task was to identify the location (left or right) of the stimulus. Luminance contrast was varied adaptively between trials.

(ii) Steady-pedestal grating task

Stimuli consisted of a sixth spatial derivative of a Gaussian in the horizontal dimension (σx = 2.75°, peak fS = 0.2 c deg–1) windowed by a Gaussian in the vertical dimension (σy = 1.85°; figure 1b). They were centred 2.8° either above or below fixation, and displayed in a single pulse of three monitor frames (approx. 30 ms). The luminance of the pedestal was 16 cd m−2. The participant's task was to identify the location (top or bottom) of the stimulus. Luminance contrast was varied adaptively between trials.

(iii) Coherent motion task

Stimuli comprised 0.15° white dots (20% density) within an annulus of inner radius 1.0° and outer radius 10.0° (figure 1c). The fixation marker was positioned in the centre of the annulus. Each dot moved at 4.0° s–1 for its 500 ms lifetime: a proportion of signal dots, selected randomly on each frame, moved in the target direction; the remainder moved in a random direction. The stimulus was presented for 1.0 s, with contrast ramped on and off over 250 ms. Background luminance was 30 cd m−2, and dot luminance was 60 cd m−2. The participant's task was to identify the primary direction of motion (left or right) in the stimulus. The proportion of signal dots was varied adaptively between trials.

(iv) Auditory temporal order task

Stimuli were three consecutive tone groups (peak intensity 65 dB SPL) with onsets spaced 1.0 s apart: a reference group (TR) and two test groups (T1 and T2). TR was a sequence of pure sinusoidal tones separated by 10 ms silence. The first and last tones (440 Hz) were always of 125 ms duration. The two inner tones (392 and 494 Hz) were of variable duration. On each trial, one of T1 or T2 matched TR exactly, while the order of the inner tones was reversed in the other (figure 1d). The participant's task was to identify which of T1 or T2 was different from TR. The duration of the middle tones was varied adaptively between trials.

(v) Short-wave cone task

Stimuli were similar to those used in the commercially available Cambridge Colour Test [34]. They comprised circular patches of varied diameters (ranging from 0.04° to 0.59°) presented on a black background for 3.0 s, or until a response was given (figure 1e). The target was defined by a subset of patches enclosed by an annulus of inner radius 2.2° and outer radius 6.1°. Patches comprising the surround were metameric with equal-energy white; patches comprising the target differed chromatically from the surround by a decrement in short-wave cone excitation. Chromaticities were constructed using the cone fundamentals of Smith & Pokorny [35]. The luminance of each individual patch was randomized within a range from 15.7 to 38.3 cd m−2 in order to mask any difference in average luminance between the target and surround. The participant's task was to identify the location (top, right, left or bottom) of a 2.1° gap in the target. Colour contrast was varied adaptively along the S/(L + M) axis between trials.

(d) Procedure

All tasks were two-alternative forced-choice (except the ‘non-magnocellular’ short-wave cone task, which was four-alternative forced-choice). After reading instructions presented on the screen, participants completed a set of practice trials to ensure they understood the task. For experimental trials, test intensity was determined according to two independent ZEST adaptive staircases [36,37]: blocked for the short-wave cone task and randomly interleaved for all other tasks. Staircases terminated after 75 trials (coherent motion), 31 trials (short-wave cone) or 30 trials (frequency-doubled grating, steady-pedestal grating and auditory temporal order). Feedback was provided throughout: auditory tones for visual tasks, and coloured lights for the auditory task.

3. Results

(a) Preliminary analysis

For each task, threshold was calculated as the 82 per cent correct point of a cumulative Weibull psychometric function fitted to the pooled data from the two staircases. For the frequency-doubled and steady-pedestal grating tasks, this signified the Michelson contrast at detection threshold; for coherent motion, proportion coherence at direction discrimination threshold; and for auditory temporal order, tone duration in seconds at order discrimination threshold. Sensitivity was defined as the inverse of threshold.

(b) Sensitivity distributions and test–retest reliabilities

Table 1 gives properties of the sensitivity distribution and the test–retest reliability of each measure. Reliabilities were based on a subset of 105 randomly selected participants who repeated the tasks in a second session at least one week after their initial session. All distributions were approximately normal, and reliabilities were moderate to high. Each distribution contained individuals with very low sensitivity.

View this table:
Table 1.

Descriptive statistics and test–retest reliability for all tasks. *p ≪ 0.001.

(c) Correlation between measures

The scatter plots of figure 2 show the relationships between the four measures. Correlations were assessed by Spearman's rank-order coefficient (ρS). Only the correlation between the two most similar tasks—the frequency-doubled and steady-pedestal grating tasks—was of a notable magnitude, ρS(1059) = 0.39, p ≪ 0.001. Owing to the large size of our sample, most other correlations were significant; but effect sizes were poor to modest, ranging from 0.07 (between the frequency-doubled grating and auditory temporal order tasks) to 0.20 (between the steady-pedestal grating and auditory temporal order tasks). With the exception of the two grating tasks, which shared 15 per cent of variance, no pair of measures shared more than 4 per cent of variance.

Figure 2.

Scatter plots showing the relationship of z-transformed sensitivities for each possible pair of tasks. Correlations are Spearman's rank-order coefficient (ρS). Asterisks denote p ≪ 0.001; dagger denotes p < 0.05.

(d) Inter-task reliabilities

It is possible that the imperfect reliability of the measures reduced the observed correlations from their true values. Thus, a valuable comparison is that between our observed reliabilities (the correlation between scores on the same measure in two separate sessions) and what we will refer to as inter-task reliabilities. By this term, we denote the correlations between different measures in two separate sessions. Inter-task reliabilities of this kind have seldom been reported, but they offer an attractive measure: whereas the correlation between tasks on a single session may be inflated by time-varying factors that are common to that individual session (e.g. mood or tiredness), these factors will have a reduced influence when the comparison is between (i) the correlation of task A with itself across sessions and (ii) the correlation of task A with task B across sessions. To the extent that performance on the different measures is affected by common sources of variance, reliabilities and inter-task reliabilities should be of a similar magnitude.

The set of reliabilities and inter-task reliabilities of the four measures is presented in figure 3. The cells along the upper-left to lower-right diagonal represent correlations between the first- and second-session scores on the same measure (test–retest reliabilities); these were of a high magnitude and were highly significant. The other cells represent correlations between first- and second-session scores on different measures; only the inter-task reliabilities between the two grating tasks were significant. All other inter-task reliabilities were near zero and were not statistically significant. We place most weight on this last result.

Figure 3.

Reliabilities and inter-task reliabilities for all pairs of tasks when testing was repeated after an interval of at least one week. Correlations are Spearman's rank-order coefficient (ρS). For each pair of tasks with a significant relationship (p < 0.05), a dashed line shows the orthogonal linear regression to the data. Notice that strong relationships are seen only for panels on the central diagonal running from upper-left to lower-right, where performance on a given task is correlated with performance on the same task in a later session. Much weaker relationships are seen in the other panels, where performance on a given task is correlated with performance on a different ‘magnocellular task’ in a later session. Asterisks denote p ≪ 0.001; daggers denote p < 0.05.

(e) Correlation with a non-magnocellular task

Despite their low magnitude, correlations within a set of ‘magnocellular tasks’ may nevertheless be stronger than those between magnocellular and putative ‘non-magnocellular tasks’. We tested this by examining the correlation of each magnocellular task with a fifth task that is not linked to magnocellular–dorsal function. For this, we chose the short-wave cone task, a variant of the Cambridge Colour Test [34] that measures sensitivity to stimuli defined by a spatial short-wave cone decrement relative to the background. Short-wave cones are thought to provide negligible input to the parasol ganglion cells [38].

Correlations between the short-wave cone task and magnocellular tasks were all highly significant (p ≪ 0.001) and of a similar magnitude to correlations within the set of magnocellular tasks: with frequency-doubled gratings, ρS(1055) = 0.26; with steady-pedestal gratings, ρS(1058) = 0.28; with coherent motion, ρS(1053) = 0.17; and with auditory temporal order, ρS(1047) = 0.15. The mean of these correlations (0.21, s.d. = 0.06) was almost identical to the mean of the correlations within the set of magnocellular tasks (mean = 0.20, s.d. = 0.11).

(f) Comparison of low-sensitivity sets

While the four tasks do not correlate to any notable extent across the full sample, they still may be consistent in identifying individuals with low sensitivity, or putative magnocellular deficits. To investigate this possibility, we asked whether those participants with low sensitivity on each of the four tasks formed a common set. We defined four low-sensitivity sets, comprising the 50 individuals with lowest sensitivity on each task. The union of these four sets (the low-sensitivity cohort) comprised 163 unique individuals.

Figure 4a shows a Venn diagram of the low-sensitivity cohort, arranged according to membership of the four task-based low-sensitivity sets; and figure 4b shows the proportion of the cohort that belongs to one, two, three or four different sets. A vast majority of individuals in the cohort showed deficits on a single task only (n = 134; 82.2%), and none showed deficits on all four tasks.

Figure 4.

Comparison of low-sensitivity sets. (a) Venn diagram of participants in the low-sensitivity cohort according to membership of task-based low-sensitivity sets. For each task, the low-sensitivity set is defined as the 50 individuals with the lowest sensitivity. The low-sensitivity cohort is the union of all low-sensitivity sets (n = 163). Darker regions denote the intersection of more low-sensitivity sets. (b) Proportion of participants in the low-sensitivity cohort with membership of one, two, three or four low-sensitivity sets. (c) Inter-task agreement (Krippendorff's α) as a function of the proportion of participants defined as comprising a low-sensitivity set. Black points denote p < 0.05; light grey points denote p ≥ 0.05. The horizontal lines at αK = 0.667 and αK = 0.800 show Krippendorff's criteria for tentative and reliable agreement, respectively [39].

The agreement of low-sensitivity categorizations across the four tasks can be assessed using a measure of inter-rater reliability, such as Krippendorff's alpha (αK) [39]. By Krippendorff's recommendations, data for which αK < 0.67 should be discarded. For a set size of 50, inter-set agreement falls well short of this criterion (αK = 11). In fact, regardless of the proportion of individuals comprising each low-sensitivity set, inter-set agreement never approaches an acceptable level (figure 4c).

4. Discussion

In a cohort of 1060 individuals, we found poor mutual agreement between four putative measures of ‘magnocellular function’. Two similar grating detection tasks were moderately correlated, sharing about 15 per cent of variance; but no other pair of tasks shared more than 4 per cent of variance. To account for the effects of temporal fluctuations in sensitivity, we compared within-task reliabilities with inter-task reliabilities based on a subcohort of 105: reliabilities were good and highly significant, but with the exception of the two grating tasks, inter-task reliabilities were statistically indistinguishable from zero. Furthermore, correlations between pairs of magnocellular tasks were of a similar magnitude to correlations between those tasks and the non-magnocellular task of detecting short-wave cone colour contrast. Finally, supposed ‘magnocellular deficit’ groups comprising individuals with low psychophysical sensitivity were not consistent between tasks.

Such dissociations imply that individual differences in sensitivity on common magnocellular tasks do not reflect individual differences in the function of a single neural mechanism [40]. Of course, the tasks rely to some extent on common substrates: there is clearly some necessary degree of overlap—at the photoreceptor level (among the visual tasks), or in motor response systems, for example. But sensitivity on each of the tasks clearly reflects variation in the efficiency of shared systems and pathways only to a very limited extent. To subsume such tasks under a single banner of magnocellular function is thus misleading.

(a) Comparison with previous results

Although several studies have used more than one magnocellular task, only a few have directly examined the relationship between tasks. In a study of 17 dyslexic adults and 18 adult controls, Witton et al. [41] found a significant correlation between thresholds for detecting coherent motion and thresholds for detecting 2 Hz frequency modulation of a 500 Hz tone, but the relationship was confined to the dyslexic subgroup (their fig. 3). Using the same two tasks and a larger sample (22 children with auditory processing disorder, 19 with developmental dyslexia and 98 controls), Dawes et al. [42] found a correlation of only 0.18 (Pearson's r) between the auditory and visual thresholds.

Our present results appear inconsistent with the study of Kéri & Benedek [28] who reported, in a sample of 100 male adults, a significant correlation (r = 0.47) between thresholds for coherent motion and contrast thresholds for gratings of low spatial frequency that were modulated at 10 Hz; their tasks were different from ours in several details, such as the presence of high-spatial-frequency cues at the perimeter of the grating. Pellicano & Gibson [25] measured thresholds for detecting a flickering Gaussian blob and thresholds for detecting coherent motion. They found significant correlations between the two tasks in a sample of 39 dyslexic children (r = 0.37), as well as in a group of 59 controls (r = 0.36), but not in an autistic group.

(b) Mental abilities, g and the differentiation hypothesis

A parallel with the notion of magnocellular function may be found in the notion of g, or general intelligence, which is inferred from the degree of positive correlation across tasks assessing different mental abilities. Spearman's differentiation hypothesis or law of diminishing returns [43] predicts the correlation between cognitive tasks to be higher for low ability levels than for high ability levels. The general notion is that a low g will impose a similar limitation across all cognitive processes, manifesting in a high correlation between specific cognitive tasks. In contrast, when g is high, no such limit exists and performance on specific tasks is driven primarily by the efficiency of other, uncorrelated, task-specific processes. The hypothesis has received a degree of empirical support: for example, the average correlation between subtests of the Wechsler Adult Intelligence Scale (WAIS) may be almost twice as large within a low-ability than within a high-ability group [44]. If the same relationship holds for tests of magnocellular function, one might argue that the low correlations observed in the present study are the result of a sample that is biased against individuals with low sensitivity.

We believe that such an argument is invalid, for several reasons. First, each of our distributions of sensitivity encompassed one or two orders of magnitude, and contained individuals with reliable near-zero sensitivity. Second, our sample included at least 21 individuals who reported a formal diagnosis of dyslexia; six reported dyspraxia; and four reported autism spectrum disorder. All of these conditions have been associated with magnocellular deficits. The rates observed in our sample, based on a follow-up survey with 553 respondents, accord with the prevalence of diagnoses in the general population [4548], suggesting that the present study had little or no bias against selection from relevant clinical groups. Third, examining our data within five groups of participants graded according to the level of magnocellular function—following the methods of Detterman & Daniel [44]—does not reveal any trend towards higher correlations with decreasing sensitivity. Mean correlations between tasks (±s.e.m.) within groups ordered by decreasing sensitivity were ρS = 0.13 (±0.02), 0.09 (±0.02), 0.11 (±0.02), 0.10 (±0.02) and 0.15 (±0.03). Finally, and more generally, there is an issue of circularity: if magnocellular tasks measure magnocellular function only for a subgroup of individuals, how are these individuals to be identified as suitable candidates for testing?

(c) Changing views of sensory pathways

The notion of a singular magnocellular–dorsal pathway—aside from its assessment by psychophysical means—is problematic in the light of advances in anatomy and physiology, which take us far from the simple classical picture of parasol and midget ganglion cells projecting, respectively, to the magnocellular and parvocellular layers of the LGN. The retina alone is known to contain at least 80 different cell populations, with separate visual pathways projecting from each of at least 20 anatomically distinct classes of the ganglion cell [49]. The physiological properties of many cell classes remain unknown, but it now seems highly unlikely that parasol cells alone determine sensitivity to the stimuli used in magnocellular tasks. For example, it has recently been suggested that frequency-doubled gratings might effectively stimulate the large smooth monostratified ganglion cells [50], which co-stratify with parasol cells in the retinal inner plexiform layer. Moreover, the functional properties of magnocellular and parvocellular neurons show substantial overlap [51]. Further downstream, the direct link between the subcortical magnocellular system and the cortical dorsal pathway is also compromised by evidence of non-magnocellular input to dorsal divisions of the primary visual cortex and higher visual areas [3]. Indeed, some have argued that the linear pathway model of cortical visual processing is fundamentally unsound [2].

(d) Final remarks

If performance on different ‘magnocellular tasks’ is determined by different sources of variance, it is hardly surprising that the literature examining psychophysically diagnosed ‘magnocellular deficits’ is characterized by conflict and contradiction. Valuable information may be lost when performance is compressed into a single dimension of ‘magnocellular function’. There is preliminary evidence, for instance, that certain developmental disorders have a specific performance profile across different dimensions of fast temporal sensory processing [9]. We propose that relinquishing the notion of general ‘magnocellular function’ in favour of a multidimensional view will resolve much of the contradiction in past findings and provide a more robust framework for future investigation.


The research was approved by the Psychology Research Ethics Committee of the University of Cambridge.

This work was supported by the Gatsby Charitable Foundation. P.T.G. was supported by a scholarship from the Cambridge Commonwealth and Overseas Trusts, and by an Overseas Research Studentship. J.M.B. was supported by a research fellowship from Gonville and Caius College, Cambridge.

  • Received June 21, 2012.
  • Accepted July 25, 2012.


View Abstract