Predictive Coding: A Fresh View of Inhibition in the Retina

M. V. Srinivasan, S. B. Laughlin, A. Dubs

Abstract

Interneurons exhibiting centre--surround antagonism within their receptive fields are commonly found in peripheral visual pathways. We propose that this organization enables the visual system to encode spatial detail in a manner that minimizes the deleterious effects of intrinsic noise, by exploiting the spatial correlation that exists within natural scenes. The antagonistic surround takes a weighted mean of the signals in neighbouring receptors to generate a statistical prediction of the signal at the centre. The predicted value is subtracted from the actual centre signal, thus minimizing the range of outputs transmitted by the centre. In this way the entire dynamic range of the interneuron can be devoted to encoding a small range of intensities, thus rendering fine detail detectable against intrinsic noise injected at later stages in processing. This predictive encoding scheme also reduces spatial redundancy, thereby enabling the array of interneurons to transmit a larger number of distinguishable images, taking into account the expected structure of the visual world. The profile of the required inhibitory field is derived from statistical estimation theory. This profile depends strongly upon the signal: noise ratio and weakly upon the extent of lateral spatial correlation. The receptive fields that are quantitatively predicted by the theory resemble those of X-type retinal ganglion cells and show that the inhibitory surround should become weaker and more diffuse at low intensities. The latter property is unequivocally demonstrated in the first-order interneurons of the fly's compound eye. The theory is extended to the time domain to account for the phasic responses of fly interneurons. These comparisons suggest that, in the early stages of processing, the visual system is concerned primarily with coding the visual image to protect against subsequent intrinsic noise, rather than with reconstructing the scene or extracting specific features from it. The treatment emphasizes that a neuron's dynamic range should be matched to both its receptive field and the statistical properties of the visual pattern expected within this field. Finally, the analysis is synthetic because it is an extension of the background suppression hypothesis (Barlow & Levick 1976), satisfies the redundancy reduction hypothesis (Barlow 1961 a, b) and is equivalent to deblurring under certain conditions (Ratliff 1965).