NIPS96 workshop:

The structure of natural images and efficient image coding


Organizers: D. Ruderman and B. Olshausen

This one-day workshop, held as part of the NIPS96 workshops in Snowmass, Colorado, Dec. 6-7, 1996, discussed recent work on natural image statistics and their relation to visual system design. We heard from a number of speakers, listed below, and there was some lively debate on topics such as sparseness, scaling in images (origins of 1/f), statistical independence, and limitations of linear image models. For background information on this general topic, see the Noise and Natural Scene Statistics web page. We hope to stage another gathering in a year or so - stay tuned!

Participants:

Roland Baddeley, Oxford University
Tony Bell, Salk Institute
Dawei Dong , Caltech
David Field, Cornell University
Jack Gallant, U.C. Berkeley
Hans van Hateren, University of Groningen
David Mumford, Harvard University
Penio Penev, Rockefeller University
Pam Reinagel, Caltech
Harel Shouval, Brown University
Michael Webster, University of Nevada, Reno
Tony Zador, Salk Institute


Abstracts


Roland Baddeley (with Larry Abbott, Mike Booth, Frank Sengpiel, Toby Freeman, Ned Wakeman, and Edmund Rolls)

"The distribution of firing rates in cells in cat V1 and macaque IT when exposed to natural scenes"

Why is the representation used by neurons in V1 of the form observed? Recently a number of researchers have proposed that the neuronal code optimises some criterion. The proposals include the ideas that the code is maximising "sparsity", attempting to remove higher order correlations, or attempting to minimise metabolic cost. All these proposals make strong predictions as to the probability distribution of output states of neurons when recorded in a natural environment. We therefore recorded from neurons in anaesthetised cat V1 whilst presenting a number of different video sequences representative of a cat's natural environment, and for comparison, recorded neurons from unanaesthetised macaque IT. Over a large range of time scales, and for both preparations, we found the distribution of firing rates to be exponentially distributed. The relevance of this result to proposals for neuronal coding will be discussed. In particular, the fact that the spike count entropy was on average within 95% of the maximum possible for a neuron with the same firing rate is argued to be compatible with, amongst other things, the codes attempting to minimise metabolic cost.


Tony Bell

"Independent Components Analysis and images."

(See bell.blind.ps for description of ICA.)


Dawei Dong

"Spatiotemporal coupling and scaling of natural images and visual responses"

I will discuss some recent findings of the statistics of natural time-varying images including measurements of high order pairwise correlations, and based on those measurements, make some quantitative comparisons between the theoretical predictions and psychophysical and physiological data.


David Field

"Sphering, decorrelation and the relative sensitivity of visual neurons"

In this talk we look at the concept of `sphering' and decorrelation in relation to spatial sensitivity of visual neurons. As has been noted previously, the amplitude spectra of natural scenes falls as approximately 1/f. To cope with this correlational structure, it has been proposed the sensitivity of neurons is well adapted to 'whiten' or 'sphere' the data. However, two theories have been proposed as to why one might want to do this. First, the goal might be to decorrelate the signal. It is often assumed that decorrelating the signal in this way will decrease the statistical dependencies. It will be shown that this assumption is not necessarily correct. Complete decorrelation is not necessarily advantageous and can actually increase the statistical dependencies between units. The second theory suggests that one of the principle advantages of sphering is to set the relative activities of different neurons to have roughly equal response magnitudes on average. In this case, the activities of individual neurons may remain correlated with each other. In this talk we will look at the implications of this second approach to whitening. It will be suggested that the relative sensitivities of neurons is well adapted to the amplitude spectra of natural scenes and that neurons increase in sensitivity in the human visual system out to approximately 20 cycles/deg. It is argued that this approach to relative sensitivity can be used to account for why the visual system shows a peak threshold sensitivity to sinusoidal gratings near 4 cycles/deg.


Jack Gallant

"Visual Cortical Responses to Natural Scenes"

Although natural scenes have become an important research topic in the computational vision community, there is very little physiological data on the responses of visual cortical cells to natural scenes. I have recorded from single cells in areas V1, V2 and V4 during free viewing of natural scenes. My talk will focus on two aspects of these data. (1) Do existing models of processing in area V1 (e.g. divisive normalization) account for responses during free viewing of natural scenes? Although V1 neurons are quasi-linear within a restricted stimulus range, they also incorporate several nonlinear mechanisms that may emerge when natural scenes are used as stimuli. To the extent that these suppressive nonlinear mechanisms are active during free viewing they may decorrelate activity in neighboring cells and make responses more sparse. (2) Is there any evidence of fine temporal coding in single-cell spike trains during free viewing? The evidence supporting temporal coding in higher vision is mixed at best. Controlled experiments that mimic free viewing provide an efficient method for addressing temporal coding issues under natural viewing conditions.


Hans van Hateren

"Temporal processing of natural intensities by the early visual system"

Measurements of the optical environment show a very large dynamic range of luminance levels, even within a single scene. As an eye scans a visual scene, these intensities impinge on the photoreceptors in rapid succession. Special strategies are needed for efficiently coping with these time series of intensities, because of the very limited dynamic range of the biological hardware. In my talk, I will discuss recent measurements of time series of intensities, measurements of the processing of these intensities in visual systems, and some theoretical implications of the nonlinear (adaptive) processes apparently implemented in the early visual system.


David Mumford

"Scale-invariant random field models for images with varying degrees of clutter"

Experiments on real images suggest that their filter statistics are infinitely divisible distributions. It seems that this is due to the presense of a "clutter" parameter, which is a function on scale space. We propose some doubly stochastic models for images which reproduce many of these statistics.


Penio Penev

"Local Feature Analysis: A General Statistical Theory for Object Representation."

Low-dimensional representations of sensory signals are key to solving many of the computational problems encountered in high-level vision. Principal Component Analysis has been used in the past to derive practically useful compact representations for different classes of objects. One major objection to the applicability of PCA is that it invariably leads to global, nontopographic representations that are not amenable to further processing and are not biologically plausible. In this paper we present a new mathematical construction---Local Feature Analysis (LFA)---for deriving local topographic representations for any class of objects. The LFA representations are sparse-distributed and, hence, are effectively low-dimensional and retain all the advantages of the compact representations of the PCA. But unlike the global eigenmodes, they give a description of objects in terms of statistically derived local features and their positions. We illustrate the theory by using it to extract local features for three ensembles---2D images of faces without background, 3D surfaces of human heads, and finally 2D faces on a background. The resulting local representations have powerful applications in head segmentation and face recognition.

(Full paper available via http://venezia.rockefeller.edu/group/papers/full/LFA/)


Pam Reinagel and Tony Zador

"Where You Look Determines What You See"

An organism actively determines which part of the world it samples with its sensory organs. A well known example is that animals orient their bodies, turn their heads, or move their eyes in order to "look at" specific things in their environment. How does this affect the structure of the signals typically reaching the visual system? Given where you look, what does your visual system see? Put another way, what is special about the parts of real-world images that you choose to look at? To address these questions we recorded the eye positions of human subjects as they looked at pictures of real-world scenes. Our main finding agrees with basic intuition: we look at the the places which are "interesting". We have formalized this intuitive notion using "compressibility" within a wavelet representation. We are currently exploring the implications of this finding for visual processing.


Harel Shouval

"Synaptic Plasticity in a Natural Image Environment"

Some of the talks in this workshop argue that the statistical properties of cortical cells reflect statistical properties of the natural environment (eg: work by David Field). The visual environment has a strong influence on receptive field properties and their organization in the visual cortex. Therefore we argue that plasticity, at least in part, is responsible for adapting receptive field properties to the environment. We will first demonstrate that a network of BCM neurons trained in a natural image environment captures many of the components of cortical receptive fields and their organization in visual cortex. Furthermore it forms maps displaying both ocular dominance bands and varying orientation selectivity. We will also show that the response of neurons in this network is sparse, however this does not stem from including an explicit sparsity term but is the outcome BCM learning, and it's underlying computational principals. We will explain the underlying computational principals of BCM which are high selectivity fixed points and dependence on the higher order statistics of the visual environment. Finally we will discuss how these computational issues relate to the basic question of neuronal coding: Efficient for What?

(Some related publications can be found in my home page and in Nathan Intrator's home page.)


Michael Webster

"Adaptation to the color and spatial structure of natural images"

Adaptation profoundly influences perception by adjusting sensitivity to the prevailing pattern of stimulation. We asked how the state of adaptation might depend on the patterns of spatial and color contrast typical of the natural visual environment. In one set of experiments, we examined whether adaptation to the characteristic amplitude spectra of natural images (which tend to decrease with frequency as 1/f) induces characteristic changes in spatial contrast sensitivity. Contrast thresholds and suprathreshold contrast and frequency matches were measured after adaptation to random samples from an ensemble of images of outdoor scenes, or synthetic images formed by filtering the amplitude spectra of noise over a range of slopes. Adaptation selectively reduced sensitivity at low to medium frequencies, biasing contrast sensitivity toward higher frequencies. The pattern of after-effects was similar for different natural image ensembles (e.g. forest vs. textures) but varied with (large) changes in the slope of the noise spectra. In a second set of experiments, we examined how adaptation to natural color distributions alters color perception. Color distributions in outdoor scenes were measured by recording each scene with a digital camera through 31 interference filters, or by sampling an array of locations in each scene with a spectroradiometer. Chromatic contrasts varied principally along bluish to yellowish green axes, with many scenes exhibiting high correlations in the signals along the luminance and chromatic axes thought to underlie postreceptoral color vision. Adaptation to random samples from the distributions strongly biases color appearance by selectively reducing sensitivity to the dominant axes of each color distribution. Our results suggest that adaptation to the color and spatial structure in natural scenes may exert strong and selective influences on perception that are important in characterizing the normal operating states of the visual system.