The structure of natural images and efficient image coding
"The distribution of firing rates in cells in cat V1 and macaque IT when exposed to natural scenes"
Why is the representation used by neurons in V1 of the form observed? Recently a number of researchers have proposed that the neuronal code optimises some criterion. The proposals include the ideas that the code is maximising "sparsity", attempting to remove higher order correlations, or attempting to minimise metabolic cost. All these proposals make strong predictions as to the probability distribution of output states of neurons when recorded in a natural environment. We therefore recorded from neurons in anaesthetised cat V1 whilst presenting a number of different video sequences representative of a cat's natural environment, and for comparison, recorded neurons from unanaesthetised macaque IT. Over a large range of time scales, and for both preparations, we found the distribution of firing rates to be exponentially distributed. The relevance of this result to proposals for neuronal coding will be discussed. In particular, the fact that the spike count entropy was on average within 95% of the maximum possible for a neuron with the same firing rate is argued to be compatible with, amongst other things, the codes attempting to minimise metabolic cost.
"Independent Components Analysis and images."
(See bell.blind.ps for description of ICA.)
"Spatiotemporal coupling and scaling of natural images and visual responses"
I will discuss some recent findings of the statistics of natural time-varying images including measurements of high order pairwise correlations, and based on those measurements, make some quantitative comparisons between the theoretical predictions and psychophysical and physiological data.
"Sphering, decorrelation and the relative sensitivity of visual neurons"
In this talk we look at the concept of `sphering' and decorrelation in relation to spatial sensitivity of visual neurons. As has been noted previously, the amplitude spectra of natural scenes falls as approximately 1/f. To cope with this correlational structure, it has been proposed the sensitivity of neurons is well adapted to 'whiten' or 'sphere' the data. However, two theories have been proposed as to why one might want to do this. First, the goal might be to decorrelate the signal. It is often assumed that decorrelating the signal in this way will decrease the statistical dependencies. It will be shown that this assumption is not necessarily correct. Complete decorrelation is not necessarily advantageous and can actually increase the statistical dependencies between units. The second theory suggests that one of the principle advantages of sphering is to set the relative activities of different neurons to have roughly equal response magnitudes on average. In this case, the activities of individual neurons may remain correlated with each other. In this talk we will look at the implications of this second approach to whitening. It will be suggested that the relative sensitivities of neurons is well adapted to the amplitude spectra of natural scenes and that neurons increase in sensitivity in the human visual system out to approximately 20 cycles/deg. It is argued that this approach to relative sensitivity can be used to account for why the visual system shows a peak threshold sensitivity to sinusoidal gratings near 4 cycles/deg.
"Visual Cortical Responses to Natural Scenes"
Although natural scenes have become an important research topic
in the computational vision community, there is very little
physiological data on the responses of visual cortical cells
to natural scenes. I have recorded from single cells in areas
V1, V2 and V4 during free viewing of natural scenes. My talk
will focus on two aspects of these data. (1) Do existing models
of processing in area V1 (e.g. divisive normalization) account
for responses during free viewing of natural scenes? Although
V1 neurons are quasi-linear within a restricted stimulus range,
they also incorporate several nonlinear mechanisms that may
emerge when natural scenes are used as stimuli. To the extent
that these suppressive nonlinear mechanisms are active during
free viewing they may decorrelate activity in neighboring cells
and make responses more sparse. (2) Is there any evidence of
fine temporal coding in single-cell spike trains during free
viewing? The evidence supporting temporal coding in higher
vision is mixed at best. Controlled experiments that mimic
free viewing provide an efficient method for addressing temporal
coding issues under natural viewing conditions.
"Temporal processing of natural intensities by the early visual system"
Measurements of the optical environment show a very large dynamic
range of luminance levels, even within a single scene. As an eye scans
a visual scene, these intensities impinge on the photoreceptors in
rapid succession. Special strategies are needed for efficiently coping
with these time series of intensities, because of the very limited
dynamic range of the biological hardware. In my talk, I will discuss
recent measurements of time series of intensities, measurements of the
processing of these intensities in visual systems, and some
theoretical implications of the nonlinear (adaptive) processes
apparently implemented in the early visual system.
"Scale-invariant random field models for images with varying degrees of
clutter"
Experiments on real images suggest that their filter statistics are
infinitely divisible distributions. It seems that this is due to the
presense of a "clutter" parameter, which is a function on scale
space. We propose some doubly stochastic models for images which
reproduce many of these statistics.
"Local Feature Analysis: A General Statistical Theory for Object Representation."
Low-dimensional representations of sensory signals are key to solving
many of the computational problems encountered in high-level
vision. Principal Component Analysis has been used in the past to
derive practically useful compact representations for different
classes of objects. One major objection to the applicability of PCA is
that it invariably leads to global, nontopographic representations
that are not amenable to further processing and are not biologically
plausible. In this paper we present a new mathematical
construction---Local Feature Analysis (LFA)---for deriving local
topographic representations for any class of objects. The LFA
representations are sparse-distributed and, hence, are effectively
low-dimensional and retain all the advantages of the compact
representations of the PCA. But unlike the global eigenmodes, they
give a description of objects in terms of statistically derived local
features and their positions. We illustrate the theory by using it to
extract local features for three ensembles---2D images of faces
without background, 3D surfaces of human heads, and finally 2D faces
on a background. The resulting local representations have powerful
applications in head segmentation and face recognition.
(Full paper available via http://venezia.rockefeller.edu/group/papers/full/LFA/)
"Where You Look Determines What You See"
An organism actively determines which part of the world it samples
with its sensory organs. A well known example is that animals orient
their bodies, turn their heads, or move their eyes in order to "look
at" specific things in their environment. How does this affect the
structure of the signals typically reaching the visual system? Given
where you look, what does your visual system see? Put another way,
what is special about the parts of real-world images that you choose
to look at? To address these questions we recorded the eye positions
of human subjects as they looked at pictures of real-world scenes. Our
main finding agrees with basic intuition: we look at the the places
which are "interesting". We have formalized this intuitive notion
using "compressibility" within a wavelet representation. We are
currently exploring the implications of this finding for visual
processing.
"Synaptic Plasticity in a Natural Image Environment"
Some of the talks in this workshop argue that the statistical
properties of cortical cells reflect statistical properties of the
natural environment (eg: work by David Field). The visual environment
has a strong influence on receptive field properties and their
organization in the visual cortex. Therefore we argue that
plasticity, at least in part, is responsible for adapting receptive
field properties to the environment. We will first demonstrate that a
network of BCM
neurons trained in a natural image environment captures many of
the components of cortical receptive fields and their organization in
visual cortex. Furthermore it forms maps displaying both ocular
dominance bands and varying orientation selectivity.
We will also show that the response of neurons in this network is
sparse, however this does not stem from including an explicit sparsity
term but is the outcome BCM learning, and it's underlying
computational principals. We will explain the underlying computational
principals of BCM which are high selectivity
fixed points and dependence on the higher order
statistics of the visual environment. Finally we will discuss how
these computational issues relate to the basic question of neuronal
coding: Efficient for What?
(Some related publications can be found in my
home page and in Nathan
Intrator's home page.)
Jack Gallant
Hans van Hateren