VS298 (Fall 06): Suggested projects
This is not an exhaustive listing, just some suggestions to get you started thinking.
- NETtalk. Train a multi-layer perceptron to convert text to speech. You can get Sejnowski & Rosenberg's original paper and the data they used here. (You will need a DECtalk speech synthesizer to play the phonemes - you can probably pick up a used one online.)
- Recognition of handwritten digits. Train a MLP to classify handwritten digits 0-9. You can get some training data here. You may wish to follow the convolutional network methodology of Yann LeCun (try the simpler, earlier model), or invent your own method.
- Sparse coding and decorrelation. Implement Peter Foldiak's network and train it on the handwritten digits above to learn the features of this data. You may wish to then try supervised learning on the learned features to see if it has simplified the classification problem.
- Cortical maps. The elastic net model of Durbin and Mitchison is typical of many cortical map models in that they learn directly on a parameterized feature space. But the cortex simply gets a bunch of inputs from the LGN, and so it needs to learn features such as orientation at the same time as it organizes them into a feature map. How would you go about learning a feature map for orientation position directly from simulated LGN inputs? (You may wish to consult the book of Risto Miikkulainen for recent efforts in this area.)
- The 'magic TV' Let's say you wake up one day to find someone rewired your optic nerve (or you have been implanted with a prosthetic retina). The signals from retina to brain are still intact, but the wires are all mixed up in the wrong place. Since neighboring pixels in natural images are correlated, it should be possible to learn a remapping that "descrambles" the image by exploiting these correlations. See if you can train a Kohonen-style network to learn the proper topographic mapping of an image based on the statistics of natural images. (Kohonen dubbed this problem 'the Magic TV'.)
- Feedforward vs. recurrent weights. As we discussed in class, one can implement a given input-output mapping in a neural network using just feedforward weights: <math> y = W x</math>, or using just recurrent weights: <math> \tau dy/dt + y = x + M y</math>, or both: <math>\tau dy/dt + y = W x + M y</math>. Probably there is a trade-off here in terms of minimizing overall wiring length and settling time - i.e., feedforward networks are fast but require lots of synapses, while recurrent networks are slower but can implement more complex functions with local connections. Explore these tradeoffs for a particular problem - e.g., implementing an array of Gabor filters in model of V1.
- Sparse codes and associative memory. The advantage of storing and recalling patterns using an associative memory as opposed to a conventional computer memory is 1) parallel search, and 2) denoising (recall of an uncorrupted pattern from partial or degraded input). However, associative memory models do not work well with natural data such as images or sound directly. Rather, they are best suited (have highest capacity) for sparse patterns (i.e., patterns with many zeros). Recent work (to be discussed in class) has shown how it is possible to convert natural images and sounds into a sparse format, and there is some evidence for this happening in the brain. See if you can link these ideas in order to store natural images or sounds in an associative memory.
- Entropy of natural images. Because of the strong statistical dependencies that exist among pixels in natural images, the true entropy of an image patch is far less than N x (entropy/pixel), where N is the number of pixels. But calculating the true entropy is impossible in practice because it requires collecting the full joint pdf over an image patch, and for any reasonably sized patch this is intractable. Thus it is currently unknown. But there are other ways to estimate the entropy of data beyond direct caculation from the pdf. See for example the recent work of David Field and colleagues, or methods based on the Hausdorff dimension. Implement one of these methods, or see if you can come up with your own method, for estimating the true entropy of images.
- ICA applied to EEG. ICA has been successfully applied to EEG signals, to separate noise artifacts from brain signals, and also to find separate signal sources within the brain. See if you can get some EEG data from one of the labs on campus (e.g., Klein or Knight labs) and use ICA to reveal the hidden structure in this data. To get started, look at Makeig S, Bell AJ, Jung T-P, Sejnowski TJ (1996) "Independent Component Analysis of Electroencephalographic Data." In: Advances in Neural Information Processing Systems 8, 145-151.
- Hierarchical restricted Boltzmann machines. Geoff Hinton has recently developed a hierarchical auto-encoder network, based on the restricted Boltzmann machine, for modelling structure in data. Implement this network and train on the handwritten digits data or some other dataset of your choosing.
- Integrate-and-fire model neurons. The integrate and fire model is a simple model for capturing the temporal integration and spiking aspects of real neurons. Using such a simplified model it is possible to begin exploring some interesting questions about sensory coding in neurons. For example, how is it possible to encode a continous, time-varying signal using a population of spiking neurons? (See "Spikes" by Rieke et al., or "Principles of Neural Engineering" by Eliasmith and Anderson for an extended discussion of this issue.) You may also wish to explore the effect of adding more realistic biophysical properties, such as the dependence of threshold on membrane potential (see work by Gray and Azouz).
- Oscillations. Oscillations in neural activity are pervasive throughout the brain. What kinds of neural circuits are capable of eliciting oscillating behavior in spiking neurons? How could it be coordinated across large regions of cortex? (ask Fritz Sommer) What role might it play in the processing of information? John Hopfield has suggested that spike timing relative to the phase of an ongoing oscillation could code information. What factors would need to be considered in order to make this idea viable?