VS265: Class project Fall2010
Project presentations will take place on project presentation day, TBD (typically just before or during finals week). You may present your project either as an oral (15 min.) or poster presentation.
Think of the class project as an extended lab assignment. In most of the labs we just scratched the surface of various network models. This is a chance to explore these ideas in more depth. In addition to giving an oral or poster presentation on your project at the end of the term, you should turn in a short writeup that describes the problem you investigated, why it is interesting, your approach, results, and conclusions. There is no length requirement but about 5-10 pages would be in the right ballpark.
This is not an exhaustive listing, just some suggestions to get you started thinking.
- NETtalk. Train a multi-layer perceptron to convert text to speech. You can get Sejnowski & Rosenberg's original paper and the data they used here. (You will need a DECtalk speech synthesizer to play the phonemes - you may be able to pick up a used one online.)
- Recognition of handwritten digits. Train a MLP to classify handwritten digits 0-9. You can get some training data here. You may wish to follow the convolutional network methodology of Yann LeCun (try the simpler, earlier model), or invent your own method.
- Sparse coding and decorrelation. Implement Peter Foldiak's network and train it on the handwritten digits above to learn the features of this data. How do the learned features change as a function of the number of units in the network? You may wish to then try supervised learning on the learned features to see if it has simplified the classification problem.
- Cortical maps. The elastic net model of Durbin and Mitchison is typical of many cortical map models in that they learn directly on a parameterized feature space. But the cortex simply gets a bunch of inputs from the LGN, and so it needs to learn features such as orientation at the same time as it organizes them into a feature map. How would you go about learning a feature map for orientation and position directly from simulated LGN inputs? (You may wish to consult the book of Risto Miikkulainen for recent efforts in this area.)
- The 'magic TV' Let's say you wake up one day to find someone rewired your optic nerve (or you have been implanted with a prosthetic retina). The signals from retina to brain are still intact, but the wires are all mixed up in the wrong place. Since neighboring pixels in natural images are correlated, it should be possible to learn a remapping that "descrambles" the image by exploiting these correlations. See if you can train a Kohonen-style network to learn the proper topographic mapping of an image based on the statistics of natural images. (Kohonen dubbed this problem 'the Magic TV'.)
- Feedforward vs. recurrent weights. As we discussed in class, one can implement a given input-output mapping in a neural network using just feedforward weights: , or using just recurrent weights: , or both: . Probably there is a trade-off here in terms of minimizing overall wiring length and settling time - i.e., feedforward networks are fast but require lots of synapses, while recurrent networks are slower but can implement more complex functions with local connections. Explore these tradeoffs for a particular problem - e.g., implementing an array of Gabor filters in a model of V1.
- Sparse codes and associative memory. The advantages of storing and recalling patterns using an associative memory as opposed to conventional template matching are 1) parallel search, 2) distributed storage, and 3) denoising (recall of an uncorrupted pattern from partial or degraded input). However, associative memory models do not work well with natural data such as images or sound directly. Rather, they are best suited (have highest capacity) for sparse patterns (i.e., patterns with many zeros). Recent work (discussed in class) has shown how it is possible to convert natural images and sounds into a sparse format, and there is some evidence for this happening in the brain. See if you can link these ideas in order to store natural images or sounds in an associative memory.
- Hardware implementation of associative memory. The analog Hopfield model has a direct physical implementation as an electrical circuit of resistors, capacitors, and op-amps. Try building a scaled-down version of this model in hardware. What issues arise in the implementation of this model? How long does it take to converge to a local minimum?
- Analysis of Hopfield dynamics. The Hopfield dynamics may be seen as performing a form of gradient descent on the energy function - i.e., the state of each unit, , follows a monotonically increasing function of the gradient rather than the gradient itself. The same is true of the LCA (locally competitive algorithm) used for sparse coding. Is the resulting trajectory more efficient for reaching the energy minimum than what you would get from doing steepest descent?
- Bump circuits. Implement Kechen Zhang's bump circuit model discussed in class. How robust is the model to perturbations of the weights? How might such a circuit to self-correct for any imperfections in the weights?
- Map-seeking circuits. David Arathorn has described a neural circuit for doing invariant object recognition which utilizes three-way interactions among units - see "Map-seeking circuits in Visual Cognition," Stanford University Press, 2002. Try implementing this model. One issue to explore is the memory matching scheme, which basically uses template matching. Can this be improved using a distributed representation such as in a Hopfield network?
- Entropy of natural images. We live in a highly structured environment, and thus the images that fall upon our retinae exhibit strong statistical dependencies among local pixel values. To get a feel for this, consider how long would it take to find an image resembling a natural scene if you viewed a series of images composed from random pixel values. This tells us that the true entropy of an image patch is far less than N x (entropy/pixel), where N is the number of pixels. But calculating the true entropy is impossible in practice because it requires collecting the full joint pdf over an image patch, and for any reasonably sized patch this is intractable. Thus it is currently unknown. But there are other ways to estimate the entropy of data beyond direct caculation from the pdf. See for example the recent work of David Field and colleagues, or methods based on the Hausdorff dimension. Implement one of these methods, or see if you can come up with your own method, for estimating the true entropy of images.
- ICA applied to EEG. ICA has been successfully applied to EEG signals, to separate noise artifacts from brain signals, and also to find separate signal sources within the brain. See if you can get some EEG data from one of the labs on campus (e.g., Klein or Knight labs) and use ICA to reveal the hidden structure in this data. To get started, look at this page.
- Hierarchical restricted Boltzmann machines. Geoff Hinton has developed a hierarchical auto-encoder network, based on the restricted Boltzmann machine, for modelling structure in data. Implement this network and train on the handwritten digits data or some other dataset of your choosing.
- Modeling dynamical systems with recurrent neural networks. As David Zipser showed in class, it is possible to train recurrent neural networks to learn limit cycles, chaos, and other forms of dynamics. Try using these learning rules to train a recurrent network to learn to recognize or emulate the structure of time-series data. (David will make his software available for this purpose to save you some time.) One form of dynamics you might try to model is "biological motion."
- Echo-state networks. Herbert Jaeger and others have shown that it is possible to utilize echo-state networks to learn rich forms of dynamics without having to train specific connections within the network itself, but rather by learning to "read out" the network in the right way. Try experimenting with this approach to see what sorts of dynamics it is capable of learning.
- Integrate-and-fire model neurons. The integrate and fire model is a simple model for capturing the temporal integration and spiking aspects of real neurons. Using such a simplified model it is possible to begin exploring some interesting questions about sensory coding in neurons. For example, how is it possible to encode a continous, time-varying signal using a population of spiking neurons? (See "Spikes" by Rieke et al., or "Principles of Neural Engineering" by Eliasmith and Anderson for an extended discussion of this issue.) You may also wish to explore the effect of adding more realistic biophysical properties, such as the dependence of threshold on membrane potential (see work by Gray and Azouz).
- Oscillations. Oscillations in neural activity are pervasive throughout the brain. What kinds of neural circuits are capable of eliciting oscillating behavior in spiking neurons? How could it be coordinated across large regions of cortex? (ask Fritz Sommer) What role might it play in the processing of information? John Hopfield has suggested that spike timing relative to the phase of an ongoing oscillation could code information. What factors would need to be considered in order to make this idea viable?
- Memristors. The memristor was originally proposed by Leon Chua at UC Berkeley back in the 1970's as a hypothetical "4th circuit element." Recently, a group at HP labs has discovered a memristive like element (see also this article. What is intriguing about the memristor is its synapse like properties, and according to Chua it even leads to a more straightforward model of action potential dynamics! Might memristive elements exist in neural circuits in the brain?
- Learning in sensorimotor loops.'" An important goal of research in both robotics and neuroscience is to understand the principles of adaptive behavior in embodied systems. Currently there are few theoretical frameworks for conceptualizing the problems in this landscape. One approach is based on learning "sensorimotor contingencies." Another is based on minimizing both predictive and postdictive error.
There are many sources of data on the web that you can use for these projects. Here are a few to get you started:
- Sam Roweis - Many datasets in Matlab format: MNIST and USPS handwritten digits, faces, text, speech.
- Hans van Hateren - Natural stimuli collection: still images, intensity time series, video