VS265: Class project: Difference between revisions

From RedwoodCenter
Jump to navigationJump to search
No edit summary
Line 58: Line 58:
* [http://www.cs.toronto.edu/~roweis/data.html '''Sam Roweis'''] - Many datasets in Matlab format: MNIST and USPS handwritten digits, faces, text,  speech.
* [http://www.cs.toronto.edu/~roweis/data.html '''Sam Roweis'''] - Many datasets in Matlab format: MNIST and USPS handwritten digits, faces, text,  speech.


* [http://hlab.phys.rug.nl/archive.html '''Hans van Hateren'''] - Natural stimuli collection: still images, intensity time series, video.  
* '''Hans van Hateren''' - Natural stimuli collection: still images, intensity time series, video.  
''Note:'' Hans van Hateren's website is down, there are two mirrors for the still images collection: [http://www.kyb.mpg.de/bethge/vanhateren/index.php mirror 2](Germany, detailed description), [http://pirsquared.org/research/#van-hateren-database mirror 2] (US,CA - faster)
<!-- [http://www.kyb.mpg.de/bethge/vanhateren/index.php mirror 2](Germany, detailed description), -->[http://pirsquared.org/research/#van-hateren-database mirror]


* [http://www.robjhyndman.com/TSDL/ '''Time Series Data Library''']
* [http://www.robjhyndman.com/TSDL/ '''Time Series Data Library''']


* [http://www.face-rec.org/databases/ '''Face Recognition Databases''']
* [http://www.face-rec.org/databases/ '''Face Recognition Databases''']

Revision as of 04:07, 24 October 2014

Project presentations will take place on project presentation day (TBD). You may present your project either as an oral (15 min.) or poster presentation.

Think of the class project as an extended lab assignment. In most of the labs we just scratched the surface of various network models. This is a chance to explore these ideas in more depth. In addition to giving an oral or poster presentation on your project at the end of the term, you should turn in a short writeup (5 pages) that describes the problem you investigated, why it is interesting, your approach, results, and conclusions.

Project Suggestions

This is not an exhaustive listing, just some suggestions to get you started thinking.

  • Dendritic nonlinearities. Most of the models we have discussed are based upon the Perceptron model - a linear weighted sum of inputs followed by a single output nonlinearity (threshold or sigmoid). However as we saw in the first week of class, even the passive properties of dendrites are such that inputs combine nonlinearly. There are also active processes (e.g., dendritic spikes) that are highly nonlinear. This would seem to imply that the concept of linear separability, which is central to machine learning, is perhaps not a fundamental limitation for neural systems. How might you exploit the nonlinear properties of dendrites for pattern discrimination and learning? What are the consequences of connecting such elements together in recurrent circuits? Do they still exhibit attractor dynamics? (See work of Bartlett Mel to get started here, esp. papers from early 90's and early 2000's. )
  • NETtalk. Train a multi-layer perceptron to convert text to speech. You can get Sejnowski & Rosenberg's original paper and the data they used here. (You will need a DECtalk speech synthesizer to play the phonemes - you may be able to pick up a used one online.)
  • Recognition of handwritten digits. Train a MLP to classify handwritten digits 0-9. You can get some training data here. You may wish to follow the convolutional network methodology of Yann LeCun (try the simpler, earlier model), or invent your own method.
  • Sparse coding and decorrelation. Implement Peter Foldiak's network and train it on the handwritten digits above to learn the features of this data. How do the learned features change as a function of the number of units in the network? You may wish to then try supervised learning on the learned features to see if it has simplified the classification problem.
  • Sparse codes and spikes. In a previous class project, Joel Zylberberg developed a variation of Foldiak's model that works with graded (analog) inputs such as images. It is composed of integrate and fire units that emit spikes similar to neurons. However it was applied only to static images. How would you extend this model to encode time-varying images in a sparse and efficient manner?
  • Cortical maps. The elastic net model of Durbin and Mitchison is typical of many cortical map models in that they learn directly on a parameterized feature space. But the cortex simply gets a bunch of inputs from the thalamus, and so it needs to learn stimulus features at the same time it organizes them into a feature map. Sparse coding models provide one account for how features may be learned, but how would you go about extending this model to learn a feature map as well? (You may wish to consult the book of Risto Miikkulainen for recent efforts in this area.)
  • The 'magic TV'. Suppose you woke up one day to find someone rewired your optic nerve (or you have been implanted with a prosthetic retina). The signals from retina to brain are intact, but the wires are all mixed up in the wrong place. Since neighboring pixels in natural images are correlated, it should be possible to learn a remapping that "descrambles" the image by exploiting these correlations. See if you can train a Kohonen-style network to learn the proper topographic mapping of an image based on the statistics of natural images. (Kohonen dubbed this problem 'the Magic TV'.)
  • Feedforward vs. recurrent weights. As we discussed in class, one can implement a given input-output mapping in a neural network using just feedforward weights: y = W x, or using just recurrent weights: dy/dt + y = x + M y, or both: dy/dt + y = W x + M y. Probably there is a trade-off here in terms of minimizing overall wiring length and settling time - i.e., feedforward networks are fast but require lots of synapses, while recurrent networks are slower but can implement more complex functions with local connections. Explore these tradeoffs for a particular problem - e.g., implementing an array of Gabor filters in a model of V1.
  • Sparse codes and associative memory. The advantages of storing and recalling patterns using an associative memory as opposed to conventional template matching are 1) parallel search, 2) distributed storage, and 3) denoising (recall of an uncorrupted pattern from partial or degraded input). However, associative memory models do not work well with natural data such as images or sound directly. Rather, they are best suited (have highest capacity) for sparse patterns (i.e., patterns with many zeros). Recent work (discussed in class) has shown how it is possible to convert natural images and sounds into a sparse format, and there is some evidence for this happening in the brain. See if you can link these ideas in order to store natural images or sounds in an associative memory. (You can read the work of Rehn and Sommer for one approach.)
  • Bump circuits. Implement Kechen Zhang's bump circuit model discussed in class. How robust is the model to perturbations of the weights? How might such a circuit be made robust and self-correct for any imperfections in the weights?
  • Map-seeking circuits. David Arathorn has described a neural circuit for doing invariant object recognition which utilizes three-way interactions among units - see "Map-seeking circuits in Visual Cognition," Stanford University Press, 2002 (I can loan you my copy). Try implementing this model. One issue to explore is the memory matching scheme, which basically uses template matching. Can this be improved using a distributed representation such as in a Hopfield network?
  • Entropy of natural images. We live in a highly structured environment, and thus the images that fall upon our retinae exhibit strong statistical dependencies among local pixel values. To get a feel for this, consider how long would it take to find an image resembling a natural scene if you viewed a series of images composed from random pixel values. This tells us that the true entropy of an image patch is far less than N x (entropy/pixel), where N is the number of pixels. But calculating the true entropy is impossible in practice because it requires collecting the full joint pdf over an image patch, and for any reasonably sized patch this is intractable. Thus it is currently unknown. But there are other ways to estimate the entropy of data beyond direct caculation from the pdf. See for example the recent work of David Field and colleagues, or methods based on the Hausdorff dimension. Implement one of these methods, or see if you can come up with your own method, for estimating the true entropy of images.
  • Manifold models of natural images. It has been shown/suggested that the structure of 3x3 pixel image patches extracted from natural scenes is a Klein bottle (i.e., a 2D manifold embedded in 9 dimensions) (Carlsson et al. 2008). See if you can recover this same topology from LLE (Local Linear Embedding). What about larger image patches such as 5x5 or 10x10?
  • ICA applied to EEG. ICA has been successfully applied to EEG signals, to separate noise artifacts from brain signals, and also to find separate signal sources within the brain. See if you can get some EEG data from one of the labs on campus (e.g., Klein or Knight labs) and use ICA to reveal the hidden structure in this data. To get started, look at this page.
  • Hierarchical restricted Boltzmann machines. Geoff Hinton has developed a hierarchical auto-encoder network, based on the restricted Boltzmann machine, for modelling structure in data. Implement this network and train on the handwritten digits data or some other dataset of your choosing.
  • Modeling dynamical systems with recurrent neural networks. As David Zipser showed in class, it is possible to train recurrent neural networks to learn limit cycles, chaos, and other forms of dynamics. Try using these learning rules to train a recurrent network to learn to recognize or emulate the structure of time-series data. (Read David's manual for his BPTT simulator to get started.) One form of dynamics you might try to model is "biological motion."
  • Echo-state networks. Herbert Jaeger and others have shown that it is possible to utilize echo-state networks to learn rich forms of dynamics without having to train specific connections within the network itself, but rather by learning to "read out" the network in the right way. Others have proposed this as a way to think of what neural circuits in the cortex are doing (Buonomano & Maass, 2009, Sussillo & Abbott, 2009). Explore the merits of this approach for modeling dynamics of natural signals such as images or sound, or for modeling the dynamics of neural activity.
  • Integrate-and-fire model neurons. The integrate and fire model is a simple model for capturing the temporal integration and spiking aspects of real neurons. Using such a simplified model it is possible to begin exploring some interesting questions about sensory coding in neurons. For example, how is it possible to encode a continous, time-varying signal using a population of spiking neurons? (See "Spikes" by Rieke et al., or "Principles of Neural Engineering" by Eliasmith and Anderson for an extended discussion of this issue.) You may also wish to explore the effect of adding more realistic biophysical properties, such as the dependence of threshold on membrane potential (see work by Gray and Azouz).
  • Oscillations. Oscillations in neural activity are pervasive throughout the brain. What kinds of neural circuits are capable of eliciting oscillating behavior in spiking neurons? How could it be coordinated across large regions of cortex? (ask Fritz Sommer) What role might it play in the processing of information? John Hopfield has suggested that spike timing relative to the phase of an ongoing oscillation could code information. See also Koepsell et al. (2010). What factors would need to be considered in order to make this idea viable?
  • Learning in sensorimotor loops. A goal of research in both robotics and neuroscience is to understand the principles of adaptive behavior in embodied systems. However, currently there are few theories for guiding work in this area. One approach advanced by O'regan and colleagues is based on learning "sensorimotor contingencies" (ref1, ref2). Another by Ralf Der and colleagues is based on minimizing both predictive and 'postdictive' error (ref). Both propose simple algorithmic examples that you can implement in computer simulation, or you may want to try implementing on a simple robotic platform.


Data

There are many sources of data on the web that you can use for these projects. Here are a few to get you started:

  • Sam Roweis - Many datasets in Matlab format: MNIST and USPS handwritten digits, faces, text, speech.
  • Hans van Hateren - Natural stimuli collection: still images, intensity time series, video.

mirror