Jascha Sohl-Dickstein
I am a graduate student in the Redwood Center for Theoretical Neuroscience, at University of California, Berkeley. I am a member of Bruno Olshausen's lab, and the Biophysics Graduate Group. My email address is jascha@berkeley.edu.
I am interested in how we learn to perceive the world. There is evidence that much of our representation of the world is learned during development rather than being genetically hardwired - everything from the way light intensity is correlated on adjacent patches of the retina, all the way up to the behavior and existence of objects. We seem to infer most of human scale physics from examples of sensory input. How this unsupervised learning problem is solved - how we learn the structure inherent in the world just by experiencing examples of it - is not well understood. This is the problem I am interested in tackling.
Practically - I spend most of my effort developing techniques to train highly flexible but intractable probabilistic models, using ideas from statistical mechanics and dynamical systems.
Code
- HAIS - This repository contains Matlab code to perform partition function estimation, log likelihood estimation, and importance weight estimation in models with intractable partition functions and continuous state spaces, using Hamiltonian Annealed Importance Sampling. It can also be used for standard Hamiltonian Monte Carlo sampling (single step, with partial momentum refreshment).
- MPF - This repository contains Matlab code implementing Minimum Probability Flow learning for several cases, specifically:
- MPF_ising/ - parameter estimation in the Ising model
- MPF_RBM_compare_log_likelihood/ - parameter estimation in Restricted Boltzmann Machines. This directory also includes code comparing the log likelihood of small RBMs trained via pseudolikelihood and Contrastive Divergence to ones trained via MPF.
Projects
- Minimum Probability Flow Learning - A collaboration with Peter Battaglino and Michael R. DeWeese. MPF is a technique for parameter estimation in un-normalized probabilistic models. It proves to be an order of magnitude faster than competing techniques for the Ising model, and an effective tool for learning parameters for any non-normalizable distribution. See the tech report, and released code.
- Hamiltonian Annealed Importance Sampling - A collaboration with Jack Culpepper. Allows the estimation of importance weights - and thus partition function and log likelihoods - for intractable probabilistic models. See the tech report, and the released code.
- Lie group models for transformations in natural video - A collaboration with Jimmy Wang and Bruno Olshausen. We train first order differential operators on inter-frame differences in natural video, in order to learn a set of natural transformations. We further explore the use of these transformations in video compression. See the tech report, and the DCC paper.
- Hessian Aware Online Optimization - By rewriting the inverse Hessian in terms of its Taylor expansion, and then accumulating terms in the expansion in an online fashion, neat things can be done...
- Bilinear Generative Models for Natural Images - A collaboration with Jack Culpepper and Bruno Olshausen.
- Expectation Maximization and Hamiltonian Dynamics - A collaboration with Jack Culpepper and Bruno Olshausen.
- A Device for Human echolocation - A collaboration with Nicol Harper and Chris Rodgers.
Notes
- Sampling the Connectivity Pattern in Minimum Probability Flow Learning - Describes how the connectivity pattern between states in MPF can be described using a proposal distribution, rather than a deterministic rule.
- Entropy of Generic Distributions - Calculates the entropy that can be expected for a distribution drawn at random from the simplex of all possible distributions (John Schulman points out that ET Jaynes deals with similar questions in chapter 11 of "Probability Theory: The Logic Of Science")
The following are titles for informal notes I intend to write, but haven't gotten to/finished yet. If any of the following sound interesting to you, pester me and they will appear more quickly.
- Natural gradients explained via an analogy to signal whitening
- A log bound on the growth of intelligence with system size
- The field of experts model learns Gabor-like receptive fields when trained via minimum probability flow or score matching
- For small time bins, generalized linear models and causal Boltzmann machines become equivalent
- How to construct phase space volume preserving recurrent networks
- Maximum likelihood learning as constraint satisfaction
- A spatial derivation of score matching
Publications
J Sohl-Dickstein, BJ Culpepper. Hamiltonian annealed importance sampling for partition function estimation. Redwood Technical Report. (2011) http://marswatch.astro.cornell.edu/jascha/pdfs/HAIS.pdf
A Hayes, J Grotzinger, L Edgar, SW Squyres, W Watters, J Sohl-Dickstein. Reconstruction of Eolian Bed Forms and Paleocurrents from Cross-Bedded Strata at Victoria Crater, Meridiani Planum, Mars, Journal of Geophysical Research (2011)
CM Wang, J Sohl-Dickstein, I Tosik. Lie Group Transformation Models for Predictive Video Coding. Proceedings of the Data Compression Conference (2011) http://marswatch.astro.cornell.edu/jascha/pdfs/PID1615931.pdf
BJ Culpepper, J Sohl-Dickstein, B Olshausen. Learning higher-order features of natural images via factorization. Under review. (2011)
J Sohl-Dickstein, CM Wang, BA Olshausen. An Unsupervised Algorithm For Learning Lie Group Transformations. Redwood Technical Report (2009) http://arxiv.org/abs/1001.1027
J Sohl-Dickstein, P Battaglino, M DeWeese. Minimum probability flow learning. Redwood Technical Report (2009) http://arxiv.org/abs/0906.4779
C Abbey, J Sohl-Dickstein, BA Olshausen. Higher-order scene statistics of breast images. Proceedings of SPIE (2009) http://link.aip.org/link/?PSISDG/7263/726317/1
K Kinch, J Sohl-Dickstein, J Bell III, JR Johnson, W Goetz, GA Landis. Dust deposition on the Mars Exploration Rover Panoramic Camera (Pancam) calibration targets. Journal of Geophysical Research-Planets (2007) http://www.agu.org/pubs/crossref/2007/2006JE002807.shtml
POSTER - J Sohl-Dickstein, BA Olshausen. Learning in energy based models via score matching. Cosyne (2007) - this (dense!) poster introduces a spatial derivation of score matching, applies it to learning in a Field of Experts model, and then extends Field of Experts to work with heterogeneous experts (to form a "tapestry of experts"). I'm including it as it hasn't been written up elsewhere. download poster
JR Johnson, J Sohl-Dickstein, WM Grundy, RE Arvidson, J Bell III, P Christensen, T Graff, EA Guinness, K Kinch, R Morris, MK Shepard. Radiative transfer modeling of dust-coated Pancam calibration target materials: Laboratory visible/near-infrared spectrogoniometry. Journal of Geophysical Research (2006) http://www.agu.org/pubs/crossref/2006/2005JE002658.shtml
J Bell III, J Joseph, J Sohl-Dickstein, H Arneson, M Johnson, M Lemmon, D Savransky In-flight calibration and performance of the Mars Exploration Rover Panoramic Camera (Pancam) instruments. Journal of Geophysical Research (2006) http://www.agu.org/pubs/crossref/2006/2005JE002444.shtml
Parker et al. Stratigraphy and sedimentology of a dry to wet eolian depositional system, Burns formation, Meridiani Planum, Mars. Earth and Planetary Science Letters (2005)
Soderblom et al. Pancam multispectral imaging results from the Opportunity rover at Meridiani Planum. Science (2004) http://www.sciencemag.org/content/306/5702/1703
Soderblom et al. Pancam multispectral imaging results from the Spirit rover at Gusev crater. Science (2004) http://www.sciencemag.org/content/305/5685/800
Smith et al. Athena microscopic imager investigation. Journal of Geophysical Research-Planets (2003)
Bell et al. Hubble Space Telescope Imaging and Spectroscopy of Mars During 2001. American Geophysical Union (2001)