Interpretable neural network models of visual cortex - A scattering transform approach
Donald Shi Pui Li, Michael F. Bonner, Johns Hopkins University, United States
Posters 2 Poster
Pacific Ballroom H-O
Fri, 26 Aug, 19:30 - 21:30 Pacific Time (UTC -7)
Deep neural networks excel at predicting image-evoked cortical responses in the human brain. However, their performance relies on deep learning with large datasets, resulting in black-box models whose internal representations are challenging to interpret. Here we consider an alternative to standard deep neural networks known as the scattering transform. Like a convolutional neural network, the scattering transform has a hierarchical structure with multiple convolutional and non-linear activation layers. However, instead of having learnable convolutions, these models have pre-defined Gabor-like wavelets in every convolutional layer, which means that no learning is required. The higher-order representations of this model can be interpreted as contour co-occurrence statistics. We examined fMRI datasets of human object and scene perception and found that a relatively simple and compact scattering transform model can provide surprisingly good predictions of image-evoked responses in high-level visual cortex. Our findings show that the scattering transform approaches the performance of Places365-pretrained AlexNet and may provide a more interpretable and parsimonious modeling framework for the computational neuroscience of vision.