Do We Need Deep Learning? Towards High-Performance Encoding Models of Visual Cortex Using Modules of Canonical Computations
Atlas Kazemian, Eric Elmoznino, Michael F. Bonner, Johns Hopkins University, United States
Session:
Posters 2 Poster
Location:
Pacific Ballroom H-O
Presentation Time:
Fri, 26 Aug, 19:30 - 21:30 Pacific Time (UTC -8)
Abstract:
The field of computational neuroscience has witnessed a surge of interest in convolutional neural networks (CNNs) trained through deep learning, following the finding that they can recapitulate representations of visual information along the ventral stream. This has led to the routine use of CNNs as standard encoding models of visual cortex, despite limitations such as a large dependency on training data and low interpretability. Here, we propose an alternative approach that addresses such limitations without sacrificing performance. We introduce a family of hand-engineered models based on a module of convolution operations combined with a set of canonical neural computations, resulting in a high-performance model that requires little to no training. We present one such architecture and compare its encoding performance to a standard CNN by linearly mapping each model’s features to fMRI responses. We show that, with no learning involved, the performance of the hand-engineered model competes with the trained CNN for predicting object-evoked and scene-evoked fMRI responses in visual cortex. Our approach opens the possibility of designing high-performance encoding models without relying on deep learning, and it has promise for revealing critical inductive biases and computational efficiencies of visual cortex.