Many but not All Deep Neural Network Audio Models Predict Auditory Cortex Responses and Exhibit Hierarchical Layer-Region Correspondence
Greta Tuckute, Jenelle Feather, Dana Boebinger, Josh H. McDermott, Massachusetts Institute of Technology, United States
Posters 2 Poster
Pacific Ballroom H-O
Fri, 26 Aug, 19:30 - 21:30 Pacific Time (UTC -7)
Deep neural networks are commonly used as models of the visual system, but are less explored in audition. We evaluated brain-model correspondence for publicly available audio neural network models along with in-house models trained on four different tasks. Most tested models out-predicted previous filter-bank models of auditory cortex, and exhibited systematic layer-region correspondence: middle layers best predicted primary auditory cortex while deep layers best predicted non-primary cortex. However, some of the publicly available models trained for engineering purposes produced substantially worse brain predictions. The results support the hypothesis that hierarchical models optimized for auditory tasks often learn representational transformations that coarsely resemble those in auditory cortex, but indicate that state-of-the-art engineering models can deviate substantially from biological systems.