Kavli Affiliate: Edward Chang
| Authors: Yuanning Li, Gopala K Anumanchipalli, Abdelrahman Mohamed, Junfeng Lu, Jinsong Wu and Edward F Chang
| Summary:
The human auditory system extracts rich linguistic abstractions from the speech signal. Traditional approaches to understand this complex process have used classical linear feature encoding models, with limited success. Artificial neural networks have recently achieved remarkable speech recognition performance and offer potential alternative computational models of speech processing. We used the speech representations learned by state-of-the-art deep neural network (DNN) models to investigate neural coding across the ascending auditory pathway from the peripheral auditory nerve to auditory speech cortex. We found that representations in hierarchical layers of the DNN correlated well to neural activity throughout the ascending auditory system. Unsupervised speech models achieve the optimal neural correlations among all models evaluated. Deeper DNN layers with context-dependent computations were essential for populations of high order auditory cortex encoding, and the computations were aligned to phonemic and syllabic context structures in speech. Accordingly, DNN models trained on a specific language (English or Mandarin) predicted cortical responses in native speakers of each language. These results reveal convergence between representations learned in DNN models and the biological auditory pathway and provide new approaches to modeling neural coding in the auditory cortex.