Deep Learning Improves Parameter Estimation in Reinforcement Learning Models

Kavli Affiliate: Marcelo Mattar

| Authors: Hua-Dong Xiong, Li Ji-An, Marcelo G Mattar and Robert C Wilson

| Summary:

Cognitive models are widely used in psychology and neuroscience to formulate and test hypotheses about cognitive processes. These processes are characterized by model parameters, which are then used for scientific inference. The reliability of scientific conclusions from cognitive modeling depends critically on the reliability of parameter estimation, yet estimating parameters remains a universal challenge particularly when data are too limited to constrain them. In such cases, multiple sets of parameters may explain the experimental data equally well within the same model, raising the question of which parameters are scientifically meaningful. We refer to this problem as parameter ambiguity. In this paper, we investigate parameter ambiguity in reinforcement learning under two optimization methods. We employ the de facto Nelder-Mead method (fminsearch) and a neural network trained to estimate parameters using a modern deep learning pipeline, which has seen limited application in cognitive modeling. Across ten decision-making datasets, we consistently find that the two methods produce substantially different parameter estimates despite achieving nearly identical fitting performance. To address this ambiguity, we introduce a systematic evaluation framework that goes beyond predictive accuracy to assess generalizability, robustness, identifiability, and test-retest reliability, thereby offering principled guidance on which parameter estimates should inform scientific inference. Applying this framework reveals that the neural network with a deep learning pipeline outperforms across these metrics. Our study establishes parameter ambiguity as an underappreciated challenge with significant implications for scientific replicability, highlighting that the choice of optimization method is a critical factor shaping scientific conclusions. We advocate for our multi-faceted evaluation approach to ensure reliable scientific inference and for broader integration of modern deep learning pipelines into cognitive modeling.