Accumulation of neural state transitions in dorsomedial striatum predicts patch foraging decisions

Kavli Affiliate: Marshall Hussain Shuler

| Authors: Elissa Sutlief, Shichen Zhang, Kate Forsberg and Marshall G Hussain Shuler

| Summary:

Activities with temporally distributed returns require deciding not only what to do, but when to stop. Although optimal foraging theory predicts that patch departure should depend on the total time invested in a patch, behavior across species often reflects sensitivity to recent rewards. The dorsomedial striatum (DMS) mediates both goal-directed decision-making and interval timing, two functions that converge during patch foraging, where animals must decide when to exit patches to maximize reward rates. We recorded extracellular activity from neurons in DMS while freely moving mice performed a patch-foraging task. Mice employed a reward-reset strategy, primarily basing their exit decisions on the time since the last reward, with systematic adjustments for patch residence time and environmental reward-rate context. Individual neurons in DMS underwent discrete firing rate transitions at characteristic delays following each reward. These transition times tiled the post-reward interval across the population, producing an accumulation-to-threshold signal that reset with each reward and whose rate was sensitive to the cost of time, increasing with higher environmental reward rates and as rewards occurred later in the patch. This accumulation reached a consistent threshold at the moment of patch exit, encoding the animal’s intended stopping time. Together, these results identify a time-cost-variable, reward-reset, accumulation-to-threshold computation in DMS that integrates environmental reward rate, elapsed time from pursuit engagement, and elapsed time from reward occurrence to determine when to abandon a depleting pursuit.