D2 多巴胺受体表达，对奖励的反应，和强化学习在复杂的基于价值的决策任务。D2 dopamine receptor expression, reactivity to rewards, and reinforcement learning in a complex value-based decision-making task.-医云文献数字医云科研云海量医学决策数据服务

Abstract：

Different dopamine (DA) subtypes have opposing dynamics at postsynaptic receptors, with the ratio of D1 to D2 receptors determining the relative sensitivity to gains and losses, respectively, during value-based learning. This effective sensitivity to different reward feedback interacts with phasic DA levels to determine the effectiveness of learning, particularly in dynamic feedback situations where the frequency and magnitude of rewards need to be integrated over time to make optimal decisions. We modeled this effect in simulations of the underlying basal ganglia pathways and then tested the predictions in individuals with a variant of the human dopamine receptor D2 (DRD2; -141C Ins/Del and Del/Del) gene that associates with lower levels of D2 receptor expression (N = 119) and compared their performance in the Iowa Gambling Task to noncarrier controls (N = 319). Ventral striatal (VS) reactivity to rewards was measured in the Cards task with fMRI. DRD2 variant carriers made less effective decisions than noncarriers, but this effect was not moderated by VS reward reactivity as is hypothesized by our model. These results suggest that the interaction between DA receptor subtypes and reactivity to rewards during learning may be more complex than originally thought.

摘要：

不同的多巴胺亚型在突触后受体具有相反的动力学，D1与D2受体的比率决定了对得失的相对敏感性，分别,在基于价值的学习过程中。这种对不同奖励反馈的有效敏感性与阶段性多巴胺水平相互作用，以确定学习的有效性，特别是在动态反馈的情况下，奖励的频率和大小需要随着时间的推移进行整合，以做出最佳决策。我们在基础基底神经节途径的模拟中对这种效应进行了建模，然后在具有人类多巴胺受体D2（DRD2；-141CIns/Del和Del/Del）基因变体的个体中测试了预测，该变体与较低水平的D2受体表达（N=119），并将它们在爱荷华州赌博任务（IGT）中的表现与非携带者对照（N=319）进行了比较。在Cards任务中使用fMRI测量腹侧纹状体（VS）对奖励的反应性。DRD2变体运营商做出的有效决策比非运营商低，但是这种效应并没有像我们的模型所假设的那样受到VS奖励反应性的调节。这些结果表明，多巴胺受体亚型与学习过程中对奖励的反应性之间的相互作用可能比最初认为的要复杂。