关键词: chemotaxis machine learning navigation reinforcement learning sensing

来  源:   DOI:10.1093/pnasnexus/pgae235   PDF(Pubmed)

Abstract:
We investigate the boundary between chemotaxis driven by spatial estimation of gradients and chemotaxis driven by temporal estimation. While it is well known that spatial chemotaxis becomes disadvantageous for small organisms at high noise levels, it is unclear whether there is a discontinuous switch of optimal strategies or a continuous transition exists. Here, we employ deep reinforcement learning to study the possible integration of spatial and temporal information in an a priori unconstrained manner. We parameterize such a combined chemotactic policy by a recurrent neural network and evaluate it using a minimal theoretical model of a chemotactic cell. By comparing with constrained variants of the policy, we show that it converges to purely temporal and spatial strategies at small and large cell sizes, respectively. We find that the transition between the regimes is continuous, with the combined strategy outperforming in the transition region both the constrained variants as well as models that explicitly integrate spatial and temporal information. Finally, by utilizing the attribution method of integrated gradients, we show that the policy relies on a nontrivial combination of spatially and temporally derived gradient information in a ratio that varies dynamically during the chemotactic trajectories.
摘要:
我们研究了由梯度的空间估计驱动的趋化性和由时间估计驱动的趋化性之间的边界。虽然众所周知,空间趋化性对于高噪声水平的小生物来说变得不利,目前尚不清楚是存在最优策略的不连续切换还是存在连续过渡。这里,我们使用深度强化学习以先验无约束的方式研究空间和时间信息的可能整合。我们通过递归神经网络对这种组合的趋化策略进行参数化,并使用趋化细胞的最小理论模型对其进行评估。通过与策略的约束变体进行比较,我们证明了它在小细胞和大细胞尺寸下收敛于纯粹的时间和空间策略,分别。我们发现政权之间的过渡是连续的,与组合策略在过渡区域中的表现优于约束变体以及明确整合空间和时间信息的模型。最后,通过利用集成梯度的归因方法,我们表明,该策略依赖于空间和时间衍生的梯度信息的非平凡组合,其比例在趋化轨迹期间动态变化。
公众号