关键词: TensorFlow artificial neural networks nonlinear control systems speech acoustics voice production

来  源:   DOI:10.3390/app14020769   PDF(Pubmed)

Abstract:
A computational neuromuscular control system that generates lung pressure and three intrinsic laryngeal muscle activations (cricothyroid, thyroarytenoid, and lateral cricoarytenoid) to control the vocal source was developed. In the current study, LeTalker, a biophysical computational model of the vocal system was used as the physical plant. In the LeTalker, a three-mass vocal fold model was used to simulate self-sustained vocal fold oscillation. A constant/ǝ/vowel was used for the vocal tract shape. The trachea was modeled after MRI measurements. The neuromuscular control system generates control parameters to achieve four acoustic targets (fundamental frequency, sound pressure level, normalized spectral centroid, and signal-to-noise ratio) and four somatosensory targets (vocal fold length, and longitudinal fiber stress in the three vocal fold layers). The deep-learning-based control system comprises one acoustic feedforward controller and two feedback (acoustic and somatosensory) controllers. Fifty thousand steady speech signals were generated using the LeTalker for training the control system. The results demonstrated that the control system was able to generate the lung pressure and the three muscle activations such that the four acoustic and four somatosensory targets were reached with high accuracy. After training, the motor command corrections from the feedback controllers were minimal compared to the feedforward controller except for thyroarytenoid muscle activation.
摘要:
一种计算神经肌肉控制系统,可产生肺压和三个内在的喉部肌肉激活(环甲,甲状腺样,和外侧环状突)来控制声源。在目前的研究中,LeTalker,声乐系统的生物物理计算模型被用作物理植物。在LeTalker中,使用三质量声带模型来模拟自持声带振荡。声道形状使用恒定的//元音。在MRI测量后对气管进行建模。神经肌肉控制系统生成控制参数,以实现四个声学目标(基频,声压级,归一化光谱质心,和信噪比)和四个体感目标(声带长度,和三个声带层中的纵向纤维应力)。基于深度学习的控制系统包括一个声学前馈控制器和两个反馈(声学和体感)控制器。使用LeTalker生成了5万个稳定的语音信号,用于训练控制系统。结果表明,控制系统能够产生肺压和三个肌肉激活,从而高精度地达到四个声学和四个体感目标。培训后,与前馈控制器相比,来自反馈控制器的运动指令校正最小,除了甲状腺样肌腱肌肉激活.
公众号