Sound

声音
  • 文章类型: Journal Article
    声音识别对人类来说是毫不费力的,但对人工听力系统构成了重大挑战。深度神经网络(DNN)尤其是卷积神经网络(CNN),最近在声音分类方面超过了传统的机器学习。然而,当前的DNN使用二进制分类变量将声音映射到标签,忽略标签之间的语义关系。认知神经科学研究表明,除了声学线索之外,人类听众还利用了此类语义信息。因此,我们的假设是,融合语义信息可以提高DNN的声音识别性能,模仿人类行为。在我们的方法中,声音识别是一个回归问题,被训练为将频谱图映射到来自NLP模型的连续语义表示(Word2Vec,BERT,和CLAP文本编码器)。训练了两种DNN类型:具有连续嵌入的semDNN和具有分类标签的catDNN,两者都具有从388211个声音集合中提取的数据集,并丰富了语义描述。跨四个外部数据集的评估,证实了semDNN与catDNN相比语义标记的优越性,保持更高层次的关系。重要的是,对自然声音的人类相似性评级的分析,表明semDNN比catDNN更接近人类听众的行为,其他DNN,NLP模型。我们的工作有助于理解语义在声音识别中的作用,弥合人工系统和人类听觉感知之间的差距。
    Sound recognition is effortless for humans but poses a significant challenge for artificial hearing systems. Deep neural networks (DNNs), especially convolutional neural networks (CNNs), have recently surpassed traditional machine learning in sound classification. However, current DNNs map sounds to labels using binary categorical variables, neglecting the semantic relations between labels. Cognitive neuroscience research suggests that human listeners exploit such semantic information besides acoustic cues. Hence, our hypothesis is that incorporating semantic information improves DNN\'s sound recognition performance, emulating human behaviour. In our approach, sound recognition is framed as a regression problem, with CNNs trained to map spectrograms to continuous semantic representations from NLP models (Word2Vec, BERT, and CLAP text encoder). Two DNN types were trained: semDNN with continuous embeddings and catDNN with categorical labels, both with a dataset extracted from a collection of 388,211 sounds enriched with semantic descriptions. Evaluations across four external datasets, confirmed the superiority of semantic labeling from semDNN compared to catDNN, preserving higher-level relations. Importantly, an analysis of human similarity ratings for natural sounds, showed that semDNN approximated human listener behaviour better than catDNN, other DNNs, and NLP models. Our work contributes to understanding the role of semantics in sound recognition, bridging the gap between artificial systems and human auditory perception.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    恢复脆弱的海洋生境越来越受欢迎,以应对广泛的生境丧失以及由此导致的生物多样性和生态系统服务的下降。最近,已采用恢复策略来增强地中海特有海草Posidoniaoceanica退化草甸的恢复。通常,栖息地恢复的成功是通过移植后基础物种的持久性来评估的(例如,植物的生存和生长)在短期和长期,尽管成功的植物反应不一定反映生态系统生物多样性和功能的恢复。最近,声景(空间,环境声音的时间和频率属性以及表征它的声源类型)与不同的栖息地条件和群落结构有关。因此,一个成功的恢复行动应该导致声学恢复和声景生态可以代表恢复监测的重要组成部分,导致评估成功的栖息地和社区恢复。这里,我们评估了海洋疟原虫修复草甸的声学群落和指标,并测试了一年后植物移植的有效性是否伴随着恢复的声景。有了这个目标,来自退化的声学记录,使用无源声学监测设备在撒丁岛(意大利)收集了移植和参考草甸。根据地中海的鱼声目录,使用频谱分析和鱼声分类来检查每种草甸类型的声景。记录了7种不同的鱼声:其中大多数存在于参考和移植的草甸中,并与Sciaena本影和Scorpaenaspp有关。声压级(SPL,以dBre:1μPa-rms为单位)和声学复杂性指数(ACI)受草甸类型的影响。特别高的值与移植的草甸相关。在200-2000Hz频带中计算的SPL和ACI也与高度丰富的鱼声(合唱)有关。这些结果表明,草甸恢复可能导致声景和相关群落的恢复,表明短期声学监测可以为评估海草恢复成功提供补充信息。
    Restoration of vulnerable marine habitats is becoming increasingly popular to cope with widespread habitat loss and the resulting decline in biodiversity and ecosystem services. Lately, restoration strategies have been employed to enhance the recovery of degraded meadows of the Mediterranean endemic seagrass Posidonia oceanica. Typically, habitat restoration success is evaluated by the persistence of foundation species after transplantation (e.g., plant survival and growth) on the short and long-term, although successful plant responses do not necessarily reflect the recovery of ecosystem biodiversity and functions. Recently, soundscape (the spatial, temporal and frequency attribute of ambient sound and types of sound sources characterizing it) has been related to different habitat conditions and community structures. Thus, a successful restoration action should lead to acoustic restoration and soundscape ecology could represent an important component of restoration monitoring, leading to assess successful habitat and community restoration. Here, we evaluated acoustic community and metrics in a P. oceanica restored meadow and tested whether the plant transplant effectiveness after one year was accompanied by a restored soundscape. With this goal, acoustic recordings from degraded, transplanted and reference meadows were collected in Sardinia (Italy) using passive acoustic monitoring devices. Soundscape at each meadow type was examined using both spectral analysis and classification of fish calls based on a catalogue of fish sounds from the Mediterranean Sea. Seven different fish sounds were recorded: most of them were present in the reference and transplanted meadows and were associated to Sciaena umbra and Scorpaena spp. Sound Pressure Level (SPL, in dB re: 1 μPa-rms) and Acoustic Complexity Index (ACI) were influenced by the meadow type. Particularly higher values were associated to the transplanted meadow. SPL and ACI calculated in the 200-2000 Hz frequency band were also related to high abundance of fish sounds (chorus). These results showed that meadow restoration may lead to the recovery of soundscape and the associated community, suggesting that short term acoustic monitoring can provide complementary information to evaluate seagrass restoration success.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:糖尿病视网膜病变(DR),糖尿病的慢性和进行性微血管并发症,严重威胁视力,是全球工作年龄个人失明的主要原因。传统的诊断方法,例如眼底镜检查和荧光素血管造影是非定量的,侵入性,和耗时。泪液中蛋白质生物标志物的分析提供了对眼部和全身健康的非侵入性见解,帮助早期DR检测。这项研究引入了一种表面声波(SAW)微芯片,该芯片可在基于珠子的免疫测定中快速增强荧光,以从人类泪液样品中进行灵敏且无创的DR检测。
    结果:该装置促进了颗粒混合,用于免疫测定形成和液滴中的颗粒浓度,导致免疫荧光信号增强。这种可拆卸的SAW微芯片允许在每次使用后处理盖玻璃,从而提高了叉指式换能器的可重用性并最小化了潜在的交叉污染。对一组10名志愿者进行了初步临床试验,包括DR患者和健康个体。结果显示与ELISA研究非常吻合,验证了SAW微芯片的高精度。
    结论:这项综合研究为新型SAW微芯片在糖尿病患者中早期发现DR的潜在应用提供了重要见解。通过利用泪液中的蛋白质生物标志物,该设备有助于非侵入性,快速,和灵敏的检测,通过及时干预和管理这种视力威胁疾病,可能彻底改变DR诊断并改善患者预后。
    BACKGROUND: Diabetic retinopathy (DR), a chronic and progressive microvascular complication of diabetes mellitus, substantially threatens vision and is a leading cause of blindness among working-age individuals worldwide. Traditional diagnostic methods, such as ophthalmoscopy and fluorescein angiography are nonquantitative, invasive, and time consuming. Analysis of protein biomarkers in tear fluid offers noninvasive insights into ocular and systemic health, aiding in early DR detection. This study introduces a surface acoustic wave (SAW) microchip that rapidly enhances fluorescence in bead-based immunoassays for the sensitive and noninvasive DR detection from human tear samples.
    RESULTS: The device facilitated particle mixing for immunoassay formation and particle concentration in the droplet, resulting in an enhanced immunofluorescence signal. This detachable SAW microchip allows the disposal of the cover glass after every use, thereby improving the reusability of the interdigital transducer and minimizing potential cross-contamination. A preliminary clinical test was conducted on a cohort of 10 volunteers, including DR patients and healthy individuals. The results demonstrated strong agreement with ELISA studies, validating the high accuracy rate of the SAW microchip.
    CONCLUSIONS: This comprehensive study offers significant insights into the potential application of a novel SAW microchip for the early detection of DR in individuals with diabetes. By utilizing protein biomarkers found in tear fluid, the device facilitates noninvasive, rapid, and sensitive detection, potentially revolutionizing DR diagnostics and improving patient outcomes through timely intervention and management of this vision-threatening condition.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    当在相似的感觉水平(SL)下呈现给听众时,可听的甚高频声音(VHFS)和超声波(US)被认为比较低频率声音更令人不愉快。在这项研究中,17名参与者将感官不愉快评为14-,16-,和18kHz音调和1kHz参考音调。每个人的音调都以相等的主观响度水平呈现,对应于在1kHz处测得的10、20和30dBSL的电平。根据他们归因于暴露于VHFS/US的自我报告的先前症状,参与者被归类为“有症状”或“无症状”。在这两组中,VHFS/US的主观响度随声压级的增加比1-kHz参考音调的增加更快,这与较高频率处的减小的动态范围一致。对于响度匹配的音调,参与者将VHFS/US评为比1kHz参考更令人不愉快。这些结果表明,在设计或部署发射VHFS/US的设备时,应考虑在高频下增加的感觉不愉快和减小的动态范围。
    Audible very-high frequency sound (VHFS) and ultrasound (US) have been rated more unpleasant than lower frequency sounds when presented to listeners at similar sensation levels (SLs). In this study, 17 participants rated the sensory unpleasantness of 14-, 16-, and 18-kHz tones and a 1-kHz reference tone. Tones were presented at equal subjective loudness levels for each individual, corresponding to levels of 10, 20, and 30 dB SL measured at 1 kHz. Participants were categorized as either \"symptomatic\" or \"asymptomatic\" based on self-reported previous symptoms that they attributed to exposure to VHFS/US. In both groups, subjective loudness increased more rapidly with sound pressure level for VHFS/US than for the 1-kHz reference tone, which is consistent with a reduced dynamic range at the higher frequencies. For loudness-matched tones, participants rated VHFS/US as more unpleasant than that for the 1-kHz reference. These results suggest that increased sensory unpleasantness and reduced dynamic range at high frequencies should be considered when designing or deploying equipment which emits VHFS/US that could be audible to exposed people.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    近年来,在声景评估中对感知情感品质(PAQs)的研究有所增加,方法从原位到实验室。通过技术进步,虚拟现实(VR)促进了对同一实验中多个位置的评估。在本文中,在大曼彻斯特的三个地点测试的在线和实验室环境中展示了不同城市地点的VR复制品(“公园”,\'广场\',和行人\'街道\')在两个人口密度(空的和繁忙的)使用ISO/TS12913-2(2018)声景PAQ。研究区域为360视频和双耳音频VR复制品准备了音频和视频记录。目的是观察方法中位置内的人口密度效应(Wilcoxon检验)和位置之间的变化(Mann-WhitneyU检验)。人口密度和不同地点之间的比较表明,对大多数PAQ有重大影响。结果还表明,大城市可以呈现均匀的声音,组成一个混合的城市声景,独立于功能。这些发现可以以低成本的方式支持城市设计,城市规划者可以测试不同的场景和干预措施。
    The study of the perceived affective qualities (PAQs) in soundscape assessments have increased in recent years, with methods varying from in-situ to laboratory. Through technological advances, virtual reality (VR) has facilitated evaluations of multiple locations in the same experiment. In this paper, VR reproductions of different urban sites were presented in an online and laboratory environment testing three locations in Greater Manchester (\'Park\', \'Plaza\', and pedestrian \'Street\') in two population densities (empty and busy) using ISO/TS 12913-2 (2018) soundscape PAQs. The studied areas had audio and video recordings prepared for 360 video and binaural audio VR reproductions. The aims were to observe population density effects within locations (Wilcoxon test) and variations between locations (Mann-Whitney U test) within methods. Population density and comparisons among locations demonstrated a significant effect on most PAQs. Results also suggested that big cities can present homogenous sounds, composing a \'blended\' urban soundscape, independently of functionality. These findings can support urban design in a low-cost approach, where urban planners can test different scenarios and interventions.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    湍流和声音是牡蛎礁幼虫募集的重要线索。大量研究发现湍流强度与海洋幼虫游泳行为的关系,而其他人则记录了声音在增强牡蛎礁幼体募集中的重要性。然而,湍流与礁石声景之间的关系还没有得到很好的理解。在这项研究中,我们对2个潮间带牡蛎礁(1个自然和1个恢复)和1个相邻的裸露滩涂进行了并排的声学多普勒测速仪湍流测量和水听器声景记录作为参考。所有三个站点的声压级(SPL)相似,虽然SPL>2000赫兹在恢复的珊瑚礁是最高的,可能是由于其更大的区域包含更多的发声生物。流量噪声(FN),定义为水听器在f<100Hz时记录的压力波动的平均值,与平均流速显著相关,湍流动能,和湍流耗散率(ε),同意湍流的理论计算。我们的结果还表明,ε和FN之间的关系与先前报道的ε和FN之间的关系相似。向下的幼虫游泳速度(wb),FN和wb在ε>0.1cm2s-3时都表现出快速增长。这些结果表明,珊瑚礁湍流和声音可能以互补和协同的方式吸引牡蛎幼虫。
    Turbulence and sound are important cues for oyster reef larval recruitment. Numerous studies have found a relationship between turbulence intensity and swimming behaviors of marine larvae, while others have documented the importance of sounds in enhancing larval recruitment to oyster reefs. However, the relationship between turbulence and the reef soundscape is not well understood. In this study we made side-by-side acoustic Doppler velocimeter turbulence measurements and hydrophone soundscape recordings over 2 intertidal oyster reefs (1 natural and 1 restored) and 1 adjacent bare mudflat as a reference. Sound pressure levels (SPL) were similar across all three sites, although SPL > 2000 Hz was highest at the restored reef, likely due to its larger area that contained a greater number of sound-producing organisms. Flow noise (FN), defined as the mean of pressure fluctuations recorded by the hydrophone at f < 100 Hz, was significantly related to mean flow speed, turbulent kinetic energy, and turbulence dissipation rate (ε), agreeing with theoretical calculations for turbulence. Our results also show a similar relationship between ε and FN to what has been previously reported for ε vs. downward larval swimming velocity (w b ), with both FN and w b demonstrating rapid growth at ε > 0.1 cm2 s-3. These results suggest that reef turbulence and sounds may attract oyster larvae in complementary and synergistic ways.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    这项研究研究了用助听器聆听时刺激特性对声音外部化的影响。通常情况下,听众会从扬声器中获得宽带“令牌”(环境声音和语音),并使用连续量表对外部化进行评级。在单独的块中,他们在没有帮助的情况下或在佩戴耳后助听器时听,该助听器具有封闭的圆顶和低增益(线性或压缩)。代币对评级有重大影响,即使是独立倾听,助听器的效果取决于令牌。声学分析表明,助听器更有可能破坏低频强调的峰值声音的外部化。
    This study examined the influence of stimulus properties on sound externalization when listening with hearing aids. Normally hearing listeners were presented with broadband \"tokens\" (environmental sounds and speech) from loudspeakers, and rated externalization using a continuous scale. In separate blocks, they listened unaided or while wearing behind-the-ear hearing aids with closed domes and low gain (linear or compressive). There was a significant influence of token on ratings, even for unaided listening, and the effect of hearing aids depended on token. An acoustic analysis indicated that hearing aids were more likely to disrupt externalization for peakier sounds with a low-frequency emphasis.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    这篇透视论文的重点是发声的身体运动与相应的感知声音特征之间的关系,以形状作为这两个领域的共同点的思想为指导。术语“形状”用于表示我们感知或想象的现象的图形图形渲染,并且可能在纸上或屏幕上有物理表现,或者打手势,或者就像我们想象中的痕迹一样。形状为我们提供了展开运动和声音片段的间歇性快照,形状的重点是使短暂的声音和运动特征易于处理,作为更永久的对象。感知声音的形状包括动态,光谱,纹理,音高相关,谐波,等。作为形状的特征,而发声运动的形状包括发声效应器的运动轨迹和姿势,即,手指,手,武器,等。,或嘴,嘴唇,和舌头。
    The focus of this perspective paper is on relationships between sound-producing body motion and corresponding perceived sound features, guided by the idea of shapes as the common denominator of these two domains. The term shape is used to denote graphical-pictorial renderings of phenomena that we perceive or imagine, and may have physical manifestations as tracings on paper or on screen, or as gesticulations, or just as imagined tracings in our minds. Shapes give us intermittent snapshots of unfolding motion and sound fragments, and the point of shapes is to make ephemeral sound and motion features tractable as more permanent objects. Shapes of perceived sound include dynamic, spectral, textural, pitch-related, harmonic, etc. features as shapes, whereas shapes of sound-producing motion include both motion trajectories and postures of sound-producing effectors, i.e., of fingers, hands, arms, etc., or mouth, lips, and tongue.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    人工智能和机器学习的快速发展显着增强了声音和声学识别技术,从传统模型转向更复杂的基于神经网络的方法。其中,尖峰神经网络(SNN)尤其值得注意。SNNs模仿生物神经元,并以类似于人脑的原理运作,使用模拟计算机制。此功能允许以低功耗和最小延迟进行有效的声音处理,理想的实时应用在嵌入式系统。本文回顾了SNN在声音识别方面的最新进展,强调他们克服数字计算局限性的潜力,并为未来的研究提出方向。SNN的独特属性可能会导致更紧密地模仿人类听觉处理的突破。
    The rapid advancement of AI and machine learning has significantly enhanced sound and acoustic recognition technologies, moving beyond traditional models to more sophisticated neural network-based methods. Among these, Spiking Neural Networks (SNNs) are particularly noteworthy. SNNs mimic biological neurons and operate on principles similar to the human brain, using analog computing mechanisms. This capability allows for efficient sound processing with low power consumption and minimal latency, ideal for real-time applications in embedded systems. This paper reviews recent developments in SNNs for sound recognition, underscoring their potential to overcome the limitations of digital computing and suggesting directions for future research. The unique attributes of SNNs could lead to breakthroughs in mimicking human auditory processing more closely.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    与其他规则的听觉序列的不可预测的偏差,以及沉默一段时间后的罕见声音,自动检测。最近的证据表明,后者也引起了正在进行的运动活动的快速非自愿调制,早在声音发作后100ms出现,这归因于超模态处理。我们探索了罕见和异常声音的这种力调制。参与者(N=29)捏紧力敏装置,并保持1-2N的力1分钟。在两个条件下提出了与任务无关的音调。在罕见的情况下,每8到16秒出现4000Hz的音调。在流动条件下,4000Hz和2996Hz的音调以1s的速率呈现,频率变化不频繁(p=1/12)。在罕见的情况下,观察到瞬态力调制在〜234ms处显着增加,在~350ms时减少。在低频偏差音调的流动条件下,在~277ms观察到力的增加,然后在~413ms观察到力的减少。在感知高频偏差期间未观察到明显的调制。这些结果表明,罕见的沉默破坏声音和低音调偏差都会引起运动反应的自动波动,这开辟了这些力调制由刺激特定的变化检测过程触发的可能性。
    Unpredictable deviations from an otherwise regular auditory sequence, as well as rare sounds following a period of silence, are detected automatically. Recent evidence suggests that the latter also elicit quick involuntary modulations of ongoing motor activity emerging as early as 100 ms following sound onset, which was attributed to supramodal processing. We explored such force modulations for both rare and deviant sounds. Participants (N = 29) pinched a force sensitive device and maintained a force of 1-2 N for periods of 1 min. Task-irrelevant tones were presented under two conditions. In the Rare condition, 4000 Hz tones were presented every 8-to-16 s. In the Roving condition, 4000 Hz and 2996 Hz tones were presented at rate of 1 s, with infrequent (p = 1/12) frequency changes. In the Rare condition, transient force modulations were observed with a significant increase at ~ 234 ms, and a decrease at ~ 350 ms. In the Roving condition with low frequency deviant tones, an increase in force was observed at ~ 277 ms followed by a decrease at ~ 413 ms. No significant modulations were observed during perception of high frequency deviants. These results suggest that both rare silence-breaking sounds and low-pitched deviants evoke automatic fluctuations of motor responses, which opens up the possibility that these force modulations are triggered by stimulus-specific change-detection processes.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号