Word representation

  • 文章类型: Journal Article
    单词,与图像不同,是象征性的表示。一个词的含义和它产生的视觉图像中固有的联想细节,与单词的处理和表示方式密不可分。众所周知,海马体与记忆的组成部分相关联以形成持久的表示,在这里,我们表明海马体对抽象文字处理特别敏感。在识别过程中使用功能磁共振成像,我们发现,无论记忆结果如何,词汇抽象性的增加都会增加海马的激活.有趣的是,无论单词内容如何,单词回忆都会产生海马激活,虽然海马旁皮层对单词表示的具体性很敏感,不管记忆结果如何。我们认为海马体在非语境化抽象单词含义的表示中发挥了关键作用,因为它的信息绑定能力允许检索语义和视觉关联,当捆绑在一起时,生成由单词符号表示的抽象概念。这些见解对单词表示的研究有意义,记忆,和海马功能,也许揭示了人类大脑如何适应编码和表示抽象概念。
    Words, unlike images, are symbolic representations. The associative details inherent within a word\'s meaning and the visual imagery it generates, are inextricably connected to the way words are processed and represented. It is well recognised that the hippocampus associatively binds components of a memory to form a lasting representation, and here we show that the hippocampus is especially sensitive to abstract word processing. Using fMRI during recognition, we found that the increased abstractness of words produced increased hippocampal activation regardless of memory outcome. Interestingly, word recollection produced hippocampal activation regardless of word content, while the parahippocampal cortex was sensitive to concreteness of word representations, regardless of memory outcome. We reason that the hippocampus has assumed a critical role in the representation of uncontextualized abstract word meaning, as its information-binding ability allows the retrieval of the semantic and visual associates that, when bound together, generate the abstract concept represented by word symbols. These insights have implications for research on word representation, memory, and hippocampal function, perhaps shedding light on how the human brain has adapted to encode and represent abstract concepts.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    单词表示的单一本地化存储的概念变得越来越不合理,因为已经积累了证据,证明了在运动中接地的单词形式的广泛分布的神经表示,感性的,和概念过程。这里,我们试图结合机器学习方法和神经生物学框架,提出一个可能负责词形表示的大脑系统的计算模型。我们测试了以下假设:大脑中单词表示的功能专业化部分是由计算优化驱动的。这个假设直接解决了映射声音和清晰度与映射声音和意义。
    我们发现,在声音和关节之间的映射上训练的人工神经网络在识别声音和意义之间的映射方面表现不佳,反之亦然。此外,与其他两个模型相比,在两个任务上同时训练的网络无法发现声音和高级认知状态之间有效映射所需的特征。此外,这些网络开发了内部表示,反映了专门的任务优化功能,而无需明确的培训。
    一起,这些发现表明,不同的任务导向表示导致更集中的响应和更好的性能的机器或算法,假设,大脑。因此,我们暗示,鉴于人类大脑面临的任务的性质,单词表示的功能专业化反映了一种计算优化策略。
    UNASSIGNED: The notion of a single localized store of word representations has become increasingly less plausible as evidence has accumulated for the widely distributed neural representation of wordform grounded in motor, perceptual, and conceptual processes. Here, we attempt to combine machine learning methods and neurobiological frameworks to propose a computational model of brain systems potentially responsible for wordform representation. We tested the hypothesis that the functional specialization of word representation in the brain is driven partly by computational optimization. This hypothesis directly addresses the unique problem of mapping sound and articulation vs. mapping sound and meaning.
    UNASSIGNED: We found that artificial neural networks trained on the mapping between sound and articulation performed poorly in recognizing the mapping between sound and meaning and vice versa. Moreover, a network trained on both tasks simultaneously could not discover the features required for efficient mapping between sound and higher-level cognitive states compared to the other two models. Furthermore, these networks developed internal representations reflecting specialized task-optimized functions without explicit training.
    UNASSIGNED: Together, these findings demonstrate that different task-directed representations lead to more focused responses and better performance of a machine or algorithm and, hypothetically, the brain. Thus, we imply that the functional specialization of word representation mirrors a computational optimization strategy given the nature of the tasks that the human brain faces.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    这项研究旨在阐明熟悉的异音变异是否被编码在单词表示中的问题。出生在特伦蒂诺的意大利语使用者和出生在意大利中南部地区的演讲者都参加了实验。我们测试了由同一单词引发的MMN,其中包含两个不同的变体,其中一组参与者比另一组参与者更熟悉,取决于他们的地区品种的意大利。特伦蒂诺小组显示了增强的MMN,该单词嵌入了熟悉的变体,而中南部使用者则没有差异。特伦蒂诺语者中陌生单词变体的MMN幅度与对特伦蒂诺方言的被动暴露呈负相关。我们得出的结论是,嵌入熟悉和不熟悉的异音的单词在区域语言母语使用者的大脑中的表现方式不同,并且分化程度由个人经历调节。
    This study aims to shed light on the issue whether familiar allophonic variation is encoded in word representations. Both Italian speakers born in Trentino and speakers born in the Central-Southern regions of Italy took part in the experiment. We tested the MMN elicited by the same word encompassing two different allophones, one of which was more familiar to one group of participants than to the other, depending on their regional variety of Italian. The Trentino group showed an enhanced MMN for the word embedding the familiar variant while Central-Southern speakers showed no difference. The amplitude of the MMN for the unfamiliar word variant in Trentino speakers showed an inverse correlation with the passive exposure to the Trentino dialect. We conclude that words embedding familiar and unfamiliar allophones are differently represented in the brain of native speakers of regional language and the degree of differentiation is modulated by individual experience.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    Variability is pervasive in spoken language, in particular if one is exposed to two varieties of the same language (e.g., the standard variety and a dialect). Unlike in bilingual settings, standard and dialectal forms are often phonologically related, increasing the variability in word forms (e.g., German Fuß \"foot\" is produced as [fus] in Standard German and as [fs] in the Alemannic dialect). We investigate whether dialectal variability in children\'s input affects their ability to recognize words in Standard German, testing non-dialectal vs. dialectal children. Non-dialectal children, who typically grow up in urban areas, mostly hear Standard German forms, and hence encounter little segmental variability in their input. Dialectal children in turn, who typically grow up in rural areas, hear both Standard German and dialectal forms, and are hence exposed to a large amount of variability in their input. We employ the familiar word paradigm for German children aged 12-18 months. Since dialectal children from rural areas are hard to recruit for laboratory studies, we programmed an App that allows all parents to test their children at home. Looking times to familiar vs. non-familiar words were analyzed using a semi-automatic procedure based on neural networks. Our results replicate the familiarity preference for non-dialectal German 12-18-month-old children (longer looking times to familiar words than vs. non-familiar words). Non-dialectal children in the same age range, on the other hand, showed a novelty preference. One explanation for the novelty preference in dialectal children may be more mature linguistic processing, caused by more variability of word forms in the input. This linguistic maturation hypothesis is addressed in Experiment 2, in which we tested older children (18-24-month-olds). These children, who are not exposed to dialectal forms, also showed a novelty preference. Taken together, our findings show that both dialectal and non-dialectal German children recognized the familiar Standard German word forms, but their looking pattern differed as a function of the variability in the input. Frequent exposure to both dialectal and Standard German word forms may hence have affected the nature of (prelexical and/or) lexical representations, leading to more mature processing capacities.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    功能列表任务是探索单词,类别,和隐喻的代表。然而,手动编码生成的特征是耗时且昂贵的,涉及实验者的主观判断。本文的目的是介绍“RK处理器”,在我们的实验室中开发的程序,用于分析隐喻特征数据,但也可以应用于其他特征列表数据。在详细说明了处理步骤之后,我们证明了处理后的特征数据与先前手动处理隐喻特征的发现一致,并且处理后的特征预测了与可理解性和隐喻善良性有关的隐喻判断的维度。最后,我们提出了其他几个用于单词相似性研究的应用程序,复合词,类别和概念,语义歧义,不一致分辨率和计算建模。RK处理器为研究人员提供了宝贵的工具,可以节省时间和资源,并保持处理的一致性。
    Feature-listing tasks are an invaluable resource for exploring how words, categories, and metaphors are represented. However, manually coding the generated features is time-consuming and expensive, and involves subjective judgments from the experimenter. The purpose of this paper is to introduce the \"RK processor\", a program that was developed in our lab to analyse metaphor feature data but which can also be applied to other feature-listing data. After detailing the steps of processing, we demonstrate that the processed feature data align with previous findings in which metaphor features were processed manually and that the processed features predict dimensions of metaphor judgments pertaining to comprehensibility and metaphor goodness. Lastly, we present several other applications for research on word similarity, compound words, categories and concepts, semantic ambiguity, incongruity resolution and computational modelling. The RK processor offers researchers a valuable tool to save time and resources and to maintain consistency in processing.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    Most deep language understanding models depend only on word representations, which are mainly based on language modelling derived from a large amount of raw text. These models encode distributional knowledge without considering syntactic structural information, although several studies have shown benefits of including such information. Therefore, we propose new syntactically-informed word representations (SIWRs), which allow us to enrich the pre-trained word representations with syntactic information without training language models from scratch. To obtain SIWRs, a graph-based neural model is built on top of either static or contextualised word representations such as GloVe, ELMo and BERT. The model is first pre-trained with only a relatively modest amount of task-independent data that are automatically annotated using existing syntactic tools. SIWRs are then obtained by applying the model to downstream task data and extracting the intermediate word representations. We finally replace word representations in downstream models with SIWRs for applications. We evaluate SIWRs on three information extraction tasks, namely nested named entity recognition (NER), binary and n-ary relation extractions (REs). The results demonstrate that our SIWRs yield performance gains over the base representations in these NLP tasks with 3-9% relative error reduction. Our SIWRs also perform better than fine-tuning BERT in binary RE. We also conduct extensive experiments to analyse the proposed method.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

  • 文章类型: Journal Article
    Word learning is basic to foreign language acquisition, however time consuming and not always successful. Empirical studies have shown that traditional (visual) word learning can be enhanced by gestures. The gesture benefit has been attributed to depth of encoding. Gestures can lead to depth of encoding because they trigger semantic processing and sensorimotor enrichment of the novel word. However, the neural underpinning of depth of encoding is still unclear. Here, we combined an fMRI and a behavioral study to investigate word encoding online. In the scanner, participants encoded 30 novel words of an artificial language created for experimental purposes and their translation into the subjects\' native language. Participants encoded the words three times: visually, audiovisually, and by additionally observing semantically related gestures performed by an actress. Hemodynamic activity during word encoding revealed the recruitment of cortical areas involved in stimulus processing. In this study, depth of encoding can be spelt out in terms of sensorimotor brain networks that grow larger the more sensory modalities are linked to the novel word. Word retention outside the scanner documented a positive effect of gestures in a free recall test in the short term.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:生物医学事件提取是生物医学文本挖掘中的一项关键任务。作为不同生物医学事件提取技术的国际评估的主要论坛,BioNLP共享任务代表了生物医学文本挖掘向细粒度信息提取(IE)的趋势。2016年BioNLP共享任务系列第四(BioNLP-ST'16)提出了三个任务,其中在早期的BioNLP-ST中已经提出了细菌生物表位事件提取(BB)任务。深度学习方法提供了一种有效的方法,可以自动提取更复杂的特征,并在各种自然语言处理任务中取得显著成效。
    结果:实验结果表明,所提出的方法可以在测试集中获得57.42%的F分数,它的表现优于先前向BioNLP-ST2016提交的最新官方文件。
    结论:在本文中,我们提出了一种新颖的门控循环单元网络框架,该框架整合了注意力机制,用于从生物医学文献中提取生物群落和细菌之间的生物医学事件。利用BioNLP\'16共享任务中的语料库。实验结果证明了所提出的框架的潜力和有效性。
    BACKGROUND: Biomedical event extraction is a crucial task in biomedical text mining. As the primary forum for international evaluation of different biomedical event extraction technologies, BioNLP Shared Task represents a trend in biomedical text mining toward fine-grained information extraction (IE). The fourth series of BioNLP Shared Task in 2016 (BioNLP-ST\'16) proposed three tasks, in which the Bacteria Biotope event extraction (BB) task has been put forward in the earlier BioNLP-ST. Deep learning methods provide an effective way to automatically extract more complex features and achieve notable results in various natural language processing tasks.
    RESULTS: The experimental results show that the presented approach can achieve an F-score of 57.42% in the test set, which outperforms previous state-of-the-art official submissions to BioNLP-ST 2016.
    CONCLUSIONS: In this paper, we propose a novel Gated Recurrent Unit Networks framework integrating attention mechanism for extracting biomedical events between biotope and bacteria from biomedical literature, utilizing the corpus from the BioNLP\'16 Shared Task on Bacteria Biotope task. The experimental results demonstrate the potential and effectiveness of the proposed framework.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

  • 文章类型: Journal Article
    This work aims to estimate the degree of adverse drug reactions (ADR) for psychiatric medications from social media, including Twitter, Reddit, and LiveJournal. Advances in lightning-fast cluster computing was employed to process large scale data, consisting of 6.4 terabytes of data containing 3.8 billion records from all the media. Rates of ADR were quantified using the SIDER database of drugs and side-effects, and an estimated ADR rate was based on the prevalence of discussion in the social media corpora. Agreement between these measures for a sample of ten popular psychiatric drugs was evaluated using the Pearson correlation coefficient, r, with values between 0.08 and 0.50. Word2vec, a novel neural learning framework, was utilized to improve the coverage of variants of ADR terms in the unstructured text by identifying syntactically or semantically similar terms. Improved correlation coefficients, between 0.29 and 0.59, demonstrates the capability of advanced techniques in machine learning to aid in the discovery of meaningful patterns from medical data, and social media data, at scale.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    Fast real-time processing of external information by the brain is vital to survival in a highly dynamic environment. A ubiquitous information medium used by humans is spoken language, but the neural dynamics of its comprehension is still poorly understood. Here, we scrutinized the earliest electrophysiological activity elicited in the human brain by spoken words and matched meaningless word-like stimuli using a lexical auditory oddball paradigm, an established technique for investigating cortical activation patterns underlying early automatic stages of language processing. We show that the earliest cortical reflection of word comprehension takes place during the electrophysiological P1 evoked response, at about 30 ms following the word disambiguation point, and takes the form of an enhanced brain activation for familiar meaningful words, even when they are presented outside the focus of attention. This previously unknown ultra-early lexicality effect is underpinned by left temporo-frontal cortical circuits and likely reflects a first-pass automatic lexical access that precedes later stages of lexical and semantic processing described in previous literature. The results suggest that the brain operates with maximum speed and efficiency to extract meaningful (including linguistic) information from the sensory input, which is a neurobiological capacity essential for timely and appropriate reactions to external events.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

公众号