Object recognition

物体识别
  • 文章类型: Journal Article
    无人机(UAV)广泛用于各种计算机视觉应用,特别是在智能交通监控中,因为它们是敏捷的,简化了操作,同时提高效率。然而,由于难以从复杂的交通场景中提取前景(车辆)信息,因此自动化这些程序仍然是一个重大挑战。
    本文提出了一种用于自动车辆监视的独特方法,该方法使用FCM分割航拍图像。YOLOv8以检测微小物体的能力而闻名,然后用于检测车辆。此外,利用ORB功能的系统用于支持车辆识别,assignment,以及跨相框的恢复。车辆跟踪是使用DeepSORT完成的,它优雅地将卡尔曼滤波与深度学习相结合,以实现精确的结果。
    我们提出的模型在VEDAI和SRTID数据集上具有0.86和0.84的精度,在车辆识别和跟踪方面具有出色的性能,分别,用于车辆检测。
    对于车辆跟踪,该模型在VEDAI和SRTID数据集上的精度分别为0.89和0.85,分别。
    UNASSIGNED: Unmanned aerial vehicles (UAVs) are widely used in various computer vision applications, especially in intelligent traffic monitoring, as they are agile and simplify operations while boosting efficiency. However, automating these procedures is still a significant challenge due to the difficulty of extracting foreground (vehicle) information from complex traffic scenes.
    UNASSIGNED: This paper presents a unique method for autonomous vehicle surveillance that uses FCM to segment aerial images. YOLOv8, which is known for its ability to detect tiny objects, is then used to detect vehicles. Additionally, a system that utilizes ORB features is employed to support vehicle recognition, assignment, and recovery across picture frames. Vehicle tracking is accomplished using DeepSORT, which elegantly combines Kalman filtering with deep learning to achieve precise results.
    UNASSIGNED: Our proposed model demonstrates remarkable performance in vehicle identification and tracking with precision of 0.86 and 0.84 on the VEDAI and SRTID datasets, respectively, for vehicle detection.
    UNASSIGNED: For vehicle tracking, the model achieves accuracies of 0.89 and 0.85 on the VEDAI and SRTID datasets, respectively.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    实现4级及以上自动驾驶,一个强大而稳定的自动驾驶系统对于适应各种环境变化至关重要。本文旨在进行车辆姿态估计,形成自动驾驶系统的关键因素,更加普遍和强劲。自动驾驶系统中车辆姿态估计的流行方法依赖于实时运动学(RTK)传感器数据,确保准确的位置获取。然而,由于RTK传感器的特点,精确定位在室内空间或信号干扰区域具有挑战性或不可能,导致姿态估计不准确,并在此类场景中阻碍自动驾驶。本文提出了一种通过利用在高精度地图中注册的对象来克服这些挑战的方法。所提出的方法涉及创建具有添加对象的语义高清(HD)地图,形成以对象为中心的特征,使用这些功能识别位置,并从识别的位置准确估计车辆的姿态。在获取RTK传感器数据具有挑战性的环境中,该方法提高了车辆姿态估计的精度。实现更强大和稳定的自动驾驶。本文通过仿真和实际实验证明了该方法的有效性,展示其更精确的姿态估计的能力。
    To achieve Level 4 and above autonomous driving, a robust and stable autonomous driving system is essential to adapt to various environmental changes. This paper aims to perform vehicle pose estimation, a crucial element in forming autonomous driving systems, more universally and robustly. The prevalent method for vehicle pose estimation in autonomous driving systems relies on Real-Time Kinematic (RTK) sensor data, ensuring accurate location acquisition. However, due to the characteristics of RTK sensors, precise positioning is challenging or impossible in indoor spaces or areas with signal interference, leading to inaccurate pose estimation and hindering autonomous driving in such scenarios. This paper proposes a method to overcome these challenges by leveraging objects registered in a high-precision map. The proposed approach involves creating a semantic high-definition (HD) map with added objects, forming object-centric features, recognizing locations using these features, and accurately estimating the vehicle\'s pose from the recognized location. This proposed method enhances the precision of vehicle pose estimation in environments where acquiring RTK sensor data is challenging, enabling more robust and stable autonomous driving. The paper demonstrates the proposed method\'s effectiveness through simulation and real-world experiments, showcasing its capability for more precise pose estimation.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    卵巢切除(OVX)小鼠背侧海马(DH)输注17β-雌二醇(E2)可增强记忆巩固,取决于细胞外信号调节激酶(ERK)和Akt的快速磷酸化的作用。星形细胞谷氨酸转运体1(GLT-1)通过从突触间隙摄取谷氨酸来调节神经传递。然而,对DH星形胶质细胞的贡献知之甚少,和星形细胞谷氨酸运输,E2的记忆增强作用。本研究旨在通过确定DHGLT-1对于E2在物体识别和物体放置任务中增强记忆并触发DH星形胶质细胞的快速磷酸化事件所必需的程度,来测试DH星形胶质细胞是否有助于记忆巩固的雌激素调节。将OVX雌性小鼠双侧插管入DH或DH和背侧第三脑室(ICV)。训练后DH输注GLT-1抑制剂二氢海藻酸(DHK)剂量依赖性地损害了两项任务中的记忆巩固。此外,在每项任务中输入ICV的E2的记忆增强作用均被DHDHK输注阻断.E2增加DH星形胶质细胞p42ERK和Akt磷酸化,这些影响被DHK阻断。结果表明,DHGLT-1活动对于对象和空间记忆巩固的必要性,和E2以增强这些记忆的巩固并快速激活DH星形胶质细胞中的细胞信号传导。研究结果表明,OVX雌性DH中的星形胶质细胞功能是记忆形成所必需的,并且受E2调节,并表明DH星形胶质细胞GLT-1活性在E2的记忆增强作用中起着重要作用。
    Infusion of 17β-estradiol (E2) into the dorsal hippocampus (DH) of ovariectomized (OVX) mice enhances memory consolidation, an effect that depends on rapid phosphorylation of extracellular signal-regulated kinase (ERK) and Akt. Astrocytic glutamate transporter 1 (GLT-1) modulates neurotransmission via glutamate uptake from the synaptic cleft. However, little is known about the contribution of DH astrocytes, and astrocytic glutamate transport, to the memory-enhancing effects of E2. This study was designed to test whether DH astrocytes contribute to estrogenic modulation of memory consolidation by determining the extent to which DH GLT-1 is necessary for E2 to enhance memory in object recognition and object placement tasks and trigger rapid phosphorylation events in DH astrocytes. OVX female mice were bilaterally cannulated into the DH or the DH and dorsal third ventricle (ICV). Post-training DH infusion of the GLT-1 inhibitor dihydrokainic acid (DHK) dose-dependently impaired memory consolidation in both tasks. Moreover, the memory-enhancing effects of ICV-infused E2 in each task were blocked by DH DHK infusion. E2 increased p42 ERK and Akt phosphorylation in DH astrocytes, and these effects were blocked by DHK. Results suggest the necessity of DH GLT-1 activity for object and spatial memory consolidation, and for E2 to enhance consolidation of these memories and to rapidly activate cell signaling in DH astrocytes. Findings indicate that astrocytic function in the DH of OVX females is necessary for memory formation and is regulated by E2, and suggest an essential role for DH astrocytic GLT-1 activity in the memory-enhancing effects of E2.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    观察导致凝视物体;看到识别它们。视觉拥挤使观看困难或不可能,然后再将物体带到中央凹。观察之前可以通过初级视觉皮层(V1)的显着性机制来指导。我们提出,看和看主要由周边和中央视觉支持,分别。该建议在由于黄斑变性而导致中心视力丧失的观察者中进行了测试,使用可以仅通过查看来完成的视觉搜索任务,但实际上是通过观看而受到阻碍。搜索目标是一个独特的导向,显著,在形状相同的杆中的杆。每个酒吧,包括目标,是\"\"X\"形状的一部分。目标的“X”与,虽然从旋转,图像中的另一个“X”,这通常会造成混乱。然而,这个观察者没有表现出这种混乱,大概是因为她看不见X的形状,但可以朝目标看.该结果证明了中央视觉和周边视觉之间的关键二分法。
    Looking leads gaze to objects; seeing recognizes them. Visual crowding makes seeing difficult or impossible before looking brings objects to the fovea. Looking before seeing can be guided by saliency mechanisms in the primary visual cortex (V1). We have proposed that looking and seeing are mainly supported by peripheral and central vision, respectively. This proposal is tested in an observer with central vision loss due to macular degeneration, using a visual search task that can be accomplished solely through looking, but is actually impeded through seeing. The search target is an uniquely oriented, salient, bar among identically shaped bars. Each bar, including the target, is part of an \" \" X \" shape. The target\'s \" X is identical to, although rotated from, the other \" X \'s in the image, which normally causes confusion. However, this observer exhibits no such confusion, presumably because she cannot see the \" X \'s shape, but can look towards the target. This result demonstrates a critical dichotomy between central and peripheral vision.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    组蛋白乙酰转移酶(HAT)对组蛋白蛋白的乙酰化,以及由此产生的基因表达变化,是长期记忆(LTM)巩固所必需的既定机制,这对于短期记忆(STM)是不需要的。然而,我们先前证明HATp300/CBP相关因子(PCAF)也影响雄性大鼠海马(HPC)依赖性STM。除了他们的表观遗传活性,HATs乙酰化非组蛋白参与非基因组细胞过程,例如雌激素受体(ER)。鉴于ER的速度很快,对HPC依赖性STM的非基因组效应,我们研究了ERs和PCAF之间的潜在相互作用,以促进背侧HPC(dHPC)介导的STM。使用一系列直接施用到dHPC中的药物,我们揭示了PCAF和ERα之间在促进雄性而非雌性大鼠短期原位对象记忆方面的功能相互作用。这种相互作用是ERα特有的,而ERβ激动作用并未增强STM。它进一步特定于dHPCSTM,因为LTM的dHPC或周围皮质中不存在这种作用。额外的实验表明,虽然STM需要本地(即,dHPC)雌激素合成,促进交互作用可能与雌激素无关。最后,westernblot分析表明,dHPC中的PCAF激活迅速(5分钟)激活下游雌激素相关细胞信号传导激酶(c-JunN末端激酶和细胞外信号相关激酶)。总的来说,这些发现表明PCAF,它通常通过表观遗传过程与LTM有关,也会影响dHPC中的STM,可能通过非基因组ER活性。严重的,这种新型PCAF-ER相互作用可能作为支持STM的男性特异性机制存在.重要性声明由于它们调节基因表达的能力,表观遗传机制是长期必要的,但不是短期的,内存。最近,组蛋白乙酰转移酶PCAF,已被证明在海马中是短期物体记忆所必需的,但其作用机制尚不清楚。这里,我们提供的证据表明,在性腺完整的雄性-而不是雌性-大鼠的海马中,PCAF与雌激素受体(ERs)之间的新型功能相互作用增强了短期物体记忆.Further,增强PCAF导致下游ER相关信号级联的快速激活,表明PCAF可能激活与雌激素作用相似的ER。的确,实验表明,PCAF-ER相互作用可能是不依赖雌激素的。这些结果证明了一种介导短期记忆增强的新机制,该机制似乎是男性特有的。
    Acetylation of histone proteins by histone acetyltransferases (HATs), and the resultant change in gene expression, is a well-established mechanism necessary for long-term memory (LTM) consolidation, which is not required for short-term memory (STM). However, we previously demonstrated that the HAT p300/CBP-associated factor (PCAF) also influences hippocampus (HPC)-dependent STM in male rats. In addition to their epigenetic activity, HATs acetylate nonhistone proteins involved in nongenomic cellular processes, such as estrogen receptors (ERs). Given that ERs have rapid, nongenomic effects on HPC-dependent STM, we investigated the potential interaction between ERs and PCAF for STM mediated by the dorsal hippocampus (dHPC). Using a series of pharmacological agents administered directly into the dHPC, we reveal a functional interaction between PCAF and ERα in the facilitation of short-term object-in-place memory in male but not female rats. This interaction was specific to ERα, while ERβ agonism did not enhance STM. It was further specific to dHPC STM, as the effect was not present in the dHPC for LTM or in the perirhinal cortex. Further, while STM required local (i.e., dHPC) estrogen synthesis, the facilitatory interaction effect appeared independent of estrogens. Finally, western blot analyses demonstrated that PCAF activation in the dHPC rapidly (5 min) activated downstream estrogen-related cell signaling kinases (c-Jun N-terminal kinase and extracellular signal-related kinase). Collectively, these findings indicate that PCAF, which is typically implicated in LTM through epigenetic processes, also influences STM in the dHPC, possibly via nongenomic ER activity. Critically, this novel PCAF-ER interaction might exist as a male-specific mechanism supporting STM.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    暂无摘要。
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    阿尔茨海默病(AD)是一种神经退行性疾病,由神经元的慢性和不可逆破坏指定。本研究旨在评估不同提取物(水性,水醇,己烷,和乙酸乙酯)和Echinops头菌(EC)对东莨菪碱诱导的小鼠认知功能受损的影响。显示EC具有抗胆碱酯酶-丁酰胆碱酯酶活性。
    在这项研究中,水性和水醇提取物,EC的己烷和乙酸乙酯馏分(25、50、100mg/kg,i.p.),和甘露(25、50、100毫克/千克,管饲法)与东莨菪碱(0.7mg/kg,i.p.)。利伐斯的明(参考药物)给药2周,使用两种行为模型测试小鼠的记忆功能,对象识别测试(ORT)和被动回避测试(PAT)。
    在两种行为模型中服用东pol碱都显着损害了记忆功能。在PAT模型中,50和100mg/kg的所有提取物均显着逆转了东莨菪碱引起的记忆破坏作用。在25mg/kg的较低剂量下,然而,没有一种提取物能够显着改变步进潜伏期时间。在ORT模型中,然而,以50和100mg/kg的剂量施用所有提取物,识别指数显著提高。只有甘露和25mg/kg的水提取物能够逆转东莨菪碱诱导的记忆障碍。
    这些结果表明,与卡巴拉汀相比,所有形式的EC提取物都可以改善东pol碱引起的记忆障碍。这些影响是否持续更长的时间还有待在未来的工作中进行测试。
    UNASSIGNED: Alzheimer\'s disease (AD) is a neurodegenerative disease specified by chronic and irreversible destruction of neurons. This study aimed to evaluate the effects of different extracts (aqueous, hydroalcoholic, hexane, and ethyl acetate) and manna of Echinops cephalotes (EC) on impaired cognitive function induced by scopolamine in mice. EC is shown to have anti-cholinesterase-butyrylcholinesterase activities.
    UNASSIGNED: In this study, aqueous and hydroalcoholic extracts, hexane and ethyl acetate fractions of EC (25, 50, 100 mg/kg, i.p.), and the manna (25, 50, 100 mg/kg, gavage) were administered for 14 days alongside scopolamine (0.7 mg/kg, i.p.). Rivastigmine (reference drug) was administered for 2 weeks i.p. Mice were tested for their memory function using two behavioral models, object recognition test (ORT) and passive avoidance test (PAT).
    UNASSIGNED: Administration of scopolamine significantly impaired memory function in both behavioral models. In the PAT model, all extracts at 50 and 100 mg/kg significantly reversed the effect of memory destruction caused by scopolamine. At a lower dose of 25 mg/kg, however, none of the extracts were able to significantly change the step-through latency time. In the ORT model, however, administration of all extracts at 50 and 100 mg/kg, significantly increased the recognition index. Only the manna and the aqueous extract at 25 mg/kg were able to reverse scopolamine-induced memory impairment.
    UNASSIGNED: These results suggest that all forms of EC extracts improve memory impairment induced by scopolamine comparably to rivastigmine. Whether the effects are sustained over a longer period remains to be tested in future work.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    先前的研究表明,伸长和对称性(自然刺激的两个普遍存在的方面)是物体感知和识别的重要属性,这反过来表明,这些几何因素可能有助于选择感知参考框架。然而,这些属性是否以及如何指导参考框架的选择仍然知之甚少。这项研究的目的是系统地研究伸长和对称的作用,以及它们的组合,在参考轴的选择以及如何为不熟悉的对象开发这些轴。我们设计了我们的实验来消除两个潜在的混杂因素:(I)无关的环境线索,比如屏幕的边缘,等。(通过使用VR)和(ii)熟悉的对象和形状的预学习线索(通过使用新形状的强化学习)。我们使用具有不同方向的算法生成的纹理,这些纹理具有指定的对称性和伸长率水平作为刺激。在每次审判中,我们只提供了一个刺激,并要求观察者报告刺激是原始形式还是翻转(镜像)形式。在每次试验结束时提供反馈。基于先前关于心理旋转的研究,我们假设,由对称性和/或伸长率定义的参考框架的选择将通过反应时间与最对称或最细长方向的角度偏差之间的线性关系来揭示。我们的结果与这一假设是一致的。我们发现,受试者执行心理旋转以将图像转换为参考轴,并在仅呈现一个因素时使用最对称或最细长的方向作为参考轴,当这两个因素都出现时,他们使用了“赢家通吃”策略,伸长比对称更占优势。我们讨论了这些发现的理论意义,特别是在“规范的感觉运动理论”的背景下。\"
    Previous studies showed that elongation and symmetry (two ubiquitous aspects of natural stimuli) are important attributes in object perception and recognition, which in turn suggests that these geometrical factors may contribute to the selection of perceptual reference-frames. However, whether and how these attributes guide the selection of reference-frames is still poorly understood. The goal of this study was to examine systematically the roles of elongation and symmetry, as well as their combination, in the selection of reference axis and how these axes are developed for unfamiliar objects. We designed our experiments to eliminate two potential confounding factors: (i) extraneous environmental cues, such as edges of the screen, etc. (by using VR) and (ii) pre-learned cues for familiar objects and shapes (by using reinforcement learning of novel shapes). We used algorithmically generated textures with different orientations having specified levels of symmetry and elongation as the stimuli. In each trial, we presented only one stimulus and asked observers to report if the stimulus was in its original form or a flipped (mirror-image) one. Feedback was provided at the end of each trial. Based on previous studies on mental rotation, we hypothesized that the selection of a reference-frame defined by symmetry and/or elongation would be revealed by a linear relationship between reaction-times and the angular-deviation from either the most symmetrical or the most elongated orientation. Our results are consistent with this hypothesis. We found that subjects performed mental rotation to transform images to their reference axes and used the most symmetrical or elongated orientation as the reference axis when only one factor was presented, and they used a \"winner-take-all\" strategy when both factors were presented, with elongation being more dominant than symmetry. We discuss theoretical implications of these findings, in particular in the context of \"canonical sensorimotor theory.\"
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    人们在搜索方面有所不同。我们测试了两种认知能力的贡献:视觉工作记忆(VWM)能力和物体识别能力。参与者完成了三项任务:一项困难的低效视觉搜索任务,他们在偏斜的L干扰者中搜索目标字母T;VWM任务,他们记忆了一个颜色数组,然后确定被探测的颜色是否属于前一个数组;以及新颖的对象记忆测试(NOMT),在那里他们学习了复杂的新奇物体,然后在与它们非常相似的物体中识别出它们。探索性和验证性因素分析显示,有两个潜在因素可以解释这三个任务之间的共同差异:一个因素表明参与者在具有挑战性的视觉搜索任务中行使的谨慎程度,也是代表他们视觉认知能力的因素。谨慎搜索得分高的人倾向于执行更准确但较慢的搜索。在视觉认知能力因子上得分高的人往往具有较高的VWM能力,更好的物体识别能力,更快的搜索速度。结果反映了两点:(1)视觉搜索任务与视觉工作记忆和对象识别任务共享组件。(2)搜索性能不仅受搜索显示属性的影响,还受个人偏好(如谨慎和一般视觉能力)的影响。这项研究引入了解释视觉搜索行为变化时要考虑的新因素。
    People differ in how well they search. What are the factors that might contribute to this variability? We tested the contribution of two cognitive abilities: visual working memory (VWM) capacity and object recognition ability. Participants completed three tasks: a difficult inefficient visual search task, where they searched for a target letter T among skewed L distractors; a VWM task, where they memorized a color array and then identified whether a probed color belonged to the previous array; and the Novel Object Memory Test (NOMT), where they learnt complex novel objects and then identified them amongst objects that closely resembled them. Exploratory and confirmatory factor analyses revealed that there are two latent factors that explain the shared variance among these three tasks: a factor indicative of the level of caution participants exercised during the challenging visual search task, and a factor representing their visual cognitive abilities. People who score high on the search cautiousness tend to perform a more accurate but slower search. People who score high on the visual cognitive ability factor tend to have a higher VWM capacity, a better object recognition ability, and a faster search speed. The results reflect two points: (1) Visual search tasks share components with visual working memory and object recognition tasks. (2) Search performance is influenced not only by the search display\'s properties but also by individual predispositions such as caution and general visual abilities. This study introduces new factors for consideration when interpreting variations in visual search behaviors.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    对象分类已被提出作为灵长类腹侧视觉流的主要目标,并已被用作视觉系统的深度神经网络模型(DNN)的优化目标。然而,视觉大脑区域代表许多不同类型的信息,并且仅对对象身份的分类进行优化不会限制其他信息如何在视觉表示中编码。关于不同场景参数的信息可以完全丢弃(\'不变性\'),在种群活动的非干扰子空间中表示(“因式分解”)或以纠缠方式编码。在这项工作中,我们提供的证据表明,因式分解是生物视觉表征的规范原则。在猴子腹侧视觉层次中,我们发现,在更高级别的区域中,对象身份的对象姿态和背景信息的因式分解增加,并且极大地有助于提高对象身份解码性能。然后,我们对单个场景参数的分解进行了大规模分析-照明,背景,摄像机视点,和对象姿态-在视觉系统的不同DNN模型库中。最匹配神经的模型,功能磁共振成像,来自12个数据集的猴子和人类的行为数据往往是最强烈地分解场景参数的数据。值得注意的是,这些参数的不变性与神经和行为数据的匹配并不一致,这表明,在因式分解的活动子空间中维护非类信息通常比完全丢弃它更可取。因此,我们认为视觉场景信息的分解是大脑及其DNN模型中广泛使用的策略。
    看图片时,我们可以快速识别一个可识别的物体,比如苹果,对它应用一个单词标签。尽管广泛的神经科学研究集中在人类和猴子的大脑如何实现这种识别,我们对大脑和类似大脑的计算机模型如何解释视觉场景的其他复杂方面的理解-例如对象位置和环境上下文-仍然不完整。特别是,目前尚不清楚物体识别在多大程度上以牺牲其他重要场景细节为代价。例如,可以同时处理场景的各个方面。另一方面,一般物体识别可能会干扰这些细节的处理。为了调查这一点,Lindsey和Issa分析了12个猴子和人脑数据集,以及许多计算机模型,探索场景的不同方面如何在神经元中编码,以及这些方面如何由计算模型表示。分析表明,阻止有效分离和保留有关对象姿势和环境上下文的信息会恶化猴子皮层神经元中的对象识别。此外,最类似大脑的计算机模型可以独立保存其他场景细节,而不会干扰物体识别。研究结果表明,人类和猴子的高级腹侧视觉处理系统能够以比以前所理解的更复杂的方式来表示环境。在未来,研究更多的大脑活动数据可以帮助识别编码信息的丰富程度,以及它如何支持空间导航等其他功能。这些知识可以帮助建立以相同方式处理信息的计算模型,有可能提高他们对现实世界场景的理解。
    Object classification has been proposed as a principal objective of the primate ventral visual stream and has been used as an optimization target for deep neural network models (DNNs) of the visual system. However, visual brain areas represent many different types of information, and optimizing for classification of object identity alone does not constrain how other information may be encoded in visual representations. Information about different scene parameters may be discarded altogether (\'invariance\'), represented in non-interfering subspaces of population activity (\'factorization\') or encoded in an entangled fashion. In this work, we provide evidence that factorization is a normative principle of biological visual representations. In the monkey ventral visual hierarchy, we found that factorization of object pose and background information from object identity increased in higher-level regions and strongly contributed to improving object identity decoding performance. We then conducted a large-scale analysis of factorization of individual scene parameters - lighting, background, camera viewpoint, and object pose - in a diverse library of DNN models of the visual system. Models which best matched neural, fMRI, and behavioral data from both monkeys and humans across 12 datasets tended to be those which factorized scene parameters most strongly. Notably, invariance to these parameters was not as consistently associated with matches to neural and behavioral data, suggesting that maintaining non-class information in factorized activity subspaces is often preferred to dropping it altogether. Thus, we propose that factorization of visual scene information is a widely used strategy in brains and DNN models thereof.
    When looking at a picture, we can quickly identify a recognizable object, such as an apple, applying a single word label to it. Although extensive neuroscience research has focused on how human and monkey brains achieve this recognition, our understanding of how the brain and brain-like computer models interpret other complex aspects of a visual scene – such as object position and environmental context – remains incomplete. In particular, it was not clear to what extent object recognition comes at the expense of other important scene details. For example, various aspects of the scene might be processed simultaneously. On the other hand, general object recognition may interfere with processing of such details. To investigate this, Lindsey and Issa analyzed 12 monkey and human brain datasets, as well as numerous computer models, to explore how different aspects of a scene are encoded in neurons and how these aspects are represented by computational models. The analysis revealed that preventing effective separation and retention of information about object pose and environmental context worsened object identification in monkey cortex neurons. In addition, the computer models that were the most brain-like could independently preserve the other scene details without interfering with object identification. The findings suggest that human and monkey high level ventral visual processing systems are capable of representing the environment in a more complex way than previously appreciated. In the future, studying more brain activity data could help to identify how rich the encoded information is and how it might support other functions like spatial navigation. This knowledge could help to build computational models that process the information in the same way, potentially improving their understanding of real-world scenes.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号