Shape perception

  • 文章类型: Journal Article
    Although the integration of information across multiple senses can enhance object representations in memory, how multisensory information affects the formation of categories is uncertain. In particular, it is unclear to what extent categories formed from multisensory information benefit object recognition over unisensory inputs. Two experiments investigated the categorisation of novel auditory and visual objects, with categories defined by spatial similarity, and tested generalisation to novel exemplars. Participants learned to categorise exemplars based on visual-only (geometric shape), auditory-only (spatially defined soundscape) or audio-visual spatial cues. Categorisation to learned as well as novel exemplars was then tested under the same sensory learning conditions. For all learning modalities, categorisation generalised to novel exemplars. However, there was no evidence of enhanced categorisation performance for learned multisensory exemplars. At best, bimodal performance approximated that of the most accurate unimodal condition, although this was observed only for a subset of exemplars within a category. These findings provide insight into the perceptual processes involved in the formation of categories and have relevance for understanding the sensory nature of object representations underpinning these categories.






  • 文章类型: Case Reports
    Human visual experience of objects comprises a combination of visual features, such as color, position, and shape. Spatial attention is thought to play a role in creating a coherent perceptual experience, integrating visual information coming from a given location, but the mechanisms underlying this process are not fully understood. Deficits of spatial attention in which this integration process does not occur normally, such as neglect, can provide insights regarding the mechanisms of spatial attention in visual object recognition. In this study, we describe a series of experiments conducted with an individual with neglect, DH. DH presents characteristic lack of awareness of the left side of individual objects, evidenced by poor object and face recognition, and impaired word reading. However, he exhibits intact recognition of color within the boundaries of the same objects he fails to recognize. Furthermore, he can also report the orientation and location of a colored region on the neglected left side despite lack of awareness of the shape of the region. Overall, DH shows selective lack of awareness of shape despite intact processing of basic visual features in the same spatial location. DH\'s performance raises intriguing questions and challenges about the role of spatial attention in the formation of coherent object percepts and visual awareness.






  • 文章类型: Journal Article
    Visual crowding refers to the phenomenon where a target object that is easily identifiable in isolation becomes difficult to recognize when surrounded by other stimuli (distractors). Many psychophysical studies have investigated this phenomenon and proposed alternative models for the underlying mechanisms. One prominent hypothesis, albeit with mixed psychophysical support, posits that crowding arises from the loss of information due to pooled encoding of features from target and distractor stimuli in the early stages of cortical visual processing. However, neurophysiological studies have not rigorously tested this hypothesis. We studied the responses of single neurons in macaque (one male, one female) area V4, an intermediate stage of the object-processing pathway, to parametrically designed crowded displays and texture statistics-matched metameric counterparts. Our investigations reveal striking parallels between how crowding parameters-number, distance, and position of distractors-influence human psychophysical performance and V4 shape selectivity. Importantly, we also found that enhancing the salience of a target stimulus could alleviate crowding effects in highly cluttered scenes, and this could be temporally protracted reflecting a dynamical process. Thus, a pooled encoding of nearby stimuli cannot explain the observed responses, and we propose an alternative model where V4 neurons preferentially encode salient stimuli in crowded displays. Overall, we conclude that the magnitude of crowding effects is determined not just by the number of distractors and target-distractor separation but also by the relative salience of targets versus distractors based on their feature attributes-the similarity of distractors and the contrast between target and distractor stimuli.






  • 文章类型: Journal Article
    Categorization is an essential cognitive and perceptual process, which happens spontaneously. However, earlier research often neglected the spontaneous nature of this process by mainly adopting explicit tasks in behavioral or neuroimaging paradigms. Here, we use frequency-tagging (FT) during electroencephalography (EEG) in 22 healthy human participants (both male and female) as a direct approach to pinpoint spontaneous visual categorical processing. Starting from schematic natural visual stimuli, we created morph sequences comprising 11 equal steps. Mirroring a behavioral categorical perception discrimination paradigm, we administered a FT-EEG oddball paradigm, assessing neural sensitivity for equally sized differences within and between stimulus categories. Likewise, mirroring a behavioral category classification paradigm, we administered a sweep FT-EEG oddball paradigm, sweeping from one end of the morph sequence to the other, thereby allowing us to objectively pinpoint the neural category boundary. We found that FT-EEG can implicitly measure categorical processing and discrimination. More specifically, we could derive an objective neural index of the required level to differentiate between the two categories, and this neural index showed the typical marker of categorical perception (i.e., stronger discrimination across as compared with within categories). The neural findings of the implicit paradigms were also validated using an explicit behavioral task. These results provide evidence that FT-EEG can be used as an objective tool to measure discrimination and categorization and that the human brain inherently and spontaneously (without any conscious or decisional processes) uses higher-level meaningful categorization information to interpret ambiguous (morph) shapes.






  • 文章类型: Journal Article
    To investigate whether local elements are grouped into global shapes in the absence of awareness, we introduced two different masked priming designs (e.g., the classic dissociation paradigm and a trial-wise probe and prime discrimination task) and collected both objective (i.e., performance based) and subjective (using the perceptual awareness scale [PAS]) awareness measures. Prime visibility was manipulated using three different prime-mask stimulus onset asynchronies (SOAs) and an unmasked condition. Our results showed that assessing prime visibility trial-wise heavily interfered with masked priming preventing any prime facilitation effect. The implementation of Bayesian regression models, which predict priming effects for participants whose awareness levels are at chance level, provided strong evidence in favor of the hypothesis that local elements group into global shape in the absence of awareness for SOAs longer than 50 ms, suggesting that prime-mask SOA is a crucial factor in the processing of the global shape without awareness.






  • 文章类型: Journal Article
    Objective. Although convolutional neural networks (CNN) and Transformers have performed well in many medical image segmentation tasks, they rely on large amounts of labeled data for training. The annotation of medical image data is expensive and time-consuming, so it is common to use semi-supervised learning methods that use a small amount of labeled data and a large amount of unlabeled data to improve the performance of medical imaging segmentation.Approach. This work aims to enhance the segmentation performance of medical images using a triple-teacher cross-learning semi-supervised medical image segmentation with shape perception and multi-scale consistency regularization. To effectively leverage the information from unlabeled data, we design a multi-scale semi-supervised method for three-teacher cross-learning based on shape perception, called Semi-TMS. The three teacher models engage in cross-learning with each other, where Teacher A and Teacher C utilize a CNN architecture, while Teacher B employs a transformer model. The cross-learning module consisting of Teacher A and Teacher C captures local and global information, generates pseudo-labels, and performs cross-learning using prediction results. Multi-scale consistency regularization is applied separately to the CNN and Transformer to improve accuracy. Furthermore, the low uncertainty output probabilities from Teacher A or Teacher C are utilized as input to Teacher B, enhancing the utilization of prior knowledge and overall segmentation robustness.Main results. Experimental evaluations on two public datasets demonstrate that the proposed method outperforms some existing semi-segmentation models, implicitly capturing shape information and effectively improving the utilization and accuracy of unlabeled data through multi-scale consistency.Significance. With the widespread utilization of medical imaging in clinical diagnosis, our method is expected to be a potential auxiliary tool, assisting clinicians and medical researchers in their diagnoses.






  • 文章类型: Journal Article
    Many objects and materials in our environment are subject to transformations that alter their shape. For example, branches bend in the wind, ice melts, and paper crumples. Still, we recognize objects and materials across these changes, suggesting we can distinguish an object\'s original features from those caused by the transformations (\"shape scission\"). Yet, if we truly understand transformations, we should not only be able to identify their signatures but also actively apply the transformations to new objects (i.e., through imagination or mental simulation). Here, we investigated this ability using a drawing task. On a tablet computer, participants viewed a sample contour and its transformed version, and were asked to apply the same transformation to a test contour by drawing what the transformed test shape should look like. Thus, they had to (i) infer the transformation from the shape differences, (ii) envisage its application to the test shape, and (iii) draw the result. Our findings show that drawings were more similar to the ground truth transformed test shape than to the original test shape-demonstrating the inference and reproduction of transformations from observation. However, this was only observed for relatively simple shapes. The ability was also modulated by transformation type and magnitude but not by the similarity between sample and test shapes. Together, our findings suggest that we can distinguish between representations of original object shapes and their transformations, and can use visual imagery to mentally apply nonrigid transformations to observed objects, showing how we not only perceive but also \'understand\' shape.






    We present a comprehensive review of the rare syndrome visual form agnosia (VFA). We begin by documenting its history, including the origins of the term, and the first case study labelled as VFA. The defining characteristics of the syndrome, as others have previously defined it, are then described. The impairments, preserved aspects of visual perception, and areas of brain damage in 21 patients who meet these defining characteristics are described in detail, including which tests were used to verify the presence or absence of key symptoms. From this, we note important similarities along with notable areas of divergence between patients. Damage to the occipital lobe (20/21), an inability to recognise line drawings (19/21), preserved colour vision (14/21), and visual field defects (16/21) were areas of consistency across most cases. We found it useful to distinguish between shape and form as distinct constructs when examining perceptual abilities in VFA patients. Our observations suggest that these patients often exhibit difficulties in processing simplified versions of form. Deficits in processing orientation and size were uncommon. Motion perception and visual imagery were not widely tested for despite being typically cited as defining features of the syndrome - although in the sample described, motion perception was never found to be a deficit. Moreover, problems with vision (e.g., poor visual acuity and the presence of hemianopias/scotomas in the visual fields) are more common than we would have thought and may also contribute to perceptual impairments in patients with VFA. We conclude that VFA is a perceptual disorder where the visual system has a reduced ability to synthesise lines together for the purposes of making sense of what images represent holistically.






  • 文章类型: Journal Article
    Deep convolutional neural networks (DCNNs) have attracted considerable interest as useful devices and as possible windows into understanding perception and cognition in biological systems. In earlier work, we showed that DCNNs differ dramatically from human perceivers in that they have no sensitivity to global object shape. Here, we investigated whether those findings are symptomatic of broader limitations of DCNNs regarding the use of relations. We tested learning and generalization of DCNNs (AlexNet and ResNet-50) for several relations involving objects. One involved classifying two shapes in an otherwise empty field as same or different. Another involved enclosure. Every display contained a closed figure among contour noise fragments and one dot; correct responding depended on whether the dot was inside or outside the figure. The third relation we tested involved a classification that depended on which of two polygons had more sides. One polygon always contained a dot, and correct classification of each display depended on whether the polygon with the dot had a greater number of sides. We used DCNNs that had been trained on the ImageNet database, and we used both restricted and unrestricted transfer learning (connection weights at all layers could change with training). For the same-different experiment, there was little restricted transfer learning (82.2%). Generalization tests showed near chance performance for new shapes. Results for enclosure were at chance for restricted transfer learning and somewhat better for unrestricted (74%). Generalization with two new kinds of shapes showed reduced but above-chance performance (≈66%). Follow-up studies indicated that the networks did not access the enclosure relation in their responses. For the relation of more or fewer sides of polygons, DCNNs showed successful learning with polygons having 3-5 sides under unrestricted transfer learning, but showed chance performance in generalization tests with polygons having 6-10 sides. Experiments with human observers showed learning from relatively few examples of all of the relations tested and complete generalization of relational learning to new stimuli. These results using several different relations suggest that DCNNs have crucial limitations that derive from their lack of computations involving abstraction and relational processing of the sort that are fundamental in human perception.






  • 文章类型: Journal Article
    It is well known that observers can use so-called summary statistics of visual ensembles to simplify perceptual processing. The assumption has been that instead of representing feature distributions in detail the visual system extracts the mean and variance of visual ensembles. But recent evidence from implicit testing using a method called feature distribution learning showed that far more detail of the distributions is retained than the summary statistic literature indicates. Observers also encode higher-order statistics such as the kurtosis of feature distributions of orientation and color. But this sort of learning has not been shown for more intricate aspects of visual information. Here we tested the learning of distractor ensembles for shape, using the feature distribution learning method. Using a linearized circular shape space, we found that learning of detailed distributions of shape does not occur for this shape space while observers were able to learn the mean and range of the distributions. Previous demonstrations of feature distribution learning involved simpler feature dimensions than the more complex shape space tested here, and our findings may therefore reveal important boundary conditions of feature distribution learning.





