Conditional GAN

条件 GAN
  • 文章类型: Journal Article
    BACKGROUND: Image-based crop growth modeling can substantially contribute to precision agriculture by revealing spatial crop development over time, which allows an early and location-specific estimation of relevant future plant traits, such as leaf area or biomass. A prerequisite for realistic and sharp crop image generation is the integration of multiple growth-influencing conditions in a model, such as an image of an initial growth stage, the associated growth time, and further information about the field treatment. While image-based models provide more flexibility for crop growth modeling than process-based models, there is still a significant research gap in the comprehensive integration of various growth-influencing conditions. Further exploration and investigation are needed to address this gap.
    METHODS: We present a two-stage framework consisting first of an image generation model and second of a growth estimation model, independently trained. The image generation model is a conditional Wasserstein generative adversarial network (CWGAN). In the generator of this model, conditional batch normalization (CBN) is used to integrate conditions of different types along with the input image. This allows the model to generate time-varying artificial images dependent on multiple influencing factors. These images are used by the second part of the framework for plant phenotyping by deriving plant-specific traits and comparing them with those of non-artificial (real) reference images. In addition, image quality is evaluated using multi-scale structural similarity (MS-SSIM), learned perceptual image patch similarity (LPIPS), and Fréchet inception distance (FID). During inference, the framework allows image generation for any combination of conditions used in training; we call this generation data-driven crop growth simulation.
    RESULTS: Experiments are performed on three datasets of different complexity. These datasets include the laboratory plant Arabidopsis thaliana (Arabidopsis) and crops grown under real field conditions, namely cauliflower (GrowliFlower) and crop mixtures consisting of faba bean and spring wheat (MixedCrop). In all cases, the framework allows realistic, sharp image generations with a slight loss of quality from short-term to long-term predictions. For MixedCrop grown under varying treatments (different cultivars, sowing densities), the results show that adding these treatment information increases the generation quality and phenotyping accuracy measured by the estimated biomass. Simulations of varying growth-influencing conditions performed with the trained framework provide valuable insights into how such factors relate to crop appearances, which is particularly useful in complex, less explored crop mixture systems. Further results show that adding process-based simulated biomass as a condition increases the accuracy of the derived phenotypic traits from the predicted images. This demonstrates the potential of our framework to serve as an interface between a data-driven and a process-based crop growth model.
    CONCLUSIONS: The realistic generation and simulation  of future plant appearances is adequately feasible by multi-conditional CWGAN. The presented framework complements process-based models and overcomes their limitations, such as the reliance on assumptions and the low exact field-localization specificity, by realistic visualizations of the spatial crop development that directly lead to a high explainability of the model predictions.






  • 文章类型: Journal Article
    BACKGROUND: Cardiac positron emission tomography (PET) can visualize and quantify the molecular and physiological pathways of cardiac function. However, cardiac and respiratory motion can introduce blurring that reduces PET image quality and quantitative accuracy. Dual cardiac- and respiratory-gated PET reconstruction can mitigate motion artifacts but increases noise as only a subset of data are used for each time frame of the cardiac cycle.
    OBJECTIVE: The objective of this study is to create a zero-shot image denoising framework using a conditional generative adversarial networks (cGANs) for improving image quality and quantitative accuracy in non-gated and dual-gated cardiac PET images.
    METHODS: Our study included retrospective list-mode data from 40 patients who underwent an 18F-fluorodeoxyglucose (18F-FDG) cardiac PET study. We initially trained and evaluated a 3D cGAN-known as Pix2Pix-on simulated non-gated low-count PET data paired with corresponding full-count target data, and then deployed the model on an unseen test set acquired on the same PET/CT system including both non-gated and dual-gated PET data.
    RESULTS: Quantitative analysis demonstrated that the 3D Pix2Pix network architecture achieved significantly (p value<0.05) enhanced image quality and accuracy in both non-gated and gated cardiac PET images. At 5%, 10%, and 15% preserved count statistics, the model increased peak signal-to-noise ratio (PSNR) by 33.7%, 21.2%, and 15.5%, structural similarity index (SSIM) by 7.1%, 3.3%, and 2.2%, and reduced mean absolute error (MAE) by 61.4%, 54.3%, and 49.7%, respectively. When tested on dual-gated PET data, the model consistently reduced noise, irrespective of cardiac/respiratory motion phases, while maintaining image resolution and accuracy. Significant improvements were observed across all gates, including a 34.7% increase in PSNR, a 7.8% improvement in SSIM, and a 60.3% reduction in MAE.
    CONCLUSIONS: The findings of this study indicate that dual-gated cardiac PET images, which often have post-reconstruction artifacts potentially affecting diagnostic performance, can be effectively improved using a generative pre-trained denoising network.






  • 文章类型: Journal Article
    Over the past decade, deep-learning (DL) algorithms have become a promising tool to aid clinicians in identifying fetal head standard planes (FHSPs) during ultrasound (US) examination. However, the adoption of these algorithms in clinical settings is still hindered by the lack of large annotated datasets. To overcome this barrier, we introduce FetalBrainAwareNet, an innovative framework designed to synthesize anatomically accurate images of FHSPs. FetalBrainAwareNet introduces a cutting-edge approach that utilizes class activation maps as a prior in its conditional adversarial training process. This approach fosters the presence of the specific anatomical landmarks in the synthesized images. Additionally, we investigate specialized regularization terms within the adversarial training loss function to control the morphology of the fetal skull and foster the differentiation between the standard planes, ensuring that the synthetic images faithfully represent real US scans in both structure and overall appearance. The versatility of our FetalBrainAwareNet framework is highlighted by its ability to generate high-quality images of three predominant FHSPs using a singular, integrated framework. Quantitative (Fréchet inception distance of 88.52) and qualitative (t-SNE) results suggest that our framework generates US images with greater variability compared to state-of-the-art methods. By using the synthetic images generated with our framework, we increase the accuracy of FHSP classifiers by 3.2% compared to training the same classifiers solely with real acquisitions. These achievements suggest that using our synthetic images to increase the training set could provide benefits to enhance the performance of DL algorithms for FHSPs classification that could be integrated in real clinical scenarios.






  • 文章类型: Journal Article
    The performance of three-dimensional (3D) point cloud reconstruction is affected by dynamic features such as vegetation. Vegetation can be detected by near-infrared (NIR)-based indices; however, the sensors providing multispectral data are resource intensive. To address this issue, this study proposes a two-stage framework to firstly improve the performance of the 3D point cloud generation of buildings with a two-view SfM algorithm, and secondly, reduce noise caused by vegetation. The proposed framework can also overcome the lack of near-infrared data when identifying vegetation areas for reducing interferences in the SfM process. The first stage includes cross-sensor training, model selection and the evaluation of image-to-image RGB to color infrared (CIR) translation with Generative Adversarial Networks (GANs). The second stage includes feature detection with multiple feature detector operators, feature removal with respect to the NDVI-based vegetation classification, masking, matching, pose estimation and triangulation to generate sparse 3D point clouds. The materials utilized in both stages are a publicly available RGB-NIR dataset, and satellite and UAV imagery. The experimental results indicate that the cross-sensor and category-wise validation achieves an accuracy of 0.9466 and 0.9024, with a kappa coefficient of 0.8932 and 0.9110, respectively. The histogram-based evaluation demonstrates that the predicted NIR band is consistent with the original NIR data of the satellite test dataset. Finally, the test on the UAV RGB and artificially generated NIR with a segmentation-driven two-view SfM proves that the proposed framework can effectively translate RGB to CIR for NDVI calculation. Further, the artificially generated NDVI is able to segment and classify vegetation. As a result, the generated point cloud is less noisy, and the 3D model is enhanced.






  • 文章类型: Journal Article
    Accurate brain tumour segmentation is critical for tasks such as surgical planning, diagnosis, and analysis, with magnetic resonance imaging (MRI) being the preferred modality due to its excellent visualisation of brain tissues. However, the wide intensity range of voxel values in MR scans often results in significant overlap between the density distributions of different tumour tissues, leading to reduced contrast and segmentation accuracy. This paper introduces a novel framework based on conditional generative adversarial networks (cGANs) aimed at enhancing the contrast of tumour subregions for both voxel-wise and region-wise segmentation approaches. We present two models: Enhancement and Segmentation GAN (ESGAN), which combines classifier loss with adversarial loss to predict central labels of input patches, and Enhancement GAN (EnhGAN), which generates high-contrast synthetic images with reduced inter-class overlap. These synthetic images are then fused with corresponding modalities to emphasise meaningful tissues while suppressing weaker ones. We also introduce a novel generator that adaptively calibrates voxel values within input patches, leveraging fully convolutional networks. Both models employ a multi-scale Markovian network as a GAN discriminator to capture local patch statistics and estimate the distribution of MR images in complex contexts. Experimental results on publicly available MR brain tumour datasets demonstrate the competitive accuracy of our models compared to current brain tumour segmentation techniques.






  • 文章类型: Journal Article
    OBJECTIVE: Prostate cancer is one of the most common diseases affecting men. The main diagnostic and prognostic reference tool is the Gleason scoring system. An expert pathologist assigns a Gleason grade to a sample of prostate tissue. As this process is very time-consuming, some artificial intelligence applications were developed to automatize it. The training process is often confronted with insufficient and unbalanced databases which affect the generalisability of the models. Therefore, the aim of this work is to develop a generative deep learning model capable of synthesising patches of any selected Gleason grade to perform data augmentation on unbalanced data and test the improvement of classification models.
    METHODS: The methodology proposed in this work consists of a conditional Progressive Growing GAN (ProGleason-GAN) capable of synthesising prostate histopathological tissue patches by selecting the desired Gleason Grade cancer pattern in the synthetic sample. The conditional Gleason Grade information is introduced into the model through the embedding layers, so there is no need to add a term to the Wasserstein loss function. We used minibatch standard deviation and pixel normalisation to improve the performance and stability of the training process.
    RESULTS: The reality assessment of the synthetic samples was performed with the Frechet Inception Distance (FID). We obtained an FID metric of 88.85 for non-cancerous patterns, 81.86 for GG3, 49.32 for GG4 and 108.69 for GG5 after post-processing stain normalisation. In addition, a group of expert pathologists was selected to perform an external validation of the proposed framework. Finally, the application of our proposed framework improved the classification results in SICAPv2 dataset, proving its effectiveness as a data augmentation method.
    CONCLUSIONS: ProGleason-GAN approach combined with a stain normalisation post-processing provides state-of-the-art results regarding Frechet\'s Inception Distance. This model can synthesise samples of non-cancerous patterns, GG3, GG4 or GG5. The inclusion of conditional information about the Gleason grade during the training process allows the model to select the cancerous pattern in a synthetic sample. The proposed framework can be used as a data augmentation method.






  • 文章类型: Journal Article
    Background: Ultra-Wide-Field (UWF) fundus imaging is an essential diagnostic tool for identifying ophthalmologic diseases, as it captures detailed retinal structures within a wider field of view (FOV). However, the presence of eyelashes along the edge of the eyelids can cast shadows and obscure the view of fundus imaging, which hinders reliable interpretation and subsequent screening of fundus diseases. Despite its limitations, there are currently no effective methods or datasets available for removing eyelash artifacts from UWF fundus images. This research aims to develop an effective approach for eyelash artifact removal and thus improve the visual quality of UWF fundus images for accurate analysis and diagnosis. Methods: To address this issue, we first constructed two UWF fundus datasets: the paired synthetic eyelashes (PSE) dataset and the unpaired real eyelashes (uPRE) dataset. Then we proposed a deep learning architecture called Joint Conditional Generative Adversarial Networks (JcGAN) to remove eyelash artifacts from UWF fundus images. JcGAN employs a shared generator with two discriminators for joint learning of both real and synthetic eyelash artifacts. Furthermore, we designed a background refinement module that refines background information and is trained with the generator in an end-to-end manner. Results: Experimental results on both PSE and uPRE datasets demonstrate the superiority of the proposed JcGAN over several state-of-the-art deep learning approaches. Compared with the best existing method, JcGAN improves PSNR and SSIM by 4.82% and 0.23%, respectively. In addition, we also verified that eyelash artifact removal via JcGAN could significantly improve vessel segmentation performance in UWF fundus images. Assessment via vessel segmentation illustrates that the sensitivity, Dice coefficient and area under curve (AUC) of ResU-Net have respectively increased by 3.64%, 1.54%, and 1.43% after eyelash artifact removal using JcGAN. Conclusion: The proposed JcGAN effectively removes eyelash artifacts in UWF images, resulting in improved visibility of retinal vessels. Our method can facilitate better processing and analysis of retinal vessels and has the potential to improve diagnostic outcomes.






  • 文章类型: Journal Article
    Biomedical data acquisition, and reaching sufficient samples of participants are difficult and time ans effort consuming processes. On the other hand, the success rates of computer aided diagnosis (CAD) algorithms are sample and feature space depended. In this paper, conditional generative adversarial network (CGAN) based enhanced feature generation is proposed to synthesize large sample datasets having higher class separability. Twenty five percent of five medical datasets are used to train CGAN, and the synthetic datasets with any sample size are evaluated and compared to originals. Thus, new datasets can be generated with the help of the CGAN model and lower sample collection. It helps physicians decreasing sample collection processes, and it increases accuracy rates of the CAD systems using generated enhanced data with enhanced feature vectors. The synthesized datasets are classified using nearest neighbor, radial basis function support vector machine and artificial neural network to analyze the effectiveness of the proposed CGAN model.






  • 文章类型: Journal Article
    Automatic detection of retinal diseases based on deep learning technology and Ultra-widefield (UWF) images plays an important role in clinical practices in recent years. However, due to small lesions and limited data samples, it is not easy to train a detection-accurate model with strong generalization ability. In this paper, we propose a lesion attention conditional generative adversarial network (LAC-GAN) to synthesize retinal images with realistic lesion details to improve the training of the disease detection model. Specifically, the generator takes the vessel mask and class label as the conditional inputs, and processes the random Gaussian noise by a series of residual block to generate the synthetic images. To focus on pathological information, we propose a lesion feature attention mechanism based on random forest (RF) method, which constructs its reverse activation network to activate the lesion features. For discriminator, a weight-sharing multi-discriminator is designed to improve the performance of model by affine transformations. Experimental results on multi-center UWF image datasets demonstrate that the proposed method can generate retinal images with reasonable details, which helps to enhance the performance of the disease detection model.






  • 文章类型: Journal Article
    Objective.We propose a method to model families of distributions of particles exiting a phantom with a conditional generative adversarial network (condGAN) during Monte Carlo simulation of single photon emission computed tomography imaging devices.Approach.The proposed condGAN is trained on a low statistics dataset containing the energy, the time, the position and the direction of exiting particles. In addition, it also contains a vector of conditions composed of four dimensions: the initial energy and the position of emitted particles within the phantom (a total of 12 dimensions). The information related to the gammas absorbed within the phantom is also added in the dataset. At the end of the training process, one component of the condGAN, the generator (G), is obtained.Main results.Particles with specific energies and positions of emission within the phantom can then be generated withGto replace the tracking of particle within the phantom, allowing reduced computation time compared to conventional Monte Carlo simulation.Significance.The condGAN generator is trained only once for a given phantom but can generate particles from various activity source distributions.





