
  • 文章类型: Journal Article
    BACKGROUND: Residual image noise is substantial in positron emission tomography (PET) and one of the factors limiting lesion detection, quantification, and overall image quality. Thus, improving noise reduction remains of considerable interest. This is especially true for respiratory-gated PET investigations. The only broadly used approach for noise reduction in PET imaging has been the application of low-pass filters, usually Gaussians, which however leads to loss of spatial resolution and increased partial volume effects affecting detectability of small lesions and quantitative data evaluation. The bilateral filter (BF) - a locally adaptive image filter - allows to reduce image noise while preserving well defined object edges but manual optimization of the filter parameters for a given PET scan can be tedious and time-consuming, hampering its clinical use. In this work we have investigated to what extent a suitable deep learning based approach can resolve this issue by training a suitable network with the target of reproducing the results of manually adjusted case-specific bilateral filtering.
    METHODS: Altogether, 69 respiratory-gated clinical PET/CT scans with three different tracers ( [ 18 F ] FDG, [ 18 F ] L-DOPA, [ 68 Ga ] DOTATATE) were used for the present investigation. Prior to data processing, the gated data sets were split, resulting in a total of 552 single-gate image volumes. For each of these image volumes, four 3D ROIs were delineated: one ROI for image noise assessment and three ROIs for focal uptake (e.g. tumor lesions) measurements at different target/background contrast levels. An automated procedure was used to perform a brute force search of the two-dimensional BF parameter space for each data set to identify the \"optimal\" filter parameters to generate user-approved ground truth input data consisting of pairs of original and optimally BF filtered images. For reproducing the optimal BF filtering, we employed a modified 3D U-Net CNN incorporating residual learning principle. The network training and evaluation was performed using a 5-fold cross-validation scheme. The influence of filtering on lesion SUV quantification and image noise level was assessed by calculating absolute and fractional differences between the CNN, manual BF, or original (STD) data sets in the previously defined ROIs.
    RESULTS: The automated procedure used for filter parameter determination chose adequate filter parameters for the majority of the data sets with only 19 patient data sets requiring manual tuning. Evaluation of the focal uptake ROIs revealed that CNN as well as BF based filtering essentially maintain the focal SUV max values of the unfiltered images with a low mean ± SD difference of δ SUV max CNN , STD = (-3.9 ± 5.2)% and δ SUV max BF , STD = (-4.4 ± 5.3)%. Regarding relative performance of CNN versus BF, both methods lead to very similar SUV max values in the vast majority of cases with an overall average difference of δ SUV max CNN , BF = (0.5 ± 4.8)%. Evaluation of the noise properties showed that CNN filtering mostly satisfactorily reproduces the noise level and characteristics of BF with δ Noise CNN , BF = (5.6 ± 10.5)%. No significant tracer dependent differences between CNN and BF were observed.
    CONCLUSIONS: Our results show that a neural network based denoising can reproduce the results of a case by case optimized BF in a fully automated way. Apart from rare cases it led to images of practically identical quality regarding noise level, edge preservation, and signal recovery. We believe such a network might proof especially useful in the context of improved motion correction of respiratory-gated PET studies but could also help to establish BF-equivalent edge-preserving CNN filtering in clinical PET since it obviates time consuming manual BF parameter tuning.






  • 文章类型: Journal Article
    The need for efficient video coding technology is more important than ever in the current scenario where video applications are increasing worldwide, and Internet of Things (IoT) devices are becoming widespread. In this context, it is necessary to carefully review the recently completed MPEG-5 Essential Video Coding (EVC) standard because the EVC Baseline profile is customized to meet the specific requirements needed to process IoT video data in terms of low complexity. Nevertheless, the EVC Baseline profile has a notable disadvantage. Since it is a codec composed only of simple tools developed over 20 years, it tends to represent numerous coding artifacts. In particular, the presence of blocking artifacts at the block boundary is regarded as a critical issue that must be addressed. To address this, this paper proposes a post-filter using a block partitioning information-based Convolutional Neural Network (CNN). The proposed method in the experimental results objectively shows an approximately 0.57 dB for All-Intra (AI) and 0.37 dB for Low-Delay (LD) improvements in each configuration by the proposed method when compared to the pre-post-filter video, and the enhanced PSNR results in an overall bitrate reduction of 11.62% for AI and 10.91% for LD in the Luma and Chroma components, respectively. Due to the huge improvement in the PSNR, the proposed method significantly improved the visual quality subjectively, particularly in blocking artifacts at the coding block boundary.






  • 文章类型: Journal Article
    The paper proposes a novel post-filtering method based on convolutional neural networks (CNNs) for quality enhancement of RGB/grayscale images and video sequences. The lossy images are encoded using common image codecs, such as JPEG and JPEG2000. The video sequences are encoded using previous and ongoing video coding standards, high-efficiency video coding (HEVC) and versatile video coding (VVC), respectively. A novel deep neural network architecture is proposed to estimate fine refinement details for full-, half-, and quarter-patch resolutions. The proposed architecture is built using a set of efficient processing blocks designed based on the following concepts: (i) the multi-head attention mechanism for refining the feature maps, (ii) the weight sharing concept for reducing the network complexity, and (iii) novel block designs of layer structures for multiresolution feature fusion. The proposed method provides substantial performance improvements compared with both common image codecs and video coding standards. Experimental results on high-resolution images and standard video sequences show that the proposed post-filtering method provides average BD-rate savings of 31.44% over JPEG and 54.61% over HEVC (x265) for RGB images, Y-BD-rate savings of 26.21% over JPEG and 15.28% over VVC (VTM) for grayscale images, and 15.47% over HEVC and 14.66% over VVC for video sequences.






  • 文章类型: Journal Article
    In order to simplify the complexity and reduce the cost of the microphone array, this paper proposes a dual-microphone based sound localization and speech enhancement algorithm. Based on the time delay estimation of the signal received by the dual microphones, this paper combines energy difference estimation and controllable beam response power to realize the 3D coordinate calculation of the acoustic source and dual-microphone sound localization. Based on the azimuth angle of the acoustic source and the analysis of the independent quantity of the speech signal, the separation of the speaker signal of the acoustic source is realized. On this basis, post-wiener filtering is used to amplify and suppress the voice signal of the speaker, which can help to achieve speech enhancement. Experimental results show that the dual-microphone sound localization algorithm proposed in this paper can accurately identify the sound location, and the speech enhancement algorithm is more robust and adaptable than the original algorithm.






  • 文章类型: Journal Article
    Several researchers have contemplated deep learning-based post-filters to increase the quality of statistical parametric speech synthesis, which perform a mapping of the synthetic speech to the natural speech, considering the different parameters separately and trying to reduce the gap between them. The Long Short-term Memory (LSTM) Neural Networks have been applied successfully in this purpose, but there are still many aspects to improve in the results and in the process itself. In this paper, we introduce a new pre-training approach for the LSTM, with the objective of enhancing the quality of the synthesized speech, particularly in the spectrum, in a more efficient manner. Our approach begins with an auto-associative training of one LSTM network, which is used as an initialization for the post-filters. We show the advantages of this initialization for the enhancing of the Mel-Frequency Cepstral parameters of synthetic speech. Results show that the initialization succeeds in achieving better results in enhancing the statistical parametric speech spectrum in most cases when compared to the common random initialization approach of the networks.







  • 文章类型: Journal Article
    BACKGROUND: Positron emission tomography (PET) imaging has a wide applicability in oncology, cardiology and neurology. However, a major drawback when imaging very active regions such as the bladder is the spill-in effect, leading to inaccurate quantification and obscured visualisation of nearby lesions. Therefore, this study aims at investigating and correcting for the spill-in effect from high-activity regions to the surroundings as a function of activity in the hot region, lesion size and location, system resolution and application of post-filtering using a recently proposed background correction technique. This study involves analytical simulations for the digital XCAT2 phantom and validation acquiring NEMA phantom and patient data with the GE Signa PET/MR scanner. Reconstructions were done using the ordered subset expectation maximisation (OSEM) algorithm. Dedicated point-spread function (OSEM+PSF) and a recently proposed background correction (OSEM+PSF+BC) were incorporated into the reconstruction for spill-in correction. The standardised uptake values (SUV) were compared for all reconstruction algorithms.
    RESULTS: The simulation study revealed that lesions within 15-20 mm from the hot region were predominantly affected by the spill-in effect, leading to an increased bias and impaired lesion visualisation within the region. For OSEM, lesion SUVmax converged to the true value at low bladder activity, but as activity increased, there was an overestimation as much as 19% for proximal lesions (distance around 15-20 mm from the bladder edge) and 2-4% for distant lesions (distance larger than 20 mm from the bladder edge). As bladder SUV increases, the % SUV change for proximal lesions is about 31% and 6% for SUVmax and SUVmean, respectively, showing that the spill-in effect is more evident for the SUVmax than the SUVmean. Also, the application of post-filtering resulted in up to 65% increment in the spill-in effect around the bladder edges. For proximal lesions, PSF has no major improvement over OSEM because of the spill-in effect, coupled with the blurring effect by post-filtering. Within two voxels around the bladder, the spill-in effect in OSEM is 42% (32%), while for OSEM+PSF, it is 31% (19%), with (and without) post-filtering, respectively. But with OSEM+PSF+BC, the spill-in contribution from the bladder was relatively low (below 5%, either with or without post-filtering). These results were further validated using the NEMA phantom and patient data for which OSEM+PSF+BC showed about 70-80% spill-in reduction around the bladder edges and increased contrast-to-noise ratio up to 36% compared to OSEM and OSEM+PSF reconstructions without post-filtering.
    CONCLUSIONS: The spill-in effect is dependent on the activity in the hot region, lesion size and location, as well as post-filtering; and this is more evident in SUVmax than SUVmean. However, the recently proposed background correction method facilitates stability in quantification and enhances the contrast in lesions with low uptake.







  • 文章类型: Journal Article
    Accurately predicting protein-protein interaction sites (PPIs) is currently a hot topic because it has been demonstrated to be very useful for understanding disease mechanisms and designing drugs. Machine-learning-based computational approaches have been broadly utilized and demonstrated to be useful for PPI prediction. However, directly applying traditional machine learning algorithms, which often assume that samples in different classes are balanced, often leads to poor performance because of the severe class imbalance that exists in the PPI prediction problem. In this study, we propose a novel method for improving PPI prediction performance by relieving the severity of class imbalance using a data-cleaning procedure and reducing predicted false positives with a post-filtering procedure: First, a machine-learning-based data-cleaning procedure is applied to remove those marginal targets, which may potentially have a negative effect on training a model with a clear classification boundary, from the majority samples to relieve the severity of class imbalance in the original training dataset; then, a prediction model is trained on the cleaned dataset; finally, an effective post-filtering procedure is further used to reduce potential false positive predictions. Stringent cross-validation and independent validation tests on benchmark datasets demonstrated the efficacy of the proposed method, which exhibits highly competitive performance compared with existing state-of-the-art sequence-based PPIs predictors and should supplement existing PPI prediction methods.





