Datasets as Topic

数据集作为主题
  • 文章类型: Journal Article
    目的:卷积神经网络(CNN)已成为放射肿瘤学领域的变革工具,显著提高轮廓练习的精度。然而,这些算法在不同扫描仪上的适应性,机构,和成像协议仍然是一个相当大的障碍。本研究旨在研究将特定于机构的数据集纳入CNN的培训方案以评估其在现实世界临床环境中的泛化能力的影响。专注于以数据为中心的分析,研究了不同的多中心和单中心训练方法对算法性能的影响。
    方法:nnU-Net使用包含从四个不同机构(弗赖堡:n=96,慕尼黑:n=19,塞浦路斯:n=32,德累斯顿:n=14)收集的161个18F-PSMA-1007PET图像的数据集进行训练。对数据集进行分区,以便将来自每个中心的数据系统地排除在训练之外,并仅用于测试,以评估模型对来自不熟悉来源的数据的通用性和适应性。通过5倍交叉验证比较性能,提供在单个中心的数据集上训练的模型与在聚合的多中心数据集上训练的模型之间的详细比较。骰子相似性得分,Hausdorff距离和体积分析用作主要评估指标。
    结果:混合训练方法在五倍交叉验证中产生了0.76(IQR:0.64-0.84)的中位DSC,与从每个中心排除数据训练的模型相比,没有显着差异(p=0.18),其DSC中位数为0.74(IQR:0.56-0.86)。德累斯顿队列观察到多中心训练方面的表现显着改善(多中心中位数DSC0.71,IQR:0.58-0.80vs.单中心0.68,IQR:0.50-0.80,p<0.001)和塞浦路斯队列(多中心0.74,IQR:0.62-0.83vs.单中心0.72,IQR:0.54-0.82,p<0.01)。虽然慕尼黑和弗莱堡也通过多中心培训表现出了改进,结果显示无统计学意义(慕尼黑:多中心DSC0.74,IQR:0.60-0.80vs.单中心0.72,IQR:0.59-0.82,p>0.05;弗莱堡:多中心0.78,IQR:0.53-0.87vs.单中心0.71,IQR:0.53-0.83,p=0.23)。
    结论:在来自多个中心的不同数据集上,在18F-PSMA-1007PET中接受过自动轮廓前列腺内GTV训练的CNN通常很好地推广到来自其他中心的看不见的数据。与仅使用单中心数据集进行关于前列腺内18F-PSMA-1007PETGTV分割的训练相比,多中心数据集上的训练可以提高性能。同一CNN的分割性能可以根据用于训练和测试的数据集而变化。
    OBJECTIVE: Convolutional Neural Networks (CNNs) have emerged as transformative tools in the field of radiation oncology, significantly advancing the precision of contouring practices. However, the adaptability of these algorithms across diverse scanners, institutions, and imaging protocols remains a considerable obstacle. This study aims to investigate the effects of incorporating institution-specific datasets into the training regimen of CNNs to assess their generalization ability in real-world clinical environments. Focusing on a data-centric analysis, the influence of varying multi- and single center training approaches on algorithm performance is conducted.
    METHODS: nnU-Net is trained using a dataset comprising 161 18F-PSMA-1007 PET images collected from four distinct institutions (Freiburg: n = 96, Munich: n = 19, Cyprus: n = 32, Dresden: n = 14). The dataset is partitioned such that data from each center are systematically excluded from training and used solely for testing to assess the model\'s generalizability and adaptability to data from unfamiliar sources. Performance is compared through a 5-Fold Cross-Validation, providing a detailed comparison between models trained on datasets from single centers to those trained on aggregated multi-center datasets. Dice Similarity Score, Hausdorff distance and volumetric analysis are used as primary evaluation metrics.
    RESULTS: The mixed training approach yielded a median DSC of 0.76 (IQR: 0.64-0.84) in a five-fold cross-validation, showing no significant differences (p = 0.18) compared to models trained with data exclusion from each center, which performed with a median DSC of 0.74 (IQR: 0.56-0.86). Significant performance improvements regarding multi-center training were observed for the Dresden cohort (multi-center median DSC 0.71, IQR: 0.58-0.80 vs. single-center 0.68, IQR: 0.50-0.80, p < 0.001) and Cyprus cohort (multi-center 0.74, IQR: 0.62-0.83 vs. single-center 0.72, IQR: 0.54-0.82, p < 0.01). While Munich and Freiburg also showed performance improvements with multi-center training, results showed no statistical significance (Munich: multi-center DSC 0.74, IQR: 0.60-0.80 vs. single-center 0.72, IQR: 0.59-0.82, p > 0.05; Freiburg: multi-center 0.78, IQR: 0.53-0.87 vs. single-center 0.71, IQR: 0.53-0.83, p = 0.23).
    CONCLUSIONS: CNNs trained for auto contouring intraprostatic GTV in 18F-PSMA-1007 PET on a diverse dataset from multiple centers mostly generalize well to unseen data from other centers. Training on a multicentric dataset can improve performance compared to training exclusively with a single-center dataset regarding intraprostatic 18F-PSMA-1007 PET GTV segmentation. The segmentation performance of the same CNN can vary depending on the dataset employed for training and testing.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    光学相干控制1-5的进步解锁了许多尖端应用,包括远程通信,光探测和测距(LiDAR)和光学相干断层扫描6-8。普遍的智慧表明,使用更多的相干光源可以提高系统性能和设备功能9-11。我们的研究引入了一种光子卷积处理系统,该系统利用部分相干光来提高计算并行性,而不会实质上牺牲精度。可能实现更大尺寸的光子张量核。相干程度的降低优化了光子卷积处理系统中的带宽使用。这一突破挑战了传统的信念,即相干性在集成光子加速器中是必不可少的,甚至是有利的,从而使得能够使用具有较不严格的反馈控制和热管理要求的光源,以用于高吞吐量光子计算。在这里,我们在两个用于计算应用的光子平台中演示了这样一个系统:一个使用相变材料光子存储器的光子张量核,该光子张量核提供并行卷积运算,以92.2%的精度对10名帕金森氏病患者的步态进行分类(理论上为92.7%)和一个带有嵌入式电吸收调制器(EAM)的硅光子张量核,以促进每秒0.108tera运算(TOPS)卷积,在理论上
    Advancements in optical coherence control1-5 have unlocked many cutting-edge applications, including long-haul communication, light detection and ranging (LiDAR) and optical coherence tomography6-8. Prevailing wisdom suggests that using more coherent light sources leads to enhanced system performance and device functionalities9-11. Our study introduces a photonic convolutional processing system that takes advantage of partially coherent light to boost computing parallelism without substantially sacrificing accuracy, potentially enabling larger-size photonic tensor cores. The reduction of the degree of coherence optimizes bandwidth use in the photonic convolutional processing system. This breakthrough challenges the traditional belief that coherence is essential or even advantageous in integrated photonic accelerators, thereby enabling the use of light sources with less rigorous feedback control and thermal-management requirements for high-throughput photonic computing. Here we demonstrate such a system in two photonic platforms for computing applications: a photonic tensor core using phase-change-material photonic memories that delivers parallel convolution operations to classify the gaits of ten patients with Parkinson\'s disease with 92.2% accuracy (92.7% theoretically) and a silicon photonic tensor core with embedded electro-absorption modulators (EAMs) to facilitate 0.108 tera operations per second (TOPS) convolutional processing for classifying the Modified National Institute of Standards and Technology (MNIST) handwritten digits dataset with 92.4% accuracy (95.0% theoretically).
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    当灾难发生时,当局必须优先考虑两件事。首先,搜索和营救生命,第二,死者的身份识别和管理。然而,在大规模灾难中,成千上万的尸体被单独识别,法医小组面临挑战,例如工作时间长,导致身份识别过程延迟,以及身体分解引起的公共卫生问题。使用牙科全景成像,在法医中,牙齿已被用作估计个体年龄的物理标记。传统上,牙科年龄估计由专家手动进行。虽然程序相当简单,在大规模灾难期间,受害者人数众多,完成评估的时间有限,这使得法医工作更具挑战性。人工智能(AI)在医学和牙科领域的出现导致建议将当前过程自动化,以替代传统方法。本研究旨在测试开发的深度卷积神经网络系统的准确性和性能,用于年龄估计,使用数字牙科全景成像的样本外马来西亚儿童数据集。法医牙科估计实验室(F-DentEst实验室)是一种计算机应用程序,旨在以数字方式进行牙科年龄估计。该系统的引入是为了改进传统的年龄估计方法,从而显着提高基于AI方法的年龄估计过程的效率。回顾性收集了总共一千八百九十二张数字牙科全景图像,以测试F-DentEst实验室。数据训练,验证,并且在F-DentEst实验室开发的早期阶段进行了测试,其中分配涉及80%的培训,其余20%用于测试。该方法包括四个主要步骤:图像预处理,符合全景牙科成像的纳入标准,分割,使用动态规划主动轮廓(DP-AC)方法和深度卷积神经网络(DCNN)对下颌前磨牙进行分类,分别,和统计分析。建议的DCNN方法低估了实际年龄,女性和男性的ME分别为0.03和0.05,分别。
    When a disaster occurs, the authority must prioritise two things. First, the search and rescue of lives, and second, the identification and management of deceased individuals. However, with thousands of dead bodies to be individually identified in mass disasters, forensic teams face challenges such as long working hours resulting in a delayed identification process and a public health concern caused by the decomposition of the body. Using dental panoramic imaging, teeth have been used in forensics as a physical marker to estimate the age of an individual. Traditionally, dental age estimation has been performed manually by experts. Although the procedure is fairly simple, the large number of victims and the limited amount of time available to complete the assessment during large-scale disasters make forensic work even more challenging. The emergence of artificial intelligence (AI) in the fields of medicine and dentistry has led to the suggestion of automating the current process as an alternative to the conventional method. This study aims to test the accuracy and performance of the developed deep convolutional neural network system for age estimation in large, out-of-sample Malaysian children dataset using digital dental panoramic imaging. Forensic Dental Estimation Lab (F-DentEst Lab) is a computer application developed to perform the dental age estimation digitally. The introduction of this system is to improve the conventional method of age estimation that significantly increase the efficiency of the age estimation process based on the AI approach. A total number of one-thousand-eight-hundred-and-ninety-two digital dental panoramic images were retrospectively collected to test the F-DentEst Lab. Data training, validation, and testing have been conducted in the early stage of the development of F-DentEst Lab, where the allocation involved 80 % training and the remaining 20 % for testing. The methodology was comprised of four major steps: image preprocessing, which adheres to the inclusion criteria for panoramic dental imaging, segmentation, and classification of mandibular premolars using the Dynamic Programming-Active Contour (DP-AC) method and Deep Convolutional Neural Network (DCNN), respectively, and statistical analysis. The suggested DCNN approach underestimated chronological age with a small ME of 0.03 and 0.05 for females and males, respectively.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: News
    暂无摘要。
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    卵母细胞中相对较小的染色体的减数分裂错误会导致卵非整倍性,从而导致流产和先天性疾病。与体细胞不同,优选错误地分离较大的染色体,衰老的卵母细胞优先通过不清楚的过程错误地分离较小的染色体。这里,我们提供了一个全面的三维染色体识别和跟踪数据集,贯穿小鼠卵母细胞减数分裂I.该分析揭示了主动将较小染色体移动到中期板内部区域的前中期途径。在内部区域,染色体被更强的双极微管力拉动,这有利于过早的染色体分离,老年卵母细胞分离错误的主要原因。这项研究揭示了一种空间途径,该途径促进了衰老卵中小染色体的非整倍性,并暗示了M期在创建基于染色体大小的空间排列中的作用。
    Meiotic errors of relatively small chromosomes in oocytes result in egg aneuploidies that cause miscarriages and congenital diseases. Unlike somatic cells, which preferentially mis-segregate larger chromosomes, aged oocytes preferentially mis-segregate smaller chromosomes through unclear processes. Here, we provide a comprehensive three-dimensional chromosome identifying-and-tracking dataset throughout meiosis I in live mouse oocytes. This analysis reveals a prometaphase pathway that actively moves smaller chromosomes to the inner region of the metaphase plate. In the inner region, chromosomes are pulled by stronger bipolar microtubule forces, which facilitates premature chromosome separation, a major cause of segregation errors in aged oocytes. This study reveals a spatial pathway that facilitates aneuploidy of small chromosomes preferentially in aged eggs and implicates the role of the M phase in creating a chromosome size-based spatial arrangement.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Letter
    暂无摘要。
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    影响基因表达和剪接的遗传变异是表型差异1-5的关键来源。虽然无价,调查人类这些联系的研究强烈偏向于欧洲祖先的参与者,这限制了概括性,阻碍了进化研究。为了解决这些限制,我们开发了MAGE,来自1000基因组项目6的731名个体的淋巴母细胞样细胞系的开放获取RNA测序数据集,分布在5个大陆群体和26个群体中。基因表达(92%)和剪接(95%)的大多数变异分布在种群内。反映了DNA序列的变异。我们绘制了遗传变异与附近基因的表达和剪接之间的关联图(顺式表达数量性状基因座(eQTL)和顺式剪接QTL(sQTL),分别)。我们确定了超过15,000个推定的因果eQTL和超过16,000个推定的因果sQTL,它们针对相关的表观基因组特征进行了富集。其中包括1,310个eQTL和1,657个sQTL,它们对代表性不足的人群来说很大程度上是私有的。我们的数据进一步表明,因果eQTL效应的大小和方向在人群中高度一致。此外,在之前的研究中观察到的明显的"群体特异性"效应主要是由低分辨率或未检测到的相同基因的额外独立eQTL驱动的.一起,我们的研究扩展了我们对人类基因表达多样性的理解,并为研究人类基因组的进化和功能提供了一个包容性的资源。
    Genetic variation that influences gene expression and splicing is a key source of phenotypic diversity1-5. Although invaluable, studies investigating these links in humans have been strongly biased towards participants of European ancestries, which constrains generalizability and hinders evolutionary research. Here to address these limitations, we developed MAGE, an open-access RNA sequencing dataset of lymphoblastoid cell lines from 731 individuals from the 1000 Genomes Project6, spread across 5 continental groups and 26 populations. Most variation in gene expression (92%) and splicing (95%) was distributed within versus between populations, which mirrored the variation in DNA sequence. We mapped associations between genetic variants and expression and splicing of nearby genes (cis-expression quantitative trait loci (eQTLs) and cis-splicing QTLs (sQTLs), respectively). We identified more than 15,000 putatively causal eQTLs and more than 16,000 putatively causal sQTLs that are enriched for relevant epigenomic signatures. These include 1,310 eQTLs and 1,657 sQTLs that are largely private to underrepresented populations. Our data further indicate that the magnitude and direction of causal eQTL effects are highly consistent across populations. Moreover, the apparent \'population-specific\' effects observed in previous studies were largely driven by low resolution or additional independent eQTLs of the same genes that were not detected. Together, our study expands our understanding of human gene expression diversity and provides an inclusive resource for studying the evolution and function of human genomes.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    拟南芥的基因表达受1,900多个转录因子(TFs)的调控,已通过存在保守的DNA结合结构域在全基因组范围内鉴定。激活剂TFs包含招募共激活剂复合物的激活域(AD);然而,对于几乎所有的拟南芥TFs,我们缺乏关于存在的知识,它们的ADs1的位置和转录强度。为了解决这个差距,在这里,我们使用酵母文库方法在蛋白质组范围内通过实验鉴定拟南芥AD,发现一半以上的拟南芥TFs含有AD。我们注释了1,553个广告,其中绝大多数是,根据我们的知识,以前未知。使用生成的数据集,我们开发了一种神经网络来准确预测AD,并识别招募共激活复合物所必需的序列特征.我们发现了导致激活活性的六种不同的序列特征组合,提供一个框架来询问AD的亚功能化。此外,我们在TFs的古代AUXIN反应因子家族中鉴定了AD,揭示AD定位在不同的进化枝中是保守的。我们的发现为理解转录激活提供了深入的资源,用于检查内在无序区域中的功能的框架和AD的预测模型。
    Gene expression in Arabidopsis is regulated by more than 1,900 transcription factors (TFs), which have been identified genome-wide by the presence of well-conserved DNA-binding domains. Activator TFs contain activation domains (ADs) that recruit coactivator complexes; however, for nearly all Arabidopsis TFs, we lack knowledge about the presence, location and transcriptional strength of their ADs1. To address this gap, here we use a yeast library approach to experimentally identify Arabidopsis ADs on a proteome-wide scale, and find that more than half of the Arabidopsis TFs contain an AD. We annotate 1,553 ADs, the vast majority of which are, to our knowledge, previously unknown. Using the dataset generated, we develop a neural network to accurately predict ADs and to identify sequence features that are necessary to recruit coactivator complexes. We uncover six distinct combinations of sequence features that result in activation activity, providing a framework to interrogate the subfunctionalization of ADs. Furthermore, we identify ADs in the ancient AUXIN RESPONSE FACTOR family of TFs, revealing that AD positioning is conserved in distinct clades. Our findings provide a deep resource for understanding transcriptional activation, a framework for examining function in intrinsically disordered regions and a predictive model of ADs.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:据报道,整体纵向应变(GLS)比射血分数更具可重复性和预后性。自动化,透明的方法可以增加信任和吸收。
    目的:作者开发了基于开放式机器学习的GLS方法,并使用UnityUK超声心动图AICollaborative的多专家共识对其进行验证。
    方法:我们训练了一个多图像神经网络(Unity-GLS)来识别环,顶点,和心内膜曲线在6,819根尖4-,2-,和三室图像。外部验证数据集包括来自100个超声心动图的这3个视图。收缩末期和舒张末期框架分别由11位专家标记,以形成共识的描记和要点。他们还通过纵向功能的视觉分级来订购超声心动图。一位专家使用2个专有软件包计算了全球应变。
    结果:中位数GLS,11位专家的平均值,为-16.1(IQR:-19.3至-12.5)。使用每个案例的专家共识测量作为参考标准,个别专家测量的绝对误差中位数为2.00GLS单位.相比之下,机器方法的错误是:Unity-GLS1.3,专有A2.5,专有B2.2。与专家共识值的相关性为个别专家0.85,Unity-GLS0.91,专有A0.73,专有B0.79。使用多专家视觉排名作为参考,个别专家应变测量发现,中位数等级相关性为0.72,Unity-GLS0.77,专有A0.70和专有B0.74。
    结论:我们计算GLS的开源方法与专家的共识一样,与专家的个人测量和专有机器解决方案一样强烈。训练数据,代码,和训练有素的网络可以免费在线获得。
    BACKGROUND: Global longitudinal strain (GLS) is reported to be more reproducible and prognostic than ejection fraction. Automated, transparent methods may increase trust and uptake.
    OBJECTIVE: The authors developed open machine-learning-based GLS methodology and validate it using multiexpert consensus from the Unity UK Echocardiography AI Collaborative.
    METHODS: We trained a multi-image neural network (Unity-GLS) to identify annulus, apex, and endocardial curve on 6,819 apical 4-, 2-, and 3-chamber images. The external validation dataset comprised those 3 views from 100 echocardiograms. End-systolic and -diastolic frames were each labelled by 11 experts to form consensus tracings and points. They also ordered the echocardiograms by visual grading of longitudinal function. One expert calculated global strain using 2 proprietary packages.
    RESULTS: The median GLS, averaged across the 11 individual experts, was -16.1 (IQR: -19.3 to -12.5). Using each case\'s expert consensus measurement as the reference standard, individual expert measurements had a median absolute error of 2.00 GLS units. In comparison, the errors of the machine methods were: Unity-GLS 1.3, proprietary A 2.5, proprietary B 2.2. The correlations with the expert consensus values were for individual experts 0.85, Unity-GLS 0.91, proprietary A 0.73, proprietary B 0.79. Using the multiexpert visual ranking as the reference, individual expert strain measurements found a median rank correlation of 0.72, Unity-GLS 0.77, proprietary A 0.70, and proprietary B 0.74.
    CONCLUSIONS: Our open-source approach to calculating GLS agrees with experts\' consensus as strongly as the individual expert measurements and proprietary machine solutions. The training data, code, and trained networks are freely available online.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    暂无摘要。
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号