Computing Methodologies

计算方法
  • 文章类型: Journal Article
    背景:由于巨大的搜索空间,生物标志物的发现是一项具有挑战性的任务。量子计算和量子人工智能(量子AI)可用于解决从遗传数据中发现生物标志物的计算问题。
    方法:我们提出了一种量子神经网络架构来发现输入激活途径的遗传生物标志物。最大相关性-最小冗余标准评分生物标志物候选集。我们提出的模型是经济的,因为神经解决方案可以在受约束的硬件上交付。
    结果:我们证明了与CTLA4相关的四种激活途径的概念证明,包括(1)CTLA4激活独立,(2)CTLA4-CD8A-CD8B共激活,(3)CTLA4-CD2共激活,和(4)CTLA4-CD2-CD48-CD53-CD58-CD84共激活。
    结论:该模型表明与CLTA4相关途径的突变激活相关的新遗传生物标志物,包括20个基因:CLIC4,CPE,ETS2,FAM107A,GPR116,HYOU1,LCN2,MACF1,MT1G,NAPA,NDUFS5,PAK1,PFN1,PGAP3,PPM1G,PSMD8、RNF213、SLC25A3、UBA1和WLS。我们开源实现:https://github.com/namnguyen0510/Biomarker-Discovery-with-Quantum-Neural-Networks。
    BACKGROUND: Biomarker discovery is a challenging task due to the massive search space. Quantum computing and quantum Artificial Intelligence (quantum AI) can be used to address the computational problem of biomarker discovery from genetic data.
    METHODS: We propose a Quantum Neural Networks architecture to discover genetic biomarkers for input activation pathways. The Maximum Relevance-Minimum Redundancy criteria score biomarker candidate sets. Our proposed model is economical since the neural solution can be delivered on constrained hardware.
    RESULTS: We demonstrate the proof of concept on four activation pathways associated with CTLA4, including (1) CTLA4-activation stand-alone, (2) CTLA4-CD8A-CD8B co-activation, (3) CTLA4-CD2 co-activation, and (4) CTLA4-CD2-CD48-CD53-CD58-CD84 co-activation.
    CONCLUSIONS: The model indicates new genetic biomarkers associated with the mutational activation of CLTA4-associated pathways, including 20 genes: CLIC4, CPE, ETS2, FAM107A, GPR116, HYOU1, LCN2, MACF1, MT1G, NAPA, NDUFS5, PAK1, PFN1, PGAP3, PPM1G, PSMD8, RNF213, SLC25A3, UBA1, and WLS. We open source the implementation at: https://github.com/namnguyen0510/Biomarker-Discovery-with-Quantum-Neural-Networks .
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:SARS-CoV-2(COVID-19)大流行暴露了需要了解导致美国社区发病率和死亡率不均的风险驱动因素。解决与SARS-CoV-2传播相关的社区特定的社会健康决定因素(SDOH)为有针对性的公共卫生干预提供了机会,以提高对病毒性呼吸道感染的抵抗力。
    方法:我们的工作将公开的COVID-19统计数据与县级SDOH信息相结合。对机器学习模型进行了训练,以预测COVID-19病例的增长并了解社会,在田纳西州和佐治亚州,与SARS-CoV-2感染率较高相关的物理和环境危险因素。评估模型准确性,将每个县的预测病例数与实际阳性病例数进行比较。
    结果:预测模型在两种状态下的平均R2均为0.998,在所有检查的时间点的准确性均高于90%。使用这些模型,随着时间的推移,我们追踪了SDOH数据特征的重要性,以发现田纳西州和佐治亚州与COVID-19发病率密切相关的特定种族人口统计学特征.我们的结果表明,随着时间的推移和变化,两个州的动态种族趋势,同一州内各县之间的局部风险模式。例如,我们发现非裔美国人和亚裔种族人口统计学具有可比性,和对比,取决于地区的风险模式。
    结论:这里提出的人口趋势二分法强调了了解影响COVID-19发病率的独特因素的重要性。确定与COVID-19病例增长相关的这些特定风险因素可以帮助利益相关者针对区域干预措施,以减轻未来疫情的负担。
    BACKGROUND: The SARS-CoV-2 (COVID-19) pandemic has exposed the need to understand the risk drivers that contribute to uneven morbidity and mortality in US communities. Addressing the community-specific social determinants of health (SDOH) that correlate with spread of SARS-CoV-2 provides an opportunity for targeted public health intervention to promote greater resilience to viral respiratory infections.
    METHODS: Our work combined publicly available COVID-19 statistics with county-level SDOH information. Machine learning models were trained to predict COVID-19 case growth and understand the social, physical and environmental risk factors associated with higher rates of SARS-CoV-2 infection in Tennessee and Georgia counties. Model accuracy was assessed comparing predicted case counts to actual positive case counts in each county.
    RESULTS: The predictive models achieved a mean R2 of 0.998 in both states with accuracy above 90% for all time points examined. Using these models, we tracked the importance of SDOH data features over time to uncover the specific racial demographic characteristics strongly associated with COVID-19 incidence in Tennessee and Georgia counties. Our results point to dynamic racial trends in both states over time and varying, localized patterns of risk among counties within the same state. For example, we find that African American and Asian racial demographics present comparable, and contrasting, patterns of risk depending on locality.
    CONCLUSIONS: The dichotomy of demographic trends presented here emphasizes the importance of understanding the unique factors that influence COVID-19 incidence. Identifying these specific risk factors tied to COVID-19 case growth can help stakeholders target regional interventions to mitigate the burden of future outbreaks.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    在机器学习中,很明显,如果应用引导聚合(bagging),则任务性能的分类会增加。然而,深度神经网络的装袋需要大量的计算资源和训练时间。在这项研究中,我们旨在回答的研究问题是,我们是否可以通过将问题分为子问题来获得更高的任务绩效分数并加速培训。
    本研究中使用的数据包括来自电子癌症病理学报告的自由文本。我们使用多任务卷积神经网络(MT-CNN)和多任务分层卷积注意网络(MT-HCAN)分类器进行了打包和分区数据训练。我们把一个大问题分成20个子问题,对培训案例进行了2000次重新采样,并为每个引导样本和每个子问题训练深度学习模型-因此,生成多达40,000个模型。我们在橡树岭国家实验室(ORNL)的高性能计算环境中同时对许多模型进行了训练。
    我们证明了与单模型方法相比,模型的聚合提高了任务性能,这与其他研究一致;我们证明了两种提出的分区套袋方法在四个任务上获得了更高的分类精度分数。值得注意的是,这些改进对于癌症组织学数据的提取具有重要意义,任务中有500多个类标签;这些结果表明,数据分区可以减轻任务的复杂性。相反,这些方法在网站和子网站分类任务中没有取得较好的分数.本质上,因为数据划分是基于原发癌部位,准确性取决于分区的确定,这需要进一步调查和改进。
    本研究的结果表明1.数据分区和打包策略获得了更高的性能分数。2.我们通过ORNL的高性能Summit超级计算机实现了更快的培训。
    In machine learning, it is evident that the classification of the task performance increases if bootstrap aggregation (bagging) is applied. However, the bagging of deep neural networks takes tremendous amounts of computational resources and training time. The research question that we aimed to answer in this research is whether we could achieve higher task performance scores and accelerate the training by dividing a problem into sub-problems.
    The data used in this study consist of free text from electronic cancer pathology reports. We applied bagging and partitioned data training using Multi-Task Convolutional Neural Network (MT-CNN) and Multi-Task Hierarchical Convolutional Attention Network (MT-HCAN) classifiers. We split a big problem into 20 sub-problems, resampled the training cases 2,000 times, and trained the deep learning model for each bootstrap sample and each sub-problem-thus, generating up to 40,000 models. We performed the training of many models concurrently in a high-performance computing environment at Oak Ridge National Laboratory (ORNL).
    We demonstrated that aggregation of the models improves task performance compared with the single-model approach, which is consistent with other research studies; and we demonstrated that the two proposed partitioned bagging methods achieved higher classification accuracy scores on four tasks. Notably, the improvements were significant for the extraction of cancer histology data, which had more than 500 class labels in the task; these results show that data partition may alleviate the complexity of the task. On the contrary, the methods did not achieve superior scores for the tasks of site and subsite classification. Intrinsically, since data partitioning was based on the primary cancer site, the accuracy depended on the determination of the partitions, which needs further investigation and improvement.
    Results in this research demonstrate that 1. The data partitioning and bagging strategy achieved higher performance scores. 2. We achieved faster training leveraged by the high-performance Summit supercomputer at ORNL.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

  • 文章类型: Journal Article
    Following the development of the internet as an essential tool for communication at home and at work, the concept of online procrastination was introduced to the literature. The present study examined the relationships between online procrastination and two well-established forms of procrastination, namely decisional and general procrastination; as well as the moderating effect of negative affect on these relationships. The sample consisted of 236 computer professionals from Israel who filled self-reported questionnaires on procrastination and negative feelings. To examine the relationships between our variables, we used multiple linear regression and moderation analyses. The findings indicated that higher levels of general and decisional procrastination were associated with higher levels of online procrastination. Higher levels of negative affect were also associated with online procrastination. Moreover, negative affect moderated the effect of general and decisional procrastination on online procrastination, and for participants with higher levels of negative affect, this effect was stronger. These findings suggest that both a personality-based tendency to procrastinate and the tendency to delay decision making may affect online behavior and that negative affect strengthens these tendencies. Future studies will need to further explore online procrastination and examine the personality and situational variables that contribute to it.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Editorial
    暂无摘要。
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    OBJECTIVE: Population-based routine service screening has gained popularity following an era of randomized controlled trials. The evaluation of these service screening programs is subject to study design, data availability, and the precise data analysis for adjusting bias. We developed a computer-aided system that allows the evaluation of population-based service screening to unify these aspects and facilitate and guide the program assessor to efficiently perform an evaluation.
    METHODS: This system underpins two experimental designs: the posttest-only non-equivalent design and the one-group pretest-posttest design and demonstrates the type of data required at both the population and individual levels. Three major analyses were developed that included a cumulative mortality analysis, survival analysis with lead-time adjustment, and self-selection bias adjustment. We used SAS AF software to develop a graphic interface system with a pull-down menu style.
    RESULTS: We demonstrate the application of this system with data obtained from a Swedish population-based service screen and a population-based randomized controlled trial for the screening of breast, colorectal, and prostate cancer, and one service screening program for cervical cancer with Pap smears. The system provided automated descriptive results based on the various sources of available data and cumulative mortality curves corresponding to the study designs. The comparison of cumulative survival between clinically and screen-detected cases without a lead-time adjustment are also demonstrated. The intention-to-treat and noncompliance analysis with self-selection bias adjustments are also shown to assess the effectiveness of the population-based service screening program. Model validation was composed of a comparison between our adjusted self-selection bias estimates and the empirical results on effectiveness reported in the literature.
    CONCLUSIONS: We demonstrate a computer-aided system allowing the evaluation of population-based service screening programs with an adjustment for self-selection and lead-time bias. This is achieved by providing a tutorial guide from the study design to the data analysis, with bias adjustment.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • DOI:
    文章类型: Journal Article
    The problem of sharing medical information among different centres has been tackled by many projects. Several of them target the specific problem of sharing DICOM images and structured reports (DICOM-SR), such as the TRENCADIS project. In this paper we propose sharing and organizing DICOM data and DICOM-SR metadata benefiting from the existent deployed Grid infrastructures compliant with gLite such as EGEE or the Spanish NGI. These infrastructures contribute with a large amount of storage resources for creating knowledge databases and also provide metadata storage resources (such as AMGA) to semantically organize reports in a tree-structure. First, in this paper, we present the extension of TRENCADIS architecture to use gLite components (LFC, AMGA, SE) on the shake of increasing interoperability. Using the metadata from DICOM-SR, and maintaining its tree structure, enables federating different but compatible diagnostic structures and simplifies the definition of complex queries. This article describes how to do this in AMGA and it shows an approach to efficiently code radiology reports to enable the multi-centre federation of data resources.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • DOI:
    文章类型: Journal Article
    The interplay of a mobile population can affect the quality of patient outcomes and the economics of health care delivery significantly. Helping patients with limited English proficiency understand the basics of self-care for optimal health will continue to be a challenge in the delivery of the highest quality nursing care. Becoming familiar with high-quality, peer-reviewed, and reliable health education materials and Web sites is the responsibility of every health care provider so that patients receive culturally and linguistically appropriate resources to support healthy lifestyles and choices.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Comparative Study
    Full-scale Monte Carlo simulations of the cyclotron room of the Buddhist Tzu Chi General Hospital were carried out to improve the original inadequate maze design. Variance reduction techniques are indispensable in this study to facilitate the simulations for testing a variety of configurations of shielding modification. The TORT/MCNP manual coupling approach based on the Consistent Adjoint Driven Importance Sampling (CADIS) methodology has been used throughout this study. The CADIS utilises the source and transport biasing in a consistent manner. With this method, the computational efficiency was increased significantly by more than two orders of magnitude and the statistical convergence was also improved compared to the unbiased Monte Carlo run. This paper describes the shielding problem encountered, the procedure for coupling the TORT and MCNP codes to accelerate the calculations and the calculation results for the original and improved shielding designs. In order to verify the calculation results and seek additional accelerations, sensitivity studies on the space-dependent and energy-dependent parameters were also conducted.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    The rapid advances in high-throughput biotechnologies such as DNA microarrays and mass spectrometry have generated vast amounts of data ranging from gene expression to proteomics data. The large size and complexity involved in analyzing such data demand a significant amount of computing power. High-performance computation (HPC) is an attractive and increasingly affordable approach to help meet this challenge. There is a spectrum of techniques that can be used to achieve computational speedup with varying degrees of impact in terms of how drastic a change is required to allow the software to run on an HPC platform. This paper describes a high- productivity/low-maintenance (HP/LM) approach to HPC that is based on establishing a collaborative relationship between the bioinformaticist and HPC expert that respects the former\'s codes and minimizes the latter\'s efforts. The goal of this approach is to make it easy for bioinformatics researchers to continue to make iterative refinements to their programs, while still being able to take advantage of HPC. The paper describes our experience applying these HP/LM techniques in four bioinformatics case studies: (1) genome-wide sequence comparison using Blast, (2) identification of biomarkers based on statistical analysis of large mass spectrometry data sets, (3) complex genetic analysis involving ordinal phenotypes, (4) large-scale assessment of the effect of possible errors in analyzing microarray data. The case studies illustrate how the HP/LM approach can be applied to a range of representative bioinformatics applications and how the approach can lead to significant speedup of computationally intensive bioinformatics applications, while making only modest modifications to the programs themselves.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

公众号