Programming Languages

编程语言
  • 文章类型: Systematic Review
    代码克隆,指的是相似或相同的代码片段,并在软件系统中复制和粘贴,对软件质量和维护都有负面影响。这项工作的目的是系统地回顾和分析用于检测代码克隆的递归神经网络技术,以阐明当前技术并为研究界提供有价值的知识。在应用审查方案后,我们已经从总共2099项研究中成功确定了该领域的20项主要研究.对这些研究的深入研究表明,九种递归神经网络技术已用于代码克隆检测,具有对LSTM技术的显著偏好。这些技术已经证明了它们在检测句法和语义克隆方面的功效,经常使用抽象语法树来表示源代码。此外,我们观察到大多数研究都应用了F分数等评估指标,精度,和回忆。此外,这些研究经常使用从用Java和C编程语言编码的开源系统中提取的数据集。值得注意的是,Graph-LSTM技术表现出优异的性能。PyTorch和TensorFlow成为实现RNN模型的流行工具。为了推进代码克隆检测的研究,对并行LSTM等技术的进一步探索,句子级LSTM,树状结构的GRU势在必行。此外,需要更多的研究来研究递归神经网络技术在不同编程语言和二进制代码中识别语义克隆的能力。为Python等语言开发标准化基准,划痕,和C#,以及跨语言比较,是必不可少的。因此,利用递归神经网络技术进行克隆识别是一个有前途的领域,需要进一步研究。
    Code clones, referring to code fragments that are either similar or identical and are copied and pasted within software systems, have negative effects on both software quality and maintenance. The objective of this work is to systematically review and analyze recurrent neural network techniques used to detect code clones to shed light on the current techniques and offer valuable knowledge to the research community. Upon applying the review protocol, we have successfully identified 20 primary studies within this field from a total of 2099 studies. A deep investigation of these studies reveals that nine recurrent neural network techniques have been utilized for code clone detection, with a notable preference for LSTM techniques. These techniques have demonstrated their efficacy in detecting both syntactic and semantic clones, often utilizing abstract syntax trees for source code representation. Moreover, we observed that most studies applied evaluation metrics like F-score, precision, and recall. Additionally, these studies frequently utilized datasets extracted from open-source systems coded in Java and C programming languages. Notably, the Graph-LSTM technique exhibited superior performance. PyTorch and TensorFlow emerged as popular tools for implementing RNN models. To advance code clone detection research, further exploration of techniques like parallel LSTM, sentence-level LSTM, and Tree-Structured GRU is imperative. In addition, more research is needed to investigate the capabilities of the recurrent neural network techniques for identifying semantic clones across different programming languages and binary codes. The development of standardized benchmarks for languages like Python, Scratch, and C#, along with cross-language comparisons, is essential. Therefore, the utilization of recurrent neural network techniques for clone identification is a promising area that demands further research.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    教授入门编程课程并不是一件容易的事。入门编程课程的讲师面临着许多与编程性质有关的挑战,学生的特点和他们正在使用的传统教学方法。混合学习似乎是解决这些挑战的一种有前途的方法。许多研究得出结论,混合学习比传统教学更有效,可以改善学生的学习体验。然而,在将混合学习应用于入门编程课程的知识和实践的当前状态是有限的。为了开始弥补这个差距,这篇综述综合了在入门编程课程中应用的不同混合学习方法。它将它们分为五个模型,然后讨论每个模型对新手程序员学习体验的影响。最后,为希望融合课程的讲师提供了一些建议,并对未来的研究产生了一些影响。
    Teaching introductory programming courses is not an easy task. Instructors of introductory programming courses are facing many challenges related to the nature of programming, the students\' characteristics and the traditional teaching methods that they are using. Blended learning seems to be a promising approach to address these challenges. Many studies concluded that blended learning can be more effective than traditional teaching and can improve students\' learning experience. However, the current state of knowledge and practice in applying blended learning to introductory programming courses is limited. In an attempt to begin remedying this gap, this review synthesizes the different blended learning approaches that have been applied in introductory programming courses. It classifies them into five models then discusses the impact of each of these models on the learning experience of novice programmers. It concludes by providing some recommendations for instructors who want to blend their courses as well as some implications for future research.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

  • 文章类型: Journal Article
    随着理论和计算方面的进步,样条建模的使用已成为统计回归分析的既定工具。样条建模中的一个重要问题是用户友好的可用性,有据可查的软件包。遵循“加强观察研究的分析思维”倡议的思想,为用户提供有关统计方法在观察研究中的应用的指导文件,本文的目的是概述最广泛使用的基于样条的技术及其在R中的实现。
    在这项工作中,我们专注于R语言的统计计算,它已经成为一个非常受欢迎的统计软件。我们确定了一组软件包,其中包括用于回归框架内样条建模的函数。使用模拟和真实数据,我们提供了样条建模的介绍以及最流行的样条函数的概述。
    我们提出了一系列单变量数据的简单场景,其中使用不同的基函数来识别自变量的正确函数形式。即使在简单的数据中,使用来自不同软件包的例程会导致不同的结果。
    这项工作说明了分析师在处理数据时面临的挑战。大多数差异可以归因于超参数的选择,而不是使用的基础。事实上,有经验的用户会知道如何获得合理的结果,无论使用哪种类型的样条曲线。然而,许多分析师没有足够的知识来充分使用这些强大的工具,需要更多的指导。
    With progress on both the theoretical and the computational fronts the use of spline modelling has become an established tool in statistical regression analysis. An important issue in spline modelling is the availability of user friendly, well documented software packages. Following the idea of the STRengthening Analytical Thinking for Observational Studies initiative to provide users with guidance documents on the application of statistical methods in observational research, the aim of this article is to provide an overview of the most widely used spline-based techniques and their implementation in R.
    In this work, we focus on the R Language for Statistical Computing which has become a hugely popular statistics software. We identified a set of packages that include functions for spline modelling within a regression framework. Using simulated and real data we provide an introduction to spline modelling and an overview of the most popular spline functions.
    We present a series of simple scenarios of univariate data, where different basis functions are used to identify the correct functional form of an independent variable. Even in simple data, using routines from different packages would lead to different results.
    This work illustrate challenges that an analyst faces when working with data. Most differences can be attributed to the choice of hyper-parameters rather than the basis used. In fact an experienced user will know how to obtain a reasonable outcome, regardless of the type of spline used. However, many analysts do not have sufficient knowledge to use these powerful tools adequately and will need more guidance.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

  • 文章类型: Journal Article
    Communicating radiological reports to peers has pedagogical value. Students may be uneasy with the process due to a lack of communication and peer review skills or to their failure to see value in the process. We describe a communication exercise with peer review in an undergraduate veterinary radiology course. The computer code used to manage the course and deliver images online is reported, and we provide links to the executable files. We tested to see if undergraduate peer review of radiological reports has validity and describe student impressions of the learning process. Peer review scores for student-generated radiological reports were compared to scores obtained in the summative multiple choice (MCQ) examination for the course. Student satisfaction was measured using a bespoke questionnaire. There was a weak positive correlation (Pearson correlation coefficient = 0.32, p < 0.01) between peer review scores students received and the student scores obtained in the MCQ examination. The difference in peer review scores received by students grouped according to their level of course performance (high vs. low) was statistically significant (p < 0.05). No correlation was found between peer review scores awarded by the students and the scores they obtained in the MCQ examination (Pearson correlation coefficient = 0.17, p = 0.14). In conclusion, we have created a realistic radiology imaging exercise with readily available software. The peer review scores are valid in that to a limited degree they reflect student future performance in an examination. Students valued the process of learning to communicate radiological findings but do not fully appreciated the value of peer review.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    背景:为了进一步推进临床数据交换标准联盟(CDISC)操作数据模型(ODM)标准的研究和开发,必须充分理解现有的研究。本文对ODM文献进行了方法学综述。具体来说,它开发了一个分类模式,根据该标准在临床研究数据生命周期中的应用方式对ODM文献进行分类.本文提出了未来研究和开发的领域,以解决ODM的局限性并利用其优势来支持临床研究信息学的新趋势。
    方法:对以下数据库进行了系统扫描:(1)ABI/Inform,(2)ACM数字,(3)AIS电子图书馆,(4)欧洲中央PubMed,(5)谷歌学者,(5)IEEEXplore,(7)PubMed,和(8)科学直接。还进行了WebofScience引文分析。在所有数据库上使用的搜索词是\"CDISCODM。“两个主要的纳入标准是:(1)研究必须检查ODM作为信息系统解决方案组件的使用,或者(2)研究必须针对规定的解决方案使用场景批判性地评估ODM。在确定的2686篇文章中,266名被列入标题级别审查,共183篇文章。随后进行了抽象评论,产生121篇剩余文章;经过全文扫描后,69篇文章符合纳入标准。
    结果:随着对互操作性的需求增加,ODM已显示出显著的灵活性,并已扩展到涵盖范围广泛的数据和元数据要求,远远超出ODM的原始用例。这种灵活性产生了涵盖各种主题领域的研究文献。创建了反映ODM在临床研究数据生命周期中使用的分类模式,以提供ODM文献的分类和综合视图。该框架的要素包括:(1)EDC(电子数据捕获)和EHR(电子健康记录)基础设施;(2)计划;(3)数据收集;(4)数据表和分析;(5)研究档案。该分析回顾了ODM作为分类模式的每个部分中的解决方案组件的优势和局限性。本文还确定了未来ODM研究和开发的机会,包括改进的语义与外部术语对齐的机制,更好地表示在临床研究数据生命周期中端到端使用的CDISC标准,改进了对实时数据交换的支持,使用EHR进行研究,并纳入完整的研究设计。
    结论:ODM的使用方式最初没有预料到,并涵盖了整个临床研究数据生命周期中的各种用例。ODM已被用作数据交换的研究元数据标准。很大一部分文献涉及整合EHR和临床研究数据。ODM的简单性和可读性可能有助于其作为数据和元数据标准的成功和广泛实施。保持核心ODM模型专注于最基本的用例,在使用扩展来处理边缘情况时,保持标准易于开发人员学习和使用。
    BACKGROUND: In order to further advance research and development on the Clinical Data Interchange Standards Consortium (CDISC) Operational Data Model (ODM) standard, the existing research must be well understood. This paper presents a methodological review of the ODM literature. Specifically, it develops a classification schema to categorize the ODM literature according to how the standard has been applied within the clinical research data lifecycle. This paper suggests areas for future research and development that address ODM\'s limitations and capitalize on its strengths to support new trends in clinical research informatics.
    METHODS: A systematic scan of the following databases was performed: (1) ABI/Inform, (2) ACM Digital, (3) AIS eLibrary, (4) Europe Central PubMed, (5) Google Scholar, (5) IEEE Xplore, (7) PubMed, and (8) ScienceDirect. A Web of Science citation analysis was also performed. The search term used on all databases was \"CDISC ODM.\" The two primary inclusion criteria were: (1) the research must examine the use of ODM as an information system solution component, or (2) the research must critically evaluate ODM against a stated solution usage scenario. Out of 2686 articles identified, 266 were included in a title level review, resulting in 183 articles. An abstract review followed, resulting in 121 remaining articles; and after a full text scan 69 articles met the inclusion criteria.
    RESULTS: As the demand for interoperability has increased, ODM has shown remarkable flexibility and has been extended to cover a broad range of data and metadata requirements that reach well beyond ODM\'s original use cases. This flexibility has yielded research literature that covers a diverse array of topic areas. A classification schema reflecting the use of ODM within the clinical research data lifecycle was created to provide a categorized and consolidated view of the ODM literature. The elements of the framework include: (1) EDC (Electronic Data Capture) and EHR (Electronic Health Record) infrastructure; (2) planning; (3) data collection; (4) data tabulations and analysis; and (5) study archival. The analysis reviews the strengths and limitations of ODM as a solution component within each section of the classification schema. This paper also identifies opportunities for future ODM research and development, including improved mechanisms for semantic alignment with external terminologies, better representation of the CDISC standards used end-to-end across the clinical research data lifecycle, improved support for real-time data exchange, the use of EHRs for research, and the inclusion of a complete study design.
    CONCLUSIONS: ODM is being used in ways not originally anticipated, and covers a diverse array of use cases across the clinical research data lifecycle. ODM has been used as much as a study metadata standard as it has for data exchange. A significant portion of the literature addresses integrating EHR and clinical research data. The simplicity and readability of ODM has likely contributed to its success and broad implementation as a data and metadata standard. Keeping the core ODM model focused on the most fundamental use cases, while using extensions to handle edge cases, has kept the standard easy for developers to learn and use.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

       PDF(Pubmed)

  • 文章类型: Journal Article
    BACKGROUND: The interoperability of the Electrocardiogram (ECG) between heterogeneous systems has been facilitated by not one, but a number of predefined open storage formats. To improve the techniques currently used, it is important to define the similarities and the differences between these ECG storage formats.
    METHODS: This paper presents a review of 9 formats used to store the ECG. Three of the predominant formats, namely, SCP-ECG, DICOM-ECG, and HL7 aECG are reviewed in detail along with the undertaking of a SWOT analysis. The remaining formats have been examined to a lesser extent as they are not as predominant in the literature.
    CONCLUSIONS: This study suggests that a plethora of open ECG formats, all aiming to promote interoperability has the opposite effect of adding more complexity. This paper discusses whether a format supporting a variety of diagnostic modalities is more advantageous than a format that only supports the ECG. It is conclusive that a general purpose format such as DICOM solves more interoperability issues, however, no general purpose format currently exists that fulfils the requirements of all users. As a result, the healthcare industry has been bombarded with custom storage formats, i.e., a format for storing the resting ECG, a format for storing the ambulatory ECG, a format for storing the ECG in clinical trials, a format for storing ECG data on mobile devices etc. This study then examines which implementation method is more suited to encode ECG data, i.e. binary or XML. Binary encoding has been used in the past to store the ECG, however, unlike binary, XML files are human readable, searchable and provide a better form of semantics. Based on analysis within this work it is speculated that XML may overtake binary as the preferred implementation method for encoding ECG data since it has already made a huge impact in the healthcare industry.
    CONCLUSIONS: It can be concluded that there is a wide range of vastly different techniques used to store the ECG. Although the specifications of these formats are openly available, neither has been internationally adopted to be used with all ECG machines. Therefore, there remains a lack of global interoperability of ECG information.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    One important aim within systems biology is to integrate disparate pieces of information, leading to discovery of higher-level knowledge about important functionality within living organisms. This makes standards for representation of data and technology for exchange and integration of data important key points for development within the area. In this article, we focus on the recent developments within the field. We compare the recent updates to the three standard representations for exchange of data SBML, PSI MI and BioPAX. In addition, we give an overview of available tools for these three standards and a discussion on how these developments support possibilities for data exchange and integration.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    InterMed是斯坦福大学研究小组之间的合作,哈佛,和哥伦比亚大学。InterMed的主要目标是开发一种可共享的语言,该语言可以作为对计算机可解释指南(CIG)进行建模的标准。这种语言,称为GuideLine交换格式(GLIF),是以协作的方式和开放的过程开发的,欢迎来自更大社区的投入。InterMed项目的目标和经验以及作者所学到的经验教训可能有助于其他正在开发基于医学知识的工具的研究人员的工作。所描述的经验教训包括(1)多机构研究和开发的工作过程,考虑不同的观点,(2)开发医学知识表示格式的进化生命周期过程,(3)认知方法论在进化发展过程中的评价和辅助作用,(4)开发体系结构和(5)可共享医学知识表示格式的设计原则,(6)aCIG建模语言的标准化过程。
    InterMed is a collaboration among research groups from Stanford, Harvard, and Columbia Universities. The primary goal of InterMed has been to develop a sharable language that could serve as a standard for modeling computer-interpretable guidelines (CIGs). This language, called GuideLine Interchange Format (GLIF), has been developed in a collaborative manner and in an open process that has welcomed input from the larger community. The goals and experiences of the InterMed project and lessons that the authors have learned may contribute to the work of other researchers who are developing medical knowledge-based tools. The lessons described include (1) a work process for multi-institutional research and development that considers different viewpoints, (2) an evolutionary lifecycle process for developing medical knowledge representation formats, (3) the role of cognitive methodology to evaluate and assist in the evolutionary development process, (4) development of an architecture and (5) design principles for sharable medical knowledge representation formats, and (6) a process for standardization of a CIG modeling language.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Comparative Study
    Representation of clinical practice guidelines in a computer-interpretable format is a critical issue for guideline development, implementation, and evaluation. We studied 11 types of guideline representation models that can be used to encode guidelines in computer-interpretable formats. We have consistently found in all reviewed models that primitives for representation of actions and decisions are necessary components of a guideline representation model. Patient states and execution states are important concepts that closely relate to each other. Scheduling constraints on representation primitives can be modeled as sequences, concurrences, alternatives, and loops in a guideline\'s application process. Nesting of guidelines provides multiple views to a guideline with different granularities. Integration of guidelines with electronic medical records can be facilitated by the introduction of a formal model for patient data. Data collection, decision, patient state, and intervention constitute four basic types of primitives in a guideline\'s logic flow. Decisions clarify our understanding on a patient\'s clinical state, while interventions lead to the change from one patient state to another.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

  • 文章类型: Journal Article
    In this paper the use for research purposes of an existing data management system, ADAMO (A Database Management system for Oncology), is described. The aim of this paper is to discuss the experiences, obtained with this \'home-made\' system and to describe some of the extensions that were recently made. Reasons are presented why the system is still extensively used by clinicians although a number of commercial database management systems is now available on personal computers. These database systems are more flexible than the system described here. It is concluded that it is precisely this flexibility of current systems that prevents an optimal use by busy clinicians. Clinicians need a research system that contains just the functions that they need. These functions have to be available via simple commands, so that no additional programming--even at the high level of a query language--is necessary.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号