Interobserver agreement

观察员间协定
  • 文章类型: Systematic Review
    目的:在本系统评价中,我们评估了X线片以及2D和3D成像技术对6种常用胫骨平台骨折分类系统观察者间一致性的影响.
    方法:根据PRISMA指南,PubMed,科克伦,搜索了Embase和WebofScience,以研究2D和3D成像技术对胫骨平台分类系统的观察者间一致性的影响。验证新分类系统的研究,不提供自己的数据或仅提供有关X射线照片的观察者间协议的信息被排除。根据ROBINS-I偏倚风险工具对研究进行评分。
    结果:我们的综述分析了14项在临床实践中用于胫骨平台骨折的不同分类系统的研究,Schatzker分类是最常用的分类系统。结果表明,增加2DCT可以显着改善一项研究的观察者之间的一致性。然而,其他包括的研究显示不同程度的观察员之间的协议,根据兰迪斯和科赫的解释,从公平到实质性不等。在一项关于Schatzker分类的研究中,3DCT的添加导致了显着恶化。类似于2DCT的添加,根据Landis和Koch的解释,加入3DCT的Schatzker分类的观察者之间的共识是异质的,从公平到几乎完美。
    结论:可以推荐使用二维CT对胫骨平台骨折进行Schatzker分类,AO/OTA分类和Hohl分类。3DCT对常用分类系统的观察者间协议的价值仍然不确定且未经证实。因此,我们不建议使用3DCT对胫骨平台骨折进行分类。总的来说,成像技术的进步与观察者对骨折分类的共识的进步不一致。
    OBJECTIVE: In this systematic review, we evaluate the effect of radiographs and 2D and 3D imaging techniques on the interobserver agreement of six commonly used classification systems for tibial plateau fractures.
    METHODS: In accordance with PRISMA guidelines, PubMed, Cochrane, Embase and Web of Science were searched for studies regarding the effect of 2D and 3D imaging techniques on the interobserver agreement of tibial plateau classification systems. Studies validating new classification systems, not providing own data or only providing information on the interobserver agreement for radiographs were excluded. Studies were scored based on the ROBINS-I risk of bias tool.
    RESULTS: Our review analysed 14 studies on different classification systems used for tibial plateau fractures in clinical practice, with the Schatzker classification being the most commonly used classification system. The results showed that the addition of 2D CT led to a significant improvement of interobserver agreement for one study. However, other included studies showed varying levels of interobserver agreement, ranging from fair to substantial according to the interpretation by Landis and Koch. The addition of 3D CT resulted in a significant deterioration in one study for the Schatzker classification. Similar to the addition of 2D CT, the interobserver agreement for the Schatzker classification with the addition of 3D CT were heterogeneous ranging from fair to almost perfect according to the interpretation by Landis and Koch.
    CONCLUSIONS: The use of 2D CT can be recommended for classifying tibial plateau fractures with the Schatzker classification, AO/OTA classification and Hohl classification. The value of 3D CT on the interobserver agreement of commonly used classification systems remains uncertain and unproven. Therefore, we do not recommend the use of 3D CT for the classification of tibial plateau fractures. Overall, the advancement of imaging techniques is not in line with the advancement in interobserver agreement on fracture classification.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    目的:提供与观察者之间的共识和准确性有关的最新证据;评估优势,弱点,和使用的意义;概述了前列腺成像报告和数据系统2.1版(PI-RADSv2.1)在多参数磁共振成像(mpMRI)上检测前列腺癌(PCa)的改进和未来发展的机会。
    结果:我们对现有证据的审查表明,最近对PI-RADS系统的改进与PI-RADSv2.1稍微改善了观察者之间的协议,对于临床上有意义的PCa的检测通常具有较高的灵敏度和中等的特异性。最近的证据还表明,与PI-RADSv2相比,PI-RADSv2.1的诊断特异性有了实质性改善。然而,检查v2.1比较性能的研究结果受到小样本量和回顾性队列的限制,可能会引入选择偏差。一些研究表明v2.1和v2之间有实质性的改善,而另一些研究报告没有统计学上的显著差异。此外,在PI-RADSv2.1中,某些发现的解释和报告仍然是主观的,特别是对于2类病变,和读者的经验仍然有很大的不同。这些因素进一步导致了观察者之间的剩余程度的差异,并发现了更有经验的读者的表现有所改善。PI-RADSv2.1似乎至少显示出观察者间协议的最小改进,诊断性能,敏感性和特异性,在更有经验的读者中看到了更大的改进。然而,鉴于这些改进的恶化性质和所有研究的有限能力,这一进展的临床影响可能很小.尽管PI-RADSv2.1有所改善,但医生在解释前列腺mpMRI方面的经验仍然是前列腺癌检测中最重要的因素。
    OBJECTIVE: To present the latest evidence related to interobserver agreement and accuracy; evaluate the strengths, weaknesses, and implications of use; and outline opportunities for improvement and future development of the Prostate Imaging Reporting and Data System version 2.1 (PI-RADS v2.1) for detection of prostate cancer (PCa) on multiparametric magnetic resonance imaging (mpMRI).
    RESULTS: Our review of currently available evidence suggests that recent improvements to the PI-RADS system with PI-RADS v2.1 slightly improved interobserver agreement, with generally high sensitivity and moderate specificity for the detection of clinically significant PCa. Recent evidence additionally demonstrates substantial improvement in diagnostic specificity with PI-RADS v2.1 compared with PI-RADS v2. However, results of studies examining the comparative performance of v2.1 are limited by small sample sizes and retrospective cohorts, potentially introducing selection bias. Some studies suggest a substantial improvement between v2.1 and v2, while others report no statistically significant difference. Additionally, in PI-RADS v2.1, the interpretation and reporting of certain findings remain subjective, particularly for category 2 lesions, and reader experience continues to vary significantly. These factors further contribute to a remaining degree of interobserver variability and findings of improved performance among more experienced readers. PI-RADS v2.1 appears to show at least minimal improvement in interobserver agreement, diagnostic performance, and both sensitivity and specificity, with greater improvements seen among more experienced readers. However, given the decrescent nature of these improvements and the limited power of all studies examined, the clinical impact of this progress may be marginal. Despite improvements in PI-RADS v2.1, practitioner experience in interpreting mpMRI of the prostate remains the most important factor in prostate cancer detection.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    OBJECTIVE: Pathology reviews for upper urinary tract cancer (UTUC) remained scarce in the literature. Here, we reported the interobserver variation among the review and local pathologies of featured histologic characteristics for UTUC.
    METHODS: Patients who underwent definitive surgical treatments for UTUC were retrospectively reviewed for eligibility of pathology review. In the Taiwan UTUC Collaboration cohort, 212 cases were reviewed, of which 154 cases were eligible for pathology review. Agreement between original pathology and review pathology was measured by the total percentage of agreement and by simple kappa statistics. The prognostic impact was analyzed by the Cox regression model with the estimation of hazard ratios (HR) and 95% confidence intervals.
    RESULTS: There were 80 women and 74 men enrolled in this study, and the median age at treatment was 71.7 years. The agreement is moderate agreement for surgical margin status (87.7%; κ = 0.61), tumor grade (82.5%; κ = 0.43), tumor invasiveness (76.6%; κ = 0.45), lymphovascular invasion (70.8%; κ = 0.42) and T stage (67.5%; κ = 0.52). The interobserver agreements for perineural invasion and variant histology identification were slight. Kaplan-Meier analysis for disease-free survival revealed comparable results in local and review pathology for localized (Tis, Ta, T1-2) or advanced T stage (T3-4).
    CONCLUSIONS: Pathology review of UTUC had minimal impact on clinical practice based on current available disease treatment guidelines. However, significant interobserver variations were observed in featured adverse histopathological characters.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:肝细胞癌是最常见的原发性肝脏恶性肿瘤。从以前的研究结果来看,肝脏影像报告和数据系统(LI-RADS)在超声造影(CEUS)上显示出令人满意的诊断价值。然而,关于这种创新的超声成像的观察者间稳定性的统一结论尚未确定。本荟萃分析考察了CEUSLI-RADS的观察者间一致性,为后续相关研究提供参考。
    目的:评估LI-RADS对CEUS的观察者间一致性,并分析研究之间异质性的来源。
    方法:分析了2020年3月1日前在中国和其他国家发表的关于CEUSLI-RADS观察员协议的相关论文。这些研究经过过滤,并对诊断标准进行评价。使用R软件版本3.6.2的\"meta\"和\"metafor\"软件包分析所选择的参考文献。
    结果:本分析最终纳入了8项研究。Meta分析结果显示,纳入研究的Kappa值汇总为0.76[95%置信区间,0.67-0.83],这表明了实质性的协议。希金斯I2统计也证实了实质性的异质性(I2=91.30%,95%置信区间,85.3%-94.9%,P<0.01)。元回归确定了变量,包括患者登记的方法,一致性测试方法,和耐心的种族,这解释了研究的实质性异质性。
    结论:CEUSLI-RADS展示了总体上实质性的观察者间协议,但研究之间的异质性结果也很明显。进一步的临床研究应考虑有关实验设计的修改建议。
    BACKGROUND: Hepatocellular carcinoma is the most common primary liver malignancy. From the results of previous studies, Liver Imaging Reporting and Data System (LI-RADS) on contrast-enhanced ultrasound (CEUS) has shown satisfactory diagnostic value. However, a unified conclusion on the interobserver stability of this innovative ultrasound imaging has not been determined. The present meta-analysis examined the interobserver agreement of CEUS LI-RADS to provide some reference for subsequent related research.
    OBJECTIVE: To evaluate the interobserver agreement of LI-RADS on CEUS and analyze the sources of heterogeneity between studies.
    METHODS: Relevant papers on the subject of interobserver agreement on CEUS LI-RADS published before March 1, 2020 in China and other countries were analyzed. The studies were filtered, and the diagnostic criteria were evaluated. The selected references were analyzed using the \"meta\" and \"metafor\" packages of R software version 3.6.2.
    RESULTS: Eight studies were ultimately included in the present analysis. Meta-analysis results revealed that the summary Kappa value of included studies was 0.76 [95% confidence interval, 0.67-0.83], which shows substantial agreement. Higgins I 2 statistics also confirmed the substantial heterogeneity (I 2 = 91.30%, 95% confidence interval, 85.3%-94.9%, P < 0.01). Meta-regression identified the variables, including the method of patient enrollment, method of consistency testing, and patient race, which explained the substantial study heterogeneity.
    CONCLUSIONS: CEUS LI-RADS demonstrated overall substantial interobserver agreement, but heterogeneous results between studies were also obvious. Further clinical investigations should consider a modified recommendation about the experimental design.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    国际结外淋巴瘤研究组(IELSG)-37是一项前瞻性随机试验,评估免疫化疗后巩固纵隔放疗对新诊断的原发性纵隔大B细胞淋巴瘤(PMBCL)患者的作用。这是一项正电子发射断层扫描(PET)反应指导研究,其中通过中央审查评估的治疗结束PET计算机断层扫描(CT)扫描获得完全代谢反应的患者被随机分配接受放疗或不接受进一步治疗。这项研究的目的是测量报告该试验的PET-CT扫描的审阅者之间的一致性,并确定训练对一致性率的影响。审查小组由6名经验丰富的核医生组成,他们使用5点Deauville量表阅读PET-CT扫描。在4个时间点测量了观察者之间的一致性(IOA):在对先前IELSG-26研究的20名PMBCL患者的“训练集”进行盲法检查后(第1阶段);在IELSG-37的前10例临床病例(第2阶段)之后;以及在另外2组50例(第3阶段)和40例临床病例(第4阶段)之后。在来自训练集和前10个案例的反馈之后,举行了一次会议讨论口译,并商定了一套详细的审查程序说明,并采取了行动。在2012年至2014年之间,对前100名患者进行了审查。使用Deauville评分3作为完全代谢反应的截止值,审稿人的整体IOA良好(Krippendorffα=0.72。)审稿人对(Cohenκ)之间的二元一致性范围为0.60至0.78。IOA,最初是温和的,从第1阶段到第4阶段逐渐改善(Krippendorffα从0.53到0.81;Cohenκ从0.35-0.72到0.77-0.87)。我们的经验表明,报告PMBCL的“专家”核医生之间的协议,即使使用标准化的标准,研究开始时只有适度。然而,使用协调过程改进了协议,其中包括一项培训活动,讨论导致分歧的要点,并汇编实用规则,与普遍采用的口译标准并列。
    The International Extranodal Lymphoma Study Group (IELSG)-37 is a prospective randomized trial assessing the role of consolidation mediastinal radiotherapy after immunochemotherapy to patients with newly diagnosed primary mediastinal large B-cell lymphoma (PMBCL). It is a positron emission tomography (PET) response-guided study where patients obtaining a complete metabolic response on an end-of-therapy PET-computed tomography (CT) scan evaluated by a central review are randomized to receive radiotherapy or no further treatment. The aims of this study were to measure agreement between reviewers reporting PET-CT scans for this trial and to determine the effect of training upon concordance rates. The review panel comprised 6 experienced nuclear physicians who read PET-CT scans using the 5-point Deauville scale. Interobserver agreement (IOA) was measured at 4 time points: after a blinded review of a \"training set\" of 20 patients with PMBCL from the previous IELSG-26 study (phase 1); after the first 10 clinical cases enrolled in the IELSG-37 (phase 2); and after 2 further groups of 50 (phase 3) and 40 clinical cases (phase 4). After feedback from the training set and the first 10 cases, a meeting was held to discuss interpretation, and a detailed set of instructions for the review procedure was agreed and acted upon. Between 2012 and 2014, the first 100 patients were reviewed. Using Deauville score 3 as the cutoff for a complete metabolic response, the overall IOA among the reviewers was good (Krippendorff α = 0.72.) The binary concordance between pairs of reviewers (Cohen κ) ranged from 0.60 to 0.78. The IOA, initially moderate, improved progressively from phase 1 to 4 (Krippendorff α from 0.53 to 0.81; Cohen κ from 0.35-0.72 to 0.77-0.87). Our experience indicates that the agreement among \"expert\" nuclear physicians reporting PMBCL, even using standardized criteria, was only moderate when the study began. However, agreement improved using a harmonization process, which included a training exercise with discussion of points leading to disagreement and compiling practical rules to sit alongside commonly adopted interpretation criteria.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    OBJECTIVE: The objective of the study was to evaluate the intra- and interobserver agreement among obstetric experts in court regarding the retrospective review of abnormal fetal heart rate tracings and obstetrical management of patients with abnormal fetal heart rate during labor.
    METHODS: A total of 22 French obstetric experts in court reviewed 30 cases of term deliveries of singleton pregnancies diagnosed with at least 1 hour of abnormal fetal heart rate, including 10 cases with adverse neonatal outcome. The experts reviewed all cases twice within a 3-month interval, with the first review being blinded to neonatal outcome. For each case reviewed, the experts were provided with the obstetric data and copies of the complete fetal heart rate recording and the partogram. The experts were asked to classify the abnormal fetal heart rate tracing and to express whether they agreed with the obstetrical management performed. When they disagreed, the experts were asked whether they concluded that an error had been made and whether they considered the obstetrical management as the cause of cerebral palsy in children if any.
    RESULTS: Compared with blinded review, the experts were significantly more likely to agree with the obstetric management performed (P < .001) and with the mode of delivery (P < .001) when informed about the neonatal outcome and were less likely to conclude that an error had been made (P < .001) or to establish a link with potential cerebral palsy (P = .003). The experts\' intraobserver agreement for the review of abnormal fetal heart rate tracing and obstetrical management were both mediocre (kappa = 0.46-0.51 and kappa = 0.48-0.53, respectively). The interobserver agreement for the review of abnormal fetal heart rate tracing was low and was not improved by knowledge of the neonatal outcome (kappa = 0.11-0.18). The interobserver agreement for the interpretation of obstetrical management was also low (kappa = 0.08-0.19) but appeared to be improved by knowledge of the neonatal outcome (kappa = 0.15-0.32).
    CONCLUSIONS: The intra- and interobserver agreement among obstetric experts in court for the review of abnormal fetal heart rate tracing and the appropriateness of obstetrical care is poor, suggesting a lack of objectivity of obstetrical expertise as currently performed in court.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Sci-hub)

公众号