Generalizability

泛化
  • 文章类型: Journal Article
    评估减肥手术治疗对糖尿病控制的随机对照试验(RCT)的外部有效性。
    多位点随机对照试验提供了最有力的证据支持临床治疗,并具有最大的内部有效性。然而,试验参与者的特征可能不能代表现实世界中接受治疗的患者.需要评估RCT的结果如何推广到正在接受治疗的所有当代患者群体。
    2018年1月8日至2023年5月19日在加州大学洛杉矶分校(UCLA)接受袖状胃切除术的所有患者均具有基线特征,体重变化,与参加手术治疗和药物可能有效根除糖尿病(STAMPEDE)和糖尿病手术研究(DSS)减重手术对糖尿病控制的影响的RCTs相比。比较了符合和不符合这些随机对照试验进入标准的UCLA患者的体重减轻和糖尿病控制。
    387例糖尿病患者中只有65例(17%)符合STAMPEDE的资格标准,29人(7.5%)因年龄较大而符合DSS标准,具有较高的体重指数,降低HbA1c。UCLA患者的体重减轻比RCT患者略少,但糖尿病控制相似。313名(81%)不符合进入任一RCT研究条件的患者与符合RCT条件的患者具有相似的长期糖尿病控制。
    尽管接受减肥手术的患者中只有很小一部分符合两项主要随机对照试验的资格标准,这一当代队列中的大多数患者具有相似的结局.来自STAMPEDE和DSS的糖尿病结果普遍适用于大多数接受减肥手术以控制糖尿病的患者。
    UNASSIGNED: To assess the external validity of randomized controlled trials (RCTs) of bariatric surgical treatment on diabetes control.
    UNASSIGNED: Multisite RCTs provide the strongest evidence supporting clinical treatments and have the greatest internal validity. However, characteristics of trial participants may not be representative of patients receiving treatment in the real world. There is a need to assess how the results of RCTs generalize to all contemporary patient populations undergoing treatments.
    UNASSIGNED: All patients undergoing sleeve gastrectomy at University of California Los Angeles (UCLA) between January 8, 2018 and May 19, 2023 had their baseline characteristics, weight change, and diabetes control compared with those enrolled in the surgical treatment and medications potentially eradicate diabetes efficiently (STAMPEDE) and diabetes surgery study (DSS) RCTs of bariatric surgery\'s effect on diabetes control. Weight loss and diabetes control were compared between UCLA patients who did and did not fit the entry criteria for these RCTs.
    UNASSIGNED: Only 65 (17%) of 387 patients with diabetes fulfilled the eligibility criteria for STAMPEDE, and 29 (7.5%) fulfilled the criteria for DSS due to being older, having higher body mass index, and lower HbA1c. UCLA patients experienced slightly less weight loss than patients in the RCTs but had similar diabetes control. The 313 (81%) patients not eligible for study entry into either RCT had similar long-term diabetes control as those who were eligible for the RCTs.
    UNASSIGNED: Even though only a very small proportion of patients undergoing bariatric surgery met the eligibility criteria for the 2 major RCTs, most patients in this contemporary cohort had similar outcomes. Diabetes outcomes from STAMPEDE and DSS generalize to most patients undergoing bariatric surgery for diabetes control.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:教育神经科学研究,研究学习的神经生物学机制,历史上纳入了主要来自白色的样本,中产阶级,和/或郊区人口。然而,在不关注代表性的研究中进行抽样可能会导致有偏见的解释和结果,这些结果不太容易推广到预期的目标人群。先前的研究揭示了群体内和跨群体的神经认知结果的差异,进一步表明,这种做法可能会掩盖具有实际意义的重大影响。
    历史边缘化社区的消极态度,源于历史虐待,有偏见的研究结果,以及研究团队之间隐含或明确的态度,会阻碍多元化参与。研究过程的质量,包括语言要求,研究地点,时间需求会产生额外的障碍。
    方法:灵活的数据收集方法,社区扩张,透明的报告可以建立信任并增强抽样多样性。长期解决方案包括优先考虑与边缘化社区相关的研究问题,增加劳动力多样性,以及样本人口统计数据的详细报告。这种共同努力对于强大的教育神经科学研究至关重要,以最大程度地扩大学习者的积极影响。
    BACKGROUND: Educational neuroscience research, which investigates the neurobiological mechanisms of learning, has historically incorporated samples drawn mostly from white, middle-class, and/or suburban populations. However, sampling in research without attending to representation can lead to biased interpretations and results that are less generalizable to an intended target population. Prior research revealing differences in neurocognitive outcomes both within- and across-groups further suggests that such practices may obscure significant effects with practical implications.
    UNASSIGNED: Negative attitudes among historically marginalized communities, stemming from historical mistreatment, biased research outcomes, and implicit or explicit attitudes among research teams, can hinder diverse participation. Qualities of the research process including language requirements, study locations, and time demands create additional barriers.
    METHODS: Flexible data collection approaches, community engaugement, and transparent reporting could build trust and enhance sampling diversity. Longer-term solutions include prioritizing research questions relevant to marginalized communities, increasing workforce diversity, and detailed reporting of sample demographics. Such concerted efforts are essential for robust educational neuroscience research to maximize positive impacts broadly across learners.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    成人思想变化(ACT)研究是一项由65岁以上的KaiserPermanenteWashington成员组成的队列,始于1994年。
    我们想知道ACT参与者对该地区所有老年人的代表性如何,以及ACT对西雅图都会区所有老年人的眼部疾病及其与阿尔茨海默病的关系的研究结果如何。
    我们使用来自合并ACT和行为危险因素监测系统(BRFSS)数据的参与权重来估计常见眼病的患病率及其与阿尔茨海默病发病率的关联。Cox比例风险模型考虑了年龄,教育,吸烟,性别,和APOE基因型。对加权分析的置信区间进行了引导,以解决估计权重的误差。
    ACT参与者与该地区的老年人相当相似。最大的差异是BRFSS中自我报告的当前胆固醇药物使用情况更多,而ACT中教育程度较低的比例更高。纳入权重对年龄相关性黄斑变性或青光眼的患病率估计影响不大。对于糖尿病性视网膜病变(加权5.7%(95%置信区间4.3,7.1);未加权4.1%(3.6,4.6))和白内障病史(加权51.8%(49.6,54.3);未加权48.6%(47.3,49.9)),加权估计值略高。最近诊断为糖尿病视网膜病变和阿尔茨海默病的加权风险比为1.84(0.34,4.29),与未加权ACT中的1.32(0.87,2.00)相比。
    最多,但不是全部,参与加权后的协会相似。即使在社区人群中,将推论扩展到更广泛的人群可能会受益于具有参与权重的评估。
    UNASSIGNED: The Adult Changes in Thought (ACT) study is a cohort of Kaiser Permanente Washington members ages 65+ that began in 1994.
    UNASSIGNED: We wanted to know how well ACT participants represented all older adults in the region, and how well ACT findings on eye disease and its relationship with Alzheimer\'s disease generalized to all older adults in the Seattle Metropolitan Region.
    UNASSIGNED: We used participation weights derived from pooling ACT and Behavioral Risk Factor Surveillance System (BRFSS) data to estimate prevalences of common eye diseases and their associations with Alzheimer\'s disease incidence. Cox proportional hazards models accounted for age, education, smoking, sex, and APOE genotype. Confidence intervals for weighted analyses were bootstrapped to account for error in estimating the weights.
    UNASSIGNED: ACT participants were fairly similar to older adults in the region. The largest differences were more self-reported current cholesterol medication use in BRFSS and higher proportions with low education in ACT. Incorporating the weights had little impact on prevalence estimates for age-related macular degeneration or glaucoma. Weighted estimates were slightly higher for diabetic retinopathy (weighted 5.7% (95% Confidence Interval 4.3, 7.1); unweighted 4.1% (3.6, 4.6)) and cataract history (weighted 51.8% (49.6, 54.3); unweighted 48.6% (47.3, 49.9)). The weighted hazard ratio for recent diabetic retinopathy diagnosis and Alzheimer\'s disease was 1.84 (0.34, 4.29), versus 1.32 (0.87, 2.00) in unweighted ACT.
    UNASSIGNED: Most, but not all, associations were similar after participation weighting. Even in community-based cohorts, extending inferences to broader populations may benefit from evaluation with participation weights.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:PRIME-NL研究前瞻性地评估了一种针对帕金森病患者的新的综合和个性化护理模式,包括帕金森病,在荷兰的选定地区(PRIME)。我们通过检查基线和1年的依从性数据来解决PRIME-NL研究的选择和混淆偏差的普遍性和来源。
    方法:首先,我们使用荷兰几乎所有帕金森病患者(来源人群)的医疗保健索赔数据评估了PRIME和常规治疗(UC)地区之间的区域基线差异.第二,我们将我们的问卷样本与来源人群进行了比较,以确定泛化性。第三,我们通过比较PRIME和UC问卷样本的基线特征和1年依从性,调查了偏倚来源.
    结果:在PRIME(n=1430)和UC(n=26,250)来源人群中,基线特征相似。组合问卷样本(n=920)比组合来源人群更年轻,疾病持续时间稍长。与PRIME地区的问卷样本相比,UC问卷样本稍年轻,有更好的认知,疾病持续时间较长,有更高的教育程度和消耗更多的酒精。UC地区(96%)的问卷样本的1年依从性高于PRIME地区(92%)。
    结论:PRIME-NL研究的普遍性似乎很好,然而我们发现了一些选择偏差的证据。这种选择偏差需要使用先进的统计方法对PRIME-NL进行最终评估,例如逆概率加权或倾向得分匹配。PRIME-NL研究为慢性疾病患者的大规模护理评估的有效性提供了一个独特的窗口,在这种情况下帕金森症。
    BACKGROUND: The PRIME-NL study prospectively evaluates a new integrated and personalized care model for people with parkinsonism, including Parkinson\'s disease, in a selected region (PRIME) in the Netherlands. We address the generalizability and sources of selection and confounding bias of the PRIME-NL study by examining baseline and 1-year compliance data.
    METHODS: First, we assessed regional baseline differences between the PRIME and the usual care (UC) region using healthcare claims data of almost all people with Parkinson\'s disease in the Netherlands (the source population). Second, we compared our questionnaire sample to the source population to determine generalizability. Third, we investigated sources of bias by comparing the PRIME and UC questionnaire sample on baseline characteristics and 1-year compliance.
    RESULTS: Baseline characteristics were similar in the PRIME (n = 1430) and UC (n = 26,250) source populations. The combined questionnaire sample (n = 920) was somewhat younger and had a slightly longer disease duration than the combined source population. Compared to the questionnaire sample in the PRIME region, the UC questionnaire sample was slightly younger, had better cognition, had a longer disease duration, had a higher educational attainment and consumed more alcohol. 1-year compliance of the questionnaire sample was higher in the UC region (96%) than in the PRIME region (92%).
    CONCLUSIONS: The generalizability of the PRIME-NL study seems to be good, yet we found evidence of some selection bias. This selection bias necessitates the use of advanced statistical methods for the final evaluation of PRIME-NL, such as inverse probability weighting or propensity score matching. The PRIME-NL study provides a unique window into the validity of a large-scale care evaluation for people with a chronic disease, in this case parkinsonism.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    背景:深度学习已被越来越多地研究用于辅助临床体外受精(IVF)。许多任务的第一个技术步骤是视觉检测和定位精子,卵母细胞,和图像中的胚胎。对于此类深度学习模型的临床部署,不同的诊所使用不同的图像采集硬件和不同的样本预处理协议,这引起了人们的担忧,即一个诊所报告的深度学习模型的准确性是否可以在另一个诊所再现。在这里,我们旨在研究每个成像因素对目标检测模型的泛化性的影响,以精子分析为例。
    方法:使用最先进的检测人类精子的模型进行消融研究,以定量评估模型精度(假阳性检测)和召回率(漏检)如何受到成像放大倍数的影响。成像模式,和样品预处理方案。结果导致了以下假设:训练数据集中图像采集条件的丰富性确定性地影响模型的可泛化性。通过首先丰富具有广泛成像条件的训练数据集来检验该假设,然后通过新样本的内部盲测试和外部多中心临床验证进行验证。
    结果:消融实验表明,从训练数据集中删除数据子集会显著降低模型精度。从训练数据集中删除原始样本图像会导致模型精度的最大下降,而删除20倍图像导致模型召回的最大下降。通过将不同的成像和样本预处理条件整合到丰富的训练数据集中,该模型的精度实现了0.97的组内相关系数(ICC)(95%CI:0.94-0.99),召回的ICC为0.97(95%CI:0.93-0.99)。多中心临床验证表明,在不同的临床和应用中,模型精度或召回率没有显着差异。
    结论:结果验证了训练数据集中数据的丰富性是影响模型泛化性的关键因素的假设。这些发现强调了训练数据集对模型评估的多样性的重要性,并建议未来在男科和生殖医学中的深度学习模型应纳入全面的特征集,以增强跨临床的普遍性。
    BACKGROUND: Deep learning has been increasingly investigated for assisting clinical in vitro fertilization (IVF). The first technical step in many tasks is to visually detect and locate sperm, oocytes, and embryos in images. For clinical deployment of such deep learning models, different clinics use different image acquisition hardware and different sample preprocessing protocols, raising the concern over whether the reported accuracy of a deep learning model by one clinic could be reproduced in another clinic. Here we aim to investigate the effect of each imaging factor on the generalizability of object detection models, using sperm analysis as a pilot example.
    METHODS: Ablation studies were performed using state-of-the-art models for detecting human sperm to quantitatively assess how model precision (false-positive detection) and recall (missed detection) were affected by imaging magnification, imaging mode, and sample preprocessing protocols. The results led to the hypothesis that the richness of image acquisition conditions in a training dataset deterministically affects model generalizability. The hypothesis was tested by first enriching the training dataset with a wide range of imaging conditions, then validated through internal blind tests on new samples and external multi-center clinical validations.
    RESULTS: Ablation experiments revealed that removing subsets of data from the training dataset significantly reduced model precision. Removing raw sample images from the training dataset caused the largest drop in model precision, whereas removing 20x images caused the largest drop in model recall. by incorporating different imaging and sample preprocessing conditions into a rich training dataset, the model achieved an intraclass correlation coefficient (ICC) of 0.97 (95% CI: 0.94-0.99) for precision, and an ICC of 0.97 (95% CI: 0.93-0.99) for recall. Multi-center clinical validation showed no significant differences in model precision or recall across different clinics and applications.
    CONCLUSIONS: The results validated the hypothesis that the richness of data in the training dataset is a key factor impacting model generalizability. These findings highlight the importance of diversity in a training dataset for model evaluation and suggest that future deep learning models in andrology and reproductive medicine should incorporate comprehensive feature sets for enhanced generalizability across clinics.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:在一个中心的数据上训练的Radiomics模型通常在应用于外部中心的数据时表现出性能下降,阻碍他们引入大规模的临床实践。当前的专家建议建议仅使用通过多扫描仪测试-重新测试实验分离的可重现的影像组学特征。这可能有助于克服对外部数据的泛化能力有限的问题。
    目的:为了评估仅使用稳健的影像组学特征子集的影响,在先前的体内多MRI扫描仪测试重测研究中定义,关于影像组学模型的性能和泛化性。
    方法:回顾性。
    方法:单克隆浆细胞疾病患者。训练集(来自中心1的117个MRI);内部测试集(来自中心1的42个MRI);外部测试集(来自中心2-8的143个MRI)。
    1.5T和3.0T;T1加权涡轮自旋回波。
    结果:影像组学模型的任务是预测浆细胞浸润,通过骨髓活检确定,从MRI无创地检查。Radiomics机器学习模型,包括线性回归量,支持向量回归量(SVR),和随机森林回归器(RFR),接受了来自中心1的数据培训,使用所有的影像组学功能,或仅使用可重现的影像组学功能。在内部(中心1)和多中心外部数据集(中心2-8)上测试模型。
    方法:预测和实际浆细胞浸润之间的Pearson相关系数r和平均绝对误差(MAE)。费希尔的z变换,Wilcoxon符号秩检验,Wilcoxon秩和检验;显著性水平P<0.05。
    结果:当与所有特征相比仅使用可再现的特征时,SVR在外部测试集上的性能显著提高(r=0.43vs.r=0.18和MAE=22.6vs.MAE=28.2)。对于RFR,当只使用可重现的而不是所有的影像组学功能时,外部测试集上的性能会下降(r=0.33vs.r=0.44,P=0.29,MAE=21.9vs.MAE=20.5,P=0.10)。
    结论:仅使用可重复的影像组学功能可提高某些影像组学的外部性能,但不是所有的机器学习模型,并且没有自动导致整体最佳影像组学模型的外部性能的改善。
    方法:
    阶段2.
    BACKGROUND: Radiomics models trained on data from one center typically show a decline of performance when applied to data from external centers, hindering their introduction into large-scale clinical practice. Current expert recommendations suggest to use only reproducible radiomics features isolated by multiscanner test-retest experiments, which might help to overcome the problem of limited generalizability to external data.
    OBJECTIVE: To evaluate the influence of using only a subset of robust radiomics features, defined in a prior in vivo multi-MRI-scanner test-retest-study, on the performance and generalizability of radiomics models.
    METHODS: Retrospective.
    METHODS: Patients with monoclonal plasma cell disorders. Training set (117 MRIs from center 1); internal test set (42 MRIs from center 1); external test set (143 MRIs from center 2-8).
    UNASSIGNED: 1.5T and 3.0T; T1-weighted turbo spin echo.
    RESULTS: The task for the radiomics models was to predict plasma cell infiltration, determined by bone marrow biopsy, noninvasively from MRI. Radiomics machine learning models, including linear regressor, support vector regressor (SVR), and random forest regressor (RFR), were trained on data from center 1, using either all radiomics features, or using only reproducible radiomics features. Models were tested on an internal (center 1) and a multicentric external data set (center 2-8).
    METHODS: Pearson correlation coefficient r and mean absolute error (MAE) between predicted and actual plasma cell infiltration. Fisher\'s z-transformation, Wilcoxon signed-rank test, Wilcoxon rank-sum test; significance level P < 0.05.
    RESULTS: When using only reproducible features compared with all features, the performance of the SVR on the external test set significantly improved (r = 0.43 vs. r = 0.18 and MAE = 22.6 vs. MAE = 28.2). For the RFR, the performance on the external test set deteriorated when using only reproducible instead of all radiomics features (r = 0.33 vs. r = 0.44, P = 0.29 and MAE = 21.9 vs. MAE = 20.5, P = 0.10).
    CONCLUSIONS: Using only reproducible radiomics features improves the external performance of some, but not all machine learning models, and did not automatically lead to an improvement of the external performance of the overall best radiomics model.
    METHODS:
    UNASSIGNED: Stage 2.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    注册临床试验对现实世界临床实践的普适性受到两种环境中患者可比性的影响。我们将注册试验中癌症患者的特征与意大利的现实临床实践进行了比较。
    年龄数据,性别和表现状况(PS)来自意大利药品管理局(AIFA)开发的基于网络的监测注册中心和欧洲药品管理局(EMA)的欧洲公共评估报告(EPAR)中报告的相应注册试验.在注册表和试验中计算加权平均值,并描述差异。采用主成分分析和聚类分析进行多变量分析。
    从1月开始,2013年4月,2023年,在129个AIFA注册表中记录了419,461对独特的患者和治疗适应症。在140项相关试验中,已招募了87,452名患者。监测登记处的中位年龄和老年(≥65岁)患者的比率高于临床试验[中位年龄5.3岁,p<0.001;老年人率平均差异17.17%(95%CI1.06,1.48)]。总的来说,注册和试验之间女性患者的比率无差异[平均差异-0.55%(95%CI-1.06,-0.05)].在试验(3.1%)和注册(4.3%)中,PS恶化患者的平均发生率均较低,平均差异为1.27%(95%CI1.06,1.48)。通过多变量分析确定了两个集群:一个包括更多的登记处(较高的中位年龄和老年人率,女性比例较低,恶化患者的比率更高),其他更多的试验(较低的中位年龄和老年人率,女性比例较高,恶化患者的比率较低)。
    这项研究支持,参加试验的癌症患者在临床实践中仅部分代表在意大利接受过治疗的患者。应增加注册试验的包容性,以确保结果对现实世界人群的普遍性。
    部分由意大利卫生部支持。
    UNASSIGNED: Generalizability of registrative clinical trials to real-world clinical practice is influenced by comparability of patients in the two settings. We compared characteristics of cancer patients in registrative trials with real-world clinical practice in Italy.
    UNASSIGNED: Data on age, sex and performance status (PS) were derived from web-based monitoring registries developed by Italian Medicines Agency (AIFA) and corresponding registrative trials reported in the European Public Assessment Reports (EPAR) of European Medicines Agency (EMA). Weighted means were calculated in registries and trials and differences were described. Multivariate analysis was performed using Principal Component Analysis and Cluster Analysis.
    UNASSIGNED: From January, 2013 to April, 2023, 419,461 unique pairs of patients and therapeutic indications were recorded in 129 AIFA registries. Within 140 related trials, 87,452 patients had been enrolled. Median age and rate of elderly (≥65 years old) patients were higher in monitoring registries than in clinical trials [mean difference of median age 5.3 years, p < 0.001; mean difference of elderly rate 17.17% (95% CI 1.06, 1.48)]. Overall, rate of female patients was not different between registries and trials [mean difference -0.55% (95% CI -1.06, -0.05)]. Mean rate of patients with deteriorated PS was low both in trials (3.1%) and in registries (4.3%) with a mean difference of 1.27% (95% CI 1.06, 1.48). Two clusters were identified with multivariate analysis: one including more registries (higher median age and elderly rate, lower female rate, higher rate of deteriorated patients), the other more trials (lower median age and elderly rate, higher female rate, lower rate of deteriorated patients).
    UNASSIGNED: This study supports that cancer patients enrolled in trials do only partially represent those who have been treated in Italy in clinical practice. Inclusiveness of registrative trials should be increased to ensure generalizability of results to real-world population.
    UNASSIGNED: Partially supported by Italian Ministry of Health.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Comparative Study
    迄今为止,尚无基于人群的研究专门探讨同时针对广谱免疫介导的炎性疾病(IMID)的生物制剂关键随机临床试验(RCT)的外部有效性.这项研究的目的是,首先,比较RCTs和真实世界环境(RW)之间批准用于IMID的生物制剂的患者特征和中位治疗持续时间;其次,评估在现实环境中接受IMID治疗的生物使用者的程度,这些使用者不符合纳入每种使用适应症的关键RCT的资格。使用意大利VALORE分布式数据库(66,639名事件生物用户),在意大利现实环境中接受生物制剂治疗的IMID成年患者与纳入关键RCT的患者(45±15岁)相比,年龄显著增大(平均年龄±SD:50±15岁).在现实世界中,certolizumabpegol更常用于患有银屑病/强直性脊柱炎的成年女性(F/M比:1.8-1.9),而RCT(F/M比:0.5-0.6).在几乎所有使用适应症和大多数生物制剂中,RW中事件生物使用者的中位治疗持续时间(周)显着高于关键RCT的持续时间(4-100vs.6-167).此外,来自RW设置的几乎一半(46.4%)的生物使用者将没有资格纳入各自的适应症特异性关键RCT;主要原因是:高龄,最近的癌症史和其他伴随的IMID的存在。这些发现表明,应该优先考虑这些患者的生物制剂上市后监测。
    To date, no population-based studies have specifically explored the external validity of pivotal randomized clinical trials (RCTs) of biologics simultaneously for a broad spectrum of immuno-mediated inflammatory diseases (IMIDs). The aims of this study were, firstly, to compare the patients\' characteristics and median treatment duration of biologics approved for IMIDs between RCTs\' and real-world setting (RW); secondly, to assess the extent of biologic users treated for IMIDs in the real-world setting that would not have been eligible for inclusion into pivotal RCT for each indication of use. Using the Italian VALORE distributed database (66,639 incident biologic users), adult patients with IMIDs treated with biologics in the Italian real-world setting were substantially older (mean age ± SD: 50 ± 15 years) compared to those enrolled in pivotal RCTs (45 ± 15 years). In the real-world setting, certolizumab pegol was more commonly used by adult women with psoriasis/ankylosing spondylitis (F/M ratio: 1.8-1.9) compared to RCTs (F/M ratio: 0.5-0.6). The median treatment duration (weeks) of incident biologic users in RW was significantly higher than the duration of pivotal RCTs in almost all indications for use and most biologics (4-100 vs. 6-167). Furthermore, almost half (46.4%) of biologic users from RW settings would have been ineligible for inclusion in the respective indication-specific pivotal RCTs. The main reasons were: advanced age, recent history of cancer and presence of other concomitant IMIDs. These findings suggest that post-marketing surveillance of biologics should be prioritized for those patients.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

  • 文章类型: Journal Article
    虽然随机对照试验(RCT)对于确定新疗法的疗效至关重要,可以直接从试验数据中进行比较存在局限性.RCT仅限于少量的比较器臂,并且经常将新的治疗方法与已经证明有效的护理标准进行比较。评估新疗法相对于未在同一试验中评估的治疗的疗效有时是有意义的。例如安慰剂或在不同试验中评估的替代疗法。这种双重研究比较具有挑战性,因为试验人群之间的潜在差异可能会影响结果。在这篇文章中,考虑了两个桥接估计器,可以比较不同试验中评估的治疗方法,考虑试验人群中测量的差异。“多跨度”估计器利用两个试验之间的共享臂,而“单跨度”估计器不需要共享臂。提供了比较标准化共享臂中的结果的诊断统计数据。在模拟中比较了这两个估计器,其中,当满足识别假设时,两个估计器都显示出最小的经验偏差和名义置信区间覆盖率。估算器应用于AIDS临床试验组320和388的数据,以比较两种药物和四种药物抗逆转录病毒疗法对晚期HIV患者CD4细胞计数的疗效。单跨度方法需要较弱的识别假设,并且在模拟和应用中更有效。
    While randomized controlled trials (RCTs) are critical for establishing the efficacy of new therapies, there are limitations regarding what comparisons can be made directly from trial data. RCTs are limited to a small number of comparator arms and often compare a new therapeutic to a standard of care which has already proven efficacious. It is sometimes of interest to estimate the efficacy of the new therapy relative to a treatment that was not evaluated in the same trial, such as a placebo or an alternative therapy that was evaluated in a different trial. Such dual-study comparisons are challenging because of potential differences between trial populations that can affect the outcome. In this article, two bridging estimators are considered that allow for comparisons of treatments evaluated in different trials, accounting for measured differences in trial populations. A \"multi-span\" estimator leverages a shared arm between two trials, while a \"single-span\" estimator does not require a shared arm. A diagnostic statistic that compares the outcome in the standardized shared arms is provided. The two estimators are compared in simulations, where both estimators demonstrate minimal empirical bias and nominal confidence interval coverage when the identification assumptions are met. The estimators are applied to data from the AIDS Clinical Trials Group 320 and 388 to compare the efficacy of two-drug vs four-drug antiretroviral therapy on CD4 cell counts among persons with advanced HIV. The single-span approach requires weaker identification assumptions and was more efficient in simulations and the application.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    目的:分散式临床试验(DCT)方法是一些或所有试验活动更接近参与者而不是传统的研究地点的临床试验。来自DCT的数据可用于卫生技术评估(HTA)机构的临床和经济评估,以支持报销决策。本研究旨在通过采访来自欧洲HTA机构的代表,从HTA的角度探讨DCT方法的机遇和挑战。
    方法:我们在2022年9月至2023年2月期间对25名欧洲HTA代表进行了半结构化访谈,并在主题分析后对成绩单进行了分析。
    结果:从与(i)HTA中的DCT方法有关的数据中确定了两个主要主题,以及(ii)试验级别的接受度和相关性。评估DCT的经验有限,并且观察到有关DCT的各种知识。当参与者报告的结果数据可以在家中更频繁和方便地收集时,受访者认识到了减少回忆偏差的机会。当参与者负责数据收集时,人们对数据质量表示担忧。尽管有这样的挑战,受访者认识到,由于数据可以在反映日常情况的环境下从更多样化的参与者群体中收集,因此可以提高结果的普遍性.
    结论:当在现实世界中从不同的参与者组中收集数据时,DCT可以为HTA决策生成相关结果。提高对机遇和挑战的认识可以帮助HTA评估员评估DCT方法。
    Decentralized clinical trial (DCT) approaches are clinical trials in which some or all trial activities take place closer to participants\' proximities instead of a traditional investigative site. Data from DCTs may be used for clinical and economic evaluations by health technology assessment (HTA) bodies to support reimbursement decision making. This study aimed to explore the opportunities and challenges for DCT approaches from an HTA perspective by interviewing representatives from European HTA bodies.
    We conducted semistructured interviews with 25 European HTA representatives between September 2022 and February 2023, and transcripts were analyzed after thematic analysis.
    Two main themes were identified from the data relating to (1) DCT approaches in HTA and (2) trial-level acceptance and relevance. Experience with assessing DCTs was limited and a variety of knowledge about DCTs was observed. The respondents recognized the opportunity of DCTs to reduce recall bias when participant-reported outcome data can be collected more frequently and conveniently from home. Concerns were expressed about the data quality when participants become responsible for data collection. Despite this challenge, the respondents recognized the potential of DCTs to increase the generalizability of results because data can be collected in a setting reflective of the everyday situation potentially from a more diverse participant group.
    DCTs could generate relevant results for HTA decision making when data are collected in a real-world setting from a diverse participant group. Increased awareness of the opportunities and challenges could help HTA assessors in their appraisal of DCT approaches.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

    求助全文

公众号