functional data analysis

功能数据分析
  • 文章类型: Journal Article
    贝叶斯方法在功能数据分析应用程序中提供直接推理,而无需依赖自举技术。功能数据应用中的主要工具是功能主成分分析,它将数据分解为常见的均值函数,并确定主要的变化方向。贝叶斯功能主成分分析(BFPCA)通过获得的后验样本对估计的功能模型成分进行不确定性量化。我们提出了基于功能深度的BFPCA的中央后验包络(CPE)作为描述性可视化工具,以总结估计的功能模型组件的后验样本的变化,有助于BFPCA中的不确定性量化。提出的BFPCA依赖于潜在因子模型,并在混合效应建模框架内使用方差分量上的修改的乘法伽马过程收缩先验来瞄准模型参数。函数深度为函数样本提供了一个中心向外的顺序。我们利用修改的带深度和修改的体积深度来排序函数和曲面的样本,分别,在BFPCA框架内的均值和特征函数的CPE处推导。拟议的CPE在广泛的模拟中得到了展示。最后,所提出的CPE被应用于分析来自静息状态脑电图(EEG)的功率谱密度(PSD)样本,在这些样本中,他们为被诊断为自闭症谱系障碍的儿童和他们的典型发展中的同龄人之间的诊断组差异提供了新的见解.
    Bayesian methods provide direct inference in functional data analysis applications without reliance on bootstrap techniques. A major tool in functional data applications is the functional principal component analysis which decomposes the data around a common mean function and identifies leading directions of variation. Bayesian functional principal components analysis (BFPCA) provides uncertainty quantification on the estimated functional model components via the posterior samples obtained. We propose central posterior envelopes (CPEs) for BFPCA based on functional depth as a descriptive visualization tool to summarize variation in the posterior samples of the estimated functional model components, contributing to uncertainty quantification in BFPCA. The proposed BFPCA relies on a latent factor model and targets model parameters within a mixed effects modeling framework using modified multiplicative gamma process shrinkage priors on the variance components. Functional depth provides a center-outward order to a sample of functions. We utilize modified band depth and modified volume depth for ordering of a sample of functions and surfaces, respectively, to derive at CPEs of the mean and eigenfunctions within the BFPCA framework. The proposed CPEs are showcased in extensive simulations. Finally, the proposed CPEs are applied to the analysis of a sample of power spectral densities (PSD) from resting state electroencephalography (EEG) where they lead to novel insights on diagnostic group differences among children diagnosed with autism spectrum disorder and their typically developing peers across age.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    身体活动(PA)与许多健康结果显着相关。近年来,基于可穿戴加速度计的活动跟踪器的广泛使用为深入研究PA及其与健康结果和干预措施的关系提供了独特的机会。过去对活动跟踪器数据的分析在很大程度上依赖于将分钟级PA记录汇总为日级汇总统计数据,其中丢失了PA时间/昼夜模式的重要信息。在本文中,我们提出了一种基于黎曼流形的新型功能数据分析方法,用于对PA及其纵向变化进行建模。我们将一天的平滑分钟级PA建模为一维黎曼流形,并将不同访问中PA的纵向变化建模为流形之间的变形。一组受试者中PA变化的变异性通过变形的变异性来表征。进一步采用功能主成分分析对变形进行建模,和PC评分用作模拟PA变化与健康结果和/或干预措施之间关系的代理。我们对两项临床试验的数据进行了全面分析:接触健康(RfH)和代谢,UCSD的运动和营养(菜单),重点关注干预措施对PA模式纵向变化的影响,以及PA变化的不同模式如何影响体重减轻,分别。所提出的方法揭示了独特的变化模式,包括整体增强PA,增强上午PA,以及每个研究队列特有的活动时间的变化。该结果为PA和健康的纵向变化研究带来了新的见解,并有可能促进有效的健康干预措施和指南的设计。
    Physical activity (PA) is significantly associated with many health outcomes. The wide usage of wearable accelerometer-based activity trackers in recent years has provided a unique opportunity for in-depth research on PA and its relations with health outcomes and interventions. Past analysis of activity tracker data relies heavily on aggregating minute-level PA records into day-level summary statistics in which important information of PA temporal/diurnal patterns is lost. In this paper we propose a novel functional data analysis approach based on Riemann manifolds for modeling PA and its longitudinal changes. We model smoothed minute-level PA of a day as one-dimensional Riemann manifolds and longitudinal changes in PA in different visits as deformations between manifolds. The variability in changes of PA among a cohort of subjects is characterized via variability in the deformation. Functional principal component analysis is further adopted to model the deformations, and PC scores are used as a proxy in modeling the relation between changes in PA and health outcomes and/or interventions. We conduct comprehensive analyses on data from two clinical trials: Reach for Health (RfH) and Metabolism, Exercise and Nutrition at UCSD (MENU), focusing on the effect of interventions on longitudinal changes in PA patterns and how different modes of changes in PA influence weight loss, respectively. The proposed approach reveals unique modes of changes, including overall enhanced PA, boosted morning PA, and shifts of active hours specific to each study cohort. The results bring new insights into the study of longitudinal changes in PA and health and have the potential to facilitate designing of effective health interventions and guidelines.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • DOI:
    文章类型: Journal Article
    通过连续葡萄糖监测(CGM)收集的葡萄糖餐反应信息与评估个体代谢状态和支持个性化饮食处方有关。然而,CGM监控器产生的数据的复杂性推动了现有分析方法的局限性。CGM数据通常表现出很大的人内变异性,并且具有自然的多级结构。这项研究的动机是分析AEGIS研究中没有糖尿病的个体的CGM数据。该数据集包括每个人在不同天的进餐时间和营养的详细信息。这项研究的主要重点是检查患者进餐后的CGM葡萄糖反应,并探索与饮食和患者特征的时间依赖性关联。出于这个问题,我们提出了一个基于多层次功能模型的新分析框架,包括一个新的函数混合R平方系数。这些模型的使用说明了3个关键点:(i)在提出饮食建议时,分析整个功能领域的葡萄糖反应的重要性;(ii)血糖正常和糖尿病前期患者之间的差异代谢反应,特别是关于脂质摄入;(Iii)包括随机的重要性,在对这个科学问题进行建模时,人层面的影响。
    Glucose meal response information collected via Continuous Glucose Monitoring (CGM) is relevant to the assessment of individual metabolic status and the support of personalized diet prescriptions. However, the complexity of the data produced by CGM monitors pushes the limits of existing analytic methods. CGM data often exhibits substantial within-person variability and has a natural multilevel structure. This research is motivated by the analysis of CGM data from individuals without diabetes in the AEGIS study. The dataset includes detailed information on meal timing and nutrition for each individual over different days. The primary focus of this study is to examine CGM glucose responses following patients\' meals and explore the time-dependent associations with dietary and patient characteristics. Motivated by this problem, we propose a new analytical framework based on multilevel functional models, including a new functional mixed R-square coefficient. The use of these models illustrates 3 key points: (i) The importance of analyzing glucose responses across the entire functional domain when making diet recommendations; (ii) The differential metabolic responses between normoglycemic and prediabetic patients, particularly with regards to lipid intake; (iii) The importance of including random, person-level effects when modelling this scientific problem.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    物联网(IoT)技术的进步使智能和可穿戴传感器的实现成为可能,可用于为老年人提供负担得起且可获得的连续生物卫生学状态监测。这些监测数据的质量,然而,由于各种干扰引起的过多噪声而不能令人满意,如运动伪影。现有方法利用汇总统计,例如平均值或中值,去噪,不考虑数据中嵌入的生物卫生学模式。在这项研究中,提出了一种功能数据分析建模方法,通过从历史数据中学习个体受试者的昼夜心率(HR)模式来提高数据质量,通过融合新收集的数据进一步改进。这种提出的数据融合方法是基于贝叶斯推理框架开发的。一项前瞻性研究的HR分析证明了其有效性,该研究涉及居住在辅助生活或家庭环境中的老年人。结果表明,通过估计个性化的HR模式来进行个性化的医疗保健势在必行。此外,与原始HR和常规方法相比,所提出的校准方法提供了更准确(更小的平均误差)和更精确(更小的误差标准偏差)的HR估计,比如意思。
    The advancements of Internet of Things (IoT) technologies have enabled the implementation of smart and wearable sensors, which can be employed to provide older adults with affordable and accessible continuous biophysiological status monitoring. The quality of such monitoring data, however, is unsatisfactory due to excessive noise induced by various disturbances, such as motion artifacts. Existing methods take advantage of summary statistics, such as mean or median values, for denoising, without taking into account the biophysiological patterns embedded in data. In this research, a functional data analysis modeling method was proposed to enhance the data quality by learning individual subjects\' diurnal heart rate (HR) patterns from historical data, which were further improved by fusing newly collected data. This proposed data-fusion approach was developed based on a Bayesian inference framework. Its effectiveness was demonstrated in an HR analysis from a prospective study involving older adults residing in assisted living or home settings. The results indicate that it is imperative to conduct personalized healthcare by estimating individualized HR patterns. Furthermore, the proposed calibration method provides a more accurate (smaller mean errors) and more precise (smaller error standard deviations) HR estimation than raw HR and conventional methods, such as the mean.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    鉴于与大麻损害相关的交通安全和职业伤害预防影响,有必要对最近的大麻使用采取客观和有效的措施。瞳孔光响应可以提供用于检测的方法。
    84名参与者(平均年龄:32岁,42%为女性)每天,偶尔,和不使用大麻的使用历史参加瞳孔光反应测试前后随意吸食大麻或放松15分钟(不使用)。使用功能数据分析工具对最近的大麻消费对瞳孔光响应轨迹的影响进行了建模。比较了检测近期大麻使用的Logistic回归模型,以及自进行光测试以来,使用大麻组和时间的平均瞳孔轨迹进行了估计。
    模型显示小,偶尔使用组与不使用对照组相比,使用大麻后瞳孔对光的反应存在显着差异,与日常使用组与不使用比较组相比,瞳孔反应模式的统计学差异相似。使用功能数据分析估计的瞳孔光响应轨迹发现,与不吸烟相比,急性大麻吸烟与较少的初始和持续瞳孔收缩相关。
    这些分析显示了配对瞳孔光响应和功能数据分析方法以评估最近的大麻使用的前景。
    UNASSIGNED: Given the traffic safety and occupational injury prevention implications associated with cannabis impairment, there is a need for objective and validated measures of recent cannabis use. Pupillary light response may offer an approach for detection.
    UNASSIGNED: Eighty-four participants (mean age: 32, 42% female) with daily, occasional, and no-use cannabis use histories participated in pupillary light response tests before and after smoking cannabis ad libitum or relaxing for 15 min (no use). The impact of recent cannabis consumption on trajectories of the pupillary light response was modeled using functional data analysis tools. Logistic regression models for detecting recent cannabis use were compared, and average pupil trajectories across cannabis use groups and times since light test administration were estimated.
    UNASSIGNED: Models revealed small, significant differences in pupil response to light after cannabis use comparing the occasional use group to the no-use control group, and similar statistically significant differences in pupil response patterns comparing the daily use group to the no-use comparison group. Trajectories of pupillary light response estimated using functional data analysis found that acute cannabis smoking was associated with less initial and sustained pupil constriction compared to no cannabis smoking.
    UNASSIGNED: These analyses show the promise of pairing pupillary light response and functional data analysis methods to assess recent cannabis use.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    常见的血糖异常测量包括空腹血糖(FPG),口服葡萄糖耐量试验(OGTT)得出的2小时血浆葡萄糖,和血红蛋白A1c(HbA1c)对儿童有局限性。动态OGTT葡萄糖和胰岛素反应可以更好地反映潜在的生理学。该分析利用双相分类评估了葡萄糖和胰岛素曲线的形状,单相,或单调增加和功能主成分(FPC)来预测未来的血糖异常。前瞻性队列包括671名以前没有糖尿病诊断的参与者(BMI百分位数≥85,8-18岁);193人返回随访(中位数14.5个月)。在2小时OGTT期间每30分钟收集血液。对总结葡萄糖和胰岛素反应的曲线进行功能数据分析。FPC描述的曲线高度变化(FPC1),峰值时间(FPC2),和振荡(FPC3)。在基线,血糖和胰岛素FPC1与BMI百分位数显著相关(Spearman相关r=0.22和0.48),甘油三酯(r=0.30和0.39),和HbA1c(r=0.25和0.17)。在纵向逻辑回归分析中,葡萄糖和胰岛素FPC预测未来血糖异常(AUC=0.80)优于形状分类(AUC=0.69),HbA1c(AUC=0.72),或FPG(AUC=0.50)。进一步的研究应该评估FPC预测代谢疾病的实用性。
    Common dysglycemia measurements including fasting plasma glucose (FPG), oral glucose tolerance test (OGTT)-derived 2 h plasma glucose, and hemoglobin A1c (HbA1c) have limitations for children. Dynamic OGTT glucose and insulin responses may better reflect underlying physiology. This analysis assessed glucose and insulin curve shapes utilizing classifications-biphasic, monophasic, or monotonically increasing-and functional principal components (FPCs) to predict future dysglycemia. The prospective cohort included 671 participants with no previous diabetes diagnosis (BMI percentile ≥ 85th, 8-18 years old); 193 returned for follow-up (median 14.5 months). Blood was collected every 30 min during the 2 h OGTT. Functional data analysis was performed on curves summarizing glucose and insulin responses. FPCs described variation in curve height (FPC1), time of peak (FPC2), and oscillation (FPC3). At baseline, both glucose and insulin FPC1 were significantly correlated with BMI percentile (Spearman correlation r = 0.22 and 0.48), triglycerides (r = 0.30 and 0.39), and HbA1c (r = 0.25 and 0.17). In longitudinal logistic regression analyses, glucose and insulin FPCs predicted future dysglycemia (AUC = 0.80) better than shape classifications (AUC = 0.69), HbA1c (AUC = 0.72), or FPG (AUC = 0.50). Further research should evaluate the utility of FPCs to predict metabolic diseases.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    考虑到功能数据分析的背景,我们通过Gibbs采样器开发并应用了一种新的贝叶斯方法,以选择用于有限表示函数数据的基函数。所提出的方法使用伯努利潜在变量将具有正概率的某些基函数系数分配为零。该过程允许自适应基础选择,因为它可以确定基础的数量以及应该选择哪些来表示功能数据。此外,所提出的程序测量选择过程的不确定性,可以同时应用于多条曲线。开发的方法可以处理由于实验误差和受试者之间的随机个体差异而可能不同的观察曲线,可以在涉及巴西每日COVID-19病例数的真实数据集应用程序中观察到。仿真研究表明了所提出方法的主要性质,例如,它在估计系数方面的准确性以及找到真正的基函数集的过程的强度。尽管是在功能数据分析的背景下开发的,我们还通过仿真将提出的模型与完善的LASSO和贝叶斯LASSO进行了比较,这是针对非功能性数据开发的方法。
    Considering the context of functional data analysis, we developed and applied a new Bayesian approach via the Gibbs sampler to select basis functions for a finite representation of functional data. The proposed methodology uses Bernoulli latent variables to assign zero to some of the basis function coefficients with a positive probability. This procedure allows for an adaptive basis selection since it can determine the number of bases and which ones should be selected to represent functional data. Moreover, the proposed procedure measures the uncertainty of the selection process and can be applied to multiple curves simultaneously. The methodology developed can deal with observed curves that may differ due to experimental error and random individual differences between subjects, which one can observe in a real dataset application involving daily numbers of COVID-19 cases in Brazil. Simulation studies show the main properties of the proposed method, such as its accuracy in estimating the coefficients and the strength of the procedure to find the true set of basis functions. Despite having been developed in the context of functional data analysis, we also compared the proposed model via simulation with the well-established LASSO and Bayesian LASSO, which are methods developed for non-functional data.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    环境流行病学研究通常利用总体健康结果来估计短期影响(例如,daily)exposuresthatareavailableatincreasinglyfinespaceresolutions.然而,面积平均值通常用于得出人口水平的暴露,无法捕获可能在感兴趣的空间和时间单位内发生的曝光的空间变化和个体异质性(例如,在一天或邮政编码内)。我们提出了一种通用的建模方法,通过暴露分位数函数将单元内暴露异质性纳入健康分析。此外,通过将曝光分位数函数视为功能协变量,我们的方法在表征不同分位数水平的关联方面提供了额外的灵活性.我们将拟议的方法应用于分析亚特兰大四年来的空气污染和急诊室(ED)访问。该分析利用了从随机人类暴露和剂量模拟器模拟的4种与交通相关的环境空气污染物的每日ZIP代码级分布。我们的分析发现,一氧化碳对呼吸和心血管疾病ED就诊的影响随着人群暴露量的低分位数的变化而更加明显。用于实现的软件在R包nbRegQF中提供。
    Environmental epidemiologic studies routinely utilize aggregate health outcomes to estimate effects of short-term (eg, daily) exposures that are available at increasingly fine spatial resolutions. However, areal averages are typically used to derive population-level exposure, which cannot capture the spatial variation and individual heterogeneity in exposures that may occur within the spatial and temporal unit of interest (eg, within a day or ZIP code). We propose a general modeling approach to incorporate within-unit exposure heterogeneity in health analyses via exposure quantile functions. Furthermore, by viewing the exposure quantile function as a functional covariate, our approach provides additional flexibility in characterizing associations at different quantile levels. We apply the proposed approach to an analysis of air pollution and emergency department (ED) visits in Atlanta over 4 years. The analysis utilizes daily ZIP code-level distributions of personal exposures to 4 traffic-related ambient air pollutants simulated from the Stochastic Human Exposure and Dose Simulator. Our analyses find that effects of carbon monoxide on respiratory and cardiovascular disease ED visits are more pronounced with changes in lower quantiles of the population\'s exposure. Software for implement is provided in the R package nbRegQF.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    地中海果蝇(medvies)或其他果蝇的早龄活动与其寿命之间的关系研究不多,与寿命和饮食之间的联系相反,性信号,和繁殖。这项研究的目的是评估雌性地中海果蝇的日间和日常活动概况及其作为长寿生物标志物的作用,并探索这些活动概况之间的关系。饮食,和死亡年龄贯穿一生。我们使用来自功能数据分析(FDA)的先进统计方法。可以区分出早期活动谱中三种不同的活动变化模式。低热量饮食与活动高峰延迟有关,而高热量饮食与较早的活动高峰有关。我们发现,个体medfes的死亡年龄与他们早年的活动概况有关。死亡风险增加与早期活动增加有关,以及白天和夜间活动之间的对比度更高。相反,当饲喂中等热量的饮食,并且当它们的日常活动在早期和白天和夜间之间分布更均匀时,它们的寿命更有可能更长。medfly的死前活动曲线显示出两种特征的死前模式,一种模式的特征是日常活动缓慢下降,另一种是活动突然下降,然后死亡。
    The relationship between the early-age activity of Mediterranean fruit flies (medflies) or other fruit flies and their lifespan has not been much studied, in contrast to the connections between lifespan and diet, sexual signaling, and reproduction. The objective of this study is to assess intra-day and day-to-day activity profiles of female Mediterranean fruit flies and their role as biomarker of longevity as well as to explore the relationships between these activity profiles, diet, and age-at-death throughout the lifespan. We use advanced statistical methods from functional data analysis (FDA). Three distinct patterns of activity variations in early-age activity profiles can be distinguished. A low-caloric diet is associated with a delayed activity peak, while a high-caloric diet is linked with an earlier activity peak. We find that age-at-death of individual medflies is connected to their activity profiles in early life. An increased risk of mortality is associated with increased activity in early age, as well as with a higher contrast between daytime and nighttime activity. Conversely, medflies are more likely to have a longer lifespan when they are fed a medium-caloric diet and when their daily activity is more evenly distributed across the early-age span and between daytime and nighttime. The before-death activity profile of medflies displays two characteristic before-death patterns, where one pattern is characterized by slowly declining daily activity and the other by a sudden decline in activity that is followed by death.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

  • 文章类型: Journal Article
    背景:生命过程流行病学检查生命不同阶段的重复风险测量与健康结果之间的关联。实证研究,然而,通常基于离散时间模型,这些模型假设零星的测量场合完全捕获了潜在的长期连续风险过程。
    方法:我们提出(i)功能相关生命历程模型(fRLM),重复治疗,作为未观察到的连续过程的离散风险度量,和(Ii)测试程序,以指定数据对应于生命过程流行病学概念模型的概率(关键时期,敏感期和累积模型)。通过仿真评估了fRLM的性能,并且通过将体重指数(BMI)与慢性肾脏疾病的mRNA-seq特征相关联的经验应用来说明该方法,炎症和乳腺癌。
    结果:模拟显示,fRLM通过三到五次重复的风险评估和400名受试者确定了正确的生命历程模型。经验例子表明,慢性肾脏疾病反映了关键时期的过程,炎症和乳腺癌可能反映了敏感期机制。
    结论:拟议的fRLM将重复的风险度量视为连续过程,在现实的数据场景下,该方法提供了准确的概率,即数据与通常研究的生命过程流行病学模型相对应。fRLM是用公开可用的软件实现的。
    BACKGROUND: Life course epidemiology examines associations between repeated measures of risk and health outcomes across different phases of life. Empirical research, however, is often based on discrete-time models that assume that sporadic measurement occasions fully capture underlying long-term continuous processes of risk.
    METHODS: We propose (i) the functional relevant life course model (fRLM), which treats repeated, discrete measures of risk as unobserved continuous processes, and (ii) a testing procedure to assign probabilities that the data correspond to conceptual models of life course epidemiology (critical period, sensitive period and accumulation models). The performance of the fRLM is evaluated with simulations, and the approach is illustrated with empirical applications relating body mass index (BMI) to mRNA-seq signatures of chronic kidney disease, inflammation and breast cancer.
    RESULTS: Simulations reveal that fRLM identifies the correct life course model with three to five repeated assessments of risk and 400 subjects. The empirical examples reveal that chronic kidney disease reflects a critical period process and inflammation and breast cancer likely reflect sensitive period mechanisms.
    CONCLUSIONS: The proposed fRLM treats repeated measures of risk as continuous processes and, under realistic data scenarios, the method provides accurate probabilities that the data correspond to commonly studied models of life course epidemiology. fRLM is implemented with publicly-available software.
    导出

    更多引用

    收藏

    翻译标题摘要

    我要上传

       PDF(Pubmed)

公众号