Factor models

  • 文章类型: Journal Article
    This study develops a model-based index approach called the Generalised Shared Component Model (GSCM) by drawing on the large field of factor models. The proposed fully Bayesian approach accommodates heteroscedastic model error, multiple shared factors and flexible spatial priors. Moreover, unlike previous index approaches, our model provides indices with uncertainty. Focusing on unhealthy behaviors that increase the risk of cancer, the proposed GSCM is used to develop the Area Indices of Behaviors Impacting Cancer product - representing the first area level cancer risk factor index in Australia. This advancement aids in identifying communities with elevated cancer risk, facilitating targeted health interventions.






  • 文章类型: Journal Article
    Elucidating the neural mechanisms of general cognitive ability (GCA) is an important mission of cognitive neuroscience. Recent large-sample cohort studies measured GCA through multiple cognitive tasks and explored its neural basis, but they did not investigate how task number, factor models, and neural data type affect the estimation of GCA and its neural correlates. To address these issues, we tested 1,605 Chinese young adults with 19 cognitive tasks and Raven\'s Advanced Progressive Matrices (RAPM) and collected resting state and n-back task fMRI data from a subsample of 683 individuals. Results showed that GCA could be reliably estimated by multiple tasks. Increasing task number enhances both reliability and validity of GCA estimates and reliably strengthens their correlations with brain data. The Spearman model and hierarchical bifactor model yield similar GCA estimates. The bifactor model has better model fit and stronger correlation with RAPM but explains less variance and shows weaker correlations with brain data than does the Spearman model. Notably, the n-back task-based functional connectivity patterns outperform resting-state fMRI in predicting GCA. These results suggest that GCA derived from a multitude of cognitive tasks serves as a valid measure of general intelligence and that its neural correlates could be better characterized by task fMRI than resting-state fMRI data.






  • 文章类型: Journal Article
    In modern scientific research, data heterogeneity is commonly observed owing to the abundance of complex data. We propose a factor regression model for data with heterogeneous subpopulations. The proposed model can be represented as a decomposition of heterogeneous and homogeneous terms. The heterogeneous term is driven by latent factors in different subpopulations. The homogeneous term captures common variation in the covariates and shares common regression coefficients across subpopulations. Our proposed model attains a good balance between a global model and a group-specific model. The global model ignores the data heterogeneity, while the group-specific model fits each subgroup separately. We prove the estimation and prediction consistency for our proposed estimators, and show that it has better convergence rates than those of the group-specific and global models. We show that the extra cost of estimating latent factors is asymptotically negligible and the minimax rate is still attainable. We further demonstrate the robustness of our proposed method by studying its prediction error under a mis-specified group-specific model. Finally, we conduct simulation studies and analyze a data set from the Alzheimer\'s Disease Neuroimaging Initiative and an aggregated microarray data set to further demonstrate the competitiveness and interpretability of our proposed factor regression model.






  • 文章类型: Journal Article
    Recent technological advances have made it possible to measure multiple types of many features in biomedical studies. However, some data types or features may not be measured for all study subjects because of cost or other constraints. We use a latent variable model to characterize the relationships across and within data types and to infer missing values from observed data. We develop a penalized-likelihood approach for variable selection and parameter estimation and devise an efficient expectation-maximization algorithm to implement our approach. We establish the asymptotic properties of the proposed estimators when the number of features increases at a polynomial rate of the sample size. Finally, we demonstrate the usefulness of the proposed methods using extensive simulation studies and provide an application to a motivating multi-platform genomics study.






  • 文章类型: Journal Article
    Factor analysis is a widely used tool for unsupervised dimensionality reduction of high-throughput datasets in molecular biology, with recently proposed extensions designed specifically for spatial transcriptomics data. However, these methods expect (count) matrices as data input and are therefore not directly applicable to single molecule resolution data, which are in the form of coordinate lists annotated with genes and provide insight into subcellular spatial expression patterns. To address this, we here propose FISHFactor, a probabilistic factor model that combines the benefits of spatial, non-negative factor analysis with a Poisson point process likelihood to explicitly model and account for the nature of single molecule resolution data. In addition, FISHFactor shares information across a potentially large number of cells in a common weight matrix, allowing consistent interpretation of factors across cells and yielding improved latent variable estimates.
    We compare FISHFactor to existing methods that rely on aggregating information through spatial binning and cannot combine information from multiple cells and show that our method leads to more accurate results on simulated data. We show that our method is scalable and can be readily applied to large datasets. Finally, we demonstrate on a real dataset that FISHFactor is able to identify major subcellular expression patterns and spatial gene clusters in a data-driven manner.
    The model implementation, data simulation and experiment scripts are available under https://www.github.com/bioFAM/FISHFactor.






  • 文章类型: Journal Article
    This study empirically analyzes time series momentum (TSM) in the European equity market between 2000 & 2020. The study produces additional evidence on TSM where a significant and persistent market price anomaly enables investors to earn abnormal returns. To achieve this goal the present study implements a pooled autoregressive model to test the predictability power of European equity indices of future returns. The results indicate that strategies based on TSM are in line with the discussed literature and enable market agents to earn returns above the market (0.71% per month) by using a six-factor model.






  • 文章类型: Journal Article
    Neural activity is often described in terms of population-level factors extracted from the responses of many neurons. Factors provide a lower-dimensional description with the aim of shedding light on network computations. Yet, mechanistically, computations are performed not by continuously valued factors but by interactions among neurons that spike discretely and variably. Models provide a means of bridging these levels of description. We developed a general method for training model networks of spiking neurons by leveraging factors extracted from either data or firing-rate-based networks. In addition to providing a useful model-building framework, this formalism illustrates how reliable and continuously valued factors can arise from seemingly stochastic spiking. Our framework establishes procedures for embedding this property in network models with different levels of realism. The relationship between spikes and factors in such networks provides a foundation for interpreting (and subtly redefining) commonly used quantities such as firing rates.






  • 文章类型: Journal Article
    We demonstrate how a linear factor model with latent variables can be used to estimate correlations between the outcomes of clinical trials. These correlations are needed for many policy questions of drug/vaccine development (such as calculating the optimal size of financial incentives) and the literature so far has relied on expert opinions. We apply our methodology to the case of vaccines and show that the estimated correlations are highly significant. We also illustrate how the estimated correlations can be used to find the probability of obtaining a successful vaccine out of a certain number of candidates and to determine optimal investment in vaccine development.







  • 文章类型: Journal Article
    Current diagnosis of neurological disorders often relies on late-stage clinical symptoms, which poses barriers to developing effective interventions at the premanifest stage. Recent research suggests that biomarkers and subtle changes in clinical markers may occur in a time-ordered fashion and can be used as indicators of early disease. In this article, we tackle the challenges to leverage multidomain markers to learn early disease progression of neurological disorders. We propose to integrate heterogeneous types of measures from multiple domains (e.g., discrete clinical symptoms, ordinal cognitive markers, continuous neuroimaging, and blood biomarkers) using a hierarchical Multilayer Exponential Family Factor (MEFF) model, where the observations follow exponential family distributions with lower-dimensional latent factors. The latent factors are decomposed into shared factors across multiple domains and domain-specific factors, where the shared factors provide robust information to perform extensive phenotyping and partition patients into clinically meaningful and biologically homogeneous subgroups. Domain-specific factors capture remaining unique variations for each domain. The MEFF model also captures nonlinear trajectory of disease progression and orders critical events of neurodegeneration measured by each marker. To overcome computational challenges, we fit our model by approximate inference techniques for large-scale data. We apply the developed method to Parkinson\'s Progression Markers Initiative data to integrate biological, clinical, and cognitive markers arising from heterogeneous distributions. The model learns lower-dimensional representations of Parkinson\'s disease (PD) and the temporal ordering of the neurodegeneration of PD.






  • 文章类型: Journal Article
    This paper expands the analysis of the cyclical characteristics of social spending by providing information on its joint behaviour across OECD countries. With this aim we propose the use of dynamic factor analysis and recursive models to estimate synchronization and cyclicality of social policies within a broad perspective. By considering the synchronization of social spending it is possible to assess the short-run characteristics of the joint response to changes in the economic cycle. We find that synchronization of social spending was only possible for advanced economies, achieving the highest countercyclical stabilization effect during the Global Financial Crisis. Emerging market economies are not able to join the synchronized response, maintaining independent and, in most cases, procyclical stances in the behaviour of their social policies.
    UNASSIGNED: The online version contains supplementary material available at 10.1007/s10663-022-09545-w.





