
  • 文章类型: Journal Article
    Multicollinearity, characterized by significant co-expression patterns among genes, often occurs in high-throughput expression data, potentially impacting the predictive model\'s reliability. This study examined multicollinearity among closely related genes, particularly in RNA-Seq data obtained from embryoid bodies (EB) exposed to 5-fluorouracil perturbation to identify genes associated with embryotoxicity. Six genes-Dppa5a, Gdf3, Zfp42, Meis1, Hoxa2, and Hoxb1-emerged as candidates based on domain knowledge and were validated using qPCR in EBs perturbed by 39 test substances. We conducted correlation studies and utilized the variance inflation factor (VIF) to examine the existence of multicollinearity among the genes. Recursive feature elimination with cross-validation (RFECV) ranked Zfp42 and Hoxb1 as the top two among the seven features considered, identifying them as potential early embryotoxicity assessment biomarkers. As a result, a t test assessing the statistical significance of this two-feature prediction model yielded a p value of 0.0044, confirming the successful reduction of redundancies and multicollinearity through RFECV. Our study presents a systematic methodology for using machine learning techniques in transcriptomics data analysis, enhancing the discovery of potential reporter gene candidates for embryotoxicity screening research, and improving the predictive model\'s predictive accuracy and feasibility while reducing financial and time constraints.






  • 文章类型: Journal Article
    The relationship between economic growth and CO2 emissions has been analyzed testing the environmental Kuznets curve hypothesis, but traditional econometric methods may be flawed. An alternative method is proposed using segmented-sample regressions and implemented in 164 countries (98.34% of world population) over different periods from 1822 to 2018. Results suggest that while the association between GDP per capita and CO2 emissions per capita is weakening over time, it remains positive globally, with only some high-income countries showing a reversed association in recent years. While 49 countries have decoupled emissions from economic growth, 115 have not. Most African, American, and Asian countries have not decoupled, whereas most European and Oceanians have. These findings highlight the urgency for effective climate policies because decoupling remains unachieved on a global scale, and we are moving away from, rather than approaching, the Paris Agreement goal of limiting temperature increase to 1.5 °C above preindustrial levels.






  • 文章类型: Journal Article
    The breeder\'s equation, Δ z ¯ = G β   , allows us to understand how genetics (the genetic covariance matrix, G) and the vector of linear selection gradients β interact to generate evolutionary trajectories. Estimation of β using multiple regression of trait values on relative fitness revolutionized the way we study selection in laboratory and wild populations. However, multicollinearity, or correlation of predictors, can lead to very high variances of and covariances between elements of β, posing a challenge for the interpretation of the parameter estimates. This is particularly relevant in the era of big data, where the number of predictors may approach or exceed the number of observations. A common approach to multicollinear predictors is to discard some of them, thereby losing any information that might be gained from those traits. Using simulations, we show how, on the one hand, multicollinearity can result in inaccurate estimates of selection, and, on the other, how the removal of correlated phenotypes from the analyses can provide a misguided view of the targets of selection. We show that regularized regression, which places data-validated constraints on the magnitudes of individual elements of β, can produce more accurate estimates of the total strength and direction of multivariate selection in the presence of multicollinearity and limited data, and often has little cost when multicollinearity is low. We also compare standard and regularized regression estimates of selection in a reanalysis of three published case studies, showing that regularized regression can improve fitness predictions in independent data. Our results suggest that regularized regression is a valuable tool that can be used as an important complement to traditional least-squares estimates of selection. In some cases, its use can lead to improved predictions of individual fitness, and improved estimates of the total strength and direction of multivariate selection.






  • 文章类型: Journal Article
    The mixture of probabilistic regression models is one of the most common techniques to incorporate the information of covariates into learning of the population heterogeneity. Despite its flexibility, unreliable estimates can occur due to multicollinearity among covariates. In this paper, we develop Liu-type shrinkage methods through an unsupervised learning approach to estimate the model coefficients in the presence of multicollinearity. We evaluate the performance of our proposed methods via classification and stochastic versions of the expectation-maximization algorithm. We show using numerical simulations that the proposed methods outperform their Ridge and maximum likelihood counterparts. Finally, we apply our methods to analyze the bone mineral data of women aged 50 and older.






  • 文章类型: Journal Article
    In this article, we define mixed predictor and stochastic restricted ridge predictor of partially linear mixed measurement error models by taking advantage of Kernel approximation. Under matrix mean square error criterion, we make the comparison of the superiorities the linear combinations of the new defined predictors. Then we investigate the asymptotic normality characteristics and the situation of the unknown covariance matrix of measurement errors. Finally, the study is ended with a Monte Carlo simulation study and COVID-19 data application.






  • 文章类型: Journal Article
    Aeromagnetic surveys are widely used in geological exploration, mineral resource assessment, environmental monitoring, military reconnaissance, and other areas. It is necessary to perform magnetic compensation for interference in these fields. In recent years, large unmanned aerial vehicles (UAVs) have been more suitable for magnetic detection missions because of the greater loads they can carry. This article proposes some methods for the magnetic compensation of large multiload UAVs. Because of the interference of the large platform and instrument noise, the standard deviations (stds) of the compensation data used in this paper are larger. At the beginning of this article, using the traditional T-L model, we avoid the shortcomings of the anti-magnetic interference ability of triaxial magnetic gate magnetometers. The direction cosine information is obtained by using an inertial navigation system, the global positioning system, and a triaxial magnetic gate magnetometer. Then, we increase the amplitude of the maneuvers in the compensation process; this reduces the multicollinearity problems in the compensation matrix to a certain extent, but it also results in greater magnetic field interference. Lastly, we employ the method of Lasso regularization Newton iteration (LRNM). Compared to the traditional methods of least squares (LS) and singular value decomposition (SVD), LRNM provides improvements of 34% and 27%, respectively. In summary, this series of schemes can be used to perform effective compensation for large multi-load UAVs and improve the actual use of large UAVs, making them more accurate in the measurement of aeromagnetic survey data.






  • 文章类型: Journal Article
    BACKGROUND: The risk of a woman dying as a result of pregnancy or childbirth during her lifetime is about one in six in the poorest parts of the world.
    OBJECTIVE: The present study aims to determine prevalence of maternal risk and the influencing variables among ever-married women belonging to the reproductive age group (15-49) of Birbhum district, West Bengal.
    METHODS: A cohort-based retrospective cross-sectional study was carried out among the sample of 229 respondents through a purposive stratified random sampling method and a pre-designed semi-structured questionnaire. The ordinal logistic regression (OLR) model was taken as a tool of assessment. Before developing the proportional OLR model, we have checked the multicollinearity effect among the predictors and the first-order effect modifier was evaluated as well. We performed data analysis using SPSS version 26.
    RESULTS: The result shows that illiterate women (Odds ratios [OR] = 2.81, 95% CI, 0.277-1.791), from lower standard of living (OR = 1.14, 95% CI, -0.845-1.116), married before the age of 15 years (OR = 21.96, 95% CI, -0.55-6.73) and between the age of 15-18 years (OR = 24.51. 95% CI, -0.45-6.85) are more likely to be affected by the higher concentration of maternal risk. Other important predictor is the time of pregnancy registration. Considering the transport and related en-route causalities, the result portraying a clear picture where the distance and travel time becoming significant factors in determining the concentration of maternal risk.
    CONCLUSIONS: Incidences of child marriages should be restricted. Eradicating factors influencing an individual\'s decision to seek care would be an essential contribution in excluding the dominant maternal risk factors.






  • 文章类型: Journal Article
    An important goal in systems neuroscience is to understand the structure of neuronal interactions, frequently approached by studying functional relations between recorded neuronal signals. Commonly used pairwise measures (e.g., correlation coefficient) offer limited insight, neither addressing the specificity of estimated neuronal interactions nor potential synergistic coupling between neuronal signals. Tripartite measures, such as partial correlation, variance partitioning, and partial information decomposition, address these questions by disentangling functional relations into interpretable information atoms (unique, redundant, and synergistic). Here, we apply these tripartite measures to simulated neuronal recordings to investigate their sensitivity to noise. We find that the considered measures are mostly accurate and specific for signals with noiseless sources but experience significant bias for noisy sources.We show that permutation testing of such measures results in high false positive rates even for small noise fractions and large data sizes. We present a conservative null hypothesis for significance testing of tripartite measures, which significantly decreases false positive rate at a tolerable expense of increasing false negative rate. We hope our study raises awareness about the potential pitfalls of significance testing and of interpretation of functional relations, offering both conceptual and practical advice.
    Tripartite functional relation measures enable the study of interesting effects in neural recordings, such as redundancy, functional connection specificity, and synergistic coupling. However, estimators of such relations are commonly validated using noiseless signals, whereas neural recordings typically contain noise. Here we systematically study the performance of tripartite estimators using simulated noisy neural signals. We demonstrate that permutation testing is not a robust procedure for inferring ground truth statistical relations from commonly used tripartite relation estimators. We develop an adjusted conservative testing procedure, reducing false positive rates of the studied estimators when applied to noisy data. Besides addressing significance testing, our results should aid in accurate interpretation of tripartite functional relations and functional connectivity.






  • 文章类型: Journal Article
    In multilevel models, disaggregating predictors into level-specific parts (typically accomplished via centering) benefits parameter estimates and their interpretations. However, the importance of level-specificity has been sparsely addressed in multilevel literature concerning collinearity. In this study, we develop novel insights into the interactivity of centering and collinearity in multilevel models. After integrating the broad literatures on centering and collinearity, we review level-specific and conflated correlations in multilevel data. Next, by deriving formal relationships between predictor collinearity and multilevel model estimates, we demonstrate how the consequences of collinearity change across different centering specifications and identify data characteristics that may exacerbate or mitigate those consequences. We show that when all or some level-1 predictors are uncentered, slope estimates can be greatly biased by collinearity. Disaggregation of all predictors eliminates the possibility that fixed effect estimates will be biased due to collinearity alone; however, under some data conditions, collinearity is associated with biased standard errors and random effect (co)variance estimates. Finally, we illustrate the importance of disaggregation for diagnosing collinearity in multilevel data and provide recommendations for the use of level-specific collinearity diagnostics. Overall, the necessity of disaggregation for identifying and managing collinearity\'s consequences in multilevel models is clarified in novel ways.






  • 文章类型: Journal Article
    Family cohesion and parental monitoring promote Latino adolescents\' positive adjustment. For Latino immigrant families, these parenting processes tend to be interdependent due to shared roots in cultural values emphasizing family togetherness and parental authority. This covariance poses a significant methodological problem with respect to multicollinearity. The present article uses a novel technique-residual centering-to remove shared variance among family cohesion and parental monitoring constructs and, in turn, to identify how the unique variance of each is associated with Latino adolescent adjustment. Participants include 249 9th and 10th graders in Mexican and Central American immigrant families. We compared findings from structural equation models in which parenting constructs were examined simultaneously with residual-centered models, in which shared variance among parenting constructs was removed for each parenting variable. Findings from residual-centered models revealed that parents\' monitoring of youth\'s daily activities was associated with less alcohol use and fewer youth depressive symptoms, and that parents\' monitoring of youth\'s peer activities outside the home was associated with less marijuana use and more depressive symptoms. Family cohesion was unrelated to Latino youth outcomes in residual-centered models. By isolating specific, \"pure\" parenting effects, residual centering can clarify the ways in which family cohesion and parental monitoring behaviors matter for Latino adolescents\' adjustment.
    La cohesión familiar y la supervisión de hijos promueven el bien estar de los adolescentes Latinos. Para las familias inmigrantes, estos procesos de crianza son interdependientes por que los valores de unidad y autoridad dentro de la familia son ambos culturales. Esta covarianza es un problema metodológico por que causa multicolinealidad. Este estudio usa una técnica innovadora (“residual centering”) para resolver el problema de covarianza entre los constructos de la cohesión familiar y la supervisión de hijos; y de esta manera, identificar como la varianza única de cada constructo es asociada con el ajustamiento de los adolescentes. Participantes fueron 249 adolescentes del grado 9° y 10° de familias inmigrantes de México y Centroamérica. Comparamos los resultados de modelos de ecuaciones estructurales en que los dos constructos fueron examinados simultáneamente a los modelos de “residual centering” en que los constructos fueron examinados independientemente. Según los modelos de “residual centering,” la supervisión de las actividades diarias de hijos es asociada con menos consumo de alcohol y síntomas de depresión, y la supervisión de las actividades fuera de casa es asociada con menos consumo de marihuana pero más síntomas de depresión. Sin embargo, la cohesión familiar no tuvo asociación con el ajustamiento de los adolescentes. En separar los efectos de los constructos, esta técnica de “residual centering” puede clarificar el impacto único de la cohesión familiar y la supervisión de hijos en los adolescentes.





