Depression prediction

  • 文章类型: Journal Article
    In contemporary society, depression has emerged as a prominent mental disorder that exhibits exponential growth and exerts a substantial influence on premature mortality. Although numerous research applied machine learning methods to forecast signs of depression. Nevertheless, only a limited number of research have taken into account the severity level as a multiclass variable. Besides, maintaining the equality of data distribution among all the classes rarely happens in practical communities. So, the inevitable class imbalance for multiple variables is considered a substantial challenge in this domain. Furthermore, this research emphasizes the significance of addressing class imbalance issues in the context of multiple classes. We introduced a new approach Feature group partitioning (FGP) in the data preprocessing phase which effectively reduces the dimensionality of features to a minimum. This study utilized synthetic oversampling techniques, specifically Synthetic Minority Over-sampling Technique (SMOTE) and Adaptive Synthetic (ADASYN), for class balancing. The dataset used in this research was collected from university students by administering the Burn Depression Checklist (BDC). For methodological modifications, we implemented heterogeneous ensemble learning stacking, homogeneous ensemble bagging, and five distinct supervised machine learning algorithms. The issue of overfitting was mitigated by evaluating the accuracy of the training, validation, and testing datasets. To justify the effectiveness of the prediction models, balanced accuracy, sensitivity, specificity, precision, and f1-score indices are used. Overall, comprehensive analysis demonstrates the discrimination between the Conventional Depression Screening (CDS) and FGP approach. In summary, the results show that the stacking classifier for FGP with SMOTE approach yields the highest balanced accuracy, with a rate of 92.81%. The empirical evidence has demonstrated that the FGP approach, when combined with the SMOTE, able to produce better performance in predicting the severity of depression. Most importantly the optimization of the training time of the FGP approach for all of the classifiers is a significant achievement of this research.






  • 文章类型: Journal Article
    UNASSIGNED: The COVID-19 pandemic has exacerbated mental health challenges, particularly depression among college students. Detecting at-risk students early is crucial but remains challenging, particularly in developing countries. Utilizing data-driven predictive models presents a viable solution to address this pressing need.
    UNASSIGNED: 1) To develop and compare machine learning (ML) models for predicting depression in Argentinean students during the pandemic. 2) To assess the performance of classification and regression models using appropriate metrics. 3) To identify key features driving depression prediction.
    UNASSIGNED: A longitudinal dataset (N = 1492 college students) captured T1 and T2 measurements during the Argentinean COVID-19 quarantine. ML models, including linear logistic regression classifiers/ridge regression (LogReg/RR), random forest classifiers/regressors, and support vector machines/regressors (SVM/SVR), are employed. Assessed features encompass depression and anxiety scores (at T1), mental disorder/suicidal behavior history, quarantine sub-period information, sex, and age. For classification, models\' performance on test data is evaluated using Area Under the Precision-Recall Curve (AUPRC), Area Under the Receiver Operating Characteristic curve, Balanced Accuracy, F1 score, and Brier loss. For regression, R-squared (R2), Mean Absolute Error, and Mean Squared Error are assessed. Univariate analyses are conducted to assess the predictive strength of each individual feature with respect to the target variable. The performance of multi- vs univariate models is compared using the mean AUPRC score for classifiers and the R2 score for regressors.
    UNASSIGNED: The highest performance is achieved by SVM and LogReg (e.g., AUPRC: 0.76, 95% CI: 0.69, 0.81) and SVR and RR models (e.g., R2 for SVR and RR: 0.56, 95% CI: 0.45, 0.64 and 0.45, 0.63, respectively). Univariate models, particularly LogReg and SVM using depression (AUPRC: 0.72, 95% CI: 0.64, 0.79) or anxiety scores (AUPRC: 0.71, 95% CI: 0.64, 0.78) and RR using depression scores (R2: 0.48, 95% CI: 0.39, 0.57) exhibit performance levels close to those of the multivariate models, which include all features.
    UNASSIGNED: These findings highlight the relevance of pre-existing depression and anxiety conditions in predicting depression during quarantine, underscoring their comorbidity. ML models, particularly SVM/SVR and LogReg/RR, demonstrate potential in the timely detection of at-risk students. However, further studies are needed before clinical implementation.






  • 文章类型: Journal Article
    Depression has become the prevailing global mental health concern. The accuracy of traditional depression diagnosis methods faces challenges due to diverse factors, making primary identification a complex task. Thus, the imperative lies in developing a method that fulfills objectivity and effectiveness criteria for depression identification. Current research underscores notable disparities in brain activity between individuals with depression and those without. The Electroencephalogram (EEG), as a biologically reflective and easily accessible signal, is widely used to diagnose depression. This article introduces an innovative depression prediction strategy that merges time-frequency complexity and electrode spatial topology to aid in depression diagnosis. Initially, time-frequency complexity and temporal features of the EEG signal are extracted to generate node features for a graph convolutional network. Subsequently, leveraging channel correlation, the brain network adjacency matrix is employed and calculated. The final depression classification is achieved by training and validating a graph convolutional network with graph node features and a brain network adjacency matrix based on channel correlation. The proposed strategy has been validated using two publicly available EEG datasets, MODMA and PRED+CT, achieving notable accuracy rates of 98.30 and 96.51%, respectively. These outcomes affirm the reliability and utility of our proposed strategy in predicting depression using EEG signals. Additionally, the findings substantiate the effectiveness of EEG time-frequency complexity characteristics as valuable biomarkers for depression prediction.






  • 文章类型: Journal Article
    The majority of people in the modern biosphere struggle with depression as a result of the coronavirus pandemic\'s impact, which has adversely impacted mental health without warning. Even though the majority of individuals are still protected, it is crucial to check for post-corona virus symptoms if someone is feeling a little lethargic. In order to identify the post-coronavirus symptoms and attacks that are present in the human body, the recommended approach is included. When a harmful virus spreads inside a human body, the post-diagnosis symptoms are considerably more dangerous, and if they are not recognised at an early stage, the risks will be increased. Additionally, if the post-symptoms are severe and go untreated, it might harm one\'s mental health. In order to prevent someone from succumbing to depression, the technology of audio prediction is employed to recognise all the symptoms and potentially dangerous signs. Different choral characters are used to combine machine-learning algorithms to determine each person\'s mental state. Design considerations are made for a separate device that detects audio attribute outputs in order to evaluate the effectiveness of the suggested technique; compared to the previous method, the performance metric is substantially better by roughly 67%.






  • 文章类型: Journal Article
    Depression is one of the most common mental health illnesses. The biggest obstacle lies in an efficient and early detection of the disorder. Self-report questionnaires are the instruments used by medical experts to elaborate a diagnosis. These questionnaires were designed by analyzing different depressive symptoms. However, factors such as social stigmas negatively affect the success of traditional methods. This paper presents a novel approach for automatically estimating the degree of depression in social media users. In this regard, we addressed the task Measuring the Severity of the Signs of Depression of eRisk 2020, an initiative in the CLEF Conference. We aimed to explore neural language models to exploit different aspects of the subject\'s writings depending on the symptom to capture. We devised two distinct methods based on the symptoms\' sensitivity in terms of willingness on commenting about them publicly. The first exploits users\' general language based on their publications. The second seeks more direct evidence from publications that specifically mention the symptoms concerns. Both methods automatically estimate the Beck Depression Inventory (BDI-II) total score. For evaluating our proposals, we used benchmark Reddit data for depression severity estimation. Our findings showed that approaches based on neural language models are a feasible alternative for estimating depression rating scales, even when small amounts of training data are available.






  • 文章类型: Journal Article
    With the impact of the COVID-19 pandemic, the number of patients suffering from depression is rising around the world. It is important to diagnose depression early so that it may be treated as soon as possible. The self-response questionnaire, which has been used to diagnose depression in hospitals, is impractical since it requires active patient engagement. Therefore, it is vital to have a system that predicts depression automatically and recommends treatment. In this paper, we propose a smartphone-based depression prediction system. In addition, we propose depressive features based on multimodal sensor data for predicting depressive mood. The multimodal depressive features were designed based on depression symptoms defined in the Diagnostic and Statistical Manual of Mental Disorders (DSM-5). The proposed system comprises a \"Mental Health Protector\" application that collects data from smartphones and a big data-based cloud platform that processes large amounts of data. We recruited 106 mental patients and collected smartphone sensor data and self-reported questionnaires from their smartphones using the proposed system. Finally, we evaluated the performance of the proposed system\'s prediction of depression. As the test dataset, 27 out of 106 participants were selected randomly. The proposed system showed 76.92% on an f1-score for 16 patients with depression disease, and in particular, 15 patients, 93.75%, were successfully predicted. Unlike previous studies, the proposed method has high adaptability in that it uses only smartphones and has a distinction of evaluating prediction accuracy based on the diagnosis.






  • 文章类型: Journal Article
    Depression has gradually become the most common mental disorder in the world. The accuracy of its diagnosis may be affected by many factors, while the primary diagnosis seems to be difficult to define. Finding a way to identify depression by satisfying both objective and effective conditions is an urgent issue. In this paper, a strategy for predicting depression based on spatiotemporal features is proposed, and is expected to be used in the auxiliary diagnosis of depression. Firstly, electroencephalogram (EEG) signals were denoised through the filter to obtain the power spectra of the three corresponding frequency ranges, Theta, Alpha and Beta. Using orthogonal projection, the spatial positions of the electrodes were mapped to the brainpower spectrum, thereby obtaining three brain maps with spatial information. Then, the three brain maps were superimposed on a new brain map with frequency domain and spatial characteristics. A Convolutional Neural Network (CNN) and Gated Recurrent Unit (GRU) were applied to extract the sequential feature. The proposed strategy was validated with a public EEG dataset, achieving an accuracy of 89.63% and an accuracy of 88.56% with the private dataset. The network had less complexity with only six layers. The results show that our strategy is credible, less complex and useful in predicting depression using EEG signals.






  • 文章类型: Journal Article
    Recent studies have demonstrated that geographic location features collected using smartphones can be a powerful predictor for depression. While location information can be conveniently gathered by GPS, typical datasets suffer from significant periods of missing data due to various factors (e.g., phone power dynamics, limitations of GPS). A common approach is to remove the time periods with significant missing data before data analysis. In this paper, we develop an approach that fuses location data collected from two sources: GPS and WiFi association records, on smartphones, and evaluate its performance using a dataset collected from 79 college students. Our evaluation demonstrates that our data fusion approach leads to significantly more complete data. In addition, the features extracted from the more complete data present stronger correlation with self-report depression scores, and lead to depression prediction with much higher F 1 scores (up to 0.76 compared to 0.5 before data fusion). We further investigate the scenerio when including an additional data source, i.e., the data collected from a WiFi network infrastructure. Our results show that, while the additional data source leads to even more complete data, the resultant F 1 scores are similar to those when only using the location data (i.e., GPS and WiFi association records) from the phones.







  • 文章类型: Journal Article
    Developing machine learning based depression prediction method with information from long-term recordings is important and challenging to clinical diagnosis of depression.
    We developed a novel two-stage feature selection algorithm conducted on the high-dimensional (over thirty thousand) features constructed by a context-aware analysis on the data set of DAIC-WOZ, including audio, video, and semantic features. The prediction performance was compared with seven reference models. The preferred topics and feature categories related to the retained features were also analyzed respectively.
    Parsimonious subsets (tens of features) were selected by the proposed method in each case of prediction. We obtained the best performance in depression classification with F1-score as 0.96 (0.67), Precision as 1.00 (0.63), and Recall as 0.92 (0.71) on the development set (test set). We also achieved promising results in depression severity estimation with RMSE as 4.43 (5.11) and MAE as 3.22 (3.98), having a marginal difference with the best reference model (random forest with \'Selected-Text\' features). Five most important topics related to depression were revealed. The audio features were predominant to the other feature categories in depression classification while the contributions of the three feature categories to severity estimation were almost equal.
    More depression samples in the database we used should be further included. The second stage of feature selection is relatively time-consuming.
    This pipeline of depression recognition as well as the preferred topics and feature categories are expected to be useful in supporting the diagnosis of psychological distress conditions.






  • 文章类型: Journal Article
    Depression is a serious mental health problem. Recently, researchers have proposed novel approaches that use sensing data collected passively on smartphones for automatic depression screening. While these studies have explored several types of sensing data (e.g., location, activity, conversation), none of them has leveraged Internet traffic of smartphones, which can be collected with little energy consumption and the data is insensitive to phone hardware. In this paper, we explore using coarse-grained meta-data of Internet traffic on smartphones for depression screening. We develop techniques to identify Internet usage sessions (i.e., time periods when a user is online) and extract a novel set of features based on usage sessions from the Internet traffic meta-data. Our results demonstrate that Internet usage features can reflect the different behavioral characteristics between depressed and non-depressed participants, confirming findings in psychological sciences, which have relied on surveys or questionnaires instead of real Internet traffic as in our study. Furthermore, we develop machine learning based prediction models that use these features to predict depression. Our evaluation shows that Internet usage features can be used for effective depression prediction, leading to F 1 score as high as 0.80.






