{Reference Type}: Journal Article {Title}: STaRNet: A spatio-temporal and Riemannian network for high-performance motor imagery decoding. {Author}: Wang X;Yang W;Qi W;Wang Y;Ma X;Wang W; {Journal}: Neural Netw {Volume}: 178 {Issue}: 0 {Year}: 2024 Jun 26 {Factor}: 9.657 {DOI}: 10.1016/j.neunet.2024.106471 {Abstract}: Brain-computer interfaces (BCIs), representing a transformative form of human-computer interaction, empower users to interact directly with external environments through brain signals. In response to the demands for high accuracy, robustness, and end-to-end capabilities within BCIs based on motor imagery (MI), this paper introduces STaRNet, a novel model that integrates multi-scale spatio-temporal convolutional neural networks (CNNs) with Riemannian geometry. Initially, STaRNet integrates a multi-scale spatio-temporal feature extraction module that captures both global and local features, facilitating the construction of Riemannian manifolds from these comprehensive spatio-temporal features. Subsequently, a matrix logarithm operation transforms the manifold-based features into the tangent space, followed by a dense layer for classification. Without preprocessing, STaRNet surpasses state-of-the-art (SOTA) models by achieving an average decoding accuracy of 83.29% and a kappa value of 0.777 on the BCI Competition IV 2a dataset, and 95.45% accuracy with a kappa value of 0.939 on the High Gamma Dataset. Additionally, a comparative analysis between STaRNet and several SOTA models, focusing on the most challenging subjects from both datasets, highlights exceptional robustness of STaRNet. Finally, the visualizations of learned frequency bands demonstrate that temporal convolutions have learned MI-related frequency bands, and the t-SNE analyses of features across multiple layers of STaRNet exhibit strong feature extraction capabilities. We believe that the accurate, robust, and end-to-end capabilities of the STaRNet will facilitate the advancement of BCIs.