3.4.5. Transformer Models

Transformer models represent a powerful continuation from Attention models. Transformer models use self-attention mechanism to handle sequential data effectively, enable the model to focus on the most relevant parts of the input without needing recurrent layers. Recurrent layers are used in RNNs and LSTMs to process one step at a time in a sequence, passing information from one step to the next. Transformers, in contrast, can view entire input sequence simultaneously by applying self-attention mechanism to assess the importance of different parts of the sequence while keeping track of the sequence order through position encoding. This parallel processing ability enables transformers to capture complex patterns and relationships across the whole sequence and be able to identify long-term dependencies much faster. However, transformers can be demanding on computational resources, especially with larger datasets.

Abibullaev et al. provided a comprehensive review of transformers in brain-computer interfaces (BCIs), highlighting their wide potentials in tasks ranging from motor imagery decoding to emotion recognition by capturing EEG signals’ temporal dynamics. Lalzary and Wolf combined a 1D-CNN with a transformer for ECG-based emotion and glucose-level mapping, implementing self-supervised learning to improve performance with limited labeled data. To tackle inter-subject variability, Sartipi and Cetin integrated transformers with Adversarial Discriminative Domain Adaptation (ADDA) to enhance cross-subject emotion recognition in EEG data, using self-attention to capture spatial features and transform temporal information.

Vazquez-Rodriguez et al. used self-supervised transformers on ECG signals, achieving state-of-the-art emotion recognition by pre-training on unlabeled data and fine-tuning on the AMIGOS dataset. Sun et al. introduced a dual-branch adaptive transformer network, which combines dynamic graph convolution and feature fusion to capture both temporal and spectral characteristics of EEG for better cross-subject transfer learning.

Last updated