Command Palette
Search for a command to run...
Multi-GAT: A Graphical Attention-based Hierarchical Multimodal Representation Learning Approach for Human Activity Recognition
{Tariq Iqbal Md Mofijul Islam}
Abstract
Recognizing human activities is one of the crucial capabilities that a robot needs to have to be useful around people. Although modern robots are equipped with various types of sensors, human activity recognition (HAR) still remains a challenging problem, particularly in the presence of noisy sensor data. In this work, we introduce a multimodal graphical attention-based HAR approach, called Multi-GAT, which hierarchically learns complementary multimodal features. We develop a multimodal mixture-of-experts model to disentangle and extract salient modality-specific features that enable feature interactions. Additionally, we introduce a novel message-passing based graphical attention approach to capture cross-modal relation for extracting complementary multimodal features. The experimental results on two multimodal human activity datasets suggest that Multi-GAT outperformed state-of-the-art HAR algorithms across all datasets and metrics tested. Finally, the experimental results with noisy sensor data indicate that Multi-GAT consistently outperforms all the evaluated baselines. The robust performance suggests that Multi-GAT can enable seamless human-robot collaboration in noisy human environments.
Benchmarks
| Benchmark | Methodology | Metrics |
|---|---|---|
| multimodal-activity-recognition-on-mmact | Multi-GAT | F1-Score (Cross-Session): 91.48 F1-Score (Cross-Subject): 75.24 |
Build AI with AI
From idea to launch — accelerate your AI development with free AI co-coding, out-of-the-box environment and best price of GPUs.