HomE: A Homogeneous Ensemble Framework for Dynamic Hand Gesture Recognition

Published in arXiv preprint, 2025

Hand gesture recognition (HGR) plays an essential role in human–computer interaction, enabling natural, touch-free control across domains such as virtual reality, sign language interpretation, and automotive interfaces. Despite notable progress using deep networks and multimodal data fusion, current HGR solutions still face challenges stemming from misclassifications and noise sensitivity. In this paper, we propose HomE, a homogeneous ensemble framework that aims to improve HGR models’ performance and robustness by partitioning gesture classes into smaller, more coherent subsets based on critical features uncovered in parallel by unsupervised clustering and an LLM-driven semantic sampler, and by training a dedicated expert learner for each subset. A separate router learner routes incoming samples to the most relevant expert learner, while the expert routing module fuses the outputs of all expert learners into a final classification. Extensive experiments on the NVGestures, DHG-14, and SHREC’17 datasets show that our method not only enhances accuracy and robustness over single-network baselines but also enables these base models to become more competitive with state-of-the-art approaches—all without altering their underlying architectures. Furthermore, our ablation studies verify that multiple heterogeneous sampling methods provide complementary strengths, ultimately boosting recognition performance. In addition to offering insights on sampling strategies, this work highlights the scalability of HomE for both depth and skeleton-based HGR tasks, suggesting its broader applicability to other domains where class diversity and label ambiguity pose obstacles for single-model approaches.

ArXiv paper: https://ieeexplore.ieee.org/abstract/document/11099347


Skills Used

  • Machine Learning
  • Deep Learning
  • Algorithm Design
  • PyTorch
  • Python


Collaborator


Advisor