Pseudo-Temporal 3D CNN Fusion of Gradient and Deep Spatial Features for Hand Gesture Recognition

Pseudo-Temporal 3D CNN Fusion of Gradient and Deep Spatial Features for Hand Gesture Recognition

Authors

  • Keerthi Kumar M Department of Electronics and Communication Engineering, Nitte Meenakshi Institute of Technology, Nitte (deemed to be a university), Yelahanka, Bengaluru, India https://orcid.org/0009-0006-9539-7697
  • Parameshachari BD Department of Electronics and Communication Engineering, Nitte Meenakshi Institute of Technology, Nitte (deemed to be a university), Yelahanka, Bengaluru, India https://orcid.org/0000-0002-3997-5070

DOI:

https://doi.org/10.37965/jait.2026.0984

Keywords:

Deep learning, hand gesture, histogram of oriented gradients, integrated model, sign language recognition

Abstract

Communication between people with disabilities and those who do not understand sign language is a growing social need and a challenging task. The usage of deep learning (DL) techniques acts as a gateway for people with communication impairments to bridge the communication gap. This research develops an integrated approach using DL architectures to recognize hand images and facilitates effective communication. Features from the raw data are extracted using the histogram of oriented gradients (HOG). HOG evaluates the magnitude and orientation of the gradient of the input image based on its outline, which is used as the edge direction. The extracted features are classified using the proposed integrated model, which comprises MobileNet V2 and a three-dimensional convolutional neural network (3D CNN). MobileNet V2 is utilized for extracting spatial features, while the 3D CNN detects spatial data in three dimensions to facilitate better classification accuracy. The proposed model fuses HOG-based gradient descriptors with deep spatial features from MobileNetV2 using a pseudo-temporal 3D CNN, enabling superior static sign language recognition. Experimental analysis shows that the proposed method achieves an accuracy of 99.55%, which is higher than that of existing techniques.

Author Biographies

Keerthi Kumar M, Department of Electronics and Communication Engineering, Nitte Meenakshi Institute of Technology, Nitte (deemed to be a university), Yelahanka, Bengaluru, India

Department of Electronics and Communication Engineering

Parameshachari BD, Department of Electronics and Communication Engineering, Nitte Meenakshi Institute of Technology, Nitte (deemed to be a university), Yelahanka, Bengaluru, India

Department of Electronics and Communication Engineering

Downloads

Published

06/10/2026

How to Cite

M, K. K., & BD, P. (2026). Pseudo-Temporal 3D CNN Fusion of Gradient and Deep Spatial Features for Hand Gesture Recognition. Journal of Artificial Intelligence and Technology. https://doi.org/10.37965/jait.2026.0984

Issue

Section

Research Articles
Loading...