HAND POSE CLASSIFICATION USING MEDIAPIPE HANDS AND CNN-LSTM FOR AUGMENTED REALITY BASED INTRAVENOUS INFUSION LEARNING
DOI:
https://doi.org/10.55583/jtisi.v3i2.2343Keywords:
Gesture Recognition, CNN-LSTM, MediaPipe, Augmented Reality, Medical LearningAbstract
Intravenous infusion training requires precise hand positioning and coordinated movements; however, conventional training approaches remain subjective and lack consistent real-time feedback. Moreover, existing augmented reality (AR)-based systems are largely limited to visualization and do not provide intelligent, automated skill evaluation. To address this gap, this study proposes an integrated hand pose classification framework that combines MediaPipe-based landmark extraction, CNN-LSTM spatio-temporal modeling, and AR-based feedback for real-time procedural learning. The novelty of this work lies in the seamless integration of lightweight feature representation, hybrid deep learning, and interactive AR feedback within a unified learning system. Experimental results demonstrate that the proposed approach achieves high classification performance, with an accuracy of 94.82% and an AUC of approximately 0.97, indicating strong discriminative capability. The system also operates in real time with low latency, enabling immediate feedback and adaptive learning. This study contributes theoretically to spatio-temporal gesture modeling and practically to the development of intelligent AR-based training systems. The proposed framework offers a scalable and objective solution for improving procedural accuracy, consistency, and accessibility in medical education.

