SBCP-YOLO-R3D: Student Behavior Recognition and Visualization Framework Using Improved YOLO and R3D for Class Video

Chunyan Yu; Qin Ding; Yuchen Bai

doi:10.37965/jait.2025.0685

SBCP-YOLO-R3D: Student Behavior Recognition and Visualization Framework Using Improved YOLO and R3D for Class Video

Authors

Chunyan Yu School of Computer and Information Engineering, Chuzhou University, Chuzhou, 239000, Anhui, China & School of Education Science, Nanjing Normal University, Nanjing, 210023, Jiangsu, China https://orcid.org/0000-0002-1437-2812
Qin Ding School of Computer Science and Engineering, Anhui University of Science & Technology, Huainan, 232001, Anhui, China https://orcid.org/0009-0009-0988-8126
Yuchen Bai School of Computer and Information Engineering, Chuzhou University, Chuzhou, 239000, Anhui, China

DOI:

https://doi.org/10.37965/jait.2025.0685

Keywords:

learning portraits, occlusion attention, student behavior recognition, YOLO

Abstract

Real-time recognition and visualization of students’ behaviors in face-to-face classrooms serve as pivotal indicators of learning engagement. However, current methods exhibit limitations in both real-time performance and accuracy. Additionally, in-depth studies have not been extensively conducted to evaluate learning status more conveniently by utilizing computer vision techniques. To address these issues, a novel Student Behavior Recognition and Dynamic Class Portraits Construction framework, named SBCP-YOLO-R3D, incorporating the StB-YOLO and R3D methods, has been proposed to detect student behaviors and construct class portraits. The developed framework comprises two layers: the StB-YOLO detection layer and the R3D classification layer. In the StB-YOLO detection layer, the Lightweight-SEAM (LW-SEAM) is incorporated into YOLOv5 to enhance the recognition of occluded students, by capturing contextual information and enhancing occlusion-related features. Moreover, a Double-SlideLoss function is devised, employing adaptive weighting mechanisms to strike an optimal balance between simple and challenging samples. In the R3D classification layer, the results generated by StB-YOLO are then processed using R3D to produce class portraits. Experiments conducted on the StuAct and SCB-DATASET3-S datasets demonstrate the effectiveness of the StB-YOLO. Compared with the baseline model, StB-YOLO increases the mAP by 3.1%.