A Multi-Scale CNN–Transformer Fusion Framework with Stain Normalization and Focal Loss for High-Accuracy Multi-Stage Gastric Cancer Diagnosis
DOI:
https://doi.org/10.37965/jait.2026.1049Keywords:
early-stage gastric cancer, hyperparameter tuning, multi-path convolution, transformer attention optimizationAbstract
Early-stage gastric cancer (GC) diagnosis from histopathological images remains challenging due to subtle morphological variations and inter-slide staining variability. This study proposes a deep learning-based multi-stage GC classification framework that integrates convolutional feature extraction with attention-based contextual modeling. Eight pretrained convolutional neural networks (CNNs) are evaluated, among which DenseNet121 and MobileNetV2 achieve the strongest baseline performance (accuracy ≈ 85.8% and 85.9%, respectively). Building on these results, two novel architectures are developed. The first is an enhanced DenseNet121 model that incorporates multi-path convolution, squeeze-and-excitation(SE) channel recalibration, and attention optimization to capture multi-scale morphological patterns. The second is a Hybrid DenseNet121–Transformer framework that integrates global self-attention with convolutional representations to improve contextual understanding of tissue structures. The models are trained using standardized preprocessing, Macenko stain normalization, extensive data augmentation, and class balancing on a dataset of 7,010 histopathology images representing Normal, Stage I, and Stage II gastric tissues. The proposed hybrid CNN–Transformer framework achieves 90.2% classification accuracy, a macro F1-score of 91.4%, and an Area Under the Curve (AUC) of 0.985, outperforming baseline CNN architectures in stage-wise discrimination. Attention-based visualization highlights diagnostically relevant tissue regions and improves model interpretability. These findings demonstrate that combining multi-scale convolutional representations with Transformer-based global attention provides a robust and interpretable framework for automated GC histopathology analysis.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Authors

This work is licensed under a Creative Commons Attribution 4.0 International License.
