Deep Learning Multimodal Sarcasm Detection in Social Media Comments: The Role of Memes and Emojis
DOI:
https://doi.org/10.37965/jait.2025.0699Keywords:
emoji, deep learning, meme, sarcasm detectionAbstract
Social media has become a crucial platform for interaction, information exchange, and market analysis. Businesses and researchers rely on it for sentiment and emotion analysis, yet sarcasm detection remains a major challenge due to its ability to alter sentiment polarity. Traditional text-based analysis struggles with sarcasm as it lacks tone and facial expressions. Additionally, crucial indicators of sarcasm—repeated emojis, punctuation, and characters—are often discarded during preprocessing. To address this issue, we proposed a multimodal deep-learning approach that integrated text, emojis, and images to improve sarcasm detection. This approach preserved and transformed repeated emojis, punctuation, and characters into structured features rather than removing them. Images were processed using Optical Character Recognition (OCR) to extract text to ensure computational efficiency by excluding non-textual visual elements. Word representations were then generated using Word2Vec embeddings, which were fed into LSTM, GRU, and BiLSTM models. The study highlighted the importance of scenario-specific preprocessing and feature selection in sarcasm detection. Among the 15 models tested, LSTM–composite demonstrated stable accuracy and strong generalization (76% accuracy, 73% precision, and 82% recall). Its high computational cost made it unsuitable for large-scale deployment. On the contrary, Model 9 (i.e., BiLSTM–isRepeatedChar) could balance efficiency and predictive performance (76% accuracy, 74% precision, and 79% recall), which made it ideal for resource-limited environments.
Metrics
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Authors

This work is licensed under a Creative Commons Attribution 4.0 International License.