AI Translation: Evaluating ChatGPT’s Reliability in Translating Arabic to English
DOI:
https://doi.org/10.37965/jait.2025.0831Keywords:
academic translation, ChatGPT-4, culture-bound expressions, prompt engineering, translation qualityAbstract
Machine translation has undergone remarkable evolution since its early rule-based systems in the 1950s, progressing through statistical models in the 1990s to neural machine translation (NMT) in the 2010s. The introduction of large language models, such as OpenAI’s GPT series, has marked a new era in translation technology, enabling systems to understand context, tone, and meaning beyond literal word substitution. These developments have reshaped translation research and practice, especially in academic and professional settings. This article explores the effectiveness of ChatGPT as a translation tool in academic contexts, particularly within the fields of humanities and social sciences. Drawing on recent literature, the study reviews advances in prompt engineering, comparative evaluations with traditional machine translation systems, and domain-specific translation challenges. Structured prompts are shown to significantly enhance translation accuracy, with BLEU scores improvingas prompt complexity increases. Comparative studies reveal that ChatGPT generally produces more fluent and contextually appropriate translations than tools like Google Translate, especially for high-resource languages and conversational texts. However, its performance declines with specialized terminology, low-resource languages, and culturally embedded expressions. Results show that ChatGPT can be a reliable translation tool that captures the intended meaning rather than offering word-for-word translations, making it a valuable resource in academic and professional settings. Nonetheless, challenges remain, particularly in accurately translating culture-bound expressions, technical jargon, and dialectical variations. Examples from Arabic–English translations underscore these limitations, highlighting instances where ChatGPT succeeded in conveying nuanced meaning and others where it produces awkward or inaccurate renderings. The study concludes by emphasizing the need for ongoing refinement in prompt design and hybrid human–AI translation approaches to enhance translation quality and cultural sensitivity in academic discourse.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Authors

This work is licensed under a Creative Commons Attribution 4.0 International License.
