Basaran, B. (2026). Evaluating Large Language Models for Educational Measurement Insights from Automated and Human Scoring of Language Exams. Journal of Artificial Intelligence and Technology. https://doi.org/10.37965/jait.2026.0949