BASARAN, B. Evaluating Large Language Models for Educational Measurement Insights from Automated and Human Scoring of Language Exams. Journal of Artificial Intelligence and Technology, [S. l.], v. 6, p. 349–355, 2026. DOI: 10.37965/jait.2026.0949. Disponível em: https://ojs.istp-press.com/jait/article/view/949. Acesso em: 16 jul. 2026.