BASARAN, B. Evaluating Large Language Models for Educational Measurement Insights from Automated and Human Scoring of Language Exams. Journal of Artificial Intelligence and Technology, [S. l.], 2026. DOI: 10.37965/jait.2026.0949. Disponível em: https://ojs.istp-press.com/jait/article/view/949. Acesso em: 26 feb. 2026.