Song, Hao, and Peter Flach. “Efficient and Robust Model Benchmarks With Item Response Theory and Adaptive Testing”. International Journal of Interactive Multimedia and Artificial Intelligence 6, no. 5 (March 1, 2021): 110–118. Accessed August 23, 2025. https://revistas.unir.net/index.php/ijimai/article/view/708.