Evaluating ChatGPT-Generated Linear Algebra Formative Assessments.

Nelly Rigaud Téllez; Patricia Rayón Villela; Roberto Blanco Bautista

doi:10.9781/ijimai.2024.02.004

Authors

Nelly Rigaud Téllez Universidad Nacional Autónoma de México
Patricia Rayón Villela Universidad Internacional De La Rioja
Roberto Blanco Bautista Universidad Nacional Autónoma de México

DOI:

https://doi.org/10.9781/ijimai.2024.02.004

Keywords:

Formative Assessment, ChatGPT, Linear Algebra, Math Word Problems, Polya’s Strategy, Prompt Generator

Supporting Agencies

This paper has been possible thanks to the support received from The National Autonomous University of Mexico, DGAPA, PROJECT PAPIME PE112723.

Abstract

This research explored Large Language Models potential uses on formative assessment for mathematical problem-solving process. The study provides a conceptual analysis of feedback and how the use of these models is related in the context of formative assessment for Linear Algebra problems. Particularly, the performance of a popular model known as ChatGPT in mathematical problems fails on reasoning, proofs, model construction, among others. Formative assessment is a process used by teachers and students during instruction that provides feedback to adjust ongoing teaching and learning to improve student’s achievement of intended instructional outcomes. The study analyzed and evaluated feedback provided to engineering students in their solutions, from both, instructors and ChatGPT, against fine-grained criteria of a formative feedback model that includes affective aspects. Considering preliminary outputs, and to improve performance of feedback from both agents’ instructors and ChatGPT, we developed a framework for formative assessment in mathematical problemsolving using a Large Language Model (LLM). We designed a framework to generate prompts, supported by common Linear Algebra mistakes within the context of concept development and problem-solving strategies. In this framework, the instructor acts as an agent to verify tasks in a math problem assigned to students, establishing a virtuous cycle of learning of queries supported by ChatGPT. Results revealed potentialities and challenges on how to improve feedback on graduate-level math problems, by which both educators and students adapt teaching and learning strategies.

Downloads

Download data is not yet available.

References

OpenAI, “ChatGPT: Optimizing language models for dialogue,” Open AI 2015-2024. Accessed: Aug 13, 2023. [Online]. Available at https://openai.com/blog/chatgpt/

W.M. Lim, A. Gunasekara, J.L. Pallant, J.I. Pallant, and E. Pechenkina, “Generative AI and the future of education: Ragnarôk or reformation? A paradoxical perspective from management educators,” The International Journal of Management Education, vol. 21, no. 2, pp. 1-13, 2023, https://doi.org/10.1016/j.ijme.2023.100790

J. Zhou, P. Ke, X. Qiu, M. Huang, J. Zhang, “ChatGPT: potential, prospects, and limitations,” Frontiers of Information Technology & Electronic Engineering, pp. 1-6, 2023, https://doi.or/10.1631/FITEE.2300089

C. K. Lo, “What is the Impact of ChatGPT on Education? A rapid review of the Literature,” Education Sciences vol. 13, no. 4, pp. 410, 2023, https://doi.org/10.3390/educsci13040410

R. Gruetzemacher and J. Whittlestone, (2022). “The transformative potential of artificial intelligence,” Futures, vol. 135, pp. 1-11, 2022, https://doi.org/10.1016/j.futures.2021.10288.4

A. Tlili, B. Shehata, M. A. Adarkwah, A. Bozkurt, D. T. Hickey, R. Huang, and B. Agyemang, “What if the evil is my guardian angel: ChatGPT as a case study of using chatbots in education,” Smart Learning Environments, vol. 10, no. 1, pp. 1-24, 2023, https://doi.org/10.1186/s40561-023-00237-x

R. Dijkstra, Z. Genc, S. Kayal, and J. Kamps, “Reading Comprehension Quiz Generation Using Generative Pre-trained Transformers,” in 4th International Workshop on Intelligent Textbooks, iTextbooks, Durham, UK, 2022, pp. 1-14.

E. Gabajiwala, P. Mehta, R. Singh, and R. Koshy. “Quiz Maker: Automatic quiz generation from text using NLP,” in Futurist trends in networks and computing technologies, vol. 936, P.K. Singh, S.T. Wierzchoń, J. K. Chhabra, and S. Tanwar, Eds. Springer Lecture Notes in Electrical Engineering, 2022, pp. 523-533.

E. Kasneci, K. Seßler, S. Küchemann, M. Bannert, D. Dementieva, F. Fischer, U. Gasser, G. Groh, S. Günnemann, E. Ellermeier, S. Krusche, G. Kutyniok, T. Michaeli, C. Nerdel, J. Pfeffer, O. Poquet, M. Sailer, A. Schmidt, T. Seidel, …, and G. Kasneci, “ChatGPT for good? On opportunities and challenges of large language models for education,” Center for Open Science, vol. 103, 2023, http://dx.doi.org/10.35542/osf.io/5er8f

X. Zhai, (2022), “ChatGPT user experience: Implications for education,” Social Science Research Network Electronic Journal, vol. 18, https://doi.org/10.2139/ssrn.4312418

A. Herft, “A Teacher’s Prompt Guide to ChatGPT: Aligned with ’What Works Best’,” CESE NSW “What Works Best in Practice”, 2023. Accessed: Aug. 15, 2023. [Online]. Available: https://usergeneratededucation.files.wordpress.com/2023/01/a-teachers-prompt-guide-to-chatgpt-alignedwith-what-works-best.pdf

A. R. Mills, “Seeing Past the Dazzle of ChatGPT,” Inside Higher Education, 2024. Accessed: Jan 19, 2023. [Online]. Available: https://www.insidehighered.com/advice/2023/01/19/academics-must-collaboratedevelop-guidelines-chatgpt-opinion

S. MacNeil, A. Tran, D. Mogil, S. Bernstein, E. Ross, and Z. Huang, “Generating diverse code explanations using ChatGPT-e large language model,” in Proceedings of the 2022 ACM Conference of International Computing Education Research, New York, NY, USA, Association for Computing Machinery 2022, pp. 37-39.

E.R. Mollick and L. Mollick, “Using AI to Implement Effective Teaching Strategies in Classrooms: Five Strategies, Including Prompts,” The Wharton School Research Paper, 2023. Accessed: Oct. 15, 2023. [Online]. Available: https://ssrn.com/abstract=4391243 or http://dx.doi.org/10.2139/ssrn.4391243

J. F. Wu, “Effective use of machine learning to empower your research,” The Campus Learn, Share, Connect, 2022. Accessed: Aug 15, 2023. [Online]. Available: https://www.timeshighereducation.com/campus/effective-use-machine-learning-empower-your-research

A. Tack and C. Piech, “The AI teacher test: Measuring the pedagogical ability of blender and GPT-e in educational dialogues,” in Proceedings of the 15th International Conference on Educational Data Mining. Durham, UK, 2022, pp. 1-8, https://doi.org/10.48550/arXiv.2205.07540, to be published.

L. M. Sánchez-Ruiz, S. Moll-López, A. Nuñez-Pérez, JA. MorañoFernández, and E. Vega-Fleitas, “ChatGPT Challenges Blended Learning Methodologies in Engineering Education: A Case Study in Mathematics,” Applied Sciences, vol. 13, no. 10, 2023, https://doi.org/10.3390/app13106039

Shakarian P., Koyyalamudi A., Ngu N., and Mareedu L. (2023). “An independent evaluation of ChatGPT on Mathematical Word Problems”.

A. R. Strohmaier, F. Reinhold, S. Hofer, M. Berkowitz, B. Vogel-Heuser, and K. Reiss, “Different complex word problems require different combinations of cognitive skills,” Educational Studies in Mathematics, vol. 109, pp. 89–114, 2022, https://doi.org/10.1007/s10649-021-10079-4

L. Verschaffel, B. Greer, and E. De Corte, Making sense of word problems, Países Bajos: Swets & Zeitlinger, 2000.

T. S. Barcelos, R. Muñoz-Soto, R. Villarroel, E. Merino, and I. F. Silveira, “Mathematics Learning through Computational Thinking Activities: A Systematic Literature Review,” Journal of Universal Computer Science, vol. 24, no. 7, pp. 815-845, 2018.

G. Polya, Cómo plantear y resolver problemas, Cd. México, Méx.: Editorial Trillas- Colección “Serie de Matemáticas”, 1969.

S. Frieder, L. Pinchetti, R. R. Griffiths, T. Salvatori, T. Lukasiewicz P. C. Peterses, A. Chevalier, and J. Berne, “Mathematical Capabilities of ChatGPT,” Neural Information Processing Systems-Datasets and Benchmarks Track, pp. 1-37, 2023, https://doi.org/10.48550/arXiv.2301.13867

J. K. Kim, M. Chua, M. Rickard, and A. Lorenzo, “ChatGPT and large language model (LLM) chatbots: The current state of acceptability and a proposal for guidelines on utilization in academic medicine,” Journal of Pediatric Urology, vol. 19, no. 5, pp. 598-604., 2023, https://doi.org/10.1016/j.jpurol.2023.05.018

A. Tack, E. Kochmar, Z. Yuan, S. Bibauw, and C. Piech, “The BEA 2023 Shared Task on Generating AI Teacher Responses in Educational Dialogues,” in Proceedings of the 18th Workshop on innovative Use of NLP for Building Educational Applications, Toronto, Canadian Association for Computational Linguistics, 2023, pp. 785-795, https://aclanthology.org/2023.bea-1.64.pdf

Y. Hicke, G. W. Masand, and T. Gangavarapu, “Assessing the efficacy of large language models in generating accurate teacher responses,” In Proceedings of the 18th Workshop on innovative Use of NLP for Building Educational Applications (BEA 2023, Toronto, Canada, 2023, pp. 745-755.

P. Black and D. Wiliam, “Developing the Theory of Formative Assessment,” Educational Assessment Evaluation and Accountability, vol. 21, pp. 5–31, 2009, doi: 10.1007/s11092-008-9068-5.

E. Panadero and A. A. Lipnevich, “A Review of Feedback Models and Typologies: Towards an Integrative Model of Feedback Elements,” Educational Research Review, vol. 35, 2022, doi: 10.1016/j.edurev.2021.100416.

A. Ramaprasad, “On the Definition of Feedback,” Behavioral Science, vol 28, pp. 4–13, 1983 doi:10.1002/bs.3830280103.

A. M. Lui and H. L. Andrade, “Inside the Next Black Box: Examining Students’ Responses to Teacher Feedback in a Formative Assessment Context,” Frontiers in Education, vol. 7, pp. 1-14, 2022, http://dx.doi.org/10.3389/feduc.2022.751548

L. Allal, “Assessment and the Co-regulation of Learning in the Classroom,” Assessment in Education: Principles, Policy &. Practices, vol. 27, no. 4, pp. 332–349, 2019 doi:10.1080/0969594X.2019.1609411.

J. Hattie and H. Timperley, “The Power of Feedback,” Review of Educational Research, vol. 77, no. 1, pp. 81–112. 2007, doi: 10.3102/003465430298487.

J. A. C. Hattie and M. Gan, “Instruction Based on Feedback,” Handbook of Research on Learning and Instruction, R. Mayer and P. Alexander Editors New York: Routledge), 2011.

A. Jonsson and E. Panadero, “Facilitating Students’ Active Engagement with Feedback,” in The Cambridge Handbook of Instructional Feedback Editors, London, England: Routledge, 2018, pp. 28.

A. A. Lipnevich and E. Panadero. “A Review of Feedback Models and Theories: Descriptions, Definitions, and Conclusions”. Frontiers in Education, vol. 6, 2021, doi: 10.3389/feduc.2021.720195.

D. Carless and D. Boud, “The development of student feedback literacy: enabling uptake of feedback,” Assessment and Evaluation in Higher Education, vol. 43, no. 8, pp.1315-1325, 2018.

D. Boud, “Sustainable Assessment: Rethinking Assessment for the Learning Society,” Studies in Continuing Education, vol. 22, no. 2, pp. 151– 167, 2000, doi:10.1080/713695728.

A. Lipnevich, F. Preckel, and S. Krumm, “Mathematics attitudes and their unique contribution to achievement: Going over and above cognitive ability and personality,” Learning and Individual Differences, vol. 47, pp. 70–79, 2016, https://doi.org/10.1016/j.lindif.2015.12.027

B. McMurtrie, “AI and the future of undergraduate writing,” The Chronicle of Higher Education, 2022. Accessed: Sept. 12, 2023. [Online]. Available: https://www.chronicle.com/article/ai-and-the-future-ofundergraduate-writing

A. R. Mills. “ChatGPT just got better: What does that mean for our writing assignments?,” The Chronicle of Higher Education, 2023. Accessed: March 26, 2023. [Online]. Available: https://www-chroniclecom.libproxy.library.unt.edu/article/chatgpt-just-got-better-what-doesthat-mean-for-our-writing-assignments

J. Warner. “Freaking Out About ChatGPT–Part I”, Inside Higher Education, 2022. Accessed: Aug. 13, 2023. [Online]. Available: https://www.insidehighered.com/blogs/just-visiting/freaking-out-aboutchatgpt%E2%80%94part-i

D. C. Lay, S. R. Lay, and J. J. McDonald, Linear Algebra and its applications, Maryland, USA: Pearson (5th Ed.), 2016.

A. Behera, P. Matthew, A. Keidel, P. Vangorp, H. Fang, and C. Susan, “Associating Facial Expressions and Upper-Body Gestures with Learning Tasks for Enhancing Intelligent Tutoring Systems,” International Journal of Artificial Intelligence in Education, vol. 30, pp. 236–270, 2020, https://doi.org/10.1007/s40593-020-00195-2

F. J. García-Peñalvo and A. Vázquez-Ingelmo. “What do we mean by GenAI? A systematic literature mapping of AI-driven solutions for content generation”. International Journal of Interactive Multimedia and Artificial Intelligence, vol. 8, no. 4. pp. 7-16, 2023, doi: https://doi.org/10.9781/ijimai.2023.07.006

S. S. Gill, M. Xu, P. Patros, H. Wu, R. Kaur, K. Kaur, S. Fuller, M. Singh, P. Arora, A. K. Parlikad, V. Stankovski, A. Abraham, S. K. Ghosh, H. Lutfiyya, S. S. Kanhere, R. Bahsoon, O. Rana, S. Dustdar, R. Sakellariou, S. Uhlig, and R. Buyya. “Transformative effects of ChatGPT on modern education: Emerging Era of AI Chatbots,” Internet of Things and CyberPhysical Systems, vol. 4, pp. 19-23, 2024, https://doi.org/10.1016/j.iotcps.2023.06.002