Blending Language Models and Domain-Specific Languages in Computer Science Education. A Case Study on API RESTFul

Authors

  • Francisco Jurado Universidad Autónoma de Madrid image/svg+xml
  • Francy D. Rodríguez Universidad Politécnica de Madrid image/svg+xml
  • Enrique Chavarriaga UGround Global S.L.
  • Luis Rojas San Sebastián University image/svg+xml

DOI:

https://doi.org/10.9781/ijimai.2025.09.005

Keywords:

Case Study, Computer Science Education, Domain-Specific Language, Language Model, Student Assessment

Abstract

Since Computer Science students are used to applying both General Purpose Programming Languages (GPPLs) and Domain-Specific Languages (DSLs), Generative Artificial Intelligence based on Language Models (LMs) can help them on automatic tasks, allowing them to focus on more creative tasks and higher skills. However, the teaching and evaluation of technical tasks in Computer Science can be inefficient and prone to errors. Thus, the main objective of this article is to explore the performance of LMs compared to that of undergraduate Computer Science students in a specific case study: designing and implementing RESTful APIs DSLs. This research aims to determine if LMs can enhance the efficiency and accuracy of these processes. Our case study involved 39 students and 5 different LMs that must use the two DSLs we also designed for their task assignment. To evaluate performance, we applied uniform criteria to student and LMs-generated solutions, enabling a comparative analysis of accuracy and effectiveness. With a case study comparing performance between students and LMs, this article contributes to checking to what extent LMs are able to carry out software development tasks involving the use of new DSLs specially designed for highly specific settings in a similar way as well-qualified Computer Science students are able to. The results underscore the importance of welldefined DSLs and effective prompting processes for optimal LM performance. Specifically, LMs demonstrated high variability in task execution, with two GPT-based LMs achieving similar grades to those scored by the best of the students for every task, obtaining 0.78 and 0.92 on a normalized scale [0, 1], with 0.23 and 0.14 Standard Deviation for ChatGPT-4 and ChatGPT-4o respectively. After the experience, we can conclude that a well-defined DSL and a proper prompting process, providing the LM with metadata, persistent prompts, and a good knowledge base, are crucial for good LM performance. When LMs receive the right prompts, both large and small LMs can achieve excellent results depending on the task.

Downloads

Download data is not yet available.

Author Biographies

Francisco Jurado, Universidad Autónoma de Madrid

He received the Ph.D. degree with honours in Computer Science from the University of Castilla-La Mancha in 2010. He is Associated Professor in the Computer Engineering Department, Universidad Autónoma de Madrid, Spain. He has (co)authored more than 70 research articles in journals and conferences, participated in more than 15 competitive researches projects, and used to served as reviewer in indexed journals and international conferences. His main research areas include Intelligent Tutoring Systems, Heterogeneous Distributed eLearning systems, and Natural Language Processing.

Francy D. Rodríguez, Universidad Politécnica de Madrid

She received her PhD degree from the Universidad Politécnica de Madrid (UPM) in 2015. She is currently a full-time Assistant Professor in the Computer Engineering Department at UPM, Spain. Her research interests include software development, design and programming patterns, software usability, and applications of artificial intelligence.

Enrique Chavarriaga, UGround Global S.L.

He received his PhD in Computer Science and Telecommunications Engineering from the Autonomous University of Madrid in 2017. He currently works as a research engineer in the I+D+i Department at UGROUND GLOBAL S.L., Spain. His research interests include domain-specific visual and textual languages, low-code development environments, and artificial intelligence.

Luis Rojas, San Sebastián University

He received his M.S. and Ph.D. in Computer Science from Universidad Autónoma de Madrid in 2017. He is currently a Lecturer in the Universidad San Sebastian, Santiago, Chile. He has been a software engineer at Chilean Nuclear Energy Commission for 10 years. He has also served as a reviewer and chair at different conferences on Software Engineering and Human-Computer Interaction. He is the author of several research articles, book chapters, and has also served as a book editor. His research interests include artificial intelligent, human-computer interaction, software engineering and learning analytics.

References

OpenAI, “GPT-4 technical report,” OpenAI, 2023. [Online]. Available: https://cdn.openai.com/papers/gpt-4.pdf, doi: https://doi.org/10.48550/arxiv.2303.08774

A. Gilson, C. W. Safranek, T. Huang, V. Socrates, L. Chi, R. A. Taylor, D. Chartash, “How does ChatGPT perform on the United States medical licensing examination? the implications of large language models for medical education and knowledge assessment,” JMIR Medical Education, vol. 9, p. e45312, Feb 2023, doi: https://doi.org/10.2196/45312

D. M. Katz, M. J. Bommarito, S. Gao, P. Arredondo, “Gpt-4 passes the bar exam,” 382 Philosophical Transactions of the Royal Society A, 2024, doi: https://doi.org/10.2139/ssrn.4389233

M. Bernabei, S. Colabianchi, A. Falegnami, F. Costantino, “Students’ use of large language models in engineering education: A case study on technology acceptance, perceptions, efficacy, and detection chances,” Computers and Education: Artificial Intelligence, vol. 5, p. 100172, 2023, doi: https://doi.org/10.1016/j.caeai.2023.100172

M. Liu, L. J. Zhang, C. Biebricher, “Investigating students’ cognitive processes in generative ai-assisted digital multimodal composing and traditional writing,” Computers & Education, vol. 211, p. 104977, 2024, doi: https://doi.org/10.1016/j.compedu.2023.104977

E. Kasneci, K. Sessler, S. Küchemann, M. Bannert, D. Dementieva, F. Fischer, U. Gasser, G. Groh, S. Günnemann, E. Hüllermeier, S. Krusche, G. Kutyniok, T. Michaeli, C. Nerdel, J. Pfeffer, O. Poquet, M. Sailer, A. Schmidt, T. Seidel, M. Stadler, J. Weller, J. Kuhn, G. Kasneci, “ChatGPT for good? on opportunities and challenges of large language models for education,” Learning and Individual Differences, vol. 103, p. 102274, 2023, doi: https://doi.org/10.1016/j.lindif.2023.102274

R. Gao, H. E. Merzdorf, S. Anwar, M. C. Hipwell, A. R. Srinivasa, “Automatic assessment of text-based responses in post-secondary education: A systematic review,” Computers and Education: Artificial Intelligence, vol. 6, p. 100206, 2024, doi: https://doi.org/10.1016/j.caeai.2024.100206

Phind, “Introducing phind-70b – closing the code quality gap with gpt-4turbo while running 4x faster,” 2024. [Online]. Available: https://www.phind.com/blog/introducing-phind-70b, Accessed on April 23, 2024.

Meta, “Don’t settle for less. build your interview skills and take your career to the next level,” 2024. [Online]. Available: https://codellama.dev, Accessed on April 23, 2024.

Y. Li, D. Choi, J. Chung, N. Kushman, J. Schrittwieser, R. Leblond, T. Eccles, J. Keeling, F. Gimeno, A. Dal Lago, T. Hubert, P. Choy, C. de Masson d’Autume, I. Babuschkin, X. Chen, P.-S. Huang, J. Welbl, S.

Gowal, A. Cherepanov, J. Molloy, D. J. Mankowitz, E. Sutherland Robson, P. Kohli, N. de Freitas, K. Kavukcuoglu, O. Vinyals, “Competition-level code generation with AlphaCode,” Science (New York, N.Y.), vol. 378, p. 1092—1097, December 2022, doi: https://doi.org/10.1126/science.abq1158

Z. Feng, D. Guo, D. Tang, N. Duan, X. Feng, M. Gong, L. Shou, B. Qin, T. Liu, D. Jiang, M. Zhou, “CodeBERT: A pre-trained model for programming and natural languages,” 2020. doi: https://doi.org/10.18653/v1/2020.findings-emnlp.139

Tabnine, “Tabnine | the AI coding assistant that you control,” 2023. [Online]. Available: https://www.tabnine.com/, Accessed on April 23, 2024.

Anysphere, “Build software faster in an ide designed for pairprogramming with ai,” 2024. [Online]. Available: https://cursor.sh/, Accessed on April 23, 2024.

GitHub, “Copylot,” 2024. [Online]. Available: https://github.com/features/copilot, Accessed on April 23, 2024.

M. Chen, J. Tworek, H. Jun, Q. Yuan, H. P. de Oliveira Pinto, J. Kaplan, H. Edwards, Y. Burda, N. Joseph, G. Brockman, A. Ray, R. Puri, G. Krueger, M. Petrov, H. Khlaaf, G. Sastry, P. Mishkin, B. Chan, S. Gray, N. Ryder, M. Pavlov, A. Power, L. Kaiser, M. Bavarian, C. Winter, P. Tillet, F. P. uch, D. Cummings, M. Plappert, F. Chantzis, E. Barnes, A. Herbert-Voss, W. H. Guss, A. Nichol, A. Paino, N. Tezak, J. Tang, I. Babuschkin, S. Balaji, S. Jain, W. Saunders, C. Hesse, A. N. Carr, J. Leike, J. Achiam, V. Misra, E. Morikawa, A. Radford, M. Knight, M. Brundage, M. Murati, K. Mayer, P. Welinder, B. McGrew, D. Amodei, S. McCandlish, I. Sutskever, W. Zaremba, “Evaluating large language models trained on code,” 2021. [Online]. Available: https://arxiv.org/abs/2107.03374, doi: https://doi.org/10.48550/arxiv.2107.03374

D. Ankur, D. Atul, “Introducing amazon CodeWhisperer, the ml-powered coding companion,” 2022. [Online]. Available: https://aws.amazon.com/es/blogs/machine-learning/introducing-amazon-codewhisperer-the-mlpowered-coding-companion/, Accessed on April 23, 2024.

Code4Me, “Code4Me,” 2022. [Online]. Available: https://code4me.me/, Accessed on April 23, 2024.

FauxPilot, “FauxPilot - an open-source alternative to GitHub copilot server,” 2023. [Online]. Available: https://github.com/fauxpilot/fauxpilot, Accessed on April 23, 2024.

J. Prather, B. N. Reeves, P. Denny, B. A. Becker, J. Leinonen, A. Luxton-Reilly, G. Powell, J. Finnie- Ansley, E. A. Santos, ““it’s weird that it knows what i want”: Usability and interactions with Copilot for novice programmers,” ACM Transactions on Computer-Human Interaction, vol. 31, pp. 1–31, nov 2023, doi: https://doi.org/10.1145/3617367

P. Haindl, G. Weinberger, “Students’ experiences of using ChatGPT in an undergraduate programming course,” IEEE Access, vol. 12, pp. 43519–43529, 2024, doi: https://doi.org/10.1109/ACCESS.2024.3380909

V. Kozov, G. Ivanova, D. Atanasova, “Practical application of ai and large language models in software engineering education,” International Journal of Advanced Computer Science and Applications, vol. 15, no. 1, 2024, doi: https://doi.org/10.14569/IJACSA.2024.0150168

V. Parra, P. Sureda, A. Corica, S. Schiaffino, D. Godoy, “Can generative ai solve geometry problems? strengths and weaknesses of llms for geometric reasoning in spanish,” International Journal of Interactive Multimedia and Artificial Intelligence, vol. 8, no. 5, pp. 65–74, 2024, doi: https://doi.org/10.9781/ijimai.2024.02.009

J. Meyer, T. Jansen, R. Schiller, L. W. Liebenow, M. Steinbach, A. Horbach, J. Fleckenstein, “Using LLMs to bring evidence-based feedback into the classroom: AI-Generated feedback increases secondary students’ text revision, motivation, and positive emotions,” Computers and Education: Artificial Intelligence, vol. 6, p. 100199, 2024, doi: https://doi.org/10.1016/jcaeai.2023.100199

N. Rigaud Téllez, P. Rayón Villela, R. Blanco Bautista, “Evaluating chatgpt-generated linear algebra formative assessments,” International Journal of Interactive Multimedia and Artificial Intelligence, vol. 8, no. 5, pp. 75–82, 2024, doi: https://doi.org/10.9781/ijimai.2024.02.004

A. Mizumoto, M. Eguchi, “Exploring the potential of using an ai language model for automated essay scoring,” Research Methods in Applied Linguistics, vol. 2, no. 2, p. 100050, 2023, doi: https://doi.org/10.1016/j.rmal.2023.100050.

E. Latif, X. Zhai, “Fine-tuning ChatGPT for automatic scoring,” Computers and Education: Artificial Intelligence, vol. 6, p. 100210, 2024, doi: https://doi.org/10.1016/j.caeai.2024.100210

M. Urban, F. Děchtěrenko, J. Lukavský, V. Hrabalová, F. Svacha, C. Brom, K. Urban, “ChatGPT improves creative problem-solving performance in university students: An experimental study,” Computers & Education, vol. 215, p. 105031, 2024, doi: https://doi.org/10.1016/j.compedu.2024.105031

A. Desai, S. Gulwani, V. Hingorani, N. Jain, Karkare, M. Marron, S. R, S. Roy, “Program synthesis using natural language,” in Proceedings of the 38Th International Conference on Software Engineering, ICSE ’16, New York, NY, USA, 2016, p. 345–356, Association for Computing Machinery.

K. Kolthoff, “Automatic generation of graphical user interface prototypes from unrestricted natural language requirements,” in 2019 34Th IEEE/ACM International Conference on Automated Software Engineering (ASE), Los Alamitos, CA, USA, nov 2019, pp. 1234–1237, IEEE Computer Society.

S. Kolahdouz-Rahimi, K. Lano, C. Lin, “Requirement formalisation using natural language processing and machine learning: A systematic review,” in Proceedings of the 11Th International Conference on Model-Based Software and Systems Engineering - Volume 1: MODELSWARD, 2023, pp. 237–244, INSTICC, SciTePress.

A. Vernotte, A. Cretin, B. Legeard, F. Peureux, “A domain-specific language to design false data injection tests for air traffic control systems,” International Journal on Software Tools for Technology Transfer, vol. 24, no. 2, pp. 127–158, 2022, doi: https://doi.org/10.1007/S10009-021-00604-4

E. Chavarriaga, F. Jurado, F. Díez, “An Approach to Build XML-Based Domain Specific Languages Solutions for Client-Side Web Applications,” Computer Languages, Systems & Structures, vol. 49, pp. 133–151, 2017, doi: https://doi.org/10.1016/j.cl.2017.04.002

E. Chavarriaga, F. Jurado, F. D. Rodríguez, “An Approach to Build JSONBased Domain Specific Languages Solutions for Web Applications,” Journal of Computer Languages, vol. 75, p. 101203, 2023, doi: https://doi.org/10.1016/j.cola.2023.101203

E. Chavarriaga, L. A. Rojas, K. Sorbello, F. D. Rodríguez, F. Jurado, “JSONbased domain-specific language: A case study using rhoarchitecture in designing and developing API restful,” 2024. [Online]. Available: https://ssrn.com/abstract=4740783, doi: http://dx.doi.org/10.2139/ssrn.4740783

I. Singh, V. Blukis, A. Mousavian, A. Goyal, D. Xu, J. Tremblay, D. Fox, J. Thomason, A. Garg, “Progprompt: Program generation for situated robot task planning using large language models,” Autonomous Robots, vol. 47, pp. 999–1012, 2023, doi: https://doi.org/10.1007/s10514-023-10135-3

J. Wang, Y. Chen, “A review on code generation with llms: Application and evaluation,” in 2023 IEEE International Conference on Medical Artificial Intelligence (MedAI), 2023, pp. 284–289.

S. Yeo, Y.-S. Ma, S. C. Kim, H. Jun, T. Kim, “Framework for evaluating code generation ability of large language models,” Electronics and Telecommunications Research Institute (ETRI) Journal, vol. 46, no. 1, pp. 106–117, 2024, doi: https://doi.org/10.4218/etrij.2023-0357

M. Schäfer, S. Nadi, A. Eghbali, F. Tip, “An empirical evaluation of using large language models for automated unit test generation,” IEEE Transactions on Software Engineering, vol. 50, no. 1, pp. 85–105, 2024, doi: https://doi.org/10.1109/TSE.2023.3334955

X. Zhou, Z. Sun, G. Li, “Db-gpt: Large language model meets database,” 2024. doi: https://doi.org/10.1007/s41019- 023-00235-6

J. Sauvola, S. Tarkoma, M. Klemettinen, J. Riekki, D. Doermann, “Future of software development with generative ai,” Automated Software Engineering, vol. 31, 2024, doi: https://doi.org/10.1007/s10515-024-00426-zDO

Z. Guo, R. Jin, C. Liu, Y. Huang, D. Shi, Supryadi, L. Yu, Y. Liu, J. Li, B. Xiong, D. Xiong, “Evaluating large language models: A comprehensive survey,” 2023. doi: https://doi.org/10.48550/arXiv.2310.19736

Y. Chang, X. Wang, J. Wang, Y. Wu, L. Yang, K. Zhu, H. Chen, X. Yi, C. Wang, Y. Wang, W. Ye, Y. Zhang, Y. Chang, P. S. Yu, Q. Yang, X. Xie, “A survey on evaluation of large language models,” ACM Transactions on Intelligent Systems and Technology, vol. 15, pp. 1–45, mar 2024, doi: https://doi.org/10.1145/3641289

K. Papineni, S. Roukos, T. Ward, W.-J. Zhu, “BLEU: A method for automatic evaluation of machine translation,” in Proceedings of the 40Th Annual Meeting on Association for Computational Linguistics, ACL ’02, USA, 2002, p. 311–318, Association for Computational Linguistics.

S. Ren, D. Guo, S. Lu, L. Zhou, S. Liu, D. Tang, N. Sundaresan, M. Zhou, A. Blanco, S. Ma, “Codebleu: a method for automatic evaluation of code synthesis,” CoRR, vol. abs/2009.10297, 2020.

C.-Y. Lin, “ROUGE: A package for automatic evaluation of summaries,” in Text Summarization Branches Out, Barcelona, Spain, jul 2004, pp. 74–81, Association for Computational Linguistics.

A. Z. Broder, “On the resemblance and containment of documents,” in Compression and Complexity of SEQUENCES 1997, Positano, Amalfitan Coast, Salerno, Italy, June 11-13, 1997, Proceedings, 1997, pp. 21–29, IEEE.

M. Voelter, DSL Engineering: Designing, Implementing and Using Domain-Specific Languages. Springer, 2013.

M. Fowler, T. White, Domain-Specific Languages. Addison-Wesley Professional, 2010.

J. Smith, DSLs in Practice. 2019.

V. Bojinov, RESTful Web API Design with Node.js 10, Third Edition: Learn to create robust RESTful web services with Node.js, MongoDB, and Express. js, 3rd Edition. Packt Publishing, 2018.

H. Subramanian, P. Raj, Hands-On RESTful API Design Patterns and Best Practices: Design, develop, and deploy highly adaptable, scalable, and secure RESTful web APIs. Packt Publishing Ltd, 2019.

J. Wexler, Get Programming with Node. js. Simon and Schuster, 2019.

D. Flanagan, JavaScript: The Definitive Guide: Master the World’s Most-Used Programming Language. 7th editio ed., 2020.

O. Foundation, “Express: Fast, unopinionated, minimalist web framework for node.js,” 2023. [Online]. Available: https://expressjs.com/

OpenAI, “Introducing ChatGPT,” 2022. [Online]. Available: https://openai.com/blog/chatgpt, Accessed on April 23, 2024.

Microsoft, “Microsoft copilot,” 2024. [Online]. Available: https://copilot.microsoft.com, Accessed on April 23, 2024.

Perplexity AI, “Perplexity | where knowledge begins,” 2024. [Online]. Available: https://www.perplexity.ai/, Accessed on April 23, 2024.

Perplexity AI, “Sonar,” 2024. [Online]. Available: https://docs.perplexity.ai/docs/model-cards, Accessed on April 23, 2024.

Meta, “Getting started with meta llama,” 2024. [Online]. Available: https://llama.meta.com/, Accessed on April 23, 2024.

Anthropic, “Claude,” 2024. [Online]. Available: https://claude.ai, Accessed on April 23, 2024.

Mistral IA, “Mistral technology,” 2024. [Online]. Available: https://mistral.ai/technology/, Accessed on April 23, 2024.

M. Abdin, S. A. Jacobs, A. A. Awan, J. Aneja, A. Awadallah, H. Awadalla, N. Bach, A. Bahree, A. Bakhtiari, H. Behl, A. Benhaim, M. Bilenko, J. Bjorck, S. Bubeck, M. Cai, C. C. T. Mendes, W. Chen, V. Chaudhary, P. Chopra, A. D. Giorno, G. de Rosa, M. Dixon, R. Eldan, D. Iter, A. Garg, A. Goswami, S. Gunasekar, E. Haider, J. Hao, R. J. Hewett, J. Huynh, M. Javaheripi, X. Jin, P. Kauffmann, N. Karampatziakis, D. Kim, M. Khademi, L. Kurilenko, J. R. Lee, Y. T. Lee, Y. Li, C. Liang, W. Liu, E. Lin, Z. Lin, P. Madan, A. Mitra, H. Modi, A. Nguyen, B. Norick, B. Patra, D. Perez- Becker, T. Portet, R. Pryzant, H. Qin, M. Radmilac, C. Rosset, S. Roy, O. Ruwase, O. Saarikivi, A. Saied, A. Salim, M. Santacroce, S. Shah, N. Shang, H. Sharma, X. Song, M. Tanaka, X. Wang, R. Ward, G. Wang, P. Witte, M. Wyatt, C. Xu, J. Xu, S. Yadav, F. Yang, Z. Yang, D. Yu, C. Zhang, C. Zhang, J. Zhang, L. L. Zhang, Y. Zhang, Y. Zhang, Y. Zhang, X. Zhou, “Phi-3 technical report: A highly capable language model locally on your phone,” 2024. doi: https://doi.org/10.48550/arXiv.2404.14219

A. Mitra, L. D. Corro, S. Mahajan, A. Codas, C. Simoes, S. Agarwal, X. Chen, A. Razdaibiedina, E. Jones, K. Aggarwal, H. Palangi, G. Zheng, C. Rosset, H. Khanpour, A. Awadallah, “Orca 2: Teaching small language models how to reason,” 2023. doi: https://doi.org/10.48550/arXiv.2311.11045.

Gemma Team, “Gemma: Open models based on gemini research and technology,” 2024. doi: https://doi.org/10.48550/arXiv.2403.08295

Gemini Team, “Gemini: A family of highly capable multimodal models,” 2024. doi: https://doi.org/10.48550/arXiv.2312.11805

Downloads

Published

2025-10-03
Metrics
Views/Downloads
  • Abstract
    93
  • PDF
    14

How to Cite

Jurado, F., Rodríguez, F. D., Chavarriaga, E., and Rojas, L. (2025). Blending Language Models and Domain-Specific Languages in Computer Science Education. A Case Study on API RESTFul. International Journal of Interactive Multimedia and Artificial Intelligence, 1–19. https://doi.org/10.9781/ijimai.2025.09.005

Issue

Section

Articles