NLP/AI Based Techniques for Programming Exercises Generation

Authors Tiago Carvalho Freitas, Alvaro Costa Neto , Maria João Varanda Pereira , Pedro Rangel Henriques

Thumbnail PDF


  • Filesize: 0.59 MB
  • 12 pages

Document Identifiers

Author Details

Tiago Carvalho Freitas
  • ALGORITMI Research Centre/LASI, University of Minho, Braga, Portugal
Alvaro Costa Neto
  • Instituto Federal de Educação, Ciência e Tecnologia de São Paulo, Barretos, Brazil
Maria João Varanda Pereira
  • Research Centre in Digitalization and Intelligent Robotics, Polythechnic Insitute of Bragança, Portugal
Pedro Rangel Henriques
  • ALGORITMI Research Centre/LASI, University of Minho, Braga, Portugal

Cite AsGet BibTex

Tiago Carvalho Freitas, Alvaro Costa Neto, Maria João Varanda Pereira, and Pedro Rangel Henriques. NLP/AI Based Techniques for Programming Exercises Generation. In 4th International Computer Programming Education Conference (ICPEC 2023). Open Access Series in Informatics (OASIcs), Volume 112, pp. 9:1-9:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


This paper focuses on the enhancement of computer programming exercises generation to the benefit of both students and teachers. By exploring Natural Language Processing (NLP) and Machine Learning (ML) methods for automatic generation of text and source code, it is possible to semi-automatically construct programming exercises, aiding teachers to reduce redundant work and more easily apply active learning methodologies. This would not only allow them to still play a leading role in the teaching-learning process, but also provide students a better and more interactive learning experience. If embedded in a widely accessible website, an exercises generator with these Artificial Intelligence (AI) methods might be used directly by students, in order to obtain randomised lists of exercises for their own study, at their own time. The emergence of new and increasingly powerful technologies, such as the ones utilised by ChatGPT, raises the discussion about their use for exercise generation. Albeit highly capable, monetary and computational costs are still obstacles for wider adoption, as well as the possibility of incorrect results. This paper describes the characteristics and behaviour of several ML models applied and trained for text and code generation and their use to generate computer programming exercises. Finally, an analysis based on correctness and coherence of the resulting exercise statements and complementary source codes generated/produced is presented, and the role that this type of technology can play in a programming exercise automatic generation system is discussed.

Subject Classification

ACM Subject Classification
  • Social and professional topics → Computer science education
  • Software and its engineering → Imperative languages
  • Computing methodologies → Machine learning
  • Software and its engineering → Parsers
  • Natural Language Processing
  • Computer Programming Education
  • Exercises Generation
  • Text Generation
  • Code Generation


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Jacob Austin, Augustus Odena, Maxwell Nye, et al. Program synthesis with large language models. arXiv preprint, 2021. URL:
  2. Gagan Bhatia. keytotext. URL:
  3. Steven Bird, Ewan Klein, and Edward Loper. Natural language processing with Python: analyzing text with the natural language toolkit. O'Reilly Media, Inc., 2009. Google Scholar
  4. Matt Bower. A taxonomy of task types in computing. SIGCSE Bull., 40(3):281-285, June 2008. URL:
  5. Tom B. Brown, Benjamin Mann, Nick Ryder, et al. Language models are few-shot learners. CoRR, abs/2005.14165, 2020. URL:
  6. Asli Celikyilmaz, Elizabeth Clark, and Jianfeng Gao. Evaluation of text generation: A survey, 2021. URL:
  7. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: pre-training of deep bidirectional transformers for language understanding. In Jill Burstein, Christy Doran, and Thamar Solorio, editors, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), pages 4171-4186. Association for Computational Linguistics, 2019. URL:
  8. Pranay Dugar. Attention - Seq2Seq Models., 2019.
  9. Alex Graves. Generating sequences with recurrent neural networks. CoRR, abs/1308.0850, 2013. URL:
  10. Hamel Husain, Ho-Hsiang Wu, Tiferet Gazit, Miltiadis Allamanis, and Marc Brockschmidt. Codesearchnet challenge: Evaluating the state of semantic code search. CoRR, abs/1909.09436, 2019. URL:
  11. IBM. What are Neural Networks?, 2023.
  12. IBM. What are Recurrent Neural Networks?, 2023.
  13. IBM. What is Natural Language Processing?, 2023.
  14. IBM. What is Unsupervised Learning?, 2023.
  15. Andrej Karpathy. The unreasonable effectiveness of recurrentneural networks, 2015. URL:
  16. Hung Le, Yue Wang, Akhilesh Gotmare, Silvio Savarese, and Steven Hoi. Coderl: Mastering code generation through pretrained models and deep reinforcement learning, July 2022. URL:
  17. Archna Oberoi. What are Language Models in NLP?, 2020.
  18. OpenAI. ChatGPT. URL:
  19. OpenAI. OpenAI. URL:
  20. OpenAI. Gpt-4 technical report, 2023. URL:
  21. Mário Pinto and Teresa Terroso. Learning Computer Programming: A Gamified Approach. In Alberto Simões and João Carlos Silva, editors, Third International Computer Programming Education Conference (ICPEC 2022), volume 102 of Open Access Series in Informatics (OASIcs), pages 11:1-11:8, Dagstuhl, Germany, 2022. Schloss Dagstuhl - Leibniz-Zentrum für Informatik. URL:
  22. Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J. Liu. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res., 21(1), January 2020. Google Scholar
  23. Noa Ragonis. Type of questions - The case of computer science. Olympiads in Informatics, 6:115-132, January 2012. Google Scholar
  24. Ehud Reiter and Robert Dale. Building Natural Language Generation Systems. Cambridge University Press, 2000. Google Scholar
  25. Ehud Reiter and Robert Dale. Building applied natural language generation systems. Natural Language Engineering, 3, March 2002. Google Scholar
  26. Alexander Ruf, Marc Berges, and Peter Hubwieser. Classification of programming tasks according to required skills and knowledge representation. In Informatics in Schools. Curricula, Competences, and Competitions - 8th International Conference on Informatics in Schools: Situation, Evolution, and Perspectives, ISSEP 2015, Ljubljana, Slovenia, September 28 - October 1, 2015, Proceedings, volume 9378, September 2015. URL:
  27. Gianetan Sekhon. Gpt-2 vs gpt-3., 2023.
  28. Alberto Simões and Ricardo Queirós. On the Nature of Programming Exercises. In Ricardo Queirós, Filipe Portela, Mário Pinto, and Alberto Simões, editors, First International Computer Programming Education Conference (ICPEC 2020), volume 81 of OpenAccess Series in Informatics (OASIcs), pages 24:1-24:9, Dagstuhl, Germany, 2020. Schloss Dagstuhl-Leibniz-Zentrum für Informatik. URL:
  29. Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. Sequence to sequence learning with neural networks. In Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2, NIPS'14, pages 3104-3112, Cambridge, MA, USA, 2014. MIT Press. Google Scholar
  30. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS'17, pages 6000-6010, Red Hook, NY, USA, 2017. Curran Associates Inc. Google Scholar
  31. w3resource. Python exercises, practice, solution., 2023.
  32. Yue Wang, Weishi Wang, Shafiq Joty, and Steven C.H. Hoi. CodeT5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 8696-8708, Online and Punta Cana, Dominican Republic, November 2021. Association for Computational Linguistics. URL:
  33. Max Woolf. aitextgen., 2021.
  34. Tianyi Zhang, Varsha Kishore, Felix Wu, Kilian Q. Weinberger, and Yoav Artzi. Bertscore: Evaluating text generation with bert, 2020. Google Scholar