Generating and Ranking Distractors for Multiple-Choice Questions in Portuguese

Authors Hugo Gonçalo Oliveira , Igor Caetano, Renato Matos, Hugo Amaro

Thumbnail PDF


  • Filesize: 0.52 MB
  • 9 pages

Document Identifiers

Author Details

Hugo Gonçalo Oliveira
  • Center of Informatics and Systems, University of Coimbra, Portugal
  • Department of Informatics Engineering, University of Coimbra, Portugal
Igor Caetano
  • Instituto Pedro Nunes, Coimbra, Portugal
  • Department of Informatics Engineering, University of Coimbra, Portugal
Renato Matos
  • Center of Informatics and Systems, University of Coimbra, Portugal
  • Department of Informatics Engineering, University of Coimbra, Portugal
Hugo Amaro
  • Instituto Pedro Nunes, LIS, Coimbra, Portugal

Cite AsGet BibTex

Hugo Gonçalo Oliveira, Igor Caetano, Renato Matos, and Hugo Amaro. Generating and Ranking Distractors for Multiple-Choice Questions in Portuguese. In 12th Symposium on Languages, Applications and Technologies (SLATE 2023). Open Access Series in Informatics (OASIcs), Volume 113, pp. 4:1-4:9, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


In the process of multiple-choice question generation, different methods are often considered for distractor acquisition, as an attempt to cover as many questions as possible. Some, however, result in many candidate distractors of variable quality, while only three or four are necessary. We implement some distractor generation methods for Portuguese and propose their combination and ranking with language models. Experimentation results confirm that this increases both coverage and suitability of the selected distractors.

Subject Classification

ACM Subject Classification
  • Computing methodologies → Natural language processing
  • Multiple-Choice Questions
  • Distractor Generation
  • Language Models


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Dhawaleswar Rao Ch and Sujan Kumar Saha. Automatic Multiple Choice Question Generation from Text: A Survey. IEEE Transactions on Learning Technologies, 13(1):14-25, 2018. Google Scholar
  2. Shang-Hsuan Chiang, Ssu-Cheng Wang, and Yao-Chung Fan. Cdgp: Automatic cloze distractor generation based on pre-trained language model. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 5835-5840, 2022. Google Scholar
  3. Rui Pedro dos Santos Correia, Jorge Baptista, Nuno Mamede, Isabel Trancoso, and Maxine Eskenazi. Automatic Generation of Cloze Question Distractors. In Second language studies: acquisition, learning, education and technology, 2010. Google Scholar
  4. Maria João Costa, Hugo Amaro, Bruno Caceiro, and Hugo Gonçalo Oliveira. SmartEDU: Accelerating slide deck production with Natural Language Processing. In Proceedings of 28th International Conference on Applications of Natural Language to Information Systems, NLDB 2023, volume 13286 of LNCS, page In press. Springer, 2023. Google Scholar
  5. Maria João Costa, Renato Matos, Hugo Amaro, Bruno Caceiro, Alcides Marques, and Hugo Gonçalo Oliveira. SmartEDU: A platform for generating education-support materials. In Proceedings of the Experiment@ International Conference 2023 (expat’23), 2023. Google Scholar
  6. Sérgio dos Santos Lopes Curto. Automatic generation of multiple-choice tests. Unpublished master’s thesis). Universida de Técnica de Lisboa, Portugal, 2010. Google Scholar
  7. Valeria de Paiva, Alexandre Rademaker, and Gerard de Melo. OpenWordNet-PT: An Open Brazilian WordNet for Reasoning. In Proceedings of 24th International Conference on Computational Linguistics, COLING (Demo Paper), 2012. Google Scholar
  8. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of 2019 Conference of North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4171-4186. Association for Computational Linguistics, June 2019. Google Scholar
  9. Christiane Fellbaum, editor. WordNet: An Electronic Lexical Database (Language, Speech, and Communication). The MIT Press, 1998. Google Scholar
  10. João Ferreira, Ricardo Rodrigues, and Hugo Gonçalo Oliveira. Assessing factoid question-answer generation for Portuguese (short paper). In Proceedings of 9th Symposium on Languages, Applications and Technologies, SLATE 2020, volume 83 of OASIcs, pages 16:1-16:9. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2020. Google Scholar
  11. Shu Jiang and John SY Lee. Distractor generation for Chinese fill-in-the-blank items. In Proceedings of 12th Workshop on Innovative Use of NLP for Building Educational Applications, pages 143-148, 2017. Google Scholar
  12. Ghader Kurdi, Jared Leo, Bijan Parsia, Uli Sattler, and Salam Al-Emari. A systematic review of Automatic Question Generation for Educational Purposes. International Journal of Artificial Intelligence in Education, 30(1):121-204, 2020. Google Scholar
  13. Guokun Lai, Qizhe Xie, Hanxiao Liu, Yiming Yang, and Eduard Hovy. RACE: Large-scale reading comprehension dataset from examinations. In Proceedings of 2017 Conference on Empirical Methods in Natural Language Processing, pages 785-794, 2017. Google Scholar
  14. Jens Lehmann, Robert Isele, Max Jakob, Anja Jentzsch, Dimitris Kontokostas, Pablo N Mendes, Sebastian Hellmann, Mohamed Morsey, Patrick Van Kleef, Sören Auer, and Christian Bizer. DBPedia-a large-scale, multilingual knowledge base extracted from Wikipedia. Semantic Web, 6(2):167-195, 2015. Google Scholar
  15. Bernardo Leite and Henrique Lopes Cardoso. Neural question generation for the Portuguese language: A preliminary study. In Progress in Artificial Intelligence: 21st EPIA Conference on Artificial Intelligence, EPIA 2022, Lisbon, Portugal, August 31-September 2, 2022, Proceedings, pages 780-793. Springer, 2022. Google Scholar
  16. Potsawee Manakul, Adian Liusie, and Mark JF Gales. MQAG: Multiple-choice question answering and generation for assessing information consistency in summarization. arXiv preprint arXiv:2301.12307, 2023. Google Scholar
  17. Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient estimation of word representations in vector space. In Proceedings of Workshop track of the International Conference on Learning Representations (ICLR), 2013. Google Scholar
  18. Ruslan Mitkov, Ha Le An, and Nikiforos Karamanis. A computer-aided environment for generating multiple-choice test items. Natural language engineering, 12(2):177-194, 2006. Google Scholar
  19. NEA Nasution. Using artificial intelligence to create biology multiple choice questions for higher education. Agricultural and Environmental Education, 2(1), 2023. Google Scholar
  20. Thomas Pellegrini, Rui Correia, Isabel Trancoso, Jorge Baptista, Nuno Mamede, and Maxine Eskenazi. Asr-based exercises for listening comprehension practice in european portuguese. Computer Speech & Language, 27(5):1127-1142, 2013. Google Scholar
  21. Van-Minh Pho, Anne-Laure Ligozat, and Brigitte Grau. Distractor quality evaluation in multiple choice questions. In Artificial Intelligence in Education: 17th International Conference, AIED 2015, Madrid, Spain, June 22-26, 2015. Proceedings 17, pages 377-386. Springer, 2015. Google Scholar
  22. Juliana Pirovani, Marcos Spalenza, and Elias Oliveira. Geração automática de questões a partir do reconhecimento de entidades nomeadas em textos didáticos. In Simpósio Brasileiro de Informática na Educação-(SBIE), page 1147, 2017. Google Scholar
  23. Alec Radford, Jeffrey Wu, Rewon Child, David Luan, Dario Amodei, and Ilya Sutskever. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019. Google Scholar
  24. Pranav Rajpurkar, Jian Zhang, Konstantin Lopyrev, and Percy Liang. SQuAD: 100,000+ questions for machine comprehension of text. In Proceedings of 2016 Conference on Empirical Methods in Natural Language Processing, pages 2383-2392, 2016. Google Scholar
  25. Siyu Ren and Kenny Q Zhu. Knowledge-driven distractor generation for cloze-style multiple choice questions. In Proceedings of AAAI Conference on Artificial Intelligence, volume 35, pages 4339-4347, 2021. Google Scholar
  26. Dominic Seyler, Mohamed Yahya, and Klaus Berberich. Knowledge questions from knowledge graphs. In Proceedings of ACM SIGIR International Conference on Theory of Information Retrieval, pages 11-18, 2017. Google Scholar
  27. Fábio Souza, Rodrigo Nogueira, and Roberto Lotufo. BERTimbau: Pretrained BERT models for Brazilian Portuguese. In Proceedings of Brazilian Conference on Intelligent Systems (BRACIS 2020), volume 12319 of LNCS, pages 403-417. Springer, 2020. Google Scholar
  28. Katherine Stasaski and Marti A Hearst. Multiple choice question generation utilizing an ontology. In Proceedings of 12th Workshop on Innovative Use of NLP for Building Educational Applications, pages 303-312, 2017. Google Scholar
  29. Cheng Zhang, Yicheng Sun, Hejia Chen, and Jie Wang. Generating adequate distractors for multiple-choice questions. arXiv preprint arXiv:2010.12658, 2020. Google Scholar