Assessing Factoid Question-Answer Generation for Portuguese (Short Paper)

Authors João Ferreira, Ricardo Rodrigues , Hugo Gonçalo Oliveira

Thumbnail PDF


  • Filesize: 426 kB
  • 9 pages

Document Identifiers

Author Details

João Ferreira
  • Centre for Informatics and Systems of the University of Coimbra, Portugal
Ricardo Rodrigues
  • Centre for Informatics and Systems of the University of Coimbra, Portugal
  • Polytechnic Institute of Coimbra, College of Higher Education of Coimbra, Portugal
Hugo Gonçalo Oliveira
  • Centre for Informatics and Systems of the University of Coimbra, Department of Informatics Engineering, Portugal

Cite AsGet BibTex

João Ferreira, Ricardo Rodrigues, and Hugo Gonçalo Oliveira. Assessing Factoid Question-Answer Generation for Portuguese (Short Paper). In 9th Symposium on Languages, Applications and Technologies (SLATE 2020). Open Access Series in Informatics (OASIcs), Volume 83, pp. 16:1-16:9, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


We present work on the automatic generation of question-answer pairs in Portuguese, useful, for instance, for populating the knowledge-base of question-answering systems. This includes: (i) a new corpus of close to 600 factoid sentences, manually created from an existing corpus of questions and answers, used as our benchmark; (ii) two approaches for the automatic generation of question-answer pairs, which can be seen as baselines; (iii) results of those approaches in the corpus.

Subject Classification

ACM Subject Classification
  • Computing methodologies → Natural language processing
  • Question-Answer Generation
  • Corpus
  • NLP
  • Portuguese


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Johan Bos and Katja Markert. Combining Shallow and Deep NLP Methods for Recognizing Textual Entailment. In Proceedings of the Pascal Challenges Workshop on Recognising Textual Entailment, Southhampton, UK, April 2005. Google Scholar
  2. Daniel Diéguez, Ricardo Rodrigues, and Paulo Gomes. Using CBR for Portuguese Question Generation. In Proceedings of the 15superscriptth Portuguese Conference on Artificial Intelligence (EPIA 2011), pages 328-341, Lisbon, Portugal, October 2011. APPIA. Google Scholar
  3. João Ferreira, Hugo Gonçalo Oliveira, and Ricardo Rodrigues. Improving NLTK for Processing Portuguese. In Ricardo Rodrigues, Jan Janoušek, Luís Ferreira, Luísa Coheur, Fernando Batista, and Hugo Gonçalo Oliveira, editors, Proceedings of 8superscriptth Symposium on Languages, Applications and Technologies (SLATE'19), OpenAccess Series in Informatics, pages 18:1-18:9. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, Dagstuhl Publishing, June 2019. Google Scholar
  4. Michael Flor and Brian Riordan. A Semantic Role-Based Approach to Open-Domain Automatic Question Generation. In Proceedings of the 13superscriptth Workshop on Innovative use of NLP for Building Educational Applications, pages 254-263, 2018. Google Scholar
  5. Erick Fonseca and João Luís Rosa. A Two-Step Convolutional Neural Network Approach for Semantic Role Labeling. In The 2013 International Joint Conference on Neural Networks (IJCNN), pages 1-7. IEEE, 2013. Google Scholar
  6. Cláudia Freitas, Paulo Rocha, and Eckhard Bick. Floresta Sintá(c)tica: Bigger, Thicker and Easier. In Proceedings of the 8superscriptth International Conference on Computational Processing of the Portuguese Language (PROPOR '08), pages 216-219. Springer-Verlag, 2008. Google Scholar
  7. Sanda Harabagiu, Andrew Hickl, John Lehmann, and Dan Moldovan. Experiments with Interactive Question-Answering. In Proceedings of the 3superscriptrd Annual Meeting of the ACL (ACL '05), pages 205-214, Morristown, New Jersey, USA, 2005. ACL. Google Scholar
  8. Ghader Kurdi, Jared Leo, Bijan Parsia, Uli Sattler, and Salam Al-Emari. A Systematic Review of Automatic Question Generation for Educational Purposes. International Journal of Artificial Intelligence in Education, 30:121-204, 2020. Google Scholar
  9. Chin-Yew Lin. ROUGE: A Package for Automatic Evaluation of Summaries. In Proceedings of Workshop on Text Summarization Branches Out, Post2Conference Workshop of ACL, 2004. Google Scholar
  10. Bang Liu, Haojie Wei, Di Niu, Haolan Chen, and Yancheng He. Asking Questions the Human Way: Scalable Question-Answer Generation from Text Corpus. In Proceedings of The Web Conference 2020 (WWW '20), pages 2032-2043. IW3C2, 2020. Google Scholar
  11. Bernardo Magnini, Alessandro Vallin, Christelle Ayache, Gregor Erbach, Anselmo Peñas, Maarten de Rijke, Paulo Rocha, Kiril Ivanov Simov, and Richard F. E. Sutcliffe. Overview of the CLEF 2004 Multilingual Question Answering Track. In Multilingual Information Access for Text, Speech and Images, 5th Workshop of the Cross-Language Evaluation Forum (CLEF), Revised Selected Papers, volume 3491 of LNCS, pages 371-391. Springer, 2004. Google Scholar
  12. Mark T. Maybury, editor. New Directions in Question Answering. AAAI Press and The MIT Press, Menlo Park, California, and Cambridge, Massachusetts, USA, 2004. Google Scholar
  13. Marie-Francine Moens. Information Extraction: Algorithms and Prospects in a Retrieval Context. Springer-Verlag, Berlin Heidelberg, 2006. Google Scholar
  14. Hiroki Ouchi, Hiroyuki Shindo, and Yuji Matsumoto. A Span Selection Model for Semantic Role Labeling. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 1630-1642. ACL, 2018. Google Scholar
  15. Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting on ACL, pages 311-318. ACL, 2002. Google Scholar
  16. Ricardo Rodrigues, Hugo Gonçalo Oliveira, and Paulo Gomes. NLPPort: A Pipeline for Portuguese NLP. In Proceedings of the 7superscriptth Symposium on Languages, Applications and Technologies (SLATE'18), OpenAccess Series in Informatics, pages 18:1-18:9, Germany, June 2018. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, Dagstuhl Publishing. Google Scholar
  17. Vasile Rus and Arthur C. Graesser. The Question Generation Shared Task and Evaluation Challenge. Workshop Report, The University of Memphis, 2009. Google Scholar
  18. Xingdi Yuan, Tong Wang, Caglar Gulcehre, Alessandro Sordoni, Philip Bachman, Sandeep Subramanian, Saizheng Zhang, and Adam Trischler. Machine comprehension by text-to-text neural question generation. arXiv preprint, 2017. URL: