Assessing Factoid Question-Answer Generation for Portuguese (Short Paper)

Ferreira, João; Rodrigues, Ricardo; Gonçalo Oliveira, Hugo

doi:10.4230/OASIcs.SLATE.2020.16

Abstract

We present work on the automatic generation of question-answer pairs in Portuguese, useful, for instance, for populating the knowledge-base of question-answering systems. This includes: (i) a new corpus of close to 600 factoid sentences, manually created from an existing corpus of questions and answers, used as our benchmark; (ii) two approaches for the automatic generation of question-answer pairs, which can be seen as baselines; (iii) results of those approaches in the corpus.

Cite As Get BibTex

João Ferreira, Ricardo Rodrigues, and Hugo Gonçalo Oliveira. Assessing Factoid Question-Answer Generation for Portuguese (Short Paper). In 9th Symposium on Languages, Applications and Technologies (SLATE 2020). Open Access Series in Informatics (OASIcs), Volume 83, pp. 16:1-16:9, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020) https://doi.org/10.4230/OASIcs.SLATE.2020.16

Author Details

João Ferreira

Centre for Informatics and Systems of the University of Coimbra, Portugal

Ricardo Rodrigues

Centre for Informatics and Systems of the University of Coimbra, Portugal
Polytechnic Institute of Coimbra, College of Higher Education of Coimbra, Portugal

Hugo Gonçalo Oliveira

Centre for Informatics and Systems of the University of Coimbra, Department of Informatics Engineering, Portugal

Funding

This work was supported by FCT’s INCoDe 2030 initiative, in the scope of the demonstration project AIA, "Apoio Inteligente a Empreendedores (chatbots)".

References

Johan Bos and Katja Markert. Combining Shallow and Deep NLP Methods for Recognizing Textual Entailment. In Proceedings of the Pascal Challenges Workshop on Recognising Textual Entailment, Southhampton, UK, April 2005.
Daniel Diéguez, Ricardo Rodrigues, and Paulo Gomes. Using CBR for Portuguese Question Generation. In Proceedings of the 15superscriptth Portuguese Conference on Artificial Intelligence (EPIA 2011), pages 328-341, Lisbon, Portugal, October 2011. APPIA.
João Ferreira, Hugo Gonçalo Oliveira, and Ricardo Rodrigues. Improving NLTK for Processing Portuguese. In Ricardo Rodrigues, Jan Janoušek, Luís Ferreira, Luísa Coheur, Fernando Batista, and Hugo Gonçalo Oliveira, editors, Proceedings of 8superscriptth Symposium on Languages, Applications and Technologies (SLATE'19), OpenAccess Series in Informatics, pages 18:1-18:9. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, Dagstuhl Publishing, June 2019.
Michael Flor and Brian Riordan. A Semantic Role-Based Approach to Open-Domain Automatic Question Generation. In Proceedings of the 13superscriptth Workshop on Innovative use of NLP for Building Educational Applications, pages 254-263, 2018.
Erick Fonseca and João Luís Rosa. A Two-Step Convolutional Neural Network Approach for Semantic Role Labeling. In The 2013 International Joint Conference on Neural Networks (IJCNN), pages 1-7. IEEE, 2013.
Cláudia Freitas, Paulo Rocha, and Eckhard Bick. Floresta Sintá(c)tica: Bigger, Thicker and Easier. In Proceedings of the 8superscriptth International Conference on Computational Processing of the Portuguese Language (PROPOR '08), pages 216-219. Springer-Verlag, 2008.
Sanda Harabagiu, Andrew Hickl, John Lehmann, and Dan Moldovan. Experiments with Interactive Question-Answering. In Proceedings of the 3superscriptrd Annual Meeting of the ACL (ACL '05), pages 205-214, Morristown, New Jersey, USA, 2005. ACL.
Ghader Kurdi, Jared Leo, Bijan Parsia, Uli Sattler, and Salam Al-Emari. A Systematic Review of Automatic Question Generation for Educational Purposes. International Journal of Artificial Intelligence in Education, 30:121-204, 2020.
Chin-Yew Lin. ROUGE: A Package for Automatic Evaluation of Summaries. In Proceedings of Workshop on Text Summarization Branches Out, Post2Conference Workshop of ACL, 2004.
Bang Liu, Haojie Wei, Di Niu, Haolan Chen, and Yancheng He. Asking Questions the Human Way: Scalable Question-Answer Generation from Text Corpus. In Proceedings of The Web Conference 2020 (WWW '20), pages 2032-2043. IW3C2, 2020.
Bernardo Magnini, Alessandro Vallin, Christelle Ayache, Gregor Erbach, Anselmo Peñas, Maarten de Rijke, Paulo Rocha, Kiril Ivanov Simov, and Richard F. E. Sutcliffe. Overview of the CLEF 2004 Multilingual Question Answering Track. In Multilingual Information Access for Text, Speech and Images, 5th Workshop of the Cross-Language Evaluation Forum (CLEF), Revised Selected Papers, volume 3491 of LNCS, pages 371-391. Springer, 2004.
Mark T. Maybury, editor. New Directions in Question Answering. AAAI Press and The MIT Press, Menlo Park, California, and Cambridge, Massachusetts, USA, 2004.
Marie-Francine Moens. Information Extraction: Algorithms and Prospects in a Retrieval Context. Springer-Verlag, Berlin Heidelberg, 2006.
Hiroki Ouchi, Hiroyuki Shindo, and Yuji Matsumoto. A Span Selection Model for Semantic Role Labeling. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 1630-1642. ACL, 2018.
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting on ACL, pages 311-318. ACL, 2002.
Ricardo Rodrigues, Hugo Gonçalo Oliveira, and Paulo Gomes. NLPPort: A Pipeline for Portuguese NLP. In Proceedings of the 7superscriptth Symposium on Languages, Applications and Technologies (SLATE'18), OpenAccess Series in Informatics, pages 18:1-18:9, Germany, June 2018. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, Dagstuhl Publishing.
Vasile Rus and Arthur C. Graesser. The Question Generation Shared Task and Evaluation Challenge. Workshop Report, The University of Memphis, 2009.
Xingdi Yuan, Tong Wang, Caglar Gulcehre, Alessandro Sordoni, Philip Bachman, Sandeep Subramanian, Saizheng Zhang, and Adam Trischler. Machine comprehension by text-to-text neural question generation. arXiv preprint, 2017. URL: http://arxiv.org/abs/1705.02012.

Assessing Factoid Question-Answer Generation for Portuguese (Short Paper)

Authors João Ferreira, Ricardo Rodrigues , Hugo Gonçalo Oliveira

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

References

Thanks for your feedback!

Could not send message