Question Answering over Linked Data with GPT-3

Authors Bruno Faria , Dylan Perdigão , Hugo Gonçalo Oliveira



PDF
Thumbnail PDF

File

OASIcs.SLATE.2023.1.pdf
  • Filesize: 1.59 MB
  • 15 pages

Document Identifiers

Author Details

Bruno Faria
  • Department of Informatics Engineering, University of Coimbra, Portugal
  • Centre for Informatics and Systems of the University of Coimbra, Portugal
Dylan Perdigão
  • Department of Informatics Engineering, University of Coimbra, Portugal
  • Centre for Informatics and Systems of the University of Coimbra, Portugal
Hugo Gonçalo Oliveira
  • Department of Informatics Engineering, University of Coimbra, Portugal
  • Centre for Informatics and Systems of the University of Coimbra, Portugal

Cite AsGet BibTex

Bruno Faria, Dylan Perdigão, and Hugo Gonçalo Oliveira. Question Answering over Linked Data with GPT-3. In 12th Symposium on Languages, Applications and Technologies (SLATE 2023). Open Access Series in Informatics (OASIcs), Volume 113, pp. 1:1-1:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)
https://doi.org/10.4230/OASIcs.SLATE.2023.1

Abstract

This paper explores GPT-3 for answering natural language questions over Linked Data. Different engines of the model and different approaches are adopted for answering questions in the QALD-9 dataset, namely: zero and few-shot SPARQL generation, as well as fine-tuning in the training portion of the dataset. Answers retrieved by the generated queries and answers generated directly by the model are also compared. Overall results are generally poor, but several insights are provided on using GPT-3 for the proposed task.

Subject Classification

ACM Subject Classification
  • Computing methodologies → Natural language processing
Keywords
  • SPARQL Generation
  • Prompt Engineering
  • Few-Shot Learning
  • Question Answering
  • GPT-3

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Ankush Agarwal, Raj Gite, Shreya Laddha, Pushpak Bhattacharyya, Satyanarayan Kar, Asif Ekbal, Prabhjit Thind, Rajesh Zele, and Ravi Shankar. Knowledge Graph–Deep Learning: A Case Study in Question Answering in Aviation Safety Domain. arXiv preprint arXiv:2205.15952, 2022. URL: https://doi.org/10.48550/arXiv.2205.15952.
  2. Sören Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak, and Zachary Ives. DBpedia: A Nucleus for a Web of Open Data. In Karl Aberer, Key-Sun Choi, Natasha Noy, Dean Allemang, Kyung-Il Lee, Lyndon Nixon, Jennifer Golbeck, Peter Mika, Diana Maynard, Riichiro Mizoguchi, Guus Schreiber, and Philippe Cudré-Mauroux, editors, The Semantic Web, LNCS, pages 722-735. Springer, 2007. URL: https://doi.org/10.1007/978-3-540-76298-0_52.
  3. Mahdi Bakhshi, Mohammadali Nematbakhsh, Mehran Mohsenzadeh, and Amir Masoud Rahmani. SParseQA: Sequential word reordering and parsing for answering complex natural language questions over knowledge graphs. Knowledge-Based Systems, 235:107626, 2022. URL: https://doi.org/10.1016/j.knosys.2021.107626.
  4. Petr Baudiš. YodaQA: a modular question answering system pipeline. In POSTER 2015-19th International Student Conference on Electrical Engineering, pages 1156-1165, 2015. Google Scholar
  5. Romain Beaumont, Brigitte Grau, and Anne-Laure Ligozat. SemGraphQA@ QALD5: LIMSI participation at QALD5@ CLEF. In Working Notes of CLEF 2015 - Conference and Labs of the Evaluation Forum, 2015. Google Scholar
  6. Jonathan Berant, Andrew Chou, Roy Frostig, and Percy Liang. Semantic parsing on Freebase from Question-Answer pairs. In Proceedings of 2013 Conference on Empirical Methods in Natural Language Processing, pages 1533-1544, 2013. Google Scholar
  7. Tom et al. Brown. Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems, volume 33, pages 1877-1901. Curran Associates, Inc., 2020. Google Scholar
  8. Yongrui Chen, Huiying Li, and Zejian Xu. Convolutional Neural Network-Based Question Answering Over Knowledge Base with Type Constraint. In China Conference on Knowledge Graph and Semantic Computing, pages 28-39. Springer, 2018. URL: https://doi.org/10.1007/978-981-13-3146-6_3.
  9. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4171-4186, Minneapolis, Minnesota, 2019. ACL. Google Scholar
  10. Dennis Diefenbach, Andreas Both, Kamal Singh, and Pierre Maret. Towards a question answering system over the Semantic Web. Semantic Web, 11(3):421-439, 2020. URL: https://doi.org/10.3233/SW-190343.
  11. Sen Hu, Lei Zou, Jeffrey Xu Yu, Haixun Wang, and Dongyan Zhao. Answering Natural Language Questions by Subgraph Matching over Knowledge Graphs. IEEE Transactions on Knowledge and Data Engineering, 30(5):824-837, 2018. Conference Name: IEEE Transactions on Knowledge and Data Engineering. URL: https://doi.org/10.1109/TKDE.2017.2766634.
  12. Hai Jin, Yi Luo, Chenjing Gao, Xunzhu Tang, and Pingpeng Yuan. ComQA: Question Answering Over Knowledge Base via Semantic Matching. IEEE Access, 7:75235-75246, 2019. Conference Name: IEEE Access. URL: https://doi.org/10.1109/ACCESS.2019.2918675.
  13. Endri Kacupaj, Hamid Zafar, Jens Lehmann, and Maria Maleshkova. VQuAnDa: Verbalization QUestion ANswering DAtaset. In Andreas Harth, Sabrina Kirrane, Axel-Cyrille Ngonga Ngomo, Heiko Paulheim, Anisa Rula, Anna Lisa Gentile, Peter Haase, and Michael Cochez, editors, The Semantic Web, pages 531-547, Cham, 2020. Springer International Publishing. Google Scholar
  14. OpenAI. GPT-4 Technical Report, March 2023. arXiv:2303.08774 [cs]. URL: https://doi.org/10.48550/arXiv.2303.08774.
  15. Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. BLEU: a method for automatic evaluation of machine translation. In Proceedings of 40th Annual Meeting on Association for Computational Linguistics - ACL '02, page 311, Philadelphia, Pennsylvania, 2002. ACL. URL: https://doi.org/10.3115/1073083.1073135.
  16. Adam Roberts, Colin Raffel, and Noam Shazeer. How much knowledge can you pack into the parameters of a language model? In Proceedings of 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 5418-5426, Online, 2020. ACL. URL: https://doi.org/10.18653/v1/2020.emnlp-main.437.
  17. Tommaso Soru, Edgard Marx, Diego Moussallem, Gustavo Publio, Andre Valdestilhas, Diego Esteves, and Ciro Baron Neto. Sparql as a foreign language. In Proceedings of the Posters and Demos Track of the 13th International Conference on Semantic Systems - SEMANTiCS2017, 2017. Google Scholar
  18. Yiming Tan, Dehai Min, Yu Li, Wenbo Li, Nan Hu, Yongrui Chen, and Guilin Qi. Evaluation of chatgpt as a question answering system for answering complex questions. arXiv preprint arXiv:2303.07992, 2023. Google Scholar
  19. Hieu Tran, Long Phan, James Anibal, Binh T Nguyen, and Truong-Son Nguyen. SPBERT: an Efficient Pre-training BERT on SPARQL Queries for Question Answering over Knowledge Graphs. In Neural Information Processing: 28th International Conference, ICONIP 2021, Proceedings, pages 512-523. Springer, 2021. Google Scholar
  20. Priyansh Trivedi, Gaurav Maheshwari, Mohnish Dubey, and Jens Lehmann. LC-QuAD: A Corpus for Complex Question Answering over Knowledge Graphs. In Claudia d'Amato, Miriam Fernandez, Valentina Tamma, Freddy Lecue, Philippe Cudré-Mauroux, Juan Sequeda, Christoph Lange, and Jeff Heflin, editors, The Semantic Web – ISWC 2017, LNCS, pages 210-218, Cham, 2017. Springer International Publishing. URL: https://doi.org/10.1007/978-3-319-68204-4_22.
  21. Christina Unger, Corina Forascu, Vanessa Lopez, Axel-Cyrille Ngonga, Elena Cabrio, Philipp Cimiano, and Sebastian Walter. Question Answering over Linked Data (QALD-5). In Working Notes of CLEF 2015 - Conference and Labs of the Evaluation Forum, volume 1391 of CEUR Workshop Proceedings, page 10. CEUR-WS.org, 2015. Google Scholar
  22. Ricardo Usbeck, Ria Hari Gusmita, Axel-Cyrille Ngonga Ngomo, and Muhammad Saleem. 9th Challenge on Question Answering over Linked Data (QALD-9). Language, 7(1):58-64, 2018. Google Scholar
  23. Mark D. Wilkinson et al. The FAIR Guiding Principles for scientific data management and stewardship. Scientific Data, 3(1):160018, 2016. Number: 1 Publisher: Nature Publishing Group. URL: https://doi.org/10.1038/sdata.2016.18.
  24. Kun Xu, Sheng Zhang, Yansong Feng, and Dongyan Zhao. Answering Natural Language Questions via Phrasal Semantic Parsing. In Chengqing Zong, Jian-Yun Nie, Dongyan Zhao, and Yansong Feng, editors, Natural Language Processing and Chinese Computing, Communications in Computer and Information Science, pages 333-344. Springer, 2014. URL: https://doi.org/10.1007/978-3-662-45924-9_30.
  25. Zhengyuan Yang, Zhe Gan, Jianfeng Wang, Xiaowei Hu, Yumao Lu, Zicheng Liu, and Lijuan Wang. An Empirical Study of GPT-3 for Few-Shot Knowledge-Based VQA. Proceedings of AAAI Conference on Artificial Intelligence, 36(3):3081-3089, 2022. URL: https://doi.org/10.1609/aaai.v36i3.20215.
  26. Xiaoyu Yin, Dagmar Gromann, and Sebastian Rudolph. Neural machine translating from natural language to SPARQL. Future Generation Computer Systems, 117:510-519, 2021. URL: https://doi.org/10.1016/j.future.2020.12.013.
  27. Susan Zhang, Stephen Roller, Naman Goyal, Mikel Artetxe, Moya Chen, Shuohui Chen, Christopher Dewan, Mona Diab, Xian Li, Xi Victoria Lin, Todor Mihaylov, Myle Ott, Sam Shleifer, Kurt Shuster, Daniel Simig, Punit Singh Koura, Anjali Sridhar, Tianlu Wang, and Luke Zettlemoyer. OPT: Open Pre-trained Transformer Language Models, 2022. arXiv:2205.01068 [cs]. URL: https://doi.org/10.48550/arXiv.2205.01068.
  28. Weiguo Zheng and Mei Zhang. Question Answering over Knowledge Graphs via Structural Query Patterns, 2019. arXiv:1910.09760 [cs]. URL: https://doi.org/10.48550/arXiv.1910.09760.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail