Document Open Access Logo

Large Language Models and Knowledge Graphs: Opportunities and Challenges

Authors Jeff Z. Pan , Simon Razniewski , Jan-Christoph Kalo , Sneha Singhania , Jiaoyan Chen , Stefan Dietze , Hajira Jabeen , Janna Omeliyanenko , Wen Zhang , Matteo Lissandrini , Russa Biswas , Gerard de Melo , Angela Bonifati , Edlira Vakaj , Mauro Dragoni , Damien Graux



PDF
Thumbnail PDF

File

TGDK.1.1.2.pdf
  • Filesize: 1.73 MB
  • 38 pages

Document Identifiers

Author Details

Jeff Z. Pan
  • The University of Edinburgh, United Kingdom
Simon Razniewski
  • Bosch Center for AI, Renningen, Germany
Jan-Christoph Kalo
  • University of Amsterdam, The Netherlands
Sneha Singhania
  • Max Planck Institute for Informatics, Saarbrücken, Germany
Jiaoyan Chen
  • The University of Manchester, United Kingdom
  • University of Oxford, United Kingdom
Stefan Dietze
  • GESIS - Leibniz Institute for the Social Sciences, Cologne, Germany
  • Heinrich-Heine-Universität Düsseldorf, Germany
Hajira Jabeen
  • GESIS - Leibniz Institute for the Social Sciences, Cologne, Germany
Janna Omeliyanenko
  • University of Würzburg, Germany
Wen Zhang
  • Zhejiang University, China
Matteo Lissandrini
  • Aalborg University, Denmark
Russa Biswas
  • Hasso-Plattner Institute, Potsdam, Germany
Gerard de Melo
  • Hasso-Plattner Institute, Potsdam, Germany
Angela Bonifati
  • Lyon 1 University, CNRS, IUF, France
Edlira Vakaj
  • Birmingham City University, United Kingdom
Mauro Dragoni
  • Fondazione Bruno Kessler, Trento, Italy
Damien Graux
  • Edinburgh Research Centre, CSI, Huawei Technologies UK, United Kingdom

Acknowledgements

We would like thank Xiaoqi Han’s helpful discussion and support when finalising the camera ready version of the paper.

Cite AsGet BibTex

Jeff Z. Pan, Simon Razniewski, Jan-Christoph Kalo, Sneha Singhania, Jiaoyan Chen, Stefan Dietze, Hajira Jabeen, Janna Omeliyanenko, Wen Zhang, Matteo Lissandrini, Russa Biswas, Gerard de Melo, Angela Bonifati, Edlira Vakaj, Mauro Dragoni, and Damien Graux. Large Language Models and Knowledge Graphs: Opportunities and Challenges. In Special Issue on Trends in Graph Data and Knowledge. Transactions on Graph Data and Knowledge (TGDK), Volume 1, Issue 1, pp. 2:1-2:38, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)
https://doi.org/10.4230/TGDK.1.1.2

Abstract

Large Language Models (LLMs) have taken Knowledge Representation - and the world - by storm. This inflection point marks a shift from explicit knowledge representation to a renewed focus on the hybrid representation of both explicit knowledge and parametric knowledge. In this position paper, we will discuss some of the common debate points within the community on LLMs (parametric knowledge) and Knowledge Graphs (explicit knowledge) and speculate on opportunities and visions that the renewed focus brings, as well as related research topics and challenges.

Subject Classification

ACM Subject Classification
  • General and reference → Surveys and overviews
  • Computing methodologies → Knowledge representation and reasoning
  • Computing methodologies → Natural language processing
Keywords
  • Large Language Models
  • Pre-trained Language Models
  • Knowledge Graphs
  • Ontology
  • Retrieval Augmented Language Models

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Oshin Agarwal, Heming Ge, Siamak Shakeri, et al. Knowledge graph based synthetic corpus generation for knowledge-enhanced language model pre-training. In NAACL, pages 3554-3565, jun 2021. URL: https://doi.org/10.18653/V1/2021.NAACL-MAIN.278.
  2. Naser Ahmadi, Viet-Phi Huynh, Vamsi Meduri, Stefano Ortona, and Paolo Papotti. Mining expressive rules in knowledge graphs. J. Data and Information Quality, 12(2), 2020. URL: https://doi.org/10.1145/3371315.
  3. Mirza Mohtashim Alam, Md Rashad Al Hasan Rony, Mojtaba Nayyeri, Karishma Mohiuddin, MST Mahfuja Akter, Sahar Vahdati, and Jens Lehmann. Language model guided knowledge graph embeddings. IEEE Access, 10:76008-76020, 2022. URL: https://doi.org/10.1109/ACCESS.2022.3191666.
  4. Dimitrios Alivanistos, Selene Báez Santamaría, Michael Cochez, Jan-Christoph Kalo, Emile van Krieken, and Thiviyan Thanapalasingam. Prompting as probing: Using language models for knowledge base construction, 2022. URL: https://doi.org/10.48550/ARXIV.2208.11057.
  5. Badr AlKhamissi, Millicent Li, Asli Celikyilmaz, Mona Diab, and Marjan Ghazvininejad. A review on language models as knowledge bases. arXiv, 2022. URL: https://doi.org/10.48550/ARXIV.2204.06031.
  6. Mona Alshahrani, Mohammad Asif Khan, Omar Maddouri, Akira R Kinjo, Núria Queralt-Rosinach, and Robert Hoehndorf. Neuro-symbolic representation learning on biological knowledge graphs. Bioinformatics, 33(17):2723-2730, apr 2017. URL: https://doi.org/10.1093/BIOINFORMATICS/BTX275.
  7. Sihem Amer-Yahia, Angela Bonifati, Lei Chen, Guoliang Li, Kyuseok Shim, Jianliang Xu, and Xiaochun Yang. From large language models to databases and back: A discussion on research and education. DASFAA, abs/2306.01388, 2023. URL: https://doi.org/10.48550/ARXIV.2306.01388.
  8. Alejandro Barredo Arrieta, Natalia Díaz Rodríguez, Javier Del Ser, Adrien Bennetot, Siham Tabik, A. Barbado, Salvador García, Sergio Gil-Lopez, Daniel Molina, Richard Benjamins, Raja Chatila, and Francisco Herrera. Explainable artificial intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible ai. Information Fusion, 2020. URL: https://doi.org/10.1016/J.INFFUS.2019.12.012.
  9. Sören Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak, and Zachary Ives. Dbpedia: A nucleus for a web of open data. In The semantic web, pages 722-735, 2007. URL: https://doi.org/10.1007/978-3-540-76298-0_52.
  10. Stephen H Bach, Victor Sanh, Zheng-Xin Yong, Albert Webson, Colin Raffel, Nihal V Nayak, Abheesht Sharma, Taewoon Kim, M Saiful Bari, Thibault Fevry, et al. Promptsource: An integrated development environment and repository for natural language prompts. ACL, 2022. https://arxiv.org/abs/2202.01279, URL: https://doi.org/10.48550/arXiv.2202.01279.
  11. Roy Bar-Haim, Lilach Eden, Roni Friedman, Yoav Kantor, Dan Lahav, and Noam Slonim. From arguments to key points: Towards automatic argument summarization. In ACL, pages 4029-4039, 2020. URL: https://doi.org/10.18653/V1/2020.ACL-MAIN.371.
  12. Parishad BehnamGhader, Santiago Miret, and Siva Reddy. Can retriever-augmented language models reason? the blame game between the retriever and the language model. In arXiv, 2022. URL: https://doi.org/10.48550/ARXIV.2212.09146.
  13. Emily M Bender, Timnit Gebru, Angelina McMillan-Major, et al. On the dangers of stochastic parrots: Can language models be too big? In FAT, pages 610-623, 2021. URL: https://doi.org/10.1145/3442188.3445922.
  14. Emily M. Bender and Alexander Koller. Climbing towards NLU: On meaning, form, and understanding in the age of data. In ACL, 2020. URL: https://doi.org/10.18653/V1/2020.ACL-MAIN.463.
  15. Russa Biswas, Harald Sack, and Mehwish Alam. Madlink: Attentive multihop and entity descriptions for link prediction in knowledge graphs. SWJ, pages 1-24, 2022. URL: https://doi.org/10.3233/SW-222960.
  16. Russa Biswas, Radina Sofronova, Mehwish Alam, and Harald Sack. Contextual language models for knowledge graph completion. In MLSMKG, 2021. URL: https://doi.org/10.34657/7668.
  17. Su Lin Blodgett, Solon Barocas, Hal Daum'e, and Hanna M. Wallach. Language (technology) is power: A critical survey of “bias” in nlp. ACL, 2020. URL: https://doi.org/10.18653/V1/2020.ACL-MAIN.485.
  18. Bernd Bohnet, Vinh Q Tran, Pat Verga, Roee Aharoni, Daniel Andor, Livio Baldini Soares, Jacob Eisenstein, Kuzman Ganchev, Jonathan Herzig, Kai Hui, et al. Attributed question answering: Evaluation and modeling for attributed large language models. arXiv preprint arXiv:2212.08037, 2022. URL: https://doi.org/10.48550/ARXIV.2212.08037.
  19. Angela Bonifati, Wim Martens, and Thomas Timm. Navigating the maze of wikidata query logs. In WWW, pages 127-138, 2019. URL: https://doi.org/10.1145/3308558.3313472.
  20. Angela Bonifati, Wim Martens, and Thomas Timm. An analytical study of large SPARQL query logs. VLDB J., 29(2-3):655-679, 2020. URL: https://doi.org/10.1007/S00778-019-00558-9.
  21. Antoine Bosselut, Hannah Rashkin, Maarten Sap, Chaitanya Malaviya, Asli Celikyilmaz, and Yejin Choi. COMET: Commonsense transformers for automatic knowledge graph construction. In ACL, pages 4762-4779, 2019. URL: https://doi.org/10.18653/V1/P19-1470.
  22. Ruben Branco, António Branco, João António Rodrigues, et al. Shortcutted commonsense: Data spuriousness in deep learning of commonsense reasoning. In EMNLP, pages 1504-1521, nov 2021. URL: https://doi.org/10.18653/V1/2021.EMNLP-MAIN.113.
  23. Ryan Brate, Minh Hoang Dang, Fabian Hoppe, Yuan He, Albert Meroño-Peñuela, and Vijay Sadashivaiah. Improving language model predictions via prompts enriched with knowledge graphs. In Proceedings of the Workshop on Deep Learning for Knowledge Graphs (DL4KG 2022) co-located with the 21th International Semantic Web Conference (ISWC 2022), Virtual Conference, online, October 24, 2022, volume 3342 of CEUR Workshop Proceedings. CEUR-WS.org, 2022. URL: https://ceur-ws.org/Vol-3342/paper-3.pdf.
  24. Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few-shot learners. neurIPS, 33:1877-1901, 2020. URL: https://proceedings.neurips.cc/paper/2020/hash/1457c0d6bfcb4967418bfb8ac142f64a-Abstract.html.
  25. Sébastien Bubeck, Varun Chandrasekaran, Ronen Eldan, Johannes Gehrke, Eric Horvitz, Ece Kamar, Peter Lee, Yin Tat Lee, Yuanzhi Li, Scott Lundberg, Harsha Nori, Hamid Palangi, Marco Tulio Ribeiro, and Yi Zhang. Sparks of artificial general intelligence: Early experiments with gpt-4, 2023. URL: https://doi.org/10.48550/ARXIV.2303.12712.
  26. Boxi Cao, Hongyu Lin, Xianpei Han, Le Sun, Lingyong Yan, Meng Liao, Tong Xue, and Jin Xu. Knowledgeable or educated guess? revisiting language models as knowledge bases. In ACL, pages 1860-1874, aug 2021. URL: https://doi.org/10.18653/V1/2021.ACL-LONG.146.
  27. J Harry Caufield, Harshad Hegde, Vincent Emonet, Nomi L Harris, Marcin P Joachimiak, Nicolas Matentzoglu, HyeongSik Kim, Sierra AT Moxon, Justin T Reese, Melissa A Haendel, et al. Structured prompt interrogation and recursive extraction of semantics (spires): A method for populating knowledge bases using zero-shot learning. arXiv, 2023. URL: https://doi.org/10.48550/ARXIV.2304.02711.
  28. Sejla Cebiric, François Goasdoué, Haridimos Kondylakis, Dimitris Kotzinos, Ioana Manolescu, Georgia Troullinou, and Mussab Zneika. Summarizing semantic graphs: a survey. VLDB J., 28(3):295-327, 2019. URL: https://doi.org/10.1007/S00778-018-0528-3.
  29. Jiaoyan Chen, Yuan He, Yuxia Geng, Ernesto Jiménez-Ruiz, Hang Dong, and Ian Horrocks. Contextual semantic embeddings for ontology subsumption prediction. WWW, pages 1-23, 2023. URL: https://doi.org/10.1007/S11280-023-01169-9.
  30. Jiaoyan Chen, Ernesto Jiménez-Ruiz, Ian Horrocks, Denvar Antonyrajah, Ali Hadian, and Jaehun Lee. Augmenting ontology alignment by semantic embedding and distant supervision. In ESWC, pages 392-408, 2021. URL: https://doi.org/10.1007/978-3-030-77385-4_23.
  31. Jiaoyan Chen, Freddy Lecue, Jeff Z. Pan, Ian Horrocks, and Huajun Chen. Knowledge-based Transfer Learning Explanation. In KR, pages 349-358, 2018. URL: https://aaai.org/ocs/index.php/KR/KR18/paper/view/18054.
  32. Jiaqiang Chen, Niket Tandon, Charles Darwis Hariman, and Gerard de Melo. WebBrain: Joint neural learning of large-scale commonsense knowledge. In ISWC, pages 102-118, 2016. URL: http://gerard.demelo.org/webbrain/, URL: https://doi.org/10.1007/978-3-319-46523-4_7.
  33. Mingyang Chen, Wen Zhang, Yuxia Geng, Zezhong Xu, Jeff Z. Pan, and Huajun Chen. Generalizing to unseen elements: A survey on knowledge extrapolation for knowledge graphs. In Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, IJCAI 2023, 19th-25th August 2023, Macao, SAR, China, pages 6574-6582. ijcai.org, 2023. URL: https://doi.org/10.24963/IJCAI.2023/737.
  34. Wenhu Chen, Yu Su, Xifeng Yan, and William Yang Wang. KGPT: Knowledge-grounded pre-training for data-to-text generation. In EMNLP, 2020. URL: https://doi.org/10.18653/V1/2020.EMNLP-MAIN.697.
  35. Xiang Chen, Ningyu Zhang, Xin Xie, Shumin Deng, Yunzhi Yao, Chuanqi Tan, Fei Huang, Luo Si, and Huajun Chen. Knowprompt: Knowledge-aware prompt-tuning with synergistic optimization for relation extraction. In Frédérique Laforest, Raphaël Troncy, Elena Simperl, Deepak Agarwal, Aristides Gionis, Ivan Herman, and Lionel Médini, editors, WWW '22: The ACM Web Conference 2022, Virtual Event, Lyon, France, April 25 - 29, 2022, pages 2778-2788. ACM, 2022. URL: https://doi.org/10.1145/3485447.3511998.
  36. Zhuo Chen, Jiaoyan Chen, Wen Zhang, Lingbing Guo, Yin Fang, Yufeng Huang, Yuxia Geng, Jeff Z Pan, Wenting Song, and Huajun Chen. Meaformer: Multi-modal entity alignment transformer for meta modality hybrid. In ACM Multimedia, 2023. URL: https://doi.org/10.1145/3581783.3611786.
  37. Wei-Lin Chiang, Zhuohan Li, Zi Lin, Ying Sheng, Zhanghao Wu, Hao Zhang, Lianmin Zheng, Siyuan Zhuang, Yonghao Zhuang, Joseph E. Gonzalez, Ion Stoica, and Eric P. Xing. Vicuna: An open-source chatbot impressing gpt-4 with 90%* chatgpt quality, mar 2023. URL: https://lmsys.org/blog/2023-03-30-vicuna/.
  38. Bonggeun Choi, Daesik Jang, and Youngjoong Ko. Mem-kgc: Masked entity model for knowledge graph completion with pre-trained language model. IEEE Access, 9, 2021. URL: https://doi.org/10.1109/ACCESS.2021.3113329.
  39. Bonggeun Choi and Youngjoong Ko. Knowledge graph extension with a pre-trained language model via unified learning method. Knowledge-Based Systems, page 110245, 2023. URL: https://doi.org/10.1016/J.KNOSYS.2022.110245.
  40. Nurendra Choudhary and Chandan K. Reddy. Complex logical reasoning over knowledge graphs using large language models. CoRR, abs/2305.01157, 2023. URL: https://doi.org/10.48550/ARXIV.2305.01157.
  41. Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann, Parker Schuh, Kensen Shi, Sasha Tsvyashchenko, Joshua Maynez, Abhishek Rao, Parker Barnes, Yi Tay, Noam Shazeer, Vinodkumar Prabhakaran, Emily Reif, Nan Du, Ben Hutchinson, Reiner Pope, James Bradbury, Jacob Austin, Michael Isard, Guy Gur-Ari, Pengcheng Yin, Toju Duke, Anselm Levskaya, Sanjay Ghemawat, Sunipa Dev, Henryk Michalewski, Xavier Garcia, Vedant Misra, Kevin Robinson, Liam Fedus, Denny Zhou, Daphne Ippolito, David Luan, Hyeontaek Lim, Barret Zoph, Alexander Spiridonov, Ryan Sepassi, David Dohan, Shivani Agrawal, Mark Omernick, Andrew M. Dai, Thanumalayan Sankaranarayana Pillai, Marie Pellat, Aitor Lewkowycz, Erica Moreira, Rewon Child, Oleksandr Polozov, Katherine Lee, Zongwei Zhou, Xuezhi Wang, Brennan Saeta, Mark Diaz, Orhan Firat, Michele Catasta, Jason Wei, Kathy Meier-Hellstern, Douglas Eck, Jeff Dean, Slav Petrov, and Noah Fiedel. Palm: Scaling language modeling with pathways, 2023. URL: http://jmlr.org/papers/v24/22-1144.html.
  42. Aida Mostafazadeh Davani, Mark Díaz, and Vinodkumar Prabhakaran. Dealing with disagreements: Looking beyond the majority vote in subjective annotations. TACL, 10:92-110, 2022. URL: https://doi.org/10.1162/TACL_A_00449.
  43. Daniel Daza, Michael Cochez, and Paul Groth. Inductive Entity Representations from Text via Link Prediction. In WWW, pages 798-808, 2021. URL: https://doi.org/10.1145/3442381.3450141.
  44. N De Cao, G Izacard, S Riedel, and F Petroni. Autoregressive entity retrieval. In ICLR 2021-9th International Conference on Learning Representations, volume 2021. ICLR, 2020. URL: https://openreview.net/forum?id=5k8F6UU39V.
  45. Nicola De Cao, Wilker Aziz, and Ivan Titov. Editing factual knowledge in language models. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 6491-6506, Online and Punta Cana, Dominican Republic, nov 2021. Association for Computational Linguistics. URL: https://doi.org/10.18653/V1/2021.EMNLP-MAIN.522.
  46. Xiang Deng, Huan Sun, Alyssa Lees, You Wu, and Cong Yu. Turl: Table understanding through representation learning. ACM SIGMOD Record, 51(1):33-40, 2022. URL: https://doi.org/10.1145/3542700.3542709.
  47. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. NAACL, 2019. URL: https://doi.org/10.18653/V1/N19-1423.
  48. Xin Luna Dong, Evgeniy Gabrilovich, Geremy Heitz, Wilko Horn, Kevin Murphy, Shaohua Sun, and Wei Zhang. From data fusion to knowledge fusion. VLDB, 7(10):881-892, jun 2014. URL: https://doi.org/10.14778/2732951.2732962.
  49. Cícero Nogueira dos Santos, Zhe Dong, Daniel Matthew Cer, John Nham, Siamak Shakeri, Jianmo Ni, and Yun-Hsuan Sung. Knowledge prompts: Injecting world knowledge into language models through soft prompts. ArXiv, 2022. URL: https://doi.org/10.48550/ARXIV.2210.04726.
  50. Mengnan Du, Fengxiang He, Na Zou, et al. Shortcut learning of large language models in natural language understanding: A survey. arXiv, 2022. URL: https://doi.org/10.48550/ARXIV.2208.11857.
  51. Yupei Du, Qi Zheng, Yuanbin Wu, Man Lan, Yan Yang, and Meirong Ma. Understanding gender bias in knowledge base embeddings. In ACL, 2022. URL: https://doi.org/10.18653/V1/2022.ACL-LONG.98.
  52. N Dziri, H Rashkin, T Linzen, and D Reitter. Evaluating attribution in dialogue systems: The BEGIN benchmark. TACL, 2022. URL: https://transacl.org/ojs/index.php/tacl/article/view/3977.
  53. Yanai Elazar, Nora Kassner, Shauli Ravfogel, Amir Feder, Abhilasha Ravichander, Marius Mosbach, Yonatan Belinkov, Hinrich Schütze, and Yoav Goldberg. Measuring causal effects of data statistics on language model’s `factual' predictions. arXiv, 2023. URL: https://doi.org/10.48550/ARXIV.2207.14251.
  54. Yanai Elazar, Nora Kassner, Shauli Ravfogel, Abhilasha Ravichander, Eduard Hovy, Hinrich Schütze, and Yoav Goldberg. Measuring and improving consistency in pretrained language models. TACL, 9, 2021. URL: https://doi.org/10.1162/TACL_A_00410.
  55. Ahmed K. Elmagarmid, Panagiotis G. Ipeirotis, and Vassilios S. Verykios. Duplicate record detection: A survey. TKDE, 19(1):1-16, 2007. URL: https://doi.org/10.1109/TKDE.2007.250581.
  56. Grace Fan, Jin Wang, Yuliang Li, Dan Zhang, and Renée Miller. Semantics-aware dataset discovery from data lakes with contextualized column-based representation learning. VLDB, 2023. URL: https://doi.org/10.14778/3587136.3587146.
  57. Wenfei Fan, Chunming Hu, Xueli Liu, and Ping Lu. Discovering graph functional dependencies. ACM Trans. Database Syst., 45(3), sep 2020. URL: https://doi.org/10.1145/3397198.
  58. Wenfei Fan, Ping Lu, Chao Tian, and Jingren Zhou. Deducing certain fixes to graphs. Proc. VLDB Endow., 12(7):752-765, mar 2019. URL: https://doi.org/10.14778/3317315.3317318.
  59. I. P. Fellegi and A. B. Sunter. A theory for record linkage. Journal of the American Statistical Association, 64:1183-1210, 1969. URL: https://doi.org/10.1080/01621459.1969.10501049.
  60. Besnik Fetahu, Ujwal Gadiraju, and Stefan Dietze. Improving entity retrieval on structured data. In ISWC, pages 474-491, 2015. URL: https://doi.org/10.1007/978-3-319-25007-6_28.
  61. Luis Antonio Galárraga, Christina Teflioudi, Katja Hose, and Fabian Suchanek. Amie: Association rule mining under incomplete evidence in ontological knowledge bases. In WWW, WWW '13, pages 413-422, 2013. URL: https://doi.org/10.1145/2488388.2488425.
  62. Daniel Gao, Yantao Jia, Lei Li, Chengzhen Fu, Zhicheng Dou, Hao Jiang, Xinyu Zhang, Lei Chen, and Zhao Cao. Kmir: A benchmark for evaluating knowledge memorization, identification and reasoning abilities of language models, 2022. URL: https://doi.org/10.48550/arXiv.2202.13529.
  63. Luyu Gao, Zhuyun Dai, Panupong Pasupat, Anthony Chen, Arun Tejasvi Chaganty, Yicheng Fan, Vincent Zhao, Ni Lao, Hongrae Lee, Da-Cheng Juan, and Kelvin Guu. RARR: Researching and revising what language models say, using language models. In Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki, editors, Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 16477-16508, Toronto, Canada, jul 2023. Association for Computational Linguistics. URL: https://doi.org/10.18653/V1/2023.ACL-LONG.910.
  64. Luyu Gao, Zhuyun Dai, Panupong Pasupat, Anthony Chen, Arun Tejasvi Chaganty, Yicheng Fan, Vincent Zhao, Ni Lao, Hongrae Lee, Da-Cheng Juan, and Kelvin Guu. Rarr: Researching and revising what language models say, using language models. In ACL2023, 2023. URL: https://doi.org/10.18653/V1/2023.ACL-LONG.910.
  65. Genet Asefa Gesese, Russa Biswas, Mehwish Alam, and Harald Sack. A survey on knowledge graph embeddings with literals: Which model links better literal-ly? Semantic Web, 12(4):617-647, 2021. URL: https://doi.org/10.3233/SW-200404.
  66. Agneta Ghose, Matteo Lissandrini, Emil Riis Hansen, and Bo Pedersen Weidema. A core ontology for modeling life cycle sustainability assessment on the semantic web. Journal of Industrial Ecology, 26(3):731-747, 2022. URL: https://doi.org/10.1111/jiec.13220.
  67. Bernardo Cuenca Grau, Ian Horrocks, Boris Motik, Bijan Parsia, Peter F. Patel-Schneider, and Ulrike Sattler. OWL 2: The next step for OWL. J. Web Semant, 6(4):309-322, 2008. URL: https://doi.org/10.1016/J.WEBSEM.2008.05.001.
  68. Paul Groth, Elena Paslaru Bontas Simperl, Marieke van Erp, and Denny Vrandecic. Knowledge graphs and their role in the knowledge engineering of the 21st century (dagstuhl seminar 22372). Dagstuhl Reports, 12:60-120, 2022. URL: https://doi.org/10.4230/DAGREP.12.9.60.
  69. OWL Working Group. OWL 2 Web Ontology Language Document Overview: W3C Recommendation, 2012. URL: https://www.w3.org/TR/owl2-overview/.
  70. Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, and Mingwei Chang. Retrieval augmented language model pre-training. In Hal Daumé III and Aarti Singh, editors, Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pages 3929-3938. PMLR, 13-18 July 2020. URL: http://proceedings.mlr.press/v119/guu20a.html.
  71. Xiaoqi Han, Ru Li, Xiaoli Li, and Jeff Z. Pan. A Divide and Conquer Framework for Knowledge Editing. Knowledge Based Systems, 279:110826, 2023. URL: https://doi.org/10.1016/j.knosys.2023.110826.
  72. Yuan He, Jiaoyan Chen, Denvar Antonyrajah, and Ian Horrocks. Bertmap: a bert-based ontology alignment system. In AAAI, volume 36, pages 5684-5691, 2022. URL: https://doi.org/10.1609/AAAI.V36I5.20510.
  73. Yuan He, Jiaoyan Chen, Hang Dong, Ian Horrocks, Carlo Allocca, Taehun Kim, and Brahmananda Sapkota. DeepOnto: A Python package for ontology engineering with deep learning. arXiv preprint arXiv:2307.03067, 2023. URL: https://doi.org/10.48550/ARXIV.2307.03067.
  74. Yuan He, Jiaoyan Chen, Hang Dong, Ernesto Jiménez-Ruiz, Ali Hadian, and Ian Horrocks. Machine learning-friendly biomedical datasets for equivalence and subsumption ontology matching. In ISWC, pages 575-591, 2022. URL: https://doi.org/10.1007/978-3-031-19433-7_33.
  75. Yuan He, Jiaoyan Chen, Ernesto Jimenez-Ruiz, Hang Dong, and Ian Horrocks. Language model analysis for ontology subsumption inference. In Findings of the Association for Computational Linguistics: ACL 2023, pages 3439-3453, Toronto, Canada, jul 2023. Association for Computational Linguistics. URL: https://doi.org/10.18653/V1/2023.FINDINGS-ACL.213.
  76. Benjamin Heinzerling and Kentaro Inui. Language models as knowledge bases: On entity representations, storage capacity, and paraphrased queries. In EACL, pages 1772-1791, 2021. URL: https://doi.org/10.18653/V1/2021.EACL-MAIN.153.
  77. Dan Hendrycks, Collin Burns, Anya Chen, and Spencer Ball. CUAD: An expert-annotated NLP dataset for legal contract review. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 1), 2021. URL: https://openreview.net/forum?id=7l1Ygs3Bamw.
  78. Or Honovich, Uri Shaham, Samuel R. Bowman, and Omer Levy. Instruction induction: From few examples to natural language task descriptions. In Anna Rogers, Jordan L. Boyd-Graber, and Naoaki Okazaki, editors, Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2023, Toronto, Canada, July 9-14, 2023, pages 1935-1952. Association for Computational Linguistics, 2023. URL: https://doi.org/10.18653/V1/2023.ACL-LONG.108.
  79. Edward J. Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, and Weizhu Chen. Lora: Low-rank adaptation of large language models. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net, 2022. URL: https://openreview.net/forum?id=nZeVKeeFYf9.
  80. Nan Hu, Yike Wu, Guilin Qi, Dehai Min, Jiaoyan Chen, Jeff Z. Pan, and Zafar Ali. An empirical study of pre-trained language models in simple knowledge graph question answering. World Wide Web (WWW), 26(5):2855-2886, 2023. URL: https://doi.org/10.1007/S11280-023-01166-Y.
  81. Ziniu Hu, Yichong Xu, Wenhao Yu, Shuohang Wang, Ziyi Yang, Chenguang Zhu, Kai-Wei Chang, and Yizhou Sun. Empowering language models with knowledge graph reasoning for open-domain question answering. In Yoav Goldberg, Zornitsa Kozareva, and Yue Zhang, editors, Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7-11, 2022, pages 9562-9581. Association for Computational Linguistics, 2022. URL: https://doi.org/10.18653/V1/2022.EMNLP-MAIN.650.
  82. Jian Huang, Jianfeng Gao, Jiangbo Miao, Xiaolong Li, Kuansan Wang, Fritz Behr, and C. Lee Giles. Exploring web scale language models for search query processing. In WWW, pages 451-460, 2010. URL: https://doi.org/10.1145/1772690.1772737.
  83. Jie Huang and Kevin Chen-Chuan Chang. Towards reasoning in large language models: A survey. Findings of ACL, 2023. URL: https://doi.org/10.48550/ARXIV.2212.10403.
  84. Ningyuan Huang, Yash R Deshpande, Yibo Liu, Houda Alberts, Kyunghyun Cho, Clara Vania, and Iacer Calixto. Endowing language models with multimodal knowledge graph representations. arXiv, 2022. URL: https://doi.org/10.48550/ARXIV.2206.13163.
  85. Wenyu Huang, Mirella Lapata, Pavlos Vougiouklis, Nikos Papasarantopoulos, and Jeff Z. Pan. Retrieval Augmented Generation with Rich Answer Encoding. In Proc. of IJCNLP-AACL 2023, 2023. Google Scholar
  86. Jena D. Hwang, Chandra Bhagavatula, Ronan Le Bras, Jeff Da, Keisuke Sakaguchi, Antoine Bosselut, and Yejin Choi. Comet-atomic 2020: On symbolic and neural commonsense knowledge graphs. In AAAI, 2021. URL: https://doi.org/10.1609/AAAI.V35I7.16792.
  87. Gautier Izacard and Edouard Grave. Leveraging passage retrieval with generative models for open domain question answering. In Paola Merlo, Jörg Tiedemann, and Reut Tsarfaty, editors, Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, EACL 2021, Online, April 19 - 23, 2021, pages 874-880. Association for Computational Linguistics, 2021. URL: https://doi.org/10.18653/V1/2021.EACL-MAIN.74.
  88. Sarthak Jain and Byron C. Wallace. Attention is not explanation. In Jill Burstein, Christy Doran, and Thamar Solorio, editors, Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), pages 3543-3556. Association for Computational Linguistics, 2019. URL: https://doi.org/10.18653/V1/N19-1357.
  89. Krzysztof Janowicz, Bo Yan, Blake Regalia, Rui Zhu, and Gengchen Mai. Debiasing knowledge graphs: Why female presidents are not like female popes. In Marieke van Erp, Medha Atre, Vanessa López, Kavitha Srinivas, and Carolina Fortuna, editors, Proceedings of the ISWC 2018 Posters & Demonstrations, Industry and Blue Sky Ideas Tracks co-located with 17th International Semantic Web Conference (ISWC 2018), Monterey, USA, October 8th - 12th, 2018, volume 2180 of CEUR Workshop Proceedings. CEUR-WS.org, 2018. URL: https://ceur-ws.org/Vol-2180/ISWC_2018_Outrageous_Ideas_paper_17.pdf.
  90. Zhengbao Jiang, Frank F Xu, Jun Araki, et al. How can we know what language models know? TACL, 8:423-438, 2020. URL: https://doi.org/10.1162/TACL_A_00324.
  91. Ernesto Jiménez-Ruiz and Bernardo Cuenca Grau. Logmap: Logic-based and scalable ontology matching. In ISWC, pages 273-288, 2011. URL: https://doi.org/10.1007/978-3-642-25073-6_18.
  92. Ernesto Jiménez-Ruiz, Oktie Hassanzadeh, Vasilis Efthymiou, Jiaoyan Chen, and Kavitha Srinivas. Semtab 2019: Resources to benchmark tabular data to knowledge graph matching systems. In ESWC, pages 514-530, 2020. URL: https://doi.org/10.1007/978-3-030-49461-2_30.
  93. Martin Josifoski, Nicola De Cao, Maxime Peyrard, Fabio Petroni, and Robert West. Genie: Generative information extraction. In Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 4626-4643, 2022. URL: https://doi.org/10.18653/V1/2022.NAACL-MAIN.342.
  94. Martin Josifoski, Marija Sakota, Maxime Peyrard, and Robert West. Exploiting asymmetry for synthetic training data generation: Synthie and the case of information extraction. ArXiv, 2023. URL: https://doi.org/10.48550/ARXIV.2303.04132.
  95. Jan-Christoph Kalo and Leandra Fichtel. Kamel: Knowledge analysis with multitoken entities in language models. In AKBC, 2022. URL: https://www.akbc.ws/2022/assets/pdfs/15_kamel_knowledge_analysis_with_.pdf.
  96. Jan-Christoph Kalo, Simon Razniewski, Sneha Singhania, and Jeff Z. Pan. LM-KBC: Knowledge base construction from pre-trained language models. ISWC Challenges, 2023. Google Scholar
  97. Nikhil Kandpal, Haikang Deng, Adam Roberts, Eric Wallace, and Colin Raffel. Large language models struggle to learn long-tail knowledge, 2023. URL: https://doi.org/10.48550/ARXIV.2211.08411.
  98. Ehud Karpas, Omri Abend, Yonatan Belinkov, Barak Lenz, Opher Lieber, Nir Ratner, Yoav Shoham, Hofit Bata, Yoav Levine, Kevin Leyton-Brown, Dor Muhlgay, Noam Rozen, Erez Schwartz, Gal Shachaf, Shai Shalev-Shwartz, Amnon Shashua, and Moshe Tennenholtz. MRKL systems: A modular, neuro-symbolic architecture that combines large language models, external knowledge sources and discrete reasoning. CoRR, abs/2205.00445, 2022. URL: https://doi.org/10.48550/ARXIV.2205.00445.
  99. Vladimir Karpukhin, Barlas Oguz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen tau Yih. Dense passage retrieval for open-domain question answering. In EMNLP, pages 6769-6781, 2020. URL: https://doi.org/10.18653/V1/2020.EMNLP-MAIN.550.
  100. Nora Kassner and Hinrich Schütze. Negated and misprimed probes for pretrained language models: Birds can talk, but cannot fly. In ACL, 2020. URL: https://doi.org/10.18653/V1/2020.ACL-MAIN.698.
  101. Bosung Kim, Taesuk Hong, Youngjoong Ko, and Jungyun Seo. Multi-task learning for knowledge graph completion with pre-trained language models. In COLING, pages 1737-1743, 2020. URL: https://doi.org/10.18653/V1/2020.COLING-MAIN.153.
  102. Holger Knublauch and Dimitris Kontokostas. Shapes constraint language (SHACL). Technical report, W3C, jul 2017. URL: https://www.w3.org/TR/shacl/.
  103. Keti Korini and Christian Bizer. Column type annotation using chatgpt. arXiv, 2023. URL: https://doi.org/10.48550/ARXIV.2306.00745.
  104. Angelie Kraft and Ricardo Usbeck. The lifecycle of “facts”: A survey of social bias in knowledge graphs. In AACL, 2022. URL: https://aclanthology.org/2022.aacl-main.49.
  105. Ranjay Krishna, Yuke Zhu, Oliver Groth, Justin Johnson, Kenji Hata, Joshua Kravitz, Stephanie Chen, Yannis Kalantidis, Li-Jia Li, David A. Shamma, Michael S. Bernstein, and Li Fei-Fei. Visual genome: Connecting language and vision using crowdsourced dense image annotations. Int. J. Comput. Vis., 123(1):32-73, 2017. URL: https://doi.org/10.1007/S11263-016-0981-7.
  106. Viet Dac Lai, Nghia Trung Ngo, Amir Pouran Ben Veyseh, Hieu Man, Franck Dernoncourt, Trung Bui, and Thien Huu Nguyen. Chatgpt beyond english: Towards a comprehensive evaluation of large language models in multilingual learning. ArXiv, 2023. URL: https://doi.org/10.48550/ARXIV.2304.05613.
  107. Yoonjoo Lee, John Joon Young Chung, Tae Soo Kim, Jean Y. Song, and Juho Kim. Promptiverse: Scalable generation of scaffolding prompts through human-ai hybrid knowledge graph annotation. In CHI, 2022. URL: https://doi.org/10.1145/3491102.3502087.
  108. Alina Leidinger and Richard Rogers. Which stereotypes are moderated and under-moderated in search engine autocompletion? In Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, FAccT 2023, Chicago, IL, USA, June 12-15, 2023, pages 1049-1061. ACM, 2023. URL: https://doi.org/10.1145/3593013.3594062.
  109. Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, Sebastian Riedel, and Douwe Kiela. Retrieval-augmented generation for knowledge-intensive nlp tasks. In neurIPS, volume 33, 2020. URL: https://proceedings.neurips.cc/paper/2020/hash/6b493230205f780e1bc26945df7481e5-Abstract.html.
  110. Belinda Z. Li, Sewon Min, Srinivasan Iyer, Yashar Mehdad, and Wen-tau Yih. Efficient one-pass end-to-end entity linking for questions. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, November 16-20, 2020, pages 6433-6441, 2020. URL: https://doi.org/10.18653/V1/2020.EMNLP-MAIN.522.
  111. Bo Li, Gexiang Fang, Yang Yang, Quansen Wang, Wei Ye, Wen Zhao, and Shikun Zhang. Evaluating chatgpt’s information extraction capabilities: An assessment of performance, explainability, calibration, and faithfulness. ArXiv, 2023. URL: https://doi.org/10.48550/ARXIV.2304.11633.
  112. Da Li, Ming Yi, and Yukai He. Lp-bert: Multi-task pre-training knowledge graph bert for link prediction. arXiv, 2022. URL: https://arxiv.org/abs/2201.04843.
  113. Junnan Li, Dongxu Li, Silvio Savarese, and Steven Hoi. BLIP-2: Bootstrapping language-image pre-training with frozen image encoders and large language models. In Andreas Krause, Emma Brunskill, Kyunghyun Cho, Barbara Engelhardt, Sivan Sabato, and Jonathan Scarlett, editors, Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pages 19730-19742. PMLR, 23-29 July 2023. URL: https://proceedings.mlr.press/v202/li23q.html.
  114. Tianyi Li, Mohammad Javad Hosseini, Sabine Weber, and Mark Steedman. Language models are poor learners of directional inference. In Findings of the Association for Computational Linguistics: EMNLP 2022, pages 903-921, Abu Dhabi, United Arab Emirates, dec 2022. Association for Computational Linguistics. URL: https://doi.org/10.18653/V1/2022.FINDINGS-EMNLP.64.
  115. Tianyi Li, Wenyu Huang, Nikos Papasarantopoulos, Pavlos Vougiouklis, and Jeff Z. Pan. Task-specific pre-training and prompt decomposition for knowledge graph population with language models. In LM-KBC, 2022. https://arxiv.org/abs/2208.12539, URL: https://doi.org/10.48550/ARXIV.2208.12539.
  116. Stephan Linzbach, Tim Tressel, Laura Kallmeyer, Stefan Dietze, and Hajira Jabeen. Decoding prompt syntax: Analysing its impact on knowledge retrieval in large language models. In NLP4KGC, 2023. URL: https://doi.org/10.1145/3543873.3587655.
  117. Matteo Lissandrini, Davide Mottin, Themis Palpanas, Dimitra Papadimitriou, and Yannis Velegrakis. Unleashing the power of information graphs. SIGMOD Rec., 43(4):21-26, 2014. URL: https://doi.org/10.1145/2737817.2737822.
  118. Fangyu Liu, Ivan Vulic, Anna Korhonen, and Nigel Collier. Learning domain-specialised representations for cross-lingual biomedical entity linking. In ACL, 2021. URL: https://doi.org/10.18653/V1/2021.ACL-SHORT.72.
  119. Hao Liu, Yehoshua Perl, and James Geller. Concept placement using bert trained by transforming and summarizing biomedical ontology structure. Journal of Biomedical Informatics, 112:103607, 2020. URL: https://doi.org/10.1016/J.JBI.2020.103607.
  120. Hugo Liu and Push Singh. Commonsense reasoning in and over natural language. In Knowledge-Based Intelligent Information and Engineering Systems, pages 293-306, 2004. URL: https://doi.org/10.1007/978-3-540-30134-9_40.
  121. Jixiong Liu, Yoan Chabot, Raphaël Troncy, Viet-Phi Huynh, Thomas Labbé, and Pierre Monnin. From tabular data to knowledge graphs: A survey of semantic table interpretation tasks and methods. J. Web Semant., 2022. URL: https://doi.org/10.1016/J.WEBSEM.2022.100761.
  122. Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. Roberta: A robustly optimized bert pretraining approach. arXiv, 2019. URL: https://arxiv.org/abs/1907.11692.
  123. Fengyuan Lu, Peijin Cong, and Xinli Huang. Utilizing textual information in knowledge graph embedding: A survey of methods and applications. IEEE Access, 8:92072-92088, 2020. URL: https://doi.org/10.1109/ACCESS.2020.2995074.
  124. Chaitanya Malaviya, Chandra Bhagavatula, Antoine Bosselut, and Yejin Choi. Commonsense knowledge base completion with structural and semantic context. In AAAI, volume 34, pages 2925-2933, 2020. URL: https://doi.org/10.1609/AAAI.V34I03.5684.
  125. Alex Mallen, Akari Asai, Victor Zhong, Rajarshi Das, Daniel Khashabi, and Hannaneh Hajishirzi. When not to trust language models: Investigating effectiveness of parametric and non-parametric memories. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 9802-9822, Toronto, Canada, jul 2023. Association for Computational Linguistics. URL: https://doi.org/10.18653/V1/2023.ACL-LONG.546.
  126. Rui Mao, Qian Liu, Kai He, et al. The biases of pre-trained language models: An empirical study on prompt-based sentiment analysis and emotion detection. IEEE Transactions on Affective Computing, pages 1-11, 2022. URL: https://doi.org/10.1109/TAFFC.2022.3204972.
  127. Ninareh Mehrabi, Thamme Gowda, Fred Morstatter, Nanyun Peng, and Aram Galstyan. Man is to person as woman is to location: Measuring gender bias in named entity recognition. In Ujwal Gadiraju, editor, HT '20: 31st ACM Conference on Hypertext and Social Media, Virtual Event, USA, July 13-15, 2020, pages 231-232. ACM, 2020. URL: https://doi.org/10.1145/3372923.3404804.
  128. Chris Mellish and Jeff Z. Pan. Natural Language Directed Inference from Ontologies. In Artificial Intelligence Journal, 2008. URL: https://doi.org/10.1016/J.ARTINT.2008.01.003.
  129. Eric Mitchell, Charles Lin, Antoine Bosselut, Christopher D Manning, and Chelsea Finn. Memory-based model editing at scale. In International Conference on Machine Learning, pages 15817-15831. PMLR, 2022. URL: https://proceedings.mlr.press/v162/mitchell22a.html.
  130. Fedor Moiseev, Zhe Dong, Enrique Alfonseca, and Martin Jaggi. Skill: Structured knowledge infusion for large language models. In NAACL, 2022. URL: https://doi.org/10.18653/V1/2022.NAACL-MAIN.113.
  131. Moin Nadeem, Anna Bethke, and Siva Reddy. Stereoset: Measuring stereotypical bias in pretrained language models. In Chengqing Zong, Fei Xia, Wenjie Li, and Roberto Navigli, editors, Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL/IJCNLP 2021, (Volume 1: Long Papers), Virtual Event, August 1-6, 2021, pages 5356-5371. Association for Computational Linguistics, 2021. URL: https://doi.org/10.18653/V1/2021.ACL-LONG.416.
  132. Mojtaba Nayyeri, Zihao Wang, Mst Akter, Mirza Mohtashim Alam, Md Rashad Al Hasan Rony, Jens Lehmann, Steffen Staab, et al. Integrating knowledge graph embedding and pretrained language models in hypercomplex spaces. arXiv, 2022. URL: https://doi.org/10.48550/ARXIV.2208.02743.
  133. Sophie Neutel and Maaike HT de Boer. Towards automatic ontology alignment using bert. In AAAI Spring Symposium: Combining Machine Learning with Knowledge Engineering, 2021. URL: https://ceur-ws.org/Vol-2846/paper28.pdf.
  134. Tuan-Phong Nguyen, Simon Razniewski, Aparna Varde, and Gerhard Weikum. Extracting cultural commonsense knowledge at scale. In WWW, pages 1907-1917, 2023. URL: https://doi.org/10.1145/3543507.3583535.
  135. Natalya Fridman Noy, Yuqing Gao, Anshu Jain, Anant Narayanan, Alan Patterson, and Jamie Taylor. Industry-scale knowledge graphs: lessons and challenges. Commun. ACM, 62(8):36-43, 2019. URL: https://doi.org/10.1145/3331166.
  136. Barlas Oguz, Xilun Chen, Vladimir Karpukhin, Stan Peshterliev, Dmytro Okhonko, Michael Sejr Schlichtkrull, Sonal Gupta, Yashar Mehdad, and Scott Yih. Unik-qa: Unified representations of structured and unstructured knowledge for open-domain question answering. In Marine Carpuat, Marie-Catherine de Marneffe, and Iván Vladimir Meza Ruíz, editors, Findings of the Association for Computational Linguistics: NAACL 2022, Seattle, WA, United States, July 10-15, 2022, pages 1535-1546. Association for Computational Linguistics, 2022. URL: https://doi.org/10.18653/V1/2022.FINDINGS-NAACL.115.
  137. OpenAI. GPT-4 technical report, 2023. URL: https://doi.org/10.48550/ARXIV.2303.08774.
  138. Long Ouyang, Jeff Wu, Xu Jiang, et al. Training language models to follow instructions with human feedback. neurIPS, 2022. URL: http://papers.nips.cc/paper_files/paper/2022/hash/b1efde53be364a73914f58805a001731-Abstract-Conference.html.
  139. Jeff Z. Pan. Resource Description Framework. In Handbook on Ontologies. IOS Press, 2009. URL: https://doi.org/10.1007/978-3-540-92673-3_3.
  140. Jeff Z. Pan and Ian Horrocks. Web Ontology Reasoning with Datatype Groups. In ISWC, pages 47-63, 2003. URL: https://doi.org/10.1007/978-3-540-39718-2_4.
  141. Jeff Z. Pan, Guido Vetere, José Manuél Gómez-Pérez, and Honghan Wu. Exploiting linked data and knowledge graphs in large organisations. Springer International Publishing, 2017. URL: https://doi.org/10.1007/978-3-319-45654-6.
  142. Shirui Pan, Linhao Luo, Yufei Wang, Chen Chen, Jiapu Wang, and Xindong Wu. Unifying large language models and knowledge graphs: A roadmap. arXiv, 2023. URL: https://doi.org/10.48550/ARXIV.2306.08302.
  143. Lalchand Pandia and Allyson Ettinger. Sorting through the noise: Testing robustness of information processing in pre-trained language models. EMNLP, 2021. URL: https://doi.org/10.48550/arXiv.2109.12393.
  144. George Papadakis, Ekaterini Ioannou, Emanouil Thanos, and Themis Palpanas. The four generations of entity resolution. In Synthesis Lectures on Data Management, 2021. URL: https://doi.org/10.2200/S01067ED1V01Y202012DTM064.
  145. Jae Sung Park, Chandra Bhagavatula, Roozbeh Mottaghi, Ali Farhadi, and Yejin Choi. Visualcomet: Reasoning about the dynamic context of a still image. In Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part V, volume 12350 of Lecture Notes in Computer Science, pages 508-524. Springer, 2020. URL: https://doi.org/10.1007/978-3-030-58558-7_30.
  146. Heiko Paulheim. Knowledge graph refinement: A survey of approaches and evaluation methods. Semantic Web, 8(3):489-508, 2017. URL: https://doi.org/10.3233/SW-160218.
  147. Ralph Peeters, Reng Chiz Der, and Christian Bizer. WDC products: A multi-dimensional entity matching benchmark. In Proceedings 27th International Conference on Extending Database Technology, EDBT 2024, Paestum, Italy, March 25 - March 28, pages 22-33, 2024. URL: https://doi.org/10.48786/EDBT.2024.03.
  148. Shichao Pei, Lu Yu, Guoxian Yu, and Xiangliang Zhang. Rea: Robust cross-lingual entity alignment between knowledge graphs. In KDD, pages 2175-2184, 2020. URL: https://doi.org/10.1145/3394486.3403268.
  149. Guilherme Penedo, Quentin Malartic, Daniel Hesslow, Ruxandra Cojocaru, Alessandro Cappelli, Hamza Alobeidli, Baptiste Pannier, Ebtesam Almazrouei, and Julien Launay. The refinedweb dataset for falcon llm: Outperforming curated corpora with web data, and web data only, 2023. https://arxiv.org/abs/2306.01116, URL: https://doi.org/10.48550/ARXIV.2306.01116.
  150. Matthew E Peters, Mark Neumann, Robert L Logan IV, Roy Schwartz, Vidur Joshi, Sameer Singh, and Noah A Smith. Knowledge enhanced contextual word representations. EMNLP, 2019. URL: https://arxiv.org/abs/1909.04164.
  151. Fabio Petroni, Patrick Lewis, Aleksandra Piktus, et al. How context affects language models' factual predictions. AKBC, 2020. URL: https://doi.org/10.24432/C5201W.
  152. Fabio Petroni, Tim Rocktäschel, Sebastian Riedel, Patrick Lewis, Anton Bakhtin, Yuxiang Wu, and Alexander Miller. Language models as knowledge bases? In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 2463-2473, Hong Kong, China, nov 2019. Association for Computational Linguistics. URL: https://doi.org/10.18653/V1/D19-1250.
  153. Gabriele Picco, Marcos Martinez Galindo, Alberto Purpura, Leopold Fuchs, Vanessa Lopez, and Thanh Lam Hoang. Zshot: An open-source framework for zero-shot named entity recognition and relation extraction. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 3: System Demonstrations), pages 357-368, 2023. URL: https://doi.org/10.18653/V1/2023.ACL-DEMO.34.
  154. Barbara Plank. The “problem” of human label variation: On ground truth in data, modeling and evaluation. EMNLP, abs/2211.02570, 2022. URL: https://doi.org/10.18653/V1/2022.EMNLP-MAIN.731.
  155. Eric Prud'hommeaux, José Emilio Labra Gayo, and Harold R. Solbrig. Shape expressions: an RDF validation and transformation language. In Harald Sack, Agata Filipowska, Jens Lehmann, and Sebastian Hellmann, editors, Proceedings of the 10th International Conference on Semantic Systems, SEMANTiCS 2014, Leipzig, Germany, September 4-5, 2014, pages 32-40. ACM, 2014. URL: https://doi.org/10.1145/2660517.2660523.
  156. Yujia Qin, Shihao Liang, Yining Ye, Kunlun Zhu, Lan Yan, Yaxi Lu, Yankai Lin, Xin Cong, Xiangru Tang, Bill Qian, Sihan Zhao, Runchu Tian, Ruobing Xie, Jie Zhou, Mark Gerstein, Dahai Li, Zhiyuan Liu, and Maosong Sun. Toolllm: Facilitating large language models to master 16000+ real-world apis, 2023. URL: https://doi.org/10.48550/ARXIV.2307.16789.
  157. Kashif Rabbani, Matteo Lissandrini, and Katja Hose. SHACL and shex in the wild: A community survey on validating shapes generation and adoption. In Frédérique Laforest, Raphaël Troncy, Elena Simperl, Deepak Agarwal, Aristides Gionis, Ivan Herman, and Lionel Médini, editors, Companion of The Web Conference 2022, Virtual Event / Lyon, France, April 25 - 29, 2022, pages 260-263. ACM, 2022. URL: https://doi.org/10.1145/3487553.3524253.
  158. Kashif Rabbani, Matteo Lissandrini, and Katja Hose. Extraction of validating shapes from very large knowledge graphs. Proc. VLDB Endow., 16(5):1023-1032, jan 2023. URL: https://doi.org/10.14778/3579075.3579078.
  159. Kashif Rabbani, Matteo Lissandrini, and Katja Hose. SHACTOR: improving the quality of large-scale knowledge graphs with validating shapes. In Sudipto Das, Ippokratis Pandis, K. Selçuk Candan, and Sihem Amer-Yahia, editors, Companion of the 2023 International Conference on Management of Data, SIGMOD/PODS 2023, Seattle, WA, USA, June 18-23, 2023, pages 151-154. ACM, 2023. URL: https://doi.org/10.1145/3555041.3589723.
  160. Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. Learning transferable visual models from natural language supervision. In Marina Meila and Tong Zhang, editors, Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pages 8748-8763. PMLR, 18-24 July 2021. URL: https://proceedings.mlr.press/v139/radford21a.html.
  161. Leonardo Ranaldi, Elena Sofia Ruzzetti, David A. Venditti, Dario Onorati, and Fabio Massimo Zanzotto. A trip towards fairness: Bias and de-biasing in large language models. ArXiv, 2023. URL: https://doi.org/10.48550/ARXIV.2305.13862.
  162. Abhilasha Ravichander, Eduard Hovy, Kaheer Suleman, Adam Trischler, and Jackie Chi Kit Cheung. On the systematicity of probing contextualized word representations: The case of hypernymy in BERT. In Joint Conference on Lexical and Computational Semantics, pages 88-102, 2020. URL: https://aclanthology.org/2020.starsem-1.10/.
  163. Simon Razniewski, Andrew Yates, Nora Kassner, and Gerhard Weikum. Language models as or for knowledge bases. CoRR, abs/2110.04888, 2021. URL: https://arxiv.org/abs/2110.04888.
  164. Leonardo F. R. Ribeiro, Martin Schmitt, Hinrich Schütze, and Iryna Gurevych. Investigating pretrained language models for graph-to-text generation. Workshop on Natural Language Processing for Conversational AI, abs/2007.08426, 2021. URL: https://arxiv.org/abs/2007.08426.
  165. Petar Ristoski, Jessica Rosati, Tommaso Di Noia, Renato De Leone, and Heiko Paulheim. Rdf2vec: RDF graph embeddings and their applications. Semantic Web, 10(4):721-752, 2019. URL: https://doi.org/10.3233/SW-180317.
  166. Devendra Sachan, Yuhao Zhang, Peng Qi, et al. Do syntax trees help pre-trained transformers extract information? In EACL, pages 2647-2661, apr 2021. URL: https://doi.org/10.18653/V1/2021.EACL-MAIN.228.
  167. Maarten Sap, Dallas Card, Saadia Gabriel, Yejin Choi, and Noah A. Smith. The risk of racial bias in hate speech detection. In ACL, 2019. URL: https://doi.org/10.18653/V1/P19-1163.
  168. Timo Schick, Jane Dwivedi-Yu, Roberto Dessì, Roberta Raileanu, Maria Lomeli, Luke Zettlemoyer, Nicola Cancedda, and Thomas Scialom. Toolformer: Language models can teach themselves to use tools. In arXiv, 2023. URL: https://doi.org/10.48550/ARXIV.2302.04761.
  169. Patrick Schramowski, Cigdem Turan, Nico Andersen, et al. Large pre-trained language models contain human-like biases of what is right and wrong to do. Nature Machine Intelligence, 4(3):258-268, 2022. URL: https://doi.org/10.1038/S42256-022-00458-8.
  170. Jingyu Shao, Qing Wang, Asiri Wijesinghe, and Erhard Rahm. Ergan: Generative adversarial networks for entity resolution. In ICDM, pages 1250-1255, 2020. URL: https://doi.org/10.1109/ICDM50108.2020.00158.
  171. Jingchuan Shi, Jiaoyan Chen, Hang Dong, Ishita Khan, Lizzie Liang, Qunzhi Zhou, Zhe Wu, and Ian Horrocks. Subsumption prediction for e-commerce taxonomies. In ESWC, pages 244-261, 2023. URL: https://doi.org/10.1007/978-3-031-33455-9_15.
  172. Peng Shi and Jimmy J. Lin. Simple BERT models for relation extraction and semantic role labeling. ArXiv, abs/1904.05255, 2019. URL: http://arxiv.org/abs/1904.05255, URL: https://arxiv.org/abs/1904.05255.
  173. Jaeho Shin, Sen Wu, Feiran Wang, Christopher De Sa, Ce Zhang, and Christopher Ré. Incremental knowledge base construction using deepdive. In VLDB, volume 8, page 1310, 2015. URL: https://doi.org/10.14778/2809974.2809991.
  174. Taylor Shin, Yasaman Razeghi, Robert L. Logan IV, Eric Wallace, and Sameer Singh. AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts. In EMNLP, pages 4222-4235, 2020. URL: https://doi.org/10.18653/V1/2020.EMNLP-MAIN.346.
  175. Sneha Singhania, Tuan-Phong Nguyen, and Simon Razniewski. LM-KBC: Knowledge base construction from pre-trained language models. Semantic Web challenge, 2022. URL: https://ceur-ws.org/Vol-3274/paper1.pdf.
  176. Sneha Singhania, Simon Razniewski, and Gerhard Weikum. Predicting Document Coverage for Relation Extraction. Transactions of the Association for Computational Linguistics, 10:207-223, mar 2022. URL: https://doi.org/10.1162/TACL_A_00456.
  177. Xiaoqi Han snf Ru Li, Hongye Tan, Yuanlong Wang, Qinghua Chai, and Jeff Z. Pan. Improving Sequential Model Editing with Fact Retrieval. In Findings of EMNLP, 2023. Google Scholar
  178. Ran Song, Shizhu He, Shengxiang Gao, Li Cai, Kang Liu, Zhengtao Yu, and Jun Zhao. Multilingual knowledge graph completion from pretrained language models with knowledge constraints. In Findings of the Association for Computational Linguistics: ACL 2023, pages 7709-7721, 2023. URL: https://doi.org/10.18653/V1/2023.FINDINGS-ACL.488.
  179. Aarohi Srivastava et al. Beyond the imitation game: Quantifying and extrapolating the capabilities of language models. Transactions on Machine Learning Research, 2023. URL: https://openreview.net/forum?id=uyTL5Bvosj.
  180. Fabian M. Suchanek, Gjergji Kasneci, and Gerhard Weikum. Yago: A core of semantic knowledge. In Proceedings of the 16th International Conference on World Wide Web, WWW '07, pages 697-706, New York, NY, USA, 2007. Association for Computing Machinery. URL: https://doi.org/10.1145/1242572.1242667.
  181. Yoshihiko Suhara, Jinfeng Li, Yuliang Li, Dan Zhang, Çağatay Demiralp, Chen Chen, and Wang-Chiew Tan. Annotating columns with pre-trained language models. In SIGMOD, pages 1493-1503, 2022. URL: https://doi.org/10.1145/3514221.3517906.
  182. Zequn Sun, Qingheng Zhang, Wei Hu, Chengming Wang, Muhao Chen, Farahnaz Akrami, and Chengkai Li. A benchmarking study of embedding-based entity alignment for knowledge graphs. VLDB, 13(11):2326-2340, 2020. URL: http://www.vldb.org/pvldb/vol13/p2326-sun.pdf.
  183. Alexandre Tamborrino, Nicola Pellicano, Baptiste Pannier, et al. Pre-training is (almost) all you need: An application to commonsense reasoning. ACL, 2020. URL: https://doi.org/10.48550/arXiv.2004.14074.
  184. Wang-Chiew Tan, Yuliang Li, Pedro Rodriguez, Richard James, Xi Victoria Lin, Alon Halevy, and Scott Yih. Reimagining retrieval augmented language models for answering queries. In Findings of ACL, pages 6131-6146, 2023. URL: https://doi.org/10.18653/V1/2023.FINDINGS-ACL.382.
  185. Niket Tandon and Gerard de Melo. Information extraction from web-scale n-gram data. In Web N-gram Workshop, volume 5803, pages 8-15, 2010. URL: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.365.2318.
  186. Niket Tandon, Gerard de Melo, Fabian M. Suchanek, and Gerhard Weikum. Webchild: harvesting and organizing commonsense knowledge from the web. Proceedings of the 7th ACM international conference on Web search and data mining, 2014. URL: https://doi.org/10.1145/2556195.2556245.
  187. Niket Tandon, Gerard de Melo, and Gerhard Weikum. Deriving a Web-scale common sense fact database. In AAAI, pages 152-157, 2011. URL: http://www.aaai.org/ocs/index.php/AAAI/AAAI11/paper/view/3764, URL: https://doi.org/10.1609/AAAI.V25I1.7841.
  188. Nan Tang, Ju Fan, Fangyi Li, Jianhong Tu, Xiaoyong Du, Guoliang Li, Sam Madden, and Mourad Ouzzani. Rpt: relational pre-trained transformer is almost all you need towards democratizing data preparation. VLDB, 2021. URL: https://doi.org/10.14778/3457390.3457391.
  189. Ruixiang Tang, Xiaotian Han, Xiaoqian Jiang, and Xia Hu. Does synthetic data generation of LLMs help clinical text mining? ArXiv, 2023. URL: https://doi.org/10.48550/ARXIV.2303.04360.
  190. Nicolas Tempelmeier, Elena Demidova, and Stefan Dietze. Inferring missing categorical information in noisy and sparse web markup. In WWW, 2018. URL: https://doi.org/10.1145/3178876.3186028.
  191. Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, et al. Llama: Open and efficient foundation language models. arXiv, 2023. URL: https://doi.org/10.48550/ARXIV.2302.13971.
  192. Pat Verga, Haitian Sun, Livio Baldini Soares, et al. Facts as experts: Adaptable and interpretable neural memory over symbolic knowledge. NAACL, 2021. URL: https://arxiv.org/abs/2007.00849.
  193. Blerta Veseli, Sneha Singhania, Simon Razniewski, and Gerhard Weikum. Evaluating language models for knowledge base completion. In The Semantic Web - 20th International Conference, ESWC 2023, Hersonissos, Crete, Greece, May 28 - June 1, 2023, Proceedings, volume 13870 of Lecture Notes in Computer Science, pages 227-243. Springer, 2023. URL: https://doi.org/10.1007/978-3-031-33455-9_14.
  194. Liane Vogel, Benjamin Hilprecht, and Carsten Binnig. Towards foundation models for relational databases [vision paper]. TRL@NeurIPS2022, 2023. URL: https://doi.org/10.48550/ARXIV.2305.15321.
  195. Pavlos Vougiouklis, Nikos Papasarantopoulos, Danna Zheng, David Tuckey, Chenxin Diao, Zhili Shen, and Jeff Z. Pan. FastRAT: Fast and Efficient Cross-lingual Text-to-SQL Semantic Parsing. In Proc. of IJCNLP-AACL 2023, 2023. Google Scholar
  196. Denny Vrandečić and Markus Krötzsch. Wikidata: A free collaborative knowledgebase. Commun. ACM, 57:78-85, sep 2014. URL: https://doi.org/10.1145/2629489.
  197. David Wadden, Ulme Wennberg, Yi Luan, and Hannaneh Hajishirzi. Entity, relation, and event extraction with contextualized span representations. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), pages 5784-5789, Hong Kong, China, nov 2019. Association for Computational Linguistics. URL: https://doi.org/10.18653/V1/D19-1585.
  198. Bin Wang, Guangtao Wang, Jing Huang, Jiaxuan You, Jure Leskovec, and C.-C. Jay Kuo. Inductive learning on commonsense knowledge graph completion. In Joint Conference on Neural Networks, pages 1-8, 2021. URL: https://doi.org/10.1109/IJCNN52387.2021.9534355.
  199. Bo Wang, Tao Shen, Guodong Long, Tianyi Zhou, Ying Wang, and Yi Chang. Structure-augmented text representation learning for efficient knowledge graph completion. In WWW, pages 1737-1748, 2021. URL: https://doi.org/10.1145/3442381.3450043.
  200. Liang Wang, Wei Zhao, Zhuoyu Wei, and Jingming Liu. SimKGC: Simple Contrastive Knowledge Graph Completion with Pre-trained Language Models. In ACL, pages 4281-4294, may 2022. URL: https://doi.org/10.18653/V1/2022.ACL-LONG.295.
  201. Xiao Wang, Wei Zhou, Can Zu, Han Xia, Tianze Chen, Yuan Zhang, Rui Zheng, Junjie Ye, Qi Zhang, Tao Gui, Jihua Kang, J. Yang, Siyuan Li, and Chunsai Du. Instructuie: Multi-task instruction tuning for unified information extraction. ArXiv, 2023. URL: https://doi.org/10.48550/ARXIV.2304.08085.
  202. Xiaozhi Wang, Tianyu Gao, Zhaocheng Zhu, Zhengyan Zhang, Zhiyuan Liu, Juanzi Li, and Jian Tang. Kepler: A unified model for knowledge embedding and pre-trained language representation. TACL, 9:176-194, 2021. URL: https://doi.org/10.1162/TACL_A_00360.
  203. Xiyu Wang and Nora El-Gohary. Deep learning-based relation extraction and knowledge graph-based representation of construction safety requirements. Automation in Construction, 147:104696, 2023. URL: https://doi.org/10.1016/j.autcon.2022.104696.
  204. Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Ed Huai hsin Chi, F. Xia, Quoc Le, and Denny Zhou. Chain of thought prompting elicits reasoning in large language models. ArXiv, abs/2201.11903, 2022. URL: https://arxiv.org/abs/2201.11903.
  205. Xiang Wei, Xingyu Cui, Ning Cheng, Xiaobin Wang, Xin Zhang, Shen Huang, Pengjun Xie, Jinan Xu, Yufeng Chen, Meishan Zhang, Yong Jiang, and Wenjuan Han. Zero-shot information extraction via chatting with chatgpt. ArXiv, 2023. URL: https://doi.org/10.48550/ARXIV.2302.10205.
  206. Gerhard Weikum, Luna Dong, Simon Razniewski, and Fabian M. Suchanek. Machine knowledge: Creation and curation of comprehensive knowledge bases. CoRR, abs/2009.11564, 2020. URL: https://arxiv.org/abs/2009.11564.
  207. Joseph Weizenbaum. Eliza—a computer program for the study of natural language communication between man and machine. Communications of the ACM, 1966. URL: https://doi.org/10.1145/365153.365168.
  208. Peter West, Chandra Bhagavatula, Jack Hessel, Jena D Hwang, Liwei Jiang, Ronan Le Bras, Ximing Lu, Sean Welleck, and Yejin Choi. Symbolic knowledge distillation: from general language models to commonsense models. arXiv preprint arXiv:2110.07178, 2021. URL: https://doi.org/10.48550/arXiv.2110.07178.
  209. Shijie Wu, Ozan Irsoy, Steven Lu, Vadim Dabravolski, Mark Dredze, Sebastian Gehrmann, Prabhanjan Kambadur, David S. Rosenberg, and Gideon Mann. Bloomberggpt: A large language model for finance. CoRR, abs/2303.17564, 2023. URL: https://doi.org/10.48550/ARXIV.2303.17564.
  210. Guohui Xiao, Diego Calvanese, Roman Kontchakov, Domenico Lembo, Antonella Poggi, Riccardo Rosati, and Michael Zakharyaschev. Ontology-based data access: A survey. In IJCAI, pages 5511-5519, 2018. URL: https://doi.org/10.24963/IJCAI.2018/777.
  211. Wenjie Xu, Ben Liu, Miao Peng, Xu Jia, and Min Peng. Pre-trained language model with prompts for temporal knowledge graph completion. In Findings of ACL 2023, 2023. URL: https://doi.org/10.18653/V1/2023.FINDINGS-ACL.493.
  212. Yan Xu, Mahdi Namazifar, Devamanyu Hazarika, Aishwarya Padmakumar, Yang Liu, and Dilek Hakkani-Tür. KILM: knowledge injection into encoder-decoder language models. In Anna Rogers, Jordan L. Boyd-Graber, and Naoaki Okazaki, editors, Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2023, Toronto, Canada, July 9-14, 2023, pages 5013-5035. Association for Computational Linguistics, 2023. URL: https://doi.org/10.18653/V1/2023.ACL-LONG.275.
  213. Linyao Yang, Hongyang Chen, Zhao Li, Xiao Ding, and Xindong Wu. Chatgpt is not enough: Enhancing large language models with knowledge graphs for fact-aware language modeling. arXiv, 2023. URL: https://doi.org/10.48550/ARXIV.2306.11489.
  214. Nan Yang, Tao Ge, Liang Wang, Binxing Jiao, Daxin Jiang, Linjun Yang, Rangan Majumder, and Furu Wei. Inference with reference: Lossless acceleration of large language models. In arXiv, 2023. URL: https://doi.org/10.48550/ARXIV.2304.04487.
  215. Xi Yang, Aokun Chen, Nima PourNejatian, Hoo Chang Shin, Kaleb E. Smith, Christopher Parisien, Colin Compas, Cheryl Martin, Anthony B. Costa, Mona G. Flores, Ying Zhang, Tanja Magoc, Christopher A. Harle, Gloria Lipori, Duane A. Mitchell, William R. Hogan, Elizabeth A. Shenkman, Jiang Bian, and Yonghui Wu. A large language model for electronic health records. npj Digital Medicine, 5(1):194, 2022. URL: https://doi.org/10.1038/S41746-022-00742-2.
  216. Liang Yao, Chengsheng Mao, and Yuan Luo. Kg-bert: Bert for knowledge graph completion. arXiv, 2019. URL: https://doi.org/10.48550/arXiv.1909.03193.
  217. Michihiro Yasunaga, Hongyu Ren, Antoine Bosselut, Percy Liang, and Jure Leskovec. Qa-gnn: Reasoning with language models and knowledge graphs for question answering. NAACL, 2021. URL: https://doi.org/10.18653/V1/2021.NAACL-MAIN.45.
  218. Wen-tau Yih, Ming-Wei Chang, Xiaodong He, and Jianfeng Gao. Semantic parsing via staged query graph generation: Question answering with knowledge base. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, ACL 2015, July 26-31, 2015, Beijing, China, Volume 1: Long Papers, pages 1321-1331, 2015. URL: https://doi.org/10.3115/V1/P15-1128.
  219. Donghan Yu, Chenguang Zhu, Yuwei Fang, Wenhao Yu, Shuohang Wang, Yichong Xu, Xiang Ren, Yiming Yang, and Michael Zeng. Kg-fid: Infusing knowledge graph in fusion-in-decoder for open-domain question answering. In ACL, 2022. URL: https://doi.org/10.18653/V1/2022.ACL-LONG.340.
  220. Ran Yu, Ujwal Gadiraju, Besnik Fetahu, Oliver Lehmberg, Dominique Ritze, and Stefan Dietze. Knowmore - knowledge base augmentation with structured web markup. Semantic Web, 10(1):159-180, 2019. URL: https://doi.org/10.3233/SW-180304.
  221. Qingkai Zeng, Jinfeng Lin, Wenhao Yu, Jane Cleland-Huang, and Meng Jiang. Enhancing taxonomy completion with concept generation via fusing relational representations. In KDD, pages 2104-2113, 2021. URL: https://doi.org/10.1145/3447548.3467308.
  222. Hanwen Zha, Zhiyu Chen, and Xifeng Yan. Inductive relation prediction by bert. In AAAI, volume 36, pages 5923-5931, 2022. URL: https://doi.org/10.1609/AAAI.V36I5.20537.
  223. Chaoning Zhang, Chenshuang Zhang, Sheng Zheng, Yu Qiao, Chenghao Li, Mengchun Zhang, Sumit Kumar Dam, Chu Myaet Thwal, Ye Lin Tun, Le Luang Huy, Donguk kim, Sung-Ho Bae, Lik-Hang Lee, Yang Yang, Heng Tao Shen, In-So Kweon, and Choong-Seon Hong. A complete survey on generative ai (aigc): Is chatgpt from gpt-4 to gpt-5 all you need? ArXiv, 2023. URL: https://doi.org/10.48550/ARXIV.2303.11717.
  224. Meiru Zhang, Yixuan Su, Zaiqiao Meng, Zihao Fu, and Nigel Collier. COFFEE: a contrastive oracle-free framework for event extraction. ArXiv, abs/2303.14452, 2023. URL: https://doi.org/10.48550/ARXIV.2303.14452.
  225. Rui Zhang, Bayu Distiawan Trisedya, Miao Li, Yong Jiang, and Jianzhong Qi. A benchmark and comprehensive survey on knowledge graph entity alignment via representation learning. VLDB J., 31(5):1143-1168, 2022. URL: https://doi.org/10.1007/S00778-022-00747-Z.
  226. Ruichuan Zhang and Nora El-Gohary. Transformer-based approach for automated context-aware IFC-regulation semantic information alignment. Automation in Construction, 145, 2023. URL: https://doi.org/10.1016/j.autcon.2022.104540.
  227. Zhiyuan Zhang, Xiaoqian Liu, Yi Zhang, Qi Su, Xu Sun, and Bin He. Pretrain-kge: Learning knowledge representation from pretrained language models. In EMNLP Findings, 2020. URL: https://doi.org/10.18653/V1/2020.FINDINGS-EMNLP.25.
  228. Ziheng Zhang, Hualuo Liu, Jiaoyan Chen, Xi Chen, Bo Liu, Yuejia Xiang, and Yefeng Zheng. An industry evaluation of embedding-based entity alignment. In COLING, pages 179-189, 2020. URL: https://doi.org/10.18653/V1/2020.COLING-INDUSTRY.17.
  229. Ruochen Zhao, Xingxuan Li, Shafiq Joty, Chengwei Qin, and Lidong Bing. Verify-and-edit: A knowledge-enhanced chain-of-thought framework. In Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki, editors, Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5823-5840, Toronto, Canada, jul 2023. Association for Computational Linguistics. URL: https://doi.org/10.18653/V1/2023.ACL-LONG.320.
  230. Zexuan Zhong and Danqi Chen. A frustratingly easy approach for entity and relation extraction. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 50-61, Online, jun 2021. Association for Computational Linguistics. URL: https://doi.org/10.18653/V1/2021.NAACL-MAIN.5.
  231. Zexuan Zhong, Dan Friedman, and Danqi Chen. Factual probing is [mask]: Learning vs. learning to recall. In NAACL, 2021. URL: https://doi.org/10.18653/V1/2021.NAACL-MAIN.398.
  232. Wenxuan Zhou, Fangyu Liu, Ivan Vulic, Nigel Collier, and Muhao Chen. Prix-lm: Pretraining for multilingual knowledge base construction. In ACL, 2021. URL: https://doi.org/10.48550/arXiv.2110.08443.
  233. Yongchao Zhou, Andrei Ioan Muresanu, Ziwen Han, Keiran Paster, Silviu Pitis, Harris Chan, and Jimmy Ba. Large language models are human-level prompt engineers. In arXiv, 2023. URL: https://doi.org/10.48550/ARXIV.2211.01910.
  234. Qi Zhu, Hao Wei, Bunyamin Sisman, Da Zheng, Christos Faloutsos, Xin Luna Dong, and Jiawei Han. Collective multi-type entity alignment between knowledge graphs. In WWW, pages 2241-2252, 2020. URL: https://doi.org/10.1145/3366423.3380289.
  235. Yuqi Zhu, Xiaohan Wang, Jing Chen, Shuofei Qiao, Yixin Ou, Yunzhi Yao, Shumin Deng, Huajun Chen, and Ningyu Zhang. Llms for knowledge graph construction and reasoning: Recent capabilities and future opportunities. arXiv, 2023. URL: https://doi.org/10.48550/ARXIV.2305.13168.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail