Multilingual Knowledge Graphs and Low-Resource Languages: A Review

Kaffee, Lucie-Aimée; Biswas, Russa; Keet, C. Maria; Vakaj, Edlira Kalemi; de Melo, Gerard

doi:10.4230/TGDK.1.1.10

Abstract

There is a lack of multilingual data to support applications in a large number of languages, especially for low-resource languages. Knowledge graphs (KG) could contribute to closing the gap of language support by providing easily accessible, machine-readable, multilingual linked data, which can be reused across applications. In this paper, we provide an overview of work in the domain of multilingual KGs with a focus on low-resource languages. We review the current state of multilingual KGs along with the different aspects that are crucial for creating KGs with language coverage in mind. Special consideration is given to challenges particular to low-resource languages in KGs. We further provide an overview of applications that yield multilingual KG information as well as downstream applications reusing such multilingual data. Finally, we explore open problems regarding multilingual KGs with a focus on low-resource languages.

Tushar Abhishek, Shivprasad Sagare, Bhavyajeet Singh, Anubhav Sharma, Manish Gupta, and Vasudeva Varma. Xalign: Cross-lingual fact-to-text alignment and generation for low-resource languages. In Companion Proceedings of the Web Conference 2022, pages 171-175, 2022. URL: https://doi.org/10.1145/3487553.3524265.
Mihael Arcan and Paul Buitelaar. Ontology Label Translation. Human Language Technologies: Conference of the North American Chapter of the Association of Computational Linguistics, Proceedings, June 9-14, 2013, Westin Peachtree Plaza Hotel, Atlanta, Georgia, USA, pages 40-46, 2013. URL: https://aclanthology.org/N13-2006/.
Mihael Arcan and Paul Buitelaar. Translating Domain-Specific Expressions in Knowledge Bases with Neural Machine Translation. CoRR, abs/1709.02184, 2017. URL: https://doi.org/10.48550/arXiv.1709.02184.
Majid Asgari-Bidhendi, Ali Hadian, and Behrouz Minaei-Bidgoli. Farsbase: The persian knowledge graph. Semantic Web Journal, 10(6):1169-1196, 2019. URL: https://doi.org/10.3233/SW-190369.
Kedar Bellare, Anish Das Sarma, Atish Das Sarma, Navneet Loiwal, Vaibhav Mehta, Ganesh Ramakrishnan, and Pushpak Bhattacharyya. Generic Text Summarization Using WordNet. In Proceedings of the Fourth International Conference on Language Resources and Evaluation, LREC 2004, May 26-28, 2004, Lisbon, Portugal. European Language Resources Association, 2004. URL: http://www.lrec-conf.org/proceedings/lrec2004/summaries/342.htm.
V. Berment. Méthodes pour informatiser des langues et des groupes de langues peu dotées. Phd thesis, J. Fourier University – Grenoble I, may 2004. URL: https://theses.hal.science/tel-00006313/document.
Tim Berners-Lee. Cool URIs don't change, 1998. Accessed on 05.07.2023. URL: https://www.w3.org/Provider/Style/URI.html.
Joan Byamugisha, C. Maria Keet, and Langa Khumalo. Pluralising nouns in isiZulu and related languages. In A. Gelbukh, editor, Proceedings of CICLing'16, volume 9623 of LNCS, pages 271-283. Springer, 2018. URL: https://doi.org/10.1007/978-3-319-75477-2_18.
Soumen Chakrabarti, Harkanwar Singh, Shubham Lohiya, Prachi Jain, and Mausam. Joint completion and alignment of multilingual knowledge graphs. In Yoav Goldberg, Zornitsa Kozareva, and Yue Zhang, editors, Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7-11, 2022, pages 11922-11938. Association for Computational Linguistics, 2022. URL: https://doi.org/10.18653/V1/2022.EMNLP-MAIN.817.
Catherine Chavula and C. Maria Keet. Is lemon sufficient for building multilingual ontologies for Bantu languages? In C. Maria Keet and Valentina Tamma, editors, Proceedings of the 11th OWL: Experiences and Directions Workshop (OWLED'14), volume 1265 of CEUR-WS, pages 61-72, 2014. Riva del Garda, Italy, Oct 17-18, 2014. URL: https://ceur-ws.org/Vol-1265/owled2014_submission_10.pdf.
Xuelu Chen, Muhao Chen, Changjun Fan, Ankith Uppunda, Yizhou Sun, and Carlo Zaniolo. Multilingual knowledge graph completion via ensemble knowledge transfer. In Trevor Cohn, Yulan He, and Yang Liu, editors, Findings of the Association for Computational Linguistics: EMNLP 2020, Online Event, 16-20 November 2020, volume EMNLP 2020 of Findings of ACL, pages 3227-3238. Association for Computational Linguistics, 2020. URL: https://doi.org/10.18653/V1/2020.FINDINGS-EMNLP.290.
Yuxuan Chen, David Harbecke, and Leonhard Hennig. Multilingual relation classification via efficient and effective prompting. In Yoav Goldberg, Zornitsa Kozareva, and Yue Zhang, editors, Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7-11, 2022, pages 1059-1075. Association for Computational Linguistics, 2022. URL: https://doi.org/10.18653/V1/2022.EMNLP-MAIN.69.
Philipp Cimiano, John P. McCrae, and Paul Buitelaar. Lexicon model for ontologies: Community report. Final community group report, 10 may 2016, W3C, 2016. URL: https://www.w3.org/2016/05/ontolex/.
Gerard de Melo. Lexvo.org: Language-related information for the Linguistic Linked Data cloud. Semantic Web Journal, 6(4):393-400, aug 2015. URL: https://doi.org/10.3233/SW-150171.
Gerard de Melo and Gerhard Weikum. Towards a universal wordnet by learning from combined evidence. In David Wai-Lok Cheung, Il-Yeol Song, Wesley W. Chu, Xiaohua Hu, and Jimmy J. Lin, editors, Proceedings of the 18th ACM Conference on Information and Knowledge Management, CIKM 2009, Hong Kong, China, November 2-6, 2009, pages 513-522, New York, NY, USA, 2009. ACM. URL: https://doi.org/10.1145/1645953.1646020.
Jeremy Debattista, Sören Auer, and Christoph Lange. Luzzu - A methodology and framework for linked data quality assessment. ACM J. Data Inf. Qual., 8(1):4:1-4:32, 2016. URL: https://doi.org/10.1145/2992786.
G. R. Dent and C. L. S. Nyembezi. Scholar’s Zulu Dictionary. Shuter & Shooter Publishers, 4 edition, 2009.
Dennis Diefenbach, Pedro Henrique Migliatti, Omar Qawasmeh, Vincent Lully, Kamal Singh, and Pierre Maret. QAnswer: A Question Answering prototype bridging the gap between a considerable part of the LOD cloud and end-users. In The World Wide Web Conference, pages 3507-3510, 2019. URL: https://doi.org/10.1145/3308558.3314124.
Dennis Diefenbach, Kamal Singh, and Pierre Maret. WDAqua-core1: A Question Answering service for RDF Knowledge Bases. In Companion Proceedings of the The Web Conference 2018, pages 1087-1091, 2018. URL: https://doi.org/10.1145/3184558.3191541.
Zakaria Elberrichi, Abdellatif Rahmoun, and Mohamed Amine Bentaallah. Using wordnet for text categorization. Int. Arab J. Inf. Technol., 5(1):16-24, 2008.
Basil Ell, Denny Vrandecic, and Elena Simperl. Labels in the web of data. In Lora Aroyo, Chris Welty, Harith Alani, Jamie Taylor, Abraham Bernstein, Lalana Kagal, Natasha Fridman Noy, and Eva Blomqvist, editors, The Semantic Web - ISWC 2011 - 10th International Semantic Web Conference, Bonn, Germany, October 23-27, 2011, Proceedings, Part I, volume 7031 of Lecture Notes in Computer Science, pages 162-176. Springer, 2011. URL: https://doi.org/10.1007/978-3-642-25073-6_11.
Mauricio Espinoza, Asunción Gómez-Pérez, and Eduardo Mena. Enriching an ontology with multilingual information. In Sean Bechhofer, Manfred Hauswirth, Jörg Hoffmann, and Manolis Koubarakis, editors, The Semantic Web: Research and Applications, 5th European Semantic Web Conference, ESWC 2008, Tenerife, Canary Islands, Spain, June 1-5, 2008, Proceedings, volume 5021 of Lecture Notes in Computer Science, pages 333-347. Springer, 2008. URL: https://doi.org/10.1007/978-3-540-68234-9_26.
Chen-Chieh Feng and David M. Mark. Cross-linguistic research on landscape categories using geonet names server data: A case study for indonesia and malaysia. The Professional Geographer, 69(4):567-578, 2017. URL: https://doi.org/10.1080/00330124.2017.1288575.
Xiaocheng Feng, Duyu Tang, Bing Qin, and Ting Liu. English-Chinese Knowledge Base Translation with Neural Network. COLING 2016, 26th International Conference on Computational Linguistics, Proceedings of the Conference: Technical Papers, December 11-16, 2016, Osaka, Japan, pages 2935-2944, 2016. URL: https://aclanthology.org/C16-1276/.
P. R. Fillottrani and C. M. Keet. An analysis of commitments in ontology language design. In B. Brodaric and F. Neuhaus, editors, 11th International Conference on Formal Ontology in Information Systems 2020 (FOIS'20), volume 330 of FAIA, pages 46-60. IOS Press, 2020. URL: https://doi.org/10.3233/FAIA200659.
Pablo R. Fillottrani and C. Maria Keet. Patterns for Heterogeneous TBox Mappings to Bridge Different Modelling Decisions. In E. Blomqvist et al., editors, Proceeding of the 14th Extended Semantic Web Conference (ESWC'17), volume 10249 of LNCS, pages 371-386. Springer, 2017. 30 May - 1 June 2017, Portoroz, Slovenia. URL: https://doi.org/10.1007/978-3-319-58068-5_23.
∀, Wilhelmina Nekoto, Vukosi Marivate, Tshinondiwa Matsila, Timi Fasubaa, Tajudeen Kolawole, Taiwo Fagbohungbe, Solomon Oluwole Akinola, Shamsuddee Hassan Muhammad, Salomon Kabongo, Salomey Osei, et al. Participatory Research for Low-resourced Machine Translation: A Case Study in African Languages. Findings of EMNLP, 2020. URL: https://doi.org/10.18653/v1/2020.findings-emnlp.195.
A. Gatt and E. Reiter. Simplenlg: A realisation engine for practical applications. In E. Krahmer and M. Theune, editors, Proceedings of the 12th European Workshop on Natural Language Generation (ENLG'09), pages 90-93. ACL, 2009. March 30-31, 2009, Athens, Greece. URL: https://aclanthology.org/W09-0613.pdf.
Frances Gillis-Webber and C. Maria Keet. A review of multilingualism in and for ontologies. CoRR, abs/2210.02807, 2022. URL: https://doi.org/10.48550/ARXIV.2210.02807.
Frances Gillis-Webber and C. Maria Keet. A survey of multilingual OWL ontologies in bioportal. In Katy Wolstencroft, Andrea Splendiani, M. Scott Marshall, Chris Baker, Andra Waagmeester, Marco Roos, Rutger A. Vos, Rianne Fijten, and Leyla Jael Castro, editors, 13th International Conference on Semantic Web Applications and Tools for Health Care and Life Sciences, SWAT4HCLS 2022, Virtual Event, Leiden, The Netherlands, January 10th to 14th, 2022, volume 3127 of CEUR Workshop Proceedings, pages 87-96. CEUR-WS.org, 2022. URL: https://ceur-ws.org/Vol-3127/paper-11.pdf.
Jorge Gracia, Elena Montiel-Ponsoda, Philipp Cimiano, Asunción Gómez-Pérez, Paul Buitelaar, and John P. McCrae. Challenges for the multilingual web of data. J. Web Semant., 11:63-71, 2012. URL: https://doi.org/10.1016/J.WEBSEM.2011.09.001.
Ariel Gutman and C. Maria Keet. Template language for wikifunctions, 2022. URL: https://meta.wikimedia.org/wiki/Abstract_Wikipedia/Template_Language_for_Wikifunctions.
Harald Hammarström, Robert Forkel, Martin Haspelmath, and Sebastian Bank. Glottolog. Version 4.8. Max Planck Institute for Evolutionary Anthropology, Leipzig, 2023. URL: https://doi.org/10.5281/zenodo.8131084.
Michael A. Hedderich, Lukas Lange, Heike Adel, Jannik Strötgen, and Dietrich Klakow. A survey on recent approaches for natural language processing in low-resource scenarios. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 2545-2568. Association for Computational Linguistics, jun 2021. URL: https://doi.org/10.18653/V1/2021.NAACL-MAIN.201.
Aidan Hogan, Eva Blomqvist, Michael Cochez, Claudia d'Amato, Gerard de Melo, Claudio Gutierrez, Sabrina Kirrane, José Emilio Labra Gayo, Roberto Navigli, Sebastian Neumaier, Axel-Cyrille Ngonga Ngomo, Axel Polleres, Sabbir M. Rashid, Anisa Rula, Lukas Schmelzeisen, Juan Sequeda, Steffen Staab, and Antoine Zimmermann. Knowledge Graphs. Number 22 in Synthesis Lectures on Data, Semantics, and Knowledge. Morgan & Claypool, 2021. URL: https://doi.org/10.2200/S01125ED1V01Y202109DSK022.
Ian Horrocks. Ontologies and the semantic web. Communications of the ACM, 51(12):58-67, 2008. URL: https://doi.org/10.1007/3-540-45810-7.
Ian Horrocks, Peter F. Patel-Schneider, and Frank van Harmelen. From SHIQ and RDF to OWL: the making of a web ontology language. J. Web Semant., 1(1):7-26, 2003. URL: https://doi.org/10.1016/J.WEBSEM.2003.07.001.
Yifan Hou, Wenxiang Jiao, Meizhen Liu, Carl Allen, Zhaopeng Tu, and Mrinmaya Sachan. Adapters for Enhanced Modeling of Multilingual Knowledge and Text. Findings of the Association for Computational Linguistics: EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7-11, 2022, pages 3902-3917, 2022. URL: https://doi.org/10.18653/V1/2022.FINDINGS-EMNLP.287.
Zijie Huang, Zheng Li, Haoming Jiang, Tianyu Cao, Hanqing Lu, Bing Yin, Karthik Subbian, Yizhou Sun, and Wei Wang. Multilingual knowledge graph completion with self-supervised adaptive graph alignment. arXiv preprint arXiv:2203.14987, 2022. URL: https://doi.org/10.48550/arXiv.2203.14987.
Zijie Huang, Zheng Li, Haoming Jiang, Tianyu Cao, Hanqing Lu, Bing Yin, Karthik Subbian, Yizhou Sun, and Wei Wang. Multilingual knowledge graph completion with self-supervised adaptive graph alignment. In Smaranda Muresan, Preslav Nakov, and Aline Villavicencio, editors, Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022, pages 474-485. Association for Computational Linguistics, 2022. URL: https://doi.org/10.18653/V1/2022.ACL-LONG.36.
Shimaa Ibrahim, Said Fathalla, Jens Lehmann, and Hajira Jabeen. Toward the multilingual semantic web: Multilingual ontology matching and assessment. IEEE Access, 11:8581-8599, 2023. URL: https://doi.org/10.1109/ACCESS.2023.3238871.
Pratik Joshi, Sebastin Santy, Amar Budhiraja, Kalika Bali, and Monojit Choudhury. The state and fate of linguistic diversity and inclusion in the NLP world. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 6282-6293, Online, jul 2020. Association for Computational Linguistics. URL: https://doi.org/10.18653/V1/2020.ACL-MAIN.560.
K. Juel Vang. Ethics of Google’s Knowledge Graph: some considerations. Journal of Information, Communication and Ethics in Society, 11(4):245-260, 2013. URL: https://doi.org/10.1108/JICES-08-2013-0028.
Lucie-Aimée Kaffee. Multilinguality in Knowledge Graphs. PhD thesis, University of Southampton, 2021. URL: https://eprints.soton.ac.uk/456783/.
Lucie-Aimée Kaffee, Kemele M. Endris, and Elena Simperl. When humans and machines collaborate: cross-lingual label editing in wikidata. In Björn Lundell, Jonas Gamalielsson, Lorraine Morgan, and Gregorio Robles, editors, Proceedings of the 15th International Symposium on Open Collaboration, OpenSym 2019, Skövde, Sweden, August 20-22, 2019, pages 16:1-16:9. ACM, 2019. URL: https://doi.org/10.1145/3306446.3340826.
Lucie-Aimée Kaffee, Kemele M. Endris, Elena Simperl, and Maria-Esther Vidal. Ranking knowledge graphs by capturing knowledge about languages and labels. In Mayank Kejriwal, Pedro A. Szekely, and Raphaël Troncy, editors, Proceedings of the 10th International Conference on Knowledge Capture, K-CAP 2019, Marina Del Rey, CA, USA, November 19-21, 2019, pages 21-28. ACM, 2019. URL: https://doi.org/10.1145/3360901.3364443.
Lucie-Aimée Kaffee, Alessandro Piscopo, Pavlos Vougiouklis, Elena Simperl, Leslie Carr, and Lydia Pintscher. A glimpse into babel: An analysis of multilinguality in wikidata. In Lorraine Morgan, editor, Proceedings of the 13th International Symposium on Open Collaboration, OpenSym 2017, Galway, Ireland, August 23-25, 2017, pages 14:1-14:5. ACM, 2017. URL: https://doi.org/10.1145/3125433.3125465.
Lucie-Aimée Kaffee and Elena Simperl. Analysis of editors' languages in wikidata. In Proceedings of the 14th International Symposium on Open Collaboration, OpenSym 2018, Paris, France, August 22-24, 2018, pages 21:1-21:5. ACM, 2018. URL: https://doi.org/10.1145/3233391.3233965.
Lucie-Aimée Kaffee and Elena Simperl. The human face of the web of data: A cross-sectional study of labels. In Anna Fensel, Victor de Boer, Tassilo Pellegrini, Elmar Kiesling, Bernhard Haslhofer, Laura Hollink, and Alexander Schindler, editors, Proceedings of the 14th International Conference on Semantic Systems, SEMANTiCS 2018, Vienna, Austria, September 10-13, 2018, volume 137 of Procedia Computer Science, pages 66-77. Elsevier, 2018. URL: https://doi.org/10.1016/J.PROCS.2018.09.007.
Lucie-Aimée Kaffee, Pavlos Vougiouklis, and Elena Simperl. Using natural language generation to bootstrap missing wikipedia articles: A human-centric perspective. Semantic Web, 13(2):163-194, 2022. URL: https://doi.org/10.3233/SW-210431.
C. M. Keet. Representing and aligning similar relations: parts and wholes in isizulu vs english. In J. Gracia, F. Bond, J. McCrae, P. Buitelaar, C. Chiarcos, and S. Hellmann, editors, Language, Data, and Knowledge 2017 (LDK'17), volume 10318 of LNAI, pages 58-73. Springer, 2017. 19-20 June, 2017, Galway, Ireland. URL: https://doi.org/10.1007/978-3-319-59888-8_5.
C. M. Keet and L. Khumalo. Grammar rules for the isiZulu complex verb. Southern African Journal of Language and Linguistics, 35(2):183-200, 2017. URL: https://doi.org/10.2989/16073614.2017.1358097.
C. Maria Keet and Langa Khumalo. Toward a knowledge-to-text controlled natural language of isiZulu. Language Resources and Evaluation, 51(1):131-157, 2017. URL: https://doi.org/10.1007/S10579-016-9340-0.
C. Maria Keet and Langa Khumalo. Parthood and part-whole relations in zulu language and culture. Applied Ontology, 15(3):361-384, 2020. URL: https://doi.org/10.3233/AO-200230.
C. Maria Keet and Langa Khumalo. Contextualising levels of language resourcedness affecting digital processing of text. CoRR, abs/2309.17035, 2023. URL: https://doi.org/10.48550/ARXIV.2309.17035.
Abdullatif Köksal and Arzucan Özgür. The RELX dataset and matching the multilingual blanks for cross-lingual relation classification. In Trevor Cohn, Yulan He, and Yang Liu, editors, Findings of the Association for Computational Linguistics: EMNLP 2020, Online Event, 16-20 November 2020, volume EMNLP 2020 of Findings of ACL, pages 340-350. Association for Computational Linguistics, 2020. URL: https://doi.org/10.18653/V1/2020.FINDINGS-EMNLP.32.
Keshav Kolluru, Martin Rezk, Pat Verga, William W. Cohen, and Partha Talukdar. Multilingual Fact Linking, 2021. URL: https://doi.org/10.48550/arXiv.2109.14364.
Steven Krauwer. The basic language resource kit (BLARK) as the first milestone for the language resources roadmap. In Proceedings of the 2003 International Workshop Speech and Computer SPECOM'03, volume 2003, pages 8-15, 2003. Moscow, Russia, 2003. URL: http://www.elsnet.org/dox/krauwer-specom2003.pdf.
A. Magueresse, V. Carles, and E. Heetderks. Low-resource languages: A review of past work and future challenges. CoRR, abs/2006.07264, 2020. URL: https://doi.org/10.48550/arXiv.2006.07264.
Zola Mahlaza. Foundations for reusable and maintainable surface realisers for isiXhosa and isiZulu. Phd thesis, Department of Computer Science, University of Cape Town, South Africa, nov 2022. URL: https://adeebnqo.github.io/files/Thesis.pdf.
John McCrae, Guadalupe Aguado-de Cea, Paul Buitelaar, Philipp Cimiano, Thierry Declerck, Asunción Gómez-Pérez, Jorge Gracia, Laura Hollink, Elena Montiel-Ponsoda, Dennis Spohr, and Tobias Wunner. Interchanging lexical resources on the semantic web. Language Resources and Evaluation, 46(4):701-719, 2012. URL: https://doi.org/10.1007/S10579-012-9182-3.
John McCrae, Guadalupe Aguado de Cea, Paul Buitelaar, Philipp Cimiano, Thierry Declerck, Asunción Gómez-Pérez, Jorge Gracia, Laura Hollink, Elena Montiel-Ponsoda, Dennis Spohr, and Tobias Wunner. The lemon cookbook. Technical report, Monnet Project, jun 2012. URL: https://www.lemon-model.net/learn/cookbook.php.
John P. McCrae, Mihael Arcan, Kartik Asooja, Jorge Gracia, Paul Buitelaar, and Philipp Cimiano. Domain adaptation for ontology localization. Journal of Web Semantics, 36:23-31, 2016. URL: https://doi.org/10.1016/J.WEBSEM.2015.12.001.
John Philip McCrae, Christian Chiarcos, Francis Bond, Philipp Cimiano, Thierry Declerck, Gerard de Melo, Jorge Gracia, Sebastian Hellmann, Bettina Klimek, Steven Moran, Petya Osenova, Antonio Pareja-Lora, and Jonathan Pool. The Open Linguistics Working Group: Developing the Linguistic Linked Open Data Cloud. In Proceedings of the 10th Language Resources and Evaluation Conference (LREC 2016), pages 2435-2441, Paris, France, 2016. URL: http://www.lrec-conf.org/proceedings/lrec2016/pdf/851_Paper.pdf.
George A. Miller. Wordnet: A lexical database for english. Commun. ACM, 38(11):39-41, 1995. URL: https://doi.org/10.1145/219717.219748.
Marc Miquel-Ribé and David Laniado. Wikipedia culture gap: quantifying content imbalances across 40 language editions. Frontiers in Physics, 6:54, 2018. URL: https://doi.org/10.3389/fphy.2018.00054.
Carmen Moors, Ilana Wilken, Karen Calteaux, and Tebogo Gumede. Human language technology audit 2018: Analysing the development trends in resource availability in all south african languages. In Proceedings of the Annual Conference of the South African Institute of Computer Scientists and Information Technologists, SAICSIT '18, pages 296-304, New York, NY, USA, 2018. Association for Computing Machinery. URL: https://doi.org/10.1145/3278681.3278716.
Diego Moussallem, Axel-Cyrille Ngonga Ngomo, Paul Buitelaar, and Mihael Arcan. Utilizing knowledge graphs for neural machine translation augmentation. In Proceedings of the 10th international conference on knowledge capture, pages 139-146, 2019. URL: https://doi.org/10.1145/3360901.3364423.
Diego Moussallem, Tommaso Soru, and Axel-Cyrille Ngonga Ngomo. THOTH: neural translation and enrichment of knowledge graphs. In Chiara Ghidini, Olaf Hartig, Maria Maleshkova, Vojtech Svátek, Isabel F. Cruz, Aidan Hogan, Jie Song, Maxime Lefrançois, and Fabien Gandon, editors, The Semantic Web - ISWC 2019 - 18th International Semantic Web Conference, Auckland, New Zealand, October 26-30, 2019, Proceedings, Part I, volume 11778 of Lecture Notes in Computer Science, pages 505-522. Springer, 2019. URL: https://doi.org/10.1007/978-3-030-30793-6_29.
Diego Moussallem, Matthias Wauer, and Axel-Cyrille Ngonga Ngomo. Machine Translation using Semantic Web Technologies: A Survey. Journal of Web Semantics, 51:1-19, 2018. URL: https://doi.org/10.1016/J.WEBSEM.2018.07.001.
Arijit Nag, Bidisha Samanta, Animesh Mukherjee, Niloy Ganguly, and Soumen Chakrabarti. A Data Bootstrapping Recipe for Low Resource Multilingual Relation Classification. arXiv preprint, 2021. URL: https://doi.org/10.48550/arXiv.2110.09570.
Roberto Navigli and Simone Paolo Ponzetto. Babelnet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artificial Intelligence, 193:217-250, 2012. URL: https://doi.org/10.1016/J.ARTINT.2012.07.001.
Finn Årup Nielsen. Lexemes in wikidata: 2020 status. In Maxim Ionov, John P. McCrae, Christian Chiarcos, Thierry Declerck, Julia Bosque-Gil, and Jorge Gracia, editors, Proceedings of the 7th Workshop on Linked Data in Linguistics, LDL@LREC 2020, Marseille, France, May 2020, pages 82-86. European Language Resources Association, 2020. URL: https://aclanthology.org/2020.ldl-1.12/.
Aleksandr Perevalov, Andreas Both, Dennis Diefenbach, and Axel-Cyrille Ngonga Ngomo. Can machine translation be a reasonable alternative for multilingual question answering systems over knowledge graphs? In Proceedings of the ACM Web Conference 2022, pages 977-986, 2022. URL: https://doi.org/10.1145/3485447.3511940.
Aleksandr Perevalov, Andreas Both, and Axel-Cyrille Ngonga Ngomo. Multilingual question answering systems for knowledge graphs - A survey. Semantic Web, 2023. URL: https://www.semantic-web-journal.net/system/files/swj3417.pdf.
Aleksandr Perevalov, Axel-Cyrille Ngonga Ngomo, and Andreas Both. Enhancing the accessibility of knowledge graph question answering systems through multilingualization. In 2022 IEEE 16th International Conference on Semantic Computing (ICSC), pages 251-256. IEEE, 2022. URL: https://doi.org/10.1109/ICSC52841.2022.00048.
Fabio Petroni, Tim Rocktäschel, Sebastian Riedel, Patrick S. H. Lewis, Anton Bakhtin, Yuxiang Wu, and Alexander H. Miller. Language models as knowledge bases? In Kentaro Inui, Jing Jiang, Vincent Ng, and Xiaojun Wan, editors, Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3-7, 2019, pages 2463-2473. Association for Computational Linguistics, 2019. URL: https://doi.org/10.18653/V1/D19-1250.
A. Phillips and M. Davis. Tags for Identifying Languages, sep 2009. URL: https://www.rfc-editor.org/info/bcp47.
Alessandro Piscopo, Christopher Phethean, and Elena Simperl. Wikidatians are born: Paths to full participation in a collaborative structured knowledge base. In Tung Bui, editor, 50th Hawaii International Conference on System Sciences, HICSS 2017, Hilton Waikoloa Village, Hawaii, USA, January 4-7, 2017, pages 1-10. ScholarSpace / AIS Electronic Library (AISeL), 2017. URL: https://doi.org/10.24251/HICSS.2017.527.
S. Ranathunga, E-S. A. Lee, M.P. Skenduli, R. Shekar, M. Alam, and R. Kaur. Neural machine translation for low-resource languages: A survey. CoRR, abs/2106.15115, 2021. URL: https://doi.org/10.48550/arXiv.2106.15115.
Aarne Ranta. Multilingual Text Generation for Abstract Wikipedia in Grammatical Framework: Prospects and Challenges, pages 125-149. Springer International Publishing, Cham, 2023. URL: https://doi.org/10.1007/978-3-031-21780-7_6.
Georg Rehm and Andy Way, editors. European Language Equality: A Strategic Agenda for Digital Language Equality. Cognitive Technologies. Springer, 2023. URL: https://doi.org/10.1007/978-3-031-28819-7.
Bhavyajeet Singh, Pavan Kandru, Anubhav Sharma, and Vasudeva Varma. Massively Multilingual Language Models for Cross Lingual Fact Extraction from Low Resource Indian Languages. arXiv preprint, 2023. URL: https://doi.org/10.48550/arXiv.2302.04790.
Yuan Sun, Yan Zhuang, Sisi Liu, and Xiaobing Zhao. Low-resource language question generation based on key sentence and knowledge graph. Available at SSRN 4560896, 2023. URL: https://doi.org/10.2139/ssrn.4560896.
Thomas Pellissier Tanon and Lucie-Aimée Kaffee. Property label stability in wikidata: Evolution and convergence of schemas in collaborative knowledge bases. In Pierre-Antoine Champin, Fabien Gandon, Mounia Lalmas, and Panagiotis G. Ipeirotis, editors, Companion of the The Web Conference 2018 on The Web Conference 2018, WWW 2018, Lyon , France, April 23-27, 2018, pages 1801-1803. ACM, 2018. URL: https://doi.org/10.1145/3184558.3191643.
Vinh Tong, Dat Quoc Nguyen, Trung Thanh Huynh, Tam Thanh Nguyen, Quoc Viet Hung Nguyen, and Mathias Niepert. Joint multilingual knowledge graph completion and alignment. In Yoav Goldberg, Zornitsa Kozareva, and Yue Zhang, editors, Findings of the Association for Computational Linguistics: EMNLP 2022, Abu Dhabi, United Arab Emirates, December 7-11, 2022, pages 4646-4658. Association for Computational Linguistics, 2022. URL: https://doi.org/10.18653/V1/2022.FINDINGS-EMNLP.341.
Y. Tsvetkov. Opportunities and challenges in working with low-resource languages, 2017. URL: https://www.cs.cmu.edu/~ytsvetko/jsalt-part1.pdf.
Preeti Vats, Nonita Sharma, and Deepak Kumar Sharma. Hkg: A novel approach for low resource indic languages to automatic knowledge graph construction. ACM Transactions on Asian and Low-Resource Language Information Processing, 2023. URL: https://doi.org/10.1145/3611306.
Denny Vrandecic. Building a multilingual wikipedia. Communications of the ACM, 64(4):38-41, 2021. URL: https://doi.org/10.1145/3425778.
Amrapali Zaveri, Anisa Rula, Andrea Maurino, Ricardo Pietrobon, Jens Lehmann, and Sören Auer. Quality assessment for linked data: A survey. Semantic Web, 7(1):63-93, 2016. URL: https://doi.org/10.3233/SW-150175.
Wenxuan Zhou, Fangyu Liu, Ivan Vulic, Nigel Collier, and Muhao Chen. Prix-LM: Pretraining for Multilingual Knowledge Base Construction. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2022, Dublin, Ireland, May 22-27, 2022, pages 5412-5424, 2022. URL: https://doi.org/10.18653/V1/2022.ACL-LONG.371.
Yucheng Zhou, Xiubo Geng, Tao Shen, Wenqiang Zhang, and Daxin Jiang. Improving zero-shot cross-lingual transfer for multilingual question answering over knowledge graph. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 5822-5834, 2021. URL: https://doi.org/10.18653/V1/2021.NAACL-MAIN.465.

Multilingual Knowledge Graphs and Low-Resource Languages: A Review

Authors Lucie-Aimée Kaffee , Russa Biswas , C. Maria Keet , Edlira Kalemi Vakaj , Gerard de Melo

File

Document Identifiers

Author Details

Cite As Get BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message