The dblp Knowledge Graph and SPARQL Endpoint

Authors Marcel R. Ackermann , Hannah Bast , Benedikt Maria Beckermann , Johannes Kalmbach , Patrick Neises , Stefan Ollinger



PDF
Thumbnail PDF

File

TGDK.2.2.3.pdf
  • Filesize: 1.24 MB
  • 23 pages

Document Identifiers

Author Details

Marcel R. Ackermann
  • Schloss Dagstuhl - Leibniz Center for Informatics, dblp computer science bibliography, Trier, Germany
Hannah Bast
  • University of Freiburg, Department of Computer Science, Freiburg, Germany
Benedikt Maria Beckermann
  • Schloss Dagstuhl - Leibniz Center for Informatics, dblp computer science bibliography, Trier, Germany
Johannes Kalmbach
  • University of Freiburg, Department of Computer Science, Freiburg, Germany
Patrick Neises
  • Schloss Dagstuhl - Leibniz Center for Informatics, dblp computer science bibliography, Trier, Germany
Stefan Ollinger
  • Schloss Dagstuhl - Leibniz Center for Informatics, dblp computer science bibliography, Trier, Germany

Acknowledgements

The dblp team would like to thank Silvio Peroni, Ralf Schenkel, and Tobias Zeimetz for the many fruitful discussions and practical help with the specification of the dblp RDF schema. We would also like to thank the many members of the dblp community who sent us their comments, thoughts, criticisms, and suggestions from working with the early versions of the dblp RDF data. Many thanks to Michael Wagner and Michael Didas from the Dagstuhl Publishing team for their support in creating a sustainable workflow for publishing and preserving persistent dblp RDF snapshot releases.

Cite As Get BibTex

Marcel R. Ackermann, Hannah Bast, Benedikt Maria Beckermann, Johannes Kalmbach, Patrick Neises, and Stefan Ollinger. The dblp Knowledge Graph and SPARQL Endpoint. In Special Issue on Resources for Graph Data and Knowledge. Transactions on Graph Data and Knowledge (TGDK), Volume 2, Issue 2, pp. 3:1-3:23, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2024) https://doi.org/10.4230/TGDK.2.2.3

Abstract

For more than 30 years, the dblp computer science bibliography has provided quality-checked and curated bibliographic metadata on major computer science journals, proceedings, and monographs. Its semantic content has been published as RDF or similar graph data by third parties in the past, but most of these resources have now disappeared from the web or are no longer actively synchronized with the latest dblp data. In this article, we introduce the dblp Knowledge Graph (dblp KG), the first semantic representation of the dblp data that is designed and maintained by the dblp team. The dataset is augmented by citation data from the OpenCitations corpus. Open and FAIR access to the data is provided via daily updated RDF dumps, persistently archived monthly releases, a new public SPARQL endpoint with a powerful user interface, and a linked open data API. We also make it easy to self-host a replica of our SPARQL endpoint. We provide an introduction on how to work with the dblp KG and the added citation data using our SPARQL endpoint, with several example queries. Finally, we present the results of a small performance evaluation.

Subject Classification

ACM Subject Classification
  • Information systems → Digital libraries and archives
  • Information systems → Graph-based database models
  • Computing methodologies → Knowledge representation and reasoning
Keywords
  • dblp
  • Scholarly Knowledge Graph
  • Resource
  • RDF
  • SPARQL

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Boanerges Aleman-Meza. Swetodblp. https://lod-cloud.net/dataset/sweto-dblp, 2007. Accessed on 2024-04-22.
  2. Boanerges Aleman-Meza, Farshad Hakimpour, Ismailcem Budak Arpinar, and Amit P. Sheth. Swetodblp ontology of computer science publications. J. Web Semant., 5(3):151-155, 2007. URL: https://doi.org/10.1016/J.WEBSEM.2007.03.001.
  3. Sören Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak, and Zachary G. Ives. Dbpedia: A nucleus for a web of open data. In The Semantic Web, ISWC 2007 + ASWC 2007, Busan, Korea, November 11-15, 2007, volume 4825 of Lecture Notes in Computer Science, pages 722-735. Springer, 2007. URL: https://doi.org/10.1007/978-3-540-76298-0_52.
  4. Debayan Banerjee, Arefa, Ricardo Usbeck, and Chris Biemann. Dblplink: An entity linker for the DBLP scholarly knowledge graph. In ISWC 2023 Posters and Demos: 22nd International Semantic Web Conference, Athens, Greece, November 6-10, 2023, volume 3632 of CEUR Workshop Proceedings. CEUR-WS.org, 2023. URL: https://ceur-ws.org/Vol-3632/ISWC2023_paper_428.pdf.
  5. Debayan Banerjee, Sushil Awale, Ricardo Usbeck, and Chris Biemann. DBLP-QuAD: A question answering dataset over the DBLP scholarly knowledge graph. In BIR 2023: 13th International Workshop on Bibliometric-enhanced Information Retrieval ECIR 2023, Dublin, Ireland, April 2, 2023, volume 3617 of CEUR Workshop Proceedings, pages 37-51. CEUR-WS.org, 2023. URL: https://ceur-ws.org/Vol-3617/paper-05.pdf.
  6. Hannah Bast and Björn Buchhold. QLever: A query engine for efficient SPARQL+Text search. In Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, CIKM 2017, Singapore, November 06 - 10, 2017, pages 647-656. ACM, 2017. URL: https://doi.org/10.1145/3132847.3132921.
  7. Hannah Bast, Johannes Kalmbach, Theresa Klumpp, Florian Kramer, and Niklas Schnelle. Efficient and effective SPARQL autocompletion on very large knowledge graphs. In Proceedings of CIKM 2022, Atlanta, GA, USA, October 17-21, 2022, pages 2893-2902. ACM, 2022. URL: https://doi.org/10.1145/3511808.3557093.
  8. Hannah Bast, Johannes Kalmbach, Claudius Korzen, and Theresa Klumpp. Knowledge graphs. In Omar Alonso and Ricardo Baeza-Yates, editors, Information Retrieval: Advanced Topics and Techniques, volume 60 of ACM Books. Association for Computing Machinery, New York, NY, USA, 1 edition, 2025. URL: https://doi.org/10.1145/3674127.
  9. Christian Bizer. DBLP bibliography database in RDF (fu berlin). https://lod-cloud.net/dataset/fu-berlin-dblp, 2007. Accessed on 2024-04-22.
  10. Christian Bizer and Andy Seaborne. D2RQ - treating non-RDF databases as virtual RDF graphs. In Proceedings of ISWC 2004) Posters, Hiroshima, Japan, November 7-11, 2004. Springer, 2004. URL: http://iswc2004.semanticweb.org/posters/PID-SMCVRKBT-1089637165.pdf.
  11. LOD W3C SWEO Community. Sweoig/taskforces/communityprojects/linkingopendata. https://www.w3.org/wiki/SweoIG/TaskForces/CommunityProjects/LinkingOpenData/, April 2007. Accessed on 2024-04-22.
  12. Richard Cyganiak, David Wood, and Markus Lanthaler. Rdf 1.1 concepts and abstract syntax. https://www.w3.org/TR/2014/REC-rdf11-concepts-20140225/, februrary 2014. Accessed on 2024-05-23.
  13. Marilena Daquino, Silvio Peroni, David M. Shotton, Giovanni Colavizza, Behnam Ghavimi, Anne Lauscher, Philipp Mayr, Matteo Romanello, and Philipp Zumstein. The opencitations data model. In The Semantic Web - ISWC 2020, Athens, Greece, November 2-6, 2020, Proceedings, Part II, volume 12507 of Lecture Notes in Computer Science, pages 447-463. Springer, 2020. URL: https://doi.org/10.1007/978-3-030-62466-8_28.
  14. Bruce D'Arcus and Frederick Giasson. The bibliographic ontology. https://www.dublincore.org/specifications/bibo/, May 2016. Accessed on 2024-05-13.
  15. dblp Team. dblp computer science bibliography – Monthly Snapshot RDF/N-Triple Release. URL: https://doi.org/10.4230/dblp.rdf.ntriples.
  16. dblp Team. dblp computer science bibliography – Monthly Snapshot XML Release. URL: https://doi.org/10.4230/dblp.xml.
  17. Jörg Diederich. DBLP in RDF (l3s). https://lod-cloud.net/dataset/l3s-dblp, 2007. Accessed on 2024-04-22.
  18. Said Fathalla, Christoph Lange, and Sören Auer. EVENTSKG: A 5-star dataset of top-ranked events in eight computer science communities. In The Semantic Web - 16th International Conference, ESWC 2019, Portorož, Slovenia, June 2-6, 2019, Proceedings, volume 11503 of Lecture Notes in Computer Science, pages 427-442. Springer, 2019. URL: https://doi.org/10.1007/978-3-030-21348-0_28.
  19. Julian Franken, Aliaksandr Birukou, Kai Eckert, Wolfgang Fahl, Christian Hauschke, and Christoph Lange. Persistent identification for conferences. Data Sci. J., 21:11, 2022. URL: https://doi.org/10.5334/DSJ-2022-011.
  20. Hugh Glaser. DBLP computer science bibliography (rkbexplorer). https://lod-cloud.net/dataset/l3s-dblp, 2007. Accessed on 2024-04-22.
  21. Hugh Glaser and Ian Millard. RKB explorer: Application and infrastructure. In Proceedings of the Semantic Web Challenge 2007 co-located with ISWC 2007 + ASWC 2007, Busan, Korea, November 13th, 2007, volume 295 of CEUR Workshop Proceedings. CEUR-WS.org, 2007. URL: https://ceur-ws.org/Vol-295/paper13.pdf.
  22. Stephanie Hagemann-Wilholt, Margret Plank, and Christian Hauschke. ConfIDent – an open platform for FAIR conference metadata. In 21st International Conference on Grey Literature “Open Science Encompasses New Forms of Grey Literature”, Hannover, Germany, October 22-23, 2019, volume 21 of GL Conference Series, pages 47-51, 2019. URL: https://doi.org/10.15488/9424.
  23. Aidan Hogan, Eva Blomqvist, Michael Cochez, Claudia d'Amato, Gerard de Melo, Claudio Gutierrez, Sabrina Kirrane, José Emilio Labra Gayo, Roberto Navigli, Sebastian Neumaier, Axel-Cyrille Ngonga Ngomo, Axel Polleres, Sabbir M. Rashid, Anisa Rula, Lukas Schmelzeisen, Juan Sequeda, Steffen Staab, and Antoine Zimmermann. Knowledge Graphs. Synthesis Lectures on Data, Semantics, and Knowledge. Morgan & Claypool Publishers, 2021. URL: https://doi.org/10.2200/S01125ED1V01Y202109DSK022.
  24. Google Inc., Yahoo Inc., Microsoft Corporation, and Yandex. Schema.org v26.0. https://schema.org/version/26.0, February 2024. Accessed on 2024-05-13.
  25. Mohamad Yaser Jaradeh, Allard Oelen, Kheir Eddine Farfar, Manuel Prinz, Jennifer D'Souza, Gábor Kismihók, Markus Stocker, and Sören Auer. Open research knowledge graph: Next generation infrastructure for semantic scholarly knowledge. In Proceedings of the 10th International Conference on Knowledge Capture, K-CAP 2019, Marina Del Rey, CA, USA, November 19-21, 2019, pages 243-246. ACM, 2019. URL: https://doi.org/10.1145/3360901.3364435.
  26. Michael Ley. Die trierer informatik-bibliographie DBLP. In Informatik '97, Informatik als Innovationsmotor, 27. Jahrestagung der Gesellschaft für Informatik, Aachen, 24.-26. September 1997, Informatik Aktuell, pages 257-266. Springer, 1997. URL: https://doi.org/10.1007/978-3-642-60831-5_34.
  27. Michael Ley. The DBLP computer science bibliography: Evolution, research issues, perspectives. In String Processing and Information Retrieval, 9th International Symposium, SPIRE 2002, Lisbon, Portugal, September 11-13, 2002, Proceedings, volume 2476 of Lecture Notes in Computer Science, pages 1-10. Springer, 2002. URL: https://doi.org/10.1007/3-540-45735-6_1.
  28. Michael Ley. DBLP - Some lessons learned. Proc. VLDB Endow., 2(2):1493-1500, 2009. URL: https://doi.org/10.14778/1687553.1687577.
  29. Paolo Manghi, Alessia Bardi, Claudio Atzori, Miriam Baglioni, Natalia Manola, Jochen Schirrwagen, and Pedro Principe. The openaire research graph data model, April 2019. URL: https://doi.org/10.5281/zenodo.2643198.
  30. Fabio Mercorio, Mario Mezzanzanica, Vincenzo Moscato, Antonio Picariello, and Giancarlo Sperlì. A tool for researchers: Querying big scholarly data through graph databases. In Machine Learning and Knowledge Discovery in Databases, ECML PKDD 2019, Würzburg, Germany, September 16-20, 2019, volume 11908 of Lecture Notes in Computer Science, pages 760-763. Springer, 2019. URL: https://doi.org/10.1007/978-3-030-46133-1_46.
  31. Mario Mezzanzanica, Fabio Mercorio, Mirko Cesarini, Vincenzo Moscato, and Antonio Picariello. Graphdblp: a system for analysing networks of computer scientists through graph databases - graphdblp. Multim. Tools Appl., 77(14):18657-18688, 2018. URL: https://doi.org/10.1007/S11042-017-5503-2.
  32. United States Library of Congress. Bibframe 2 ontology. http://id.loc.gov/ontologies/bibframe-2-3-0/, 2016. Accessed on 2024-05-13.
  33. Reham Omar, Ishika Dhall, Panos Kalnis, and Essam Mansour. A universal question-answering platform for knowledge graphs. Proc. ACM Manag. Data, 1(1):57:1-57:25, 2023. URL: https://doi.org/10.1145/3588911.
  34. Silvio Peroni and David Shotton. Frbr-aligned bibliographic ontology (fabio). http://www.sparontologies.net/ontologies/fabio, 2012. Accessed on 2024-05-13.
  35. Silvio Peroni and David M. Shotton. Opencitations, an infrastructure organization for open scholarship. Quant. Sci. Stud., 1(1):428-444, 2020. URL: https://doi.org/10.1162/QSS_A_00023.
  36. Jason Priem, Heather A. Piwowar, and Richard Orr. Openalex: A fully-open index of scholarly works, authors, venues, institutions, and concepts. In Proceedings of the 26th International Conference on Science and Technology Indicators (STI 2022), Granada, Spain, Sptember 7-9, 2022, 2022. URL: https://doi.org/10.5281/zenodo.6936227.
  37. David Shotton and Silvio Peroni. Datacite ontology. http://www.sparontologies.net/ontologies/datacite, September 2022. Accessed on 2024-05-13.
  38. Selver Softic. Colinda - conference linked data. https://lod-cloud.net/dataset/colinda, 2015. Accessed on 2024-04-22.
  39. Selver Softic, Laurens De Vocht, Erik Mannens, Martin Ebner, and Rik Van de Walle. COLINDA: modeling, representing and using scientific events in the web of data. In Proceedings of DeRiVE 2015, Protoroz, Slovenia, May 31, 2015, volume 1363 of CEUR Workshop Proceedings, pages 12-23. CEUR-WS.org, 2015. URL: https://ceur-ws.org/Vol-1363/paper_2.pdf.
  40. Denny Vrandecic. The rise of wikidata. IEEE Intell. Syst., 28(4):90-95, 2013. URL: https://doi.org/10.1109/MIS.2013.119.
  41. Ruijie Wang, Zhiruo Zhang, Luca Rossetto, Florian Ruosch, and Abraham Bernstein. Nlqxform-ui: A natural language interface for querying DBLP interactively. CoRR, abs/2403.08475, 2024. URL: https://doi.org/10.48550/arXiv.2403.08475.
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail