Interlinking SciGraph and DBpedia Datasets Using Link Discovery and Named Entity Recognition Techniques

Authors Beyza Yaman , Michele Pasin, Markus Freudenberg

Thumbnail PDF


  • Filesize: 0.8 MB
  • 8 pages

Document Identifiers

Author Details

Beyza Yaman
  • Institute of Applied Informatics, Leipzig, Germany
Michele Pasin
  • Springer Nature, London, UK
Markus Freudenberg
  • Leipzig University, Leipzig, Germany

Cite AsGet BibTex

Beyza Yaman, Michele Pasin, and Markus Freudenberg. Interlinking SciGraph and DBpedia Datasets Using Link Discovery and Named Entity Recognition Techniques. In 2nd Conference on Language, Data and Knowledge (LDK 2019). Open Access Series in Informatics (OASIcs), Volume 70, pp. 15:1-15:8, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


In recent years we have seen a proliferation of Linked Open Data (LOD) compliant datasets becoming available on the web, leading to an increased number of opportunities for data consumers to build smarter applications which integrate data coming from disparate sources. However, often the integration is not easily achievable since it requires discovering and expressing associations across heterogeneous data sets. The goal of this work is to increase the discoverability and reusability of the scholarly data by integrating them to highly interlinked datasets in the LOD cloud. In order to do so we applied techniques that a) improve the identity resolution across these two sources using Link Discovery for the structured data (i.e. by annotating Springer Nature (SN) SciGraph entities with links to DBpedia entities), and b) enriching SN SciGraph unstructured text content (document abstracts) with links to DBpedia entities using Named Entity Recognition (NER). We published the results of this work using standard vocabularies and provided an interactive exploration tool which presents the discovered links w.r.t. the breadth and depth of the DBpedia classes.

Subject Classification

ACM Subject Classification
  • Information systems → Semantic web description languages
  • Computing methodologies → Natural language processing
  • Information systems → Entity resolution
  • Linked Data
  • Named Entity Recognition
  • Link Discovery
  • Interlinking


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Sören Auer, Christian Bizer, Georgi Kobilarov, Jens Lehmann, Richard Cyganiak, and Zachary Ives. Dbpedia: A nucleus for a web of open data. In The semantic web, pages 722-735. Springer, 2007. Google Scholar
  2. Joachim Daiber, Max Jakob, Chris Hokamp, and Pablo N Mendes. Improving efficiency and accuracy in multilingual entity extraction. In Proceedings of the 9th International Conference on Semantic Systems, pages 121-124. ACM, 2013. Google Scholar
  3. Ernesto Jiménez-Ruiz and Bernardo Cuenca Grau. Logmap: Logic-based and scalable ontology matching. In International Semantic Web Conference, pages 273-288. Springer, 2011. Google Scholar
  4. Markus Nentwig, Michael Hartung, Axel-Cyrille Ngonga Ngomo, and Erhard Rahm. A survey of current link discovery frameworks. Semantic Web, 8(3):419-436, 2017. Google Scholar
  5. Axel-Cyrille Ngonga Ngomo. On link discovery using a hybrid approach. Journal on Data Semantics, 1(4):203-217, 2012. Google Scholar
  6. Axel-Cyrille Ngonga Ngomo and Sören Auer. Limes-a time-efficient approach for large-scale link discovery on the web of data. In IJCAI, pages 2312-2317, 2011. Google Scholar
  7. Andriy Nikolov, Victoria Uren, and Enrico Motta. KnoFuss: A comprehensive architecture for knowledge fusion. In Proceedings of the 4th international conference on Knowledge capture, pages 185-186. ACM, 2007. Google Scholar
  8. Xing Niu, Shu Rong, Haofen Wang, and Yong Yu. An effective rule miner for instance matching in a web of data. In Proceedings of the 21st ACM international conference on Information and knowledge management, pages 1085-1094. ACM, 2012. Google Scholar
  9. Yves Raimond, Michael Smethurst, Andrew McParland, and Christopher Lowis. Using the past to explain the present: interlinking current affairs with archives via the semantic web. In International Semantic Web Conference, pages 146-161. Springer, 2013. Google Scholar
  10. Jie Tang, Bang-Yong Liang, Juanzi Li, and Kehong Wang. Risk minimization based ontology mapping. In Content Computing, pages 469-480. Springer, 2004. Google Scholar
  11. Julius Volz, Christian Bizer, Martin Gaedke, and Georgi Kobilarov. Silk-a link discovery framework for the web of data. LDOW, 538, 2009. Google Scholar
Questions / Remarks / Feedback

Feedback for Dagstuhl Publishing

Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail