Search Results

Documents authored by Buitelaar, Paul


Document
Automatic Construction of Knowledge Graphs from Text and Structured Data: A Preliminary Literature Review

Authors: Maraim Masoud, Bianca Pereira, John McCrae, and Paul Buitelaar

Published in: OASIcs, Volume 93, 3rd Conference on Language, Data and Knowledge (LDK 2021)


Abstract
Knowledge graphs have been shown to be an important data structure for many applications, including chatbot development, data integration, and semantic search. In the enterprise domain, such graphs need to be constructed based on both structured (e.g. databases) and unstructured (e.g. textual) internal data sources; preferentially using automatic approaches due to the costs associated with manual construction of knowledge graphs. However, despite the growing body of research that leverages both structured and textual data sources in the context of automatic knowledge graph construction, the research community has centered on either one type of source or the other. In this paper, we conduct a preliminary literature review to investigate approaches that can be used for the integration of textual and structured data sources in the process of automatic knowledge graph construction. We highlight the solutions currently available for use within enterprises and point areas that would benefit from further research.

Cite as

Maraim Masoud, Bianca Pereira, John McCrae, and Paul Buitelaar. Automatic Construction of Knowledge Graphs from Text and Structured Data: A Preliminary Literature Review. In 3rd Conference on Language, Data and Knowledge (LDK 2021). Open Access Series in Informatics (OASIcs), Volume 93, pp. 19:1-19:9, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{masoud_et_al:OASIcs.LDK.2021.19,
  author =	{Masoud, Maraim and Pereira, Bianca and McCrae, John and Buitelaar, Paul},
  title =	{{Automatic Construction of Knowledge Graphs from Text and Structured Data: A Preliminary Literature Review}},
  booktitle =	{3rd Conference on Language, Data and Knowledge (LDK 2021)},
  pages =	{19:1--19:9},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-199-3},
  ISSN =	{2190-6807},
  year =	{2021},
  volume =	{93},
  editor =	{Gromann, Dagmar and S\'{e}rasset, Gilles and Declerck, Thierry and McCrae, John P. and Gracia, Jorge and Bosque-Gil, Julia and Bobillo, Fernando and Heinisch, Barbara},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.LDK.2021.19},
  URN =		{urn:nbn:de:0030-drops-145556},
  doi =		{10.4230/OASIcs.LDK.2021.19},
  annote =	{Keywords: Knowledge Graph Construction, Enterprise Knowledge Graph}
}
Document
Complete Volume
OASIcs, Volume 70, LDK'19, Complete Volume

Authors: Maria Eskevich, Gerard de Melo, Christian Fäth, John P. McCrae, Paul Buitelaar, Christian Chiarcos, Bettina Klimek, and Milan Dojchinovski

Published in: OASIcs, Volume 70, 2nd Conference on Language, Data and Knowledge (LDK 2019)


Abstract
OASIcs, Volume 70, LDK'19, Complete Volume

Cite as

2nd Conference on Language, Data and Knowledge (LDK 2019). Open Access Series in Informatics (OASIcs), Volume 70, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@Proceedings{eskevich_et_al:OASIcs.LDK.2019,
  title =	{{OASIcs, Volume 70, LDK'19, Complete Volume}},
  booktitle =	{2nd Conference on Language, Data and Knowledge (LDK 2019)},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-105-4},
  ISSN =	{2190-6807},
  year =	{2019},
  volume =	{70},
  editor =	{Eskevich, Maria and de Melo, Gerard and F\"{a}th, Christian and McCrae, John P. and Buitelaar, Paul and Chiarcos, Christian and Klimek, Bettina and Dojchinovski, Milan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.LDK.2019},
  URN =		{urn:nbn:de:0030-drops-105045},
  doi =		{10.4230/OASIcs.LDK.2019},
  annote =	{Keywords: Computing methodologies, Natural language processing, Knowledge representation and reasoning}
}
Document
Front Matter
Front Matter, Table of Contents, Preface, Conference Organization

Authors: Maria Eskevich, Gerard de Melo, Christian Fäth, John P. McCrae, Paul Buitelaar, Christian Chiarcos, Bettina Klimek, and Milan Dojchinovski

Published in: OASIcs, Volume 70, 2nd Conference on Language, Data and Knowledge (LDK 2019)


Abstract
Front Matter, Table of Contents, Preface, Conference Organization

Cite as

2nd Conference on Language, Data and Knowledge (LDK 2019). Open Access Series in Informatics (OASIcs), Volume 70, pp. 0:i-0:xvi, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{eskevich_et_al:OASIcs.LDK.2019.0,
  author =	{Eskevich, Maria and de Melo, Gerard and F\"{a}th, Christian and McCrae, John P. and Buitelaar, Paul and Chiarcos, Christian and Klimek, Bettina and Dojchinovski, Milan},
  title =	{{Front Matter, Table of Contents, Preface, Conference Organization}},
  booktitle =	{2nd Conference on Language, Data and Knowledge (LDK 2019)},
  pages =	{0:i--0:xvi},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-105-4},
  ISSN =	{2190-6807},
  year =	{2019},
  volume =	{70},
  editor =	{Eskevich, Maria and de Melo, Gerard and F\"{a}th, Christian and McCrae, John P. and Buitelaar, Paul and Chiarcos, Christian and Klimek, Bettina and Dojchinovski, Milan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.LDK.2019.0},
  URN =		{urn:nbn:de:0030-drops-103641},
  doi =		{10.4230/OASIcs.LDK.2019.0},
  annote =	{Keywords: Front Matter, Table of Contents, Preface, Conference Organization}
}
Document
Crowd-Sourcing A High-Quality Dataset for Metaphor Identification in Tweets

Authors: Omnia Zayed, John P. McCrae, and Paul Buitelaar

Published in: OASIcs, Volume 70, 2nd Conference on Language, Data and Knowledge (LDK 2019)


Abstract
Metaphor is one of the most important elements of human communication, especially in informal settings such as social media. There have been a number of datasets created for metaphor identification, however, this task has proven difficult due to the nebulous nature of metaphoricity. In this paper, we present a crowd-sourcing approach for the creation of a dataset for metaphor identification, that is able to rapidly achieve large coverage over the different usages of metaphor in a given corpus while maintaining high accuracy. We validate this methodology by creating a set of 2,500 manually annotated tweets in English, for which we achieve inter-annotator agreement scores over 0.8, which is higher than other reported results that did not limit the task. This methodology is based on the use of an existing classifier for metaphor in order to assist in the identification and the selection of the examples for annotation, in a way that reduces the cognitive load for annotators and enables quick and accurate annotation. We selected a corpus of both general language tweets and political tweets relating to Brexit and we compare the resulting corpus on these two domains. As a result of this work, we have published the first dataset of tweets annotated for metaphors, which we believe will be invaluable for the development, training and evaluation of approaches for metaphor identification in tweets.

Cite as

Omnia Zayed, John P. McCrae, and Paul Buitelaar. Crowd-Sourcing A High-Quality Dataset for Metaphor Identification in Tweets. In 2nd Conference on Language, Data and Knowledge (LDK 2019). Open Access Series in Informatics (OASIcs), Volume 70, pp. 10:1-10:17, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{zayed_et_al:OASIcs.LDK.2019.10,
  author =	{Zayed, Omnia and McCrae, John P. and Buitelaar, Paul},
  title =	{{Crowd-Sourcing A High-Quality Dataset for Metaphor Identification in Tweets}},
  booktitle =	{2nd Conference on Language, Data and Knowledge (LDK 2019)},
  pages =	{10:1--10:17},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-105-4},
  ISSN =	{2190-6807},
  year =	{2019},
  volume =	{70},
  editor =	{Eskevich, Maria and de Melo, Gerard and F\"{a}th, Christian and McCrae, John P. and Buitelaar, Paul and Chiarcos, Christian and Klimek, Bettina and Dojchinovski, Milan},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.LDK.2019.10},
  URN =		{urn:nbn:de:0030-drops-103740},
  doi =		{10.4230/OASIcs.LDK.2019.10},
  annote =	{Keywords: metaphor, identification, tweets, dataset, annotation, crowd-sourcing}
}
Document
The Multilingual Semantic Web (Dagstuhl Seminar 12362)

Authors: Paul Buitelaar, Key-Sun Choi, Philipp Cimiano, and Eduard H. Hovy

Published in: Dagstuhl Reports, Volume 2, Issue 9 (2013)


Abstract
This document constitutes a brief report from the Dagstuhl Seminar on the "Multilingual Semantic Web" which took place at Schloss Dagstuhl between September 3rd and 7th, 2012. The document states the motivation for the workshop as well as the main thematic focus. It describes the organization and structure of the seminar and briefly reports on the main topics of discussion and the main outcomes of the workshop.

Cite as

Paul Buitelaar, Key-Sun Choi, Philipp Cimiano, and Eduard H. Hovy. The Multilingual Semantic Web (Dagstuhl Seminar 12362). In Dagstuhl Reports, Volume 2, Issue 9, pp. 15-94, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2013)


Copy BibTex To Clipboard

@Article{buitelaar_et_al:DagRep.2.9.15,
  author =	{Buitelaar, Paul and Choi, Key-Sun and Cimiano, Philipp and Hovy, Eduard H.},
  title =	{{The Multilingual Semantic Web (Dagstuhl Seminar 12362)}},
  pages =	{15--94},
  journal =	{Dagstuhl Reports},
  ISSN =	{2192-5283},
  year =	{2013},
  volume =	{2},
  number =	{9},
  editor =	{Buitelaar, Paul and Choi, Key-Sun and Cimiano, Philipp and Hovy, Eduard H.},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagRep.2.9.15},
  URN =		{urn:nbn:de:0030-drops-37883},
  doi =		{10.4230/DagRep.2.9.15},
  annote =	{Keywords: Semantic Web, Multilinguality, Natural Language Processing}
}
Document
Ontologies & Text Mining (for Life Sciences)

Authors: Paul Buitelaar

Published in: Dagstuhl Seminar Proceedings, Volume 8131, Ontologies and Text Mining for Life Sciences : Current Status and Future Perspectives (2008)


Abstract
The talk will address several issues in the application and development of ontologies: the selection of appropriate ontologies for a task; the population of a selected ontology through information extraction from text; the semi-automatic development or extension of an ontology; the lexicalisation of ontologies for the purpose of ontology-based information extraction from text. Each of these issues will be addressed through a particular application: the OntoSelect ontology library and search engine (http://olp.dfki.de/ontoselect/); the OntoLT Protege PlugIn for ontology learning from text (http://olp.dfki.de/OntoLT/OntoLT.htm); the SOBA system for ontology-based information extraction from text; the LingInfo lexicon model for the integration of lexical/linguistic information in ontologies (http://olp.dfki.de/LingInfo/).

Cite as

Paul Buitelaar. Ontologies & Text Mining (for Life Sciences). In Ontologies and Text Mining for Life Sciences : Current Status and Future Perspectives. Dagstuhl Seminar Proceedings, Volume 8131, p. 1, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2008)


Copy BibTex To Clipboard

@InProceedings{buitelaar:DagSemProc.08131.11,
  author =	{Buitelaar, Paul},
  title =	{{Ontologies \& Text Mining (for Life Sciences)}},
  booktitle =	{Ontologies and Text Mining for Life Sciences : Current Status and Future Perspectives},
  pages =	{1--1},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2008},
  volume =	{8131},
  editor =	{Michael Ashburner and Ulf Leser and Dietrich Rebholz-Schuhmann},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagSemProc.08131.11},
  URN =		{urn:nbn:de:0030-drops-15095},
  doi =		{10.4230/DagSemProc.08131.11},
  annote =	{Keywords: Ontology Search; Ontology Population; Ontology Learning; Lexical Enrichment of Ontologies}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail