Search Results

Documents authored by Rademaker, Alexandre


Document
On the Utility of Word Embeddings for Enriching OpenWordNet-PT

Authors: Hugo Gonçalo Oliveira, Fredson Silva de Souza Aguiar, and Alexandre Rademaker

Published in: OASIcs, Volume 93, 3rd Conference on Language, Data and Knowledge (LDK 2021)


Abstract
The maintenance of wordnets and lexical knwoledge bases typically relies on time-consuming manual effort. In order to minimise this issue, we propose the exploitation of models of distributional semantics, namely word embeddings learned from corpora, in the automatic identification of relation instances missing in a wordnet. Analogy-solving methods are first used for learning a set of relations from analogy tests focused on each relation. Despite their low accuracy, we noted that a portion of the top-given answers are good suggestions of relation instances that could be included in the wordnet. This procedure is applied to the enrichment of OpenWordNet-PT, a public Portuguese wordnet. Relations are learned from data acquired from this resource, and illustrative examples are provided. Results are promising for accelerating the identification of missing relation instances, as we estimate that about 17% of the potential suggestions are good, a proportion that almost doubles if some are automatically invalidated.

Cite as

Hugo Gonçalo Oliveira, Fredson Silva de Souza Aguiar, and Alexandre Rademaker. On the Utility of Word Embeddings for Enriching OpenWordNet-PT. In 3rd Conference on Language, Data and Knowledge (LDK 2021). Open Access Series in Informatics (OASIcs), Volume 93, pp. 21:1-21:13, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{goncalooliveira_et_al:OASIcs.LDK.2021.21,
  author =	{Gon\c{c}alo Oliveira, Hugo and Aguiar, Fredson Silva de Souza and Rademaker, Alexandre},
  title =	{{On the Utility of Word Embeddings for Enriching OpenWordNet-PT}},
  booktitle =	{3rd Conference on Language, Data and Knowledge (LDK 2021)},
  pages =	{21:1--21:13},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-199-3},
  ISSN =	{2190-6807},
  year =	{2021},
  volume =	{93},
  editor =	{Gromann, Dagmar and S\'{e}rasset, Gilles and Declerck, Thierry and McCrae, John P. and Gracia, Jorge and Bosque-Gil, Julia and Bobillo, Fernando and Heinisch, Barbara},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.LDK.2021.21},
  URN =		{urn:nbn:de:0030-drops-145578},
  doi =		{10.4230/OASIcs.LDK.2021.21},
  annote =	{Keywords: word embeddings, lexical resources, wordnet, analogy tests}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail