License: Creative Commons Attribution 4.0 International license (CC BY 4.0)
When quoting this document, please refer to the following
DOI: 10.4230/OASIcs.LDK.2021.15
URN: urn:nbn:de:0030-drops-145510
URL: https://drops.dagstuhl.de/opus/volltexte/2021/14551/
Go to the corresponding OASIcs Volume Portal


Samagaio, Álvaro Mendes ; Lopes Cardoso, Henrique ; Ribeiro, David

Enriching Word Embeddings with Food Knowledge for Ingredient Retrieval

pdf-format:
OASIcs-LDK-2021-15.pdf (0.9 MB)


Abstract

Smart assistants and recommender systems must deal with lots of information coming from different sources and having different formats. This is more frequent in text data, which presents increased variability and complexity, and is rather common for conversational assistants or chatbots. Moreover, this issue is very evident in the food and nutrition lexicon, where the semantics present increased variability, namely due to hypernyms and hyponyms. This work describes the creation of a set of word embeddings based on the incorporation of information from a food thesaurus - LanguaL - through retrofitting. The ingredients were classified according to three different facet label groups. Retrofitted embeddings seem to properly encode food-specific knowledge, as shown by an increase on accuracy as compared to generic embeddings (+23%, +10% and +31% per group). Moreover, a weighing mechanism based on TF-IDF was applied to embedding creation before retrofitting, also bringing an increase on accuracy (+5%, +9% and +5% per group). Finally, the approach has been tested with human users in an ingredient retrieval exercise, showing very positive evaluation (77.3% of the volunteer testers preferred this method over a string-based matching algorithm).

BibTeX - Entry

@InProceedings{samagaio_et_al:OASIcs.LDK.2021.15,
  author =	{Samagaio, \'{A}lvaro Mendes and Lopes Cardoso, Henrique and Ribeiro, David},
  title =	{{Enriching Word Embeddings with Food Knowledge for Ingredient Retrieval}},
  booktitle =	{3rd Conference on Language, Data and Knowledge (LDK 2021)},
  pages =	{15:1--15:15},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-199-3},
  ISSN =	{2190-6807},
  year =	{2021},
  volume =	{93},
  editor =	{Gromann, Dagmar and S\'{e}rasset, Gilles and Declerck, Thierry and McCrae, John P. and Gracia, Jorge and Bosque-Gil, Julia and Bobillo, Fernando and Heinisch, Barbara},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/opus/volltexte/2021/14551},
  URN =		{urn:nbn:de:0030-drops-145510},
  doi =		{10.4230/OASIcs.LDK.2021.15},
  annote =	{Keywords: Word embeddings, Retrofitting, LanguaL, Food Embeddings, Knowledge Graph}
}

Keywords: Word embeddings, Retrofitting, LanguaL, Food Embeddings, Knowledge Graph
Collection: 3rd Conference on Language, Data and Knowledge (LDK 2021)
Issue Date: 2021
Date of publication: 30.08.2021


DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI