Enriching Word Embeddings with Food Knowledge for Ingredient Retrieval

Authors Álvaro Mendes Samagaio , Henrique Lopes Cardoso , David Ribeiro

Thumbnail PDF


  • Filesize: 0.87 MB
  • 15 pages

Document Identifiers

Author Details

Álvaro Mendes Samagaio
  • Faculty of Engineering, University of Porto, Portugal
  • Fraunhofer Portugal, Porto, Portugal
Henrique Lopes Cardoso
  • Faculty of Engineering, University of Porto, Portugal
  • Artificial Intelligence and Computer Science Laboratory (LIACC), Porto, Portugal
David Ribeiro
  • Fraunhofer Portugal, Porto, Portugal

Cite AsGet BibTex

Álvaro Mendes Samagaio, Henrique Lopes Cardoso, and David Ribeiro. Enriching Word Embeddings with Food Knowledge for Ingredient Retrieval. In 3rd Conference on Language, Data and Knowledge (LDK 2021). Open Access Series in Informatics (OASIcs), Volume 93, pp. 15:1-15:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Smart assistants and recommender systems must deal with lots of information coming from different sources and having different formats. This is more frequent in text data, which presents increased variability and complexity, and is rather common for conversational assistants or chatbots. Moreover, this issue is very evident in the food and nutrition lexicon, where the semantics present increased variability, namely due to hypernyms and hyponyms. This work describes the creation of a set of word embeddings based on the incorporation of information from a food thesaurus - LanguaL - through retrofitting. The ingredients were classified according to three different facet label groups. Retrofitted embeddings seem to properly encode food-specific knowledge, as shown by an increase on accuracy as compared to generic embeddings (+23%, +10% and +31% per group). Moreover, a weighing mechanism based on TF-IDF was applied to embedding creation before retrofitting, also bringing an increase on accuracy (+5%, +9% and +5% per group). Finally, the approach has been tested with human users in an ingredient retrieval exercise, showing very positive evaluation (77.3% of the volunteer testers preferred this method over a string-based matching algorithm).

Subject Classification

ACM Subject Classification
  • Computing methodologies → Artificial intelligence
  • Computing methodologies → Knowledge representation and reasoning
  • Computing methodologies → Lexical semantics
  • Word embeddings
  • Retrofitting
  • LanguaL
  • Food Embeddings
  • Knowledge Graph


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Alaa A. Abd-alrazaq, Mohannad Alajlani, Ali Abdallah Alalwan, Bridgette M. Bewick, Peter Gardner, and Mowafa Househ. An overview of the features of chatbots in mental health: A scoping review, December 2019. Google Scholar
  2. Andreas Arens-Volland, Benjamin Gateau, and Yannick Naudet. Semantic Modeling for Personalized Dietary Recommendation. Proceedings - 13th International Workshop on Semantic and Social Media Adaptation and Personalization, SMAP 2018, pages 93-98, 2018. URL: https://doi.org/10.1109/SMAP.2018.8501864.
  3. Ram G. Athreya, Axel Cyrille Ngonga Ngomo, and Ricardo Usbeck. Enhancing Community Interactions with Data-Driven Chatbots - The DBpedia Chatbot. In The Web Conference 2018 - Companion of the World Wide Web Conference, WWW 2018, pages 143-146, New York, New York, USA, April 2018. Association for Computing Machinery, Inc. URL: https://doi.org/10.1145/3184558.3186964.
  4. Timothy W. Bickmore, Daniel Schulman, and Candace L. Sidner. A reusable framework for health counseling dialogue systems based on a behavioral medicine ontology. Journal of Biomedical Informatics, 44(2):183-197, April 2011. URL: https://doi.org/10.1016/j.jbi.2010.12.006.
  5. Kyungyong Chung and Roy C. Park. Chatbot-based heathcare service with a knowledge base for cloud computing. Cluster Computing, 22(1):1925-1937, January 2019. URL: https://doi.org/10.1007/s10586-018-2334-5.
  6. Leigh Clark, Nadia Pantidi, Orla Cooney, Philip Doyle, Diego Garaialde, Justin Edwards, Brendan Spillane, Emer Gilmartin, Christine Murad, Cosmin Munteanu, Vincent Wade, and Benjamin R. Cowan. What makes a good conversation? Challenges in designing truly conversational agents. In Conference on Human Factors in Computing Systems - Proceedings, pages 1-12, New York, New York, USA, May 2019. Association for Computing Machinery. URL: https://doi.org/10.1145/3290605.3300705.
  7. Jacob Devlin, Ming Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-training of deep bidirectional transformers for language understanding. In NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, volume 1, pages 4171-4186, 2019. URL: https://github.com/tensorflow/tensor2tensor.
  8. Vanesa Espín, María V. Hurtado, and Manuel Noguera. Nutrition for Elder Care: A nutritional semantic recommender system for the elderly. Expert Systems, 33(2):201-210, 2016. URL: https://doi.org/10.1111/exsy.12143.
  9. William J. Evans and Deanna Cyr-Campbell. Nutrition, exercise, and healthy aging. Journal of the American Dietetic Association, 97(6):632-638, 1997. URL: https://doi.org/10.1016/S0002-8223(97)00160-0.
  10. Manaal Faruqui, Jesse Dodge, Sujay K Jauhar, Chris Dyer, Eduard Hovy, and Noah A Smith. Retrofitting word vectors to semantic lexicons. In NAACL HLT 2015 - 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Proceedings of the Conference, pages 1606-1615, 2015. URL: https://doi.org/10.3115/v1/n15-1184.
  11. Catherine Havasi, Robert Speer, Kenneth Arnold, Henry Lieberman, Jason Alonso, and Jesse Moeller. Open mind common sense: Crowd-sourcing for common sense. In AAAI Workshop - Technical Report, volume WS-10-02, page 51, 2010. URL: https://www.aaai.org.
  12. Shafquat Hussain, Omid Ameri Sianaki, and Nedal Ababneh. A Survey on Conversational Agents/Chatbots Classification and Design Techniques. In Advances in Intelligent Systems and Computing, volume 927, pages 946-956. Springer Verlag, 2019. URL: https://doi.org/10.1007/978-3-030-15035-8_93.
  13. H. N. Io and C. B. Lee. Chatbots and conversational agents: A bibliometric analysis. In IEEE International Conference on Industrial Engineering and Engineering Management, volume 2017-December, pages 215-219. IEEE Computer Society, February 2018. URL: https://doi.org/10.1109/IEEM.2017.8289883.
  14. J. D. Ireland and A. Møller. Langual food description: A learning process. European Journal of Clinical Nutrition, 64:S44-S48, 2010. URL: https://doi.org/10.1038/ejcn.2010.209.
  15. Veton Kepuska and Gamal Bohouta. Next-generation of virtual personal assistants (Microsoft Cortana, Apple Siri, Amazon Alexa and Google Home). In 2018 IEEE 8th Annual Computing and Communication Workshop and Conference, CCWC 2018, volume 2018-January, pages 99-103. Institute of Electrical and Electronics Engineers Inc., February 2018. URL: https://doi.org/10.1109/CCWC.2018.8301638.
  16. Stefanie Mika. Challenges for nutrition recommender systems. CEUR Workshop Proceedings, 786:25-33, 2011. Google Scholar
  17. Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. Efficient estimation of word representations in vector space. In 1st International Conference on Learning Representations, ICLR 2013 - Workshop Track Proceedings. International Conference on Learning Representations, ICLR, 2013. Google Scholar
  18. George. Miller and Princeton University. Cognitive Science Laboratory. WordNet. MIT Press, 1998. Google Scholar
  19. Jeffrey Pennington, Richard Socher, and Christopher D Manning. GloVe: Global vectors for word representation. In EMNLP 2014 - 2014 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference, pages 1532-1543, 2014. URL: https://doi.org/10.3115/v1/d14-1162.
  20. Matthew E. Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. Deep contextualized word representations. In NAACL HLT 2018 - 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, volume 1, pages 2227-2237. Association for Computational Linguistics (ACL), February 2018. URL: https://doi.org/10.18653/v1/n18-1202.
  21. Gorjan Popovski, Bibek Paudel, Tome Eftimov, and Barbara Korousic Seljak. Exploring a standardized language for describing foods using embedding techniques. In Proceedings - 2019 IEEE International Conference on Big Data, Big Data 2019, pages 5172-5176. Institute of Electrical and Electronics Engineers Inc., December 2019. URL: https://doi.org/10.1109/BigData47090.2019.9005970.
  22. David Ribeiro, João Machado, Jorge Ribeiro, Maria João M. Vasconcelos, Elsa F. Vieira, and Ana Correia De Barros. SousChef: Mobile meal recommender system for older adults. In ICT4AWE 2017 - Proceedings of the 3rd International Conference on Information and Communication Technologies for Ageing Well and e-Health, pages 36-45. SciTePress, 2017. URL: https://doi.org/10.5220/0006281900360045.
  23. David Ribeiro, Jorge Ribeiro, Maria João M. Vasconcelos, Elsa F. Vieira, and Ana Correia de Barros. SousChef: Improved meal recommender system for Portuguese older adults. Communications in Computer and Information Science, 869:107-126, 2018. Google Scholar
  24. Christopher R Sauer and Alex Haigh. Cooking up Food Embeddings Understanding Flavors in the Recipe-Ingredient Graph, 2017. Google Scholar
  25. Nuno Silva, David Ribeiro, and Liliana Ferreira. Information extraction from unstructured recipe data. ACM International Conference Proceeding Series, Part F1482:165-168, 2019. URL: https://doi.org/10.1145/3323933.3324084.
  26. Vivian S Silva, Andre Freitas, and Siegfried Handschuh. Building a knowledge graph from natural language definitions for interpretable text entailment recognition, 2018. URL: http://brat.nlplab.org/.
  27. Vivian S Silva, Siegfried Handschuh, and Andre Freitas. Categorization of semantic roles for dictionary definitions, 2018. URL: https://www.aclweb.org/anthology/W16-5323.
  28. Robyn Speer, Joshua Chin, and Catherine Havasi. ConceptNet 5.5: An Open Multilingual Graph of General Knowledge. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, AAAI'17, page 4444–4451. AAAI Press, 2017. Google Scholar
  29. Wesley Tansey, Edward W. Lowe, and James G. Scott. Diet2Vec: Multi-scale analysis of massive dietary data. arXiv, December 2016. URL: http://arxiv.org/abs/1612.00388.
  30. Christoph Trattner and David Elsweiler. Food Recommender Systems Important Contributions, Challenges and Future Research Directions, November 2017. URL: http://arxiv.org/abs/1711.02760.
  31. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is All you Need. In I Guyon, U V Luxburg, S Bengio, H Wallach, R Fergus, S Vishwanathan, and R Garnett, editors, Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017. Google Scholar
  32. Anu Venkatesh, Chandra Khatri, Ashwin Ram, Fenfei Guo, Raefer Gabriel, Ashish Nagar, Rohit Prasad, Ming Cheng, Behnam Hedayatnia, Angeliki Metallinou, Rahul Goel, Shaohua Yang, and Anirudh Raju. On Evaluating and Comparing Conversational Agents. In Conversational AI Workshop at the 31st Conference on Neural Information Processing Systems, pages 1-10, 2017. URL: http://alborz-geramifard.com/workshops/nips17-Conversational-AI/Papers/17nipsw-cai-evaluating_conversational.pdf.
  33. Ledell Wu, Adam Fisch, Sumit Chopra, Keith Adams, Antoine Bordes, and Jason Weston. StarSpace: Embed all the things! In 32nd AAAI Conference on Artificial Intelligence, AAAI 2018, pages 5569-5577. AAAI press, September 2018. Google Scholar
  34. Anbang Xu, Zhe Liu, Yufan Guo, Vibha Sinha, and Rama Akkiraju. A new chatbot for customer service on social media. In Conference on Human Factors in Computing Systems - Proceedings, volume 2017-May, pages 3506-3510. Association for Computing Machinery, May 2017. URL: https://doi.org/10.1145/3025453.3025496.
  35. Wen Zhou, Haoshen Hong, Zihao Zhou, and Stanford Scpd. Derive Word Embeddings From Knowledge Graph, 2019. Google Scholar
Questions / Remarks / Feedback

Feedback for Dagstuhl Publishing

Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail