Entity set expansion from the Web via ASP

Authors Weronika T. Adrian, Marco Manna, Nicola Leone, Giovanni Amendola, Marek Adrian

Thumbnail PDF


  • Filesize: 345 kB
  • 5 pages

Document Identifiers

Author Details

Weronika T. Adrian
Marco Manna
Nicola Leone
Giovanni Amendola
Marek Adrian

Cite AsGet BibTex

Weronika T. Adrian, Marco Manna, Nicola Leone, Giovanni Amendola, and Marek Adrian. Entity set expansion from the Web via ASP. In Technical Communications of the 33rd International Conference on Logic Programming (ICLP 2017). Open Access Series in Informatics (OASIcs), Volume 58, pp. 1:1-1:5, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


Knowledge on the Web in a large part is stored in various semantic resources that formalize, represent and organize it differently. Combining information from several sources can improve results of tasks such as recognizing similarities among objects. In this paper, we propose a logic-based method for the problem of entity set expansion (ESE), i.e. extending a list of named entities given a set of seeds. This problem has relevant applications in the Information Extraction domain, specifically in automatic lexicon generation for dictionary-based annotating tools. Contrary to typical approaches in natural languages processing, based on co-occurrence statistics of words, we determine the common category of the seeds by analyzing the semantic relations of the objects the words represent. To do it, we integrate information from selected Web resources. We introduce a notion of an entity network that uniformly represents the combined knowledge and allow to reason over it. We show how to use the network to disambiguate word senses by relying on a concept of optimal common ancestor and how to discover similarities between two entities. Finally, we show how to expand a set of entities, by using answer set programming with external predicates.
  • answer set programming
  • entity set expansion
  • information extraction
  • natural language processing
  • word sense disambiguation


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads


  1. Gerhard Brewka, Thomas Eiter, and Miroslaw Truszczynski. Answer set programming at a glance. Communications of the ACM, 54(12):92-103, 2011. Google Scholar
  2. Francesco Calimeri, Davide Fuscà, Simona Perri, and Jessica Zangari. I-DLV: the new intelligent grounder of DLV. Intelligenza Artificiale, 11(1):5-20, 2017. Google Scholar
  3. José Camacho-Collados, Mohammad Taher Pilehvar, and Roberto Navigli. A unified multilingual semantic representation of concepts. In Proc. of ACL'15, pages 741-751, 2015. Google Scholar
  4. Oren Etzioni, Michael Cafarella, Doug Downey, Ana-Maria Popescu, Tal Shaked, Stephen Soderland, Daniel S. Weld, and Alexander Yates. Unsupervised named-entity extraction from the web: An experimental study. Artificial Intelligence, 165(1):91-134, 2005. Google Scholar
  5. Ruihong Huang and Ellen Riloff. Inducing domain-specific semantic class taggers from (almost) nothing. In Proc. of ACL 2010, pages 275-285, 2010. Google Scholar
  6. Ignacio Iacobacci, Mohammad T. Pilehvar, and Roberto Navigli. Sensembed: Learning sense embeddings for word and relational similarity. In Proc. of ACL 2015, pages 95-105, 2015. Google Scholar
  7. George A Miller. Wordnet: a lexical database for english. Communications of the ACM, 38(11):39-41, 1995. Google Scholar
  8. Roberto Navigli and Simone Paolo Ponzetto. Babelnet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artificial Intelligence, 193:217-250, 2012. Google Scholar
  9. Patrick Pantel, Eric Crestan, Arkady Borkovsky, Ana-Maria Popescu, and Vishnu Vyas. Web-scale distributional similarity and entity set expansion. In Proc. of EMNLP 2009, pages 938-947, 2009. Google Scholar
  10. Ellen Riloff and Rosie Jones. Learning dictionaries for information extraction by multi-level bootstrapping. In Proc. of AAAI '99 and IAAI '99, pages 474-479, 1999. Google Scholar
  11. Luís Sarmento, Valentin Jijkoun, Maarten de Rijke, and Eugenio Oliveira. "more like these": growing entity classes from seeds. In Proc. of CIKM'07, pages 959-962, 2007. Google Scholar
  12. Julian Seitner, Christian Bizer, Kai Eckert, Stefano Faralli, Robert Meusel, Heiko Paulheim, and Simone Paolo Ponzetto. A large database of hypernymy relations extracted from the web. In Proc. of LREC'16, 2016. Google Scholar
  13. Michael Thelen and Ellen Riloff. A bootstrapping method for learning semantic lexicons using extraction pattern contexts. In Proc. of EMNLP '02, pages 214-221, 2002. Google Scholar