Graph-of-Entity: A Model for Combined Data Representation and Retrieval

Authors José Devezas , Carla Lopes , Sérgio Nunes

Author Details

José Devezas
  • INESC TEC, Porto, Portugal
  • Faculty of Engineering, University of Porto, Portugal
Carla Lopes
  • INESC TEC, Porto, Portugal
  • Faculty of Engineering, University of Porto, Portugal
Sérgio Nunes
  • INESC TEC, Porto, Portugal
  • Faculty of Engineering, University of Porto, Portugal

José Devezas, Carla Lopes, and Sérgio Nunes. Graph-of-Entity: A Model for Combined Data Representation and Retrieval. In 8th Symposium on Languages, Applications and Technologies (SLATE 2019). Open Access Series in Informatics (OASIcs), Volume 74, pp. 1:1-1:14, Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2019)


Managing large volumes of digital documents along with the information they contain, or are associated with, can be challenging. As systems become more intelligent, it increasingly makes sense to power retrieval through all available data, where every lead makes it easier to reach relevant documents or entities. Modern search is heavily powered by structured knowledge, but users still query using keywords or, at the very best, telegraphic natural language. As search becomes increasingly dependent on the integration of text and knowledge, novel approaches for a unified representation of combined data present the opportunity to unlock new ranking strategies. We tackle entity-oriented search using graph-based approaches for representation and retrieval. In particular, we propose the graph-of-entity, a novel approach for indexing combined data, where terms, entities and their relations are jointly represented. We compare the graph-of-entity with the graph-of-word, a text-only model, verifying that, overall, it does not yet achieve a better performance, despite obtaining a higher precision. Our assessment was based on a small subset of the INEX 2009 Wikipedia Collection, created from a sample of 10 topics and respectively judged documents. The offline evaluation we do here is complementary to its counterpart from TREC 2017 OpenSearch track, where, during our participation, we had assessed graph-of-entity in an online setting, through team-draft interleaving.

Subject Classification

ACM Subject Classification
  • Information systems → Document representation
  • Information systems → Retrieval models and ranking
  • Mathematics of computing → Graph theory
  • Entity-oriented search
  • graph-based models
  • collection-based graph


