OASIcs.ICLP.2017.1.pdf
- Filesize: 345 kB
- 5 pages
Knowledge on the Web in a large part is stored in various semantic resources that formalize, represent and organize it differently. Combining information from several sources can improve results of tasks such as recognizing similarities among objects. In this paper, we propose a logic-based method for the problem of entity set expansion (ESE), i.e. extending a list of named entities given a set of seeds. This problem has relevant applications in the Information Extraction domain, specifically in automatic lexicon generation for dictionary-based annotating tools. Contrary to typical approaches in natural languages processing, based on co-occurrence statistics of words, we determine the common category of the seeds by analyzing the semantic relations of the objects the words represent. To do it, we integrate information from selected Web resources. We introduce a notion of an entity network that uniformly represents the combined knowledge and allow to reason over it. We show how to use the network to disambiguate word senses by relying on a concept of optimal common ancestor and how to discover similarities between two entities. Finally, we show how to expand a set of entities, by using answer set programming with external predicates.
Feedback for Dagstuhl Publishing