Search Results

Documents authored by Tateisi, Yuka


Document
Coreference Resolution in Biomedical Texts: a Machine Learning Approach

Authors: Jian Su, Xiaofeng Yang, Huaqing Hong, Yuka Tateisi, and Jun'ichi Tsujii

Published in: Dagstuhl Seminar Proceedings, Volume 8131, Ontologies and Text Mining for Life Sciences : Current Status and Future Perspectives (2008)


Abstract
Motivation: Coreference resolution, the process of identifying different mentions of an entity, is a very important component in a text-mining system. Compared with the work in news articles, the existing study of coreference resolution in biomedical texts is quite preliminary by only focusing on specific types of anaphors like pronouns or definite noun phrases, using heuristic methods, and running on small data sets. Therefore, there is a need for an in-depth exploration of this task in the biomedical domain. Results: In this article, we presented a learning-based approach to coreference resolution in the biomedical domain. We made three contributions in our study. Firstly, we annotated a large scale coreference corpus, MedCo, which consists of 1,999 medline abstracts in the GENIA data set. Secondly, we proposed a detailed framework for the coreference resolution task, in which we augmented the traditional learning model by incorporating non-anaphors into training. Lastly, we explored various sources of knowledge for coreference resolution, particularly, those that can deal with the complexity of biomedical texts. The evaluation on the MedCo corpus showed promising results. Our coreference resolution system achieved a high precision of 85.2% with a reasonable recall of 65.3%, obtaining an F-measure of 73.9%. The results also suggested that our augmented learning model significantly boosted precision (up to 24.0%) without much loss in recall (less than 5%), and brought a gain of over 8% in F-measure.

Cite as

Jian Su, Xiaofeng Yang, Huaqing Hong, Yuka Tateisi, and Jun'ichi Tsujii. Coreference Resolution in Biomedical Texts: a Machine Learning Approach. In Ontologies and Text Mining for Life Sciences : Current Status and Future Perspectives. Dagstuhl Seminar Proceedings, Volume 8131, p. 1, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2008)


Copy BibTex To Clipboard

@InProceedings{su_et_al:DagSemProc.08131.4,
  author =	{Su, Jian and Yang, Xiaofeng and Hong, Huaqing and Tateisi, Yuka and Tsujii, Jun'ichi},
  title =	{{Coreference Resolution in Biomedical Texts: a Machine Learning Approach}},
  booktitle =	{Ontologies and Text Mining for Life Sciences : Current Status and Future Perspectives},
  pages =	{1--1},
  series =	{Dagstuhl Seminar Proceedings (DagSemProc)},
  ISSN =	{1862-4405},
  year =	{2008},
  volume =	{8131},
  editor =	{Michael Ashburner and Ulf Leser and Dietrich Rebholz-Schuhmann},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/DagSemProc.08131.4},
  URN =		{urn:nbn:de:0030-drops-15220},
  doi =		{10.4230/DagSemProc.08131.4},
  annote =	{Keywords: Coreference resolution, biomedical text}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail