Authors:
Michael Krauthammer and Thaibinh Luong
Published in: Dagstuhl Seminar Proceedings, Volume 8131, Ontologies and Text Mining for Life Sciences : Current Status and Future Perspectives (2008)
Abstract
We believe that gene name identification is a modular process involving term recognition, classification and mapping. This work's focus is on gene name mapping, and we assume that names are already recognized and classified. We use a combination of two methods to map recognized entities to their appropriate gene identifiers (Entrez GeneIDs): the Trigram Method, and the Network Method. Both methods require preprocessing, using resources from Entrez Gene, to construct a set of method-specific matrices. We first address lexical variation by transforming gene names into their unique "trigrams" (groups of three alphanumeric characters), and perform trigram matching against the preprocessed gene dictionary. For ambiguous gene names, we additionally perform a contextual analysis of the abstract that contains the recognized entity. We have formalized our method as a sequence of matrix manipulations, allowing for a fast and coherent implementation of the algorithm. In this talk, we also show how gene name identification, and text mining in general, can play a critical role in translational medicine. We demonstrate how term identification is useful for establishing a biobibliometric distance between genes and psychiatric disorders.
Cite as
Michael Krauthammer and Thaibinh Luong. Term Mapping Using Matrix Operations. In Ontologies and Text Mining for Life Sciences : Current Status and Future Perspectives. Dagstuhl Seminar Proceedings, Volume 8131, p. 1, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2008)
Copy BibTex To Clipboard
@InProceedings{krauthammer_et_al:DagSemProc.08131.17,
author = {Krauthammer, Michael and Luong, Thaibinh},
title = {{Term Mapping Using Matrix Operations}},
booktitle = {Ontologies and Text Mining for Life Sciences : Current Status and Future Perspectives},
pages = {1--1},
series = {Dagstuhl Seminar Proceedings (DagSemProc)},
ISSN = {1862-4405},
year = {2008},
volume = {8131},
editor = {Michael Ashburner and Ulf Leser and Dietrich Rebholz-Schuhmann},
publisher = {Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/entities/document/10.4230/DagSemProc.08131.17},
URN = {urn:nbn:de:0030-drops-15126},
doi = {10.4230/DagSemProc.08131.17},
annote = {Keywords: Term Identification}
}