Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik GmbH Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik GmbH scholarly article en Rodrigues, Ricardo; Gonšalo Oliveira, Hugo; Gomes, Paulo http://www.dagstuhl.de/oasics License
when quoting this document, please refer to the following
DOI:
URN: urn:nbn:de:0030-drops-45753
URL:

; ;

LemPORT: a High-Accuracy Cross-Platform Lemmatizer for Portuguese

pdf-format:


Abstract

Although lemmatization is a very common subtask in many natural language processing tasks, there is a lack of available true cross-platform lemmatization tools specifically targeted for Portuguese, namely for integration in projects developed in Java. To address this issue, we have developed a lemmatizer, initially just for our own use, but which we have decided to make publicly available. The lemmatizer, presented in this document, yields an overall accuracy over 98% when compared against a manually revised corpus.

BibTeX - Entry

@InProceedings{rodrigues_et_al:OASIcs:2014:4575,
  author =	{Ricardo Rodrigues and Hugo Gon{\c{c}}alo Oliveira and Paulo Gomes},
  title =	{{LemPORT: a High-Accuracy Cross-Platform Lemmatizer for Portuguese}},
  booktitle =	{3rd Symposium on Languages, Applications and Technologies},
  pages =	{267--274},
  series =	{OpenAccess Series in Informatics (OASIcs)},
  ISBN =	{978-3-939897-68-2},
  ISSN =	{2190-6807},
  year =	{2014},
  volume =	{38},
  editor =	{Maria Jo{\~a}o Varanda Pereira and Jos{\'e} Paulo Leal and Alberto Sim{\~o}es},
  publisher =	{Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{http://drops.dagstuhl.de/opus/volltexte/2014/4575},
  URN =		{urn:nbn:de:0030-drops-45753},
  doi =		{10.4230/OASIcs.SLATE.2014.267},
  annote =	{Keywords: lemmatization, normalization, rules, lexicon}
}

Keywords: lemmatization, normalization, rules, lexicon
Seminar: 3rd Symposium on Languages, Applications and Technologies
Issue date: 2014
Date of publication: 2014


DROPS-Home | Fulltext Search | Imprint Published by LZI