LemPORT: a High-Accuracy Cross-Platform Lemmatizer for Portuguese

Authors Ricardo Rodrigues, Hugo Gonçalo Oliveira, Paulo Gomes

Thumbnail PDF


  • Filesize: 446 kB
  • 8 pages

Document Identifiers

Author Details

Ricardo Rodrigues
Hugo Gonçalo Oliveira
Paulo Gomes

Cite AsGet BibTex

Ricardo Rodrigues, Hugo Gonçalo Oliveira, and Paulo Gomes. LemPORT: a High-Accuracy Cross-Platform Lemmatizer for Portuguese. In 3rd Symposium on Languages, Applications and Technologies. Open Access Series in Informatics (OASIcs), Volume 38, pp. 267-274, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2014)


Although lemmatization is a very common subtask in many natural language processing tasks, there is a lack of available true cross-platform lemmatization tools specifically targeted for Portuguese, namely for integration in projects developed in Java. To address this issue, we have developed a lemmatizer, initially just for our own use, but which we have decided to make publicly available. The lemmatizer, presented in this document, yields an overall accuracy over 98% when compared against a manually revised corpus.
  • lemmatization
  • normalization
  • rules
  • lexicon


  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    PDF Downloads
Questions / Remarks / Feedback

Feedback for Dagstuhl Publishing

Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail