Search Results

Documents authored by Gomes, Paulo


Document
Short Paper
NLPPort: A Pipeline for Portuguese NLP (Short Paper)

Authors: Ricardo Rodrigues, Hugo Gonçalo Oliveira, and Paulo Gomes

Published in: OASIcs, Volume 62, 7th Symposium on Languages, Applications and Technologies (SLATE 2018)


Abstract
Although there are tools for some the most common natural language processing tasks in Portuguese, there is a lack of available cross-platform tools specifically targeted for Portuguese, from end to end, namely for integration in projects developed in Java. To address this issue, we have developed and tweaked, over the last half-dozen years, NLPPort, a set of tools that can be used in a pipelined fashion, which we have made publicly available. In this paper, we present the major features of such set of tools.

Cite as

Ricardo Rodrigues, Hugo Gonçalo Oliveira, and Paulo Gomes. NLPPort: A Pipeline for Portuguese NLP (Short Paper). In 7th Symposium on Languages, Applications and Technologies (SLATE 2018). Open Access Series in Informatics (OASIcs), Volume 62, pp. 18:1-18:9, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


Copy BibTex To Clipboard

@InProceedings{rodrigues_et_al:OASIcs.SLATE.2018.18,
  author =	{Rodrigues, Ricardo and Gon\c{c}alo Oliveira, Hugo and Gomes, Paulo},
  title =	{{NLPPort: A Pipeline for Portuguese NLP}},
  booktitle =	{7th Symposium on Languages, Applications and Technologies (SLATE 2018)},
  pages =	{18:1--18:9},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-072-9},
  ISSN =	{2190-6807},
  year =	{2018},
  volume =	{62},
  editor =	{Henriques, Pedro Rangel and Leal, Jos\'{e} Paulo and Leit\~{a}o, Ant\'{o}nio Menezes and Guinovart, Xavier G\'{o}mez},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2018.18},
  URN =		{urn:nbn:de:0030-drops-92768},
  doi =		{10.4230/OASIcs.SLATE.2018.18},
  annote =	{Keywords: natural language processing, tools, Portuguese}
}
Document
Assigning Polarity Automatically to the Synsets of a Wordnet-like Resource

Authors: Hugo Gonçalo Oliveira, António Paulo Santos, and Paulo Gomes

Published in: OASIcs, Volume 38, 3rd Symposium on Languages, Applications and Technologies (2014)


Abstract
This article describes work towards the automatic creation of a conceptual polarity lexicon for Portuguese. For this purpose, we take advantage of a polarity lexicon based on single lemmas to assign polarities to the synsets of a wordnet-like resource. We assume that each synset has the polarity of the majority of its lemmas, given by the initial lexicon. After that, polarity is propagated to other synsets, through different types of semantic relations. The relation types used were selected after manual evaluation. The main result of this work is a lexicon with more than 10,000 synsets with an assigned polarity, with accuracy of 70% or 79%, depending on the human evaluator. For Portuguese, this is the first synset-based polarity lexicon we are aware of. In addition to this contribution, the presented approach can be applied to create similar resources for other languages.

Cite as

Hugo Gonçalo Oliveira, António Paulo Santos, and Paulo Gomes. Assigning Polarity Automatically to the Synsets of a Wordnet-like Resource. In 3rd Symposium on Languages, Applications and Technologies. Open Access Series in Informatics (OASIcs), Volume 38, pp. 169-184, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2014)


Copy BibTex To Clipboard

@InProceedings{goncalooliveira_et_al:OASIcs.SLATE.2014.169,
  author =	{Gon\c{c}alo Oliveira, Hugo and Santos, Ant\'{o}nio Paulo and Gomes, Paulo},
  title =	{{Assigning Polarity Automatically to the Synsets of a Wordnet-like Resource}},
  booktitle =	{3rd Symposium on Languages, Applications and Technologies},
  pages =	{169--184},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-939897-68-2},
  ISSN =	{2190-6807},
  year =	{2014},
  volume =	{38},
  editor =	{Pereira, Maria Jo\~{a}o Varanda and Leal, Jos\'{e} Paulo and Sim\~{o}es, Alberto},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2014.169},
  URN =		{urn:nbn:de:0030-drops-45689},
  doi =		{10.4230/OASIcs.SLATE.2014.169},
  annote =	{Keywords: sentiment analysis, polarity, lexicon, wordnet, Portuguese}
}
Document
LemPORT: a High-Accuracy Cross-Platform Lemmatizer for Portuguese

Authors: Ricardo Rodrigues, Hugo Gonçalo Oliveira, and Paulo Gomes

Published in: OASIcs, Volume 38, 3rd Symposium on Languages, Applications and Technologies (2014)


Abstract
Although lemmatization is a very common subtask in many natural language processing tasks, there is a lack of available true cross-platform lemmatization tools specifically targeted for Portuguese, namely for integration in projects developed in Java. To address this issue, we have developed a lemmatizer, initially just for our own use, but which we have decided to make publicly available. The lemmatizer, presented in this document, yields an overall accuracy over 98% when compared against a manually revised corpus.

Cite as

Ricardo Rodrigues, Hugo Gonçalo Oliveira, and Paulo Gomes. LemPORT: a High-Accuracy Cross-Platform Lemmatizer for Portuguese. In 3rd Symposium on Languages, Applications and Technologies. Open Access Series in Informatics (OASIcs), Volume 38, pp. 267-274, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2014)


Copy BibTex To Clipboard

@InProceedings{rodrigues_et_al:OASIcs.SLATE.2014.267,
  author =	{Rodrigues, Ricardo and Gon\c{c}alo Oliveira, Hugo and Gomes, Paulo},
  title =	{{LemPORT: a High-Accuracy Cross-Platform Lemmatizer for Portuguese}},
  booktitle =	{3rd Symposium on Languages, Applications and Technologies},
  pages =	{267--274},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-939897-68-2},
  ISSN =	{2190-6807},
  year =	{2014},
  volume =	{38},
  editor =	{Pereira, Maria Jo\~{a}o Varanda and Leal, Jos\'{e} Paulo and Sim\~{o}es, Alberto},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2014.267},
  URN =		{urn:nbn:de:0030-drops-45753},
  doi =		{10.4230/OASIcs.SLATE.2014.267},
  annote =	{Keywords: lemmatization, normalization, rules, lexicon}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail