NLPPort: A Pipeline for Portuguese NLP (Short Paper)

Rodrigues, Ricardo; Gonçalo Oliveira, Hugo; Gomes, Paulo

doi:10.4230/OASIcs.SLATE.2018.18

File

Subject Classification

ACM Subject Classification

Computing methodologies → Natural language processing

Keywords

natural language processing
tools
Portuguese

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

Document

0

Metadata

Abstract

Although there are tools for some the most common natural language processing tasks in Portuguese, there is a lack of available cross-platform tools specifically targeted for Portuguese, from end to end, namely for integration in projects developed in Java. To address this issue, we have developed and tweaked, over the last half-dozen years, NLPPort, a set of tools that can be used in a pipelined fashion, which we have made publicly available. In this paper, we present the major features of such set of tools.

Cite As Get BibTex

Ricardo Rodrigues, Hugo Gonçalo Oliveira, and Paulo Gomes. NLPPort: A Pipeline for Portuguese NLP (Short Paper). In 7th Symposium on Languages, Applications and Technologies (SLATE 2018). Open Access Series in Informatics (OASIcs), Volume 62, pp. 18:1-18:9, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018) https://doi.org/10.4230/OASIcs.SLATE.2018.18

Author Details

Ricardo Rodrigues

CISUC / ESEC, Polytechnic Institute of Coimbra, Portugal

Hugo Gonçalo Oliveira

CISUC / Department of Informatics Engineering, University of Coimbra, Portugal

Paulo Gomes

CISUC, University of Coimbra, Portugal

References

Cláudia Freitas, Paulo Rocha, and Eckhard Bick. Floresta Sintá(c)tica: Bigger, thicker and easier. In 8superscriptth Conference on Computational Processing of the Portuguese Language (PROPOR), pages 216-219, 2008.
Pablo Gamallo. An overview of open information extraction. In 3superscriptrd Symposium on Languages, Applications and Technologies (SLATE), pages 13-16, 2014.
Pablo Gamallo and Marcos Garcia. LinguaKit: Uma ferramenta multilingue para a análise linguística e a extracção de informação. Linguamática, 9(1):19-28, 2017.
Hugo Gonçalo Oliveira. Onto.PT: Towards the Automatic Construction of a Lexical Ontology for Portuguese. PhD thesis, University of Coimbra, 2013.
Hugo Gonçalo Oliveira, Diogo Costa, and Alexandre Pinto. Automatic generation of Internet Memes from Portuguese news headlines. In 12superscriptth Conference on Computational Processing of the Portuguese Language (PROPOR), volume 9727, pages 340-346, 2016.
Edward Loper and Steven Bird. NLTK: the natural language toolkit. In Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics (ETMTNLP), pages 63-70, 2002.
Christopher D. Manning, Mihai Surdeanu, John Bauer, Jenny Rose Finkel, Steven Bethard, and David McClosky. The Stanford CoreNLP natural language processing toolkit. In 52superscriptnd Annual Meeting of the Association for Computational Linguistics, pages 55-60, 2014.
Ana Oliveira Alves, Ricardo Rodrigues, and Hugo Gonçalo Oliveira. ASAPP: Alinhamento semântico automático de palavras aplicado ao Português. Linguamática, 8(2):43-58, 2016.
Lluís Padró, Miquel Collado, Samuel Reese, Marina Lloberes, and Irene Castellón. FreeLing 2.1: five years of open-source language processing tools. In 7superscriptth Language Resources and Evaluation Conference (LREC), pages 931-936, 2010.
Ricardo Rodrigues. RAPPort: A Fact-Based Question Answering System for Portuguese. PhD thesis, University of Coimbra, 2017.
Ricardo Rodrigues, Hugo Gonçalo Oliveira, and Paulo Gomes. LemPORT: a high-accuracy cross-platform lemmatizer for Portuguese. In 3superscriptrd Symposium on Languages, Applications and Technologies (SLATE), pages 267-274, 2014.
Derry Tanti Wijaya and Tom Mitchell. Mapping verbs in different languages to knowledge base relations using web text as interlingua. In 15superscriptth Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), pages 818-827, 2016.

NLPPort: A Pipeline for Portuguese NLP (Short Paper)

Authors Ricardo Rodrigues , Hugo Gonçalo Oliveira , Paulo Gomes

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

References

Thanks for your feedback!

Could not send message