NLPPort: A Pipeline for Portuguese NLP (Short Paper)

Authors Ricardo Rodrigues , Hugo Gonçalo Oliveira , Paulo Gomes



PDF
Thumbnail PDF

File

OASIcs.SLATE.2018.18.pdf
  • Filesize: 465 kB
  • 9 pages

Document Identifiers

Author Details

Ricardo Rodrigues
  • CISUC / ESEC, Polytechnic Institute of Coimbra, Portugal
Hugo Gonçalo Oliveira
  • CISUC / Department of Informatics Engineering, University of Coimbra, Portugal
Paulo Gomes
  • CISUC, University of Coimbra, Portugal

Cite As Get BibTex

Ricardo Rodrigues, Hugo Gonçalo Oliveira, and Paulo Gomes. NLPPort: A Pipeline for Portuguese NLP (Short Paper). In 7th Symposium on Languages, Applications and Technologies (SLATE 2018). Open Access Series in Informatics (OASIcs), Volume 62, pp. 18:1-18:9, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018) https://doi.org/10.4230/OASIcs.SLATE.2018.18

Abstract

Although there are tools for some the most common natural language processing tasks in Portuguese, there is a lack of available cross-platform tools specifically targeted for Portuguese, from end to end, namely for integration in projects developed in Java. To address this issue, we have developed and tweaked, over the last half-dozen years, NLPPort, a set of tools that can be used in a pipelined fashion, which we have made publicly available. In this paper, we present the major features of such set of tools.

Subject Classification

ACM Subject Classification
  • Computing methodologies → Natural language processing
Keywords
  • natural language processing
  • tools
  • Portuguese

Metrics

  • Access Statistics
  • Total Accesses (updated on a weekly basis)
    0
    PDF Downloads

References

  1. Cláudia Freitas, Paulo Rocha, and Eckhard Bick. Floresta Sintá(c)tica: Bigger, thicker and easier. In 8superscriptth Conference on Computational Processing of the Portuguese Language (PROPOR), pages 216-219, 2008. Google Scholar
  2. Pablo Gamallo. An overview of open information extraction. In 3superscriptrd Symposium on Languages, Applications and Technologies (SLATE), pages 13-16, 2014. Google Scholar
  3. Pablo Gamallo and Marcos Garcia. LinguaKit: Uma ferramenta multilingue para a análise linguística e a extracção de informação. Linguamática, 9(1):19-28, 2017. Google Scholar
  4. Hugo Gonçalo Oliveira. Onto.PT: Towards the Automatic Construction of a Lexical Ontology for Portuguese. PhD thesis, University of Coimbra, 2013. Google Scholar
  5. Hugo Gonçalo Oliveira, Diogo Costa, and Alexandre Pinto. Automatic generation of Internet Memes from Portuguese news headlines. In 12superscriptth Conference on Computational Processing of the Portuguese Language (PROPOR), volume 9727, pages 340-346, 2016. Google Scholar
  6. Edward Loper and Steven Bird. NLTK: the natural language toolkit. In Workshop on Effective Tools and Methodologies for Teaching Natural Language Processing and Computational Linguistics (ETMTNLP), pages 63-70, 2002. Google Scholar
  7. Christopher D. Manning, Mihai Surdeanu, John Bauer, Jenny Rose Finkel, Steven Bethard, and David McClosky. The Stanford CoreNLP natural language processing toolkit. In 52superscriptnd Annual Meeting of the Association for Computational Linguistics, pages 55-60, 2014. Google Scholar
  8. Ana Oliveira Alves, Ricardo Rodrigues, and Hugo Gonçalo Oliveira. ASAPP: Alinhamento semântico automático de palavras aplicado ao Português. Linguamática, 8(2):43-58, 2016. Google Scholar
  9. Lluís Padró, Miquel Collado, Samuel Reese, Marina Lloberes, and Irene Castellón. FreeLing 2.1: five years of open-source language processing tools. In 7superscriptth Language Resources and Evaluation Conference (LREC), pages 931-936, 2010. Google Scholar
  10. Ricardo Rodrigues. RAPPort: A Fact-Based Question Answering System for Portuguese. PhD thesis, University of Coimbra, 2017. Google Scholar
  11. Ricardo Rodrigues, Hugo Gonçalo Oliveira, and Paulo Gomes. LemPORT: a high-accuracy cross-platform lemmatizer for Portuguese. In 3superscriptrd Symposium on Languages, Applications and Technologies (SLATE), pages 267-274, 2014. Google Scholar
  12. Derry Tanti Wijaya and Tom Mitchell. Mapping verbs in different languages to knowledge base relations using web text as interlingua. In 15superscriptth Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL), pages 818-827, 2016. Google Scholar
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail