Search Results

Documents authored by Gamallo, Pablo


Document
LeMe-PT: A Medical Package Leaflet Corpus for Portuguese

Authors: Alberto Simões and Pablo Gamallo

Published in: OASIcs, Volume 94, 10th Symposium on Languages, Applications and Technologies (SLATE 2021)


Abstract
The current trend on natural language processing is the use of machine learning. This is being done on every field, from summarization to machine translation. For these techniques to be applied, resources are needed, namely quality corpora. While there are large quantities of corpora for the Portuguese language, there is the lack of technical and focused corpora. Therefore, in this article we present a new corpus, built from drug package leaflets. We describe its structure and contents, and discuss possible exploration directions.

Cite as

Alberto Simões and Pablo Gamallo. LeMe-PT: A Medical Package Leaflet Corpus for Portuguese. In 10th Symposium on Languages, Applications and Technologies (SLATE 2021). Open Access Series in Informatics (OASIcs), Volume 94, pp. 10:1-10:10, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{simoes_et_al:OASIcs.SLATE.2021.10,
  author =	{Sim\~{o}es, Alberto and Gamallo, Pablo},
  title =	{{LeMe-PT: A Medical Package Leaflet Corpus for Portuguese}},
  booktitle =	{10th Symposium on Languages, Applications and Technologies (SLATE 2021)},
  pages =	{10:1--10:10},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-202-0},
  ISSN =	{2190-6807},
  year =	{2021},
  volume =	{94},
  editor =	{Queir\'{o}s, Ricardo and Pinto, M\'{a}rio and Sim\~{o}es, Alberto and Portela, Filipe and Pereira, Maria Jo\~{a}o},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2021.10},
  URN =		{urn:nbn:de:0030-drops-144277},
  doi =		{10.4230/OASIcs.SLATE.2021.10},
  annote =	{Keywords: drug corpora, information extractiom, word embeddings}
}
Document
Identifying Causal Relations in Legal Documents with Dependency Syntactic Analysis

Authors: Pablo Gamallo, Patricia Martín-Rodilla, and Beatriz Calderón

Published in: OASIcs, Volume 74, 8th Symposium on Languages, Applications and Technologies (SLATE 2019)


Abstract
This article describes a method for enriching a dependency-based parser with causal connectors. Our specific objective is to identify causal relationships between elementary discourse units in Spanish legal texts. For this purpose, the approach we follow is to search for specific discourse connectives which are taken as causal dependencies relating an effect event (head) with a verbal or nominal cause (dependent). As a result, we turn a specific syntactic parser into a discourse parser aimed at recognizing causal structures.

Cite as

Pablo Gamallo, Patricia Martín-Rodilla, and Beatriz Calderón. Identifying Causal Relations in Legal Documents with Dependency Syntactic Analysis. In 8th Symposium on Languages, Applications and Technologies (SLATE 2019). Open Access Series in Informatics (OASIcs), Volume 74, pp. 20:1-20:6, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{gamallo_et_al:OASIcs.SLATE.2019.20,
  author =	{Gamallo, Pablo and Mart{\'\i}n-Rodilla, Patricia and Calder\'{o}n, Beatriz},
  title =	{{Identifying Causal Relations in Legal Documents with Dependency Syntactic Analysis}},
  booktitle =	{8th Symposium on Languages, Applications and Technologies (SLATE 2019)},
  pages =	{20:1--20:6},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-114-6},
  ISSN =	{2190-6807},
  year =	{2019},
  volume =	{74},
  editor =	{Rodrigues, Ricardo and Janou\v{s}ek, Jan and Ferreira, Lu{\'\i}s and Coheur, Lu{\'\i}sa and Batista, Fernando and Gon\c{c}alo Oliveira, Hugo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2019.20},
  URN =		{urn:nbn:de:0030-drops-108870},
  doi =		{10.4230/OASIcs.SLATE.2019.20},
  annote =	{Keywords: Dependency Analysis, Discourse Analysis, Causal Markers, Legal Documents}
}
Document
Evaluation of Distributional Models with the Outlier Detection Task

Authors: Pablo Gamallo

Published in: OASIcs, Volume 62, 7th Symposium on Languages, Applications and Technologies (SLATE 2018)


Abstract
In this article, we define the outlier detection task and use it to compare neural-based word embeddings with transparent count-based distributional representations. Using the English Wikipedia as text source to train the models, we observed that embeddings outperform count-based representations when their contexts are made up of bag-of-words. However, there are no sharp differences between the two models if the word contexts are defined as syntactic dependencies. In general, syntax-based models tend to perform better than those based on bag-of-words for this specific task. Similar experiments were carried out for Portuguese with similar results. The test datasets we have created for outlier detection task in English and Portuguese are released.

Cite as

Pablo Gamallo. Evaluation of Distributional Models with the Outlier Detection Task. In 7th Symposium on Languages, Applications and Technologies (SLATE 2018). Open Access Series in Informatics (OASIcs), Volume 62, pp. 13:1-13:8, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


Copy BibTex To Clipboard

@InProceedings{gamallo:OASIcs.SLATE.2018.13,
  author =	{Gamallo, Pablo},
  title =	{{Evaluation of Distributional Models with the Outlier Detection Task}},
  booktitle =	{7th Symposium on Languages, Applications and Technologies (SLATE 2018)},
  pages =	{13:1--13:8},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-072-9},
  ISSN =	{2190-6807},
  year =	{2018},
  volume =	{62},
  editor =	{Henriques, Pedro Rangel and Leal, Jos\'{e} Paulo and Leit\~{a}o, Ant\'{o}nio Menezes and Guinovart, Xavier G\'{o}mez},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2018.13},
  URN =		{urn:nbn:de:0030-drops-92717},
  doi =		{10.4230/OASIcs.SLATE.2018.13},
  annote =	{Keywords: distributional semantics, dependency analysis, outlier detection, similarity}
}
Document
Invited Talk
An Overview of Open Information Extraction (Invited Talk)

Authors: Pablo Gamallo

Published in: OASIcs, Volume 38, 3rd Symposium on Languages, Applications and Technologies (2014)


Abstract
Open Information Extraction (OIE) is a recent unsupervised strategy to extract great amounts of basic propositions (verb-based triples) from massive text corpora which scales to Web-size document collections. We will introduce the main properties of this extraction method.

Cite as

Pablo Gamallo. An Overview of Open Information Extraction (Invited Talk). In 3rd Symposium on Languages, Applications and Technologies. Open Access Series in Informatics (OASIcs), Volume 38, pp. 13-16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2014)


Copy BibTex To Clipboard

@InProceedings{gamallo:OASIcs.SLATE.2014.13,
  author =	{Gamallo, Pablo},
  title =	{{An Overview of Open Information Extraction}},
  booktitle =	{3rd Symposium on Languages, Applications and Technologies},
  pages =	{13--16},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-939897-68-2},
  ISSN =	{2190-6807},
  year =	{2014},
  volume =	{38},
  editor =	{Pereira, Maria Jo\~{a}o Varanda and Leal, Jos\'{e} Paulo and Sim\~{o}es, Alberto},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2014.13},
  URN =		{urn:nbn:de:0030-drops-45559},
  doi =		{10.4230/OASIcs.SLATE.2014.13},
  annote =	{Keywords: information extraction, natural language processing}
}
Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail