56 Search Results for "Gon�alo Oliveira, Hugo"


Document
Question Answering over Linked Data with GPT-3

Authors: Bruno Faria, Dylan Perdigão, and Hugo Gonçalo Oliveira

Published in: OASIcs, Volume 113, 12th Symposium on Languages, Applications and Technologies (SLATE 2023)


Abstract
This paper explores GPT-3 for answering natural language questions over Linked Data. Different engines of the model and different approaches are adopted for answering questions in the QALD-9 dataset, namely: zero and few-shot SPARQL generation, as well as fine-tuning in the training portion of the dataset. Answers retrieved by the generated queries and answers generated directly by the model are also compared. Overall results are generally poor, but several insights are provided on using GPT-3 for the proposed task.

Cite as

Bruno Faria, Dylan Perdigão, and Hugo Gonçalo Oliveira. Question Answering over Linked Data with GPT-3. In 12th Symposium on Languages, Applications and Technologies (SLATE 2023). Open Access Series in Informatics (OASIcs), Volume 113, pp. 1:1-1:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{faria_et_al:OASIcs.SLATE.2023.1,
  author =	{Faria, Bruno and Perdig\~{a}o, Dylan and Gon\c{c}alo Oliveira, Hugo},
  title =	{{Question Answering over Linked Data with GPT-3}},
  booktitle =	{12th Symposium on Languages, Applications and Technologies (SLATE 2023)},
  pages =	{1:1--1:15},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-291-4},
  ISSN =	{2190-6807},
  year =	{2023},
  volume =	{113},
  editor =	{Sim\~{o}es, Alberto and Ber\'{o}n, Mario Marcelo and Portela, Filipe},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2023.1},
  URN =		{urn:nbn:de:0030-drops-185155},
  doi =		{10.4230/OASIcs.SLATE.2023.1},
  annote =	{Keywords: SPARQL Generation, Prompt Engineering, Few-Shot Learning, Question Answering, GPT-3}
}
Document
Generating and Ranking Distractors for Multiple-Choice Questions in Portuguese

Authors: Hugo Gonçalo Oliveira, Igor Caetano, Renato Matos, and Hugo Amaro

Published in: OASIcs, Volume 113, 12th Symposium on Languages, Applications and Technologies (SLATE 2023)


Abstract
In the process of multiple-choice question generation, different methods are often considered for distractor acquisition, as an attempt to cover as many questions as possible. Some, however, result in many candidate distractors of variable quality, while only three or four are necessary. We implement some distractor generation methods for Portuguese and propose their combination and ranking with language models. Experimentation results confirm that this increases both coverage and suitability of the selected distractors.

Cite as

Hugo Gonçalo Oliveira, Igor Caetano, Renato Matos, and Hugo Amaro. Generating and Ranking Distractors for Multiple-Choice Questions in Portuguese. In 12th Symposium on Languages, Applications and Technologies (SLATE 2023). Open Access Series in Informatics (OASIcs), Volume 113, pp. 4:1-4:9, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)


Copy BibTex To Clipboard

@InProceedings{goncalooliveira_et_al:OASIcs.SLATE.2023.4,
  author =	{Gon\c{c}alo Oliveira, Hugo and Caetano, Igor and Matos, Renato and Amaro, Hugo},
  title =	{{Generating and Ranking Distractors for Multiple-Choice Questions in Portuguese}},
  booktitle =	{12th Symposium on Languages, Applications and Technologies (SLATE 2023)},
  pages =	{4:1--4:9},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-291-4},
  ISSN =	{2190-6807},
  year =	{2023},
  volume =	{113},
  editor =	{Sim\~{o}es, Alberto and Ber\'{o}n, Mario Marcelo and Portela, Filipe},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2023.4},
  URN =		{urn:nbn:de:0030-drops-185185},
  doi =		{10.4230/OASIcs.SLATE.2023.4},
  annote =	{Keywords: Multiple-Choice Questions, Distractor Generation, Language Models}
}
Document
Question Answering For Toxicological Information Extraction

Authors: Bruno Carlos Luís Ferreira, Hugo Gonçalo Oliveira, Hugo Amaro, Ângela Laranjeiro, and Catarina Silva

Published in: OASIcs, Volume 104, 11th Symposium on Languages, Applications and Technologies (SLATE 2022)


Abstract
Working with large amounts of text data has become hectic and time-consuming. In order to reduce human effort, costs, and make the process more efficient, companies and organizations resort to intelligent algorithms to automate and assist the manual work. This problem is also present in the field of toxicological analysis of chemical substances, where information needs to be searched from multiple documents. That said, we propose an approach that relies on Question Answering for acquiring information from unstructured data, in our case, English PDF documents containing information about physicochemical and toxicological properties of chemical substances. Experimental results confirm that our approach achieves promising results which can be applicable in the business scenario, especially if further revised by humans.

Cite as

Bruno Carlos Luís Ferreira, Hugo Gonçalo Oliveira, Hugo Amaro, Ângela Laranjeiro, and Catarina Silva. Question Answering For Toxicological Information Extraction. In 11th Symposium on Languages, Applications and Technologies (SLATE 2022). Open Access Series in Informatics (OASIcs), Volume 104, pp. 3:1-3:10, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{ferreira_et_al:OASIcs.SLATE.2022.3,
  author =	{Ferreira, Bruno Carlos Lu{\'\i}s and Gon\c{c}alo Oliveira, Hugo and Amaro, Hugo and Laranjeiro, \^{A}ngela and Silva, Catarina},
  title =	{{Question Answering For Toxicological Information Extraction}},
  booktitle =	{11th Symposium on Languages, Applications and Technologies (SLATE 2022)},
  pages =	{3:1--3:10},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-245-7},
  ISSN =	{2190-6807},
  year =	{2022},
  volume =	{104},
  editor =	{Cordeiro, Jo\~{a}o and Pereira, Maria Jo\~{a}o and Rodrigues, Nuno F. and Pais, Sebasti\~{a}o},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2022.3},
  URN =		{urn:nbn:de:0030-drops-167493},
  doi =		{10.4230/OASIcs.SLATE.2022.3},
  annote =	{Keywords: Information Extraction, Question Answering, Transformers, Toxicological Analysis}
}
Document
Analysing Off-The-Shelf Options for Question Answering with Portuguese FAQs

Authors: Hugo Gonçalo Oliveira, Sara Inácio, and Catarina Silva

Published in: OASIcs, Volume 104, 11th Symposium on Languages, Applications and Technologies (SLATE 2022)


Abstract
Following the current interest in developing automatic question answering systems, we analyse alternative approaches for finding suitable answers from a list of Frequently Asked Questions (FAQs), in Portuguese. These rely on different technologies, some more established and others more recent, and are all easily adaptable to new lists of FAQs, on new domains. We analyse the effort required for their configuration, the accuracy of their answers, and the time they take to get such answers. We conclude that traditional Information Retrieval (IR) can be a solution for smaller lists of FAQs, but approaches based on deep neural networks for sentence encoding are at least as reliable and less dependent on the number and complexity of the FAQs. We also contribute with a small dataset of Portuguese FAQs on the domain of telecommunications, which was used in our experiments.

Cite as

Hugo Gonçalo Oliveira, Sara Inácio, and Catarina Silva. Analysing Off-The-Shelf Options for Question Answering with Portuguese FAQs. In 11th Symposium on Languages, Applications and Technologies (SLATE 2022). Open Access Series in Informatics (OASIcs), Volume 104, pp. 19:1-19:11, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{goncalooliveira_et_al:OASIcs.SLATE.2022.19,
  author =	{Gon\c{c}alo Oliveira, Hugo and In\'{a}cio, Sara and Silva, Catarina},
  title =	{{Analysing Off-The-Shelf Options for Question Answering with Portuguese FAQs}},
  booktitle =	{11th Symposium on Languages, Applications and Technologies (SLATE 2022)},
  pages =	{19:1--19:11},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-245-7},
  ISSN =	{2190-6807},
  year =	{2022},
  volume =	{104},
  editor =	{Cordeiro, Jo\~{a}o and Pereira, Maria Jo\~{a}o and Rodrigues, Nuno F. and Pais, Sebasti\~{a}o},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2022.19},
  URN =		{urn:nbn:de:0030-drops-167652},
  doi =		{10.4230/OASIcs.SLATE.2022.19},
  annote =	{Keywords: Natural Language Processing, Portuguese, Question Answering, FAQs, Information Retrieval, Sentence Encoding, Transformers}
}
Document
On the Utility of Word Embeddings for Enriching OpenWordNet-PT

Authors: Hugo Gonçalo Oliveira, Fredson Silva de Souza Aguiar, and Alexandre Rademaker

Published in: OASIcs, Volume 93, 3rd Conference on Language, Data and Knowledge (LDK 2021)


Abstract
The maintenance of wordnets and lexical knwoledge bases typically relies on time-consuming manual effort. In order to minimise this issue, we propose the exploitation of models of distributional semantics, namely word embeddings learned from corpora, in the automatic identification of relation instances missing in a wordnet. Analogy-solving methods are first used for learning a set of relations from analogy tests focused on each relation. Despite their low accuracy, we noted that a portion of the top-given answers are good suggestions of relation instances that could be included in the wordnet. This procedure is applied to the enrichment of OpenWordNet-PT, a public Portuguese wordnet. Relations are learned from data acquired from this resource, and illustrative examples are provided. Results are promising for accelerating the identification of missing relation instances, as we estimate that about 17% of the potential suggestions are good, a proportion that almost doubles if some are automatically invalidated.

Cite as

Hugo Gonçalo Oliveira, Fredson Silva de Souza Aguiar, and Alexandre Rademaker. On the Utility of Word Embeddings for Enriching OpenWordNet-PT. In 3rd Conference on Language, Data and Knowledge (LDK 2021). Open Access Series in Informatics (OASIcs), Volume 93, pp. 21:1-21:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{goncalooliveira_et_al:OASIcs.LDK.2021.21,
  author =	{Gon\c{c}alo Oliveira, Hugo and Aguiar, Fredson Silva de Souza and Rademaker, Alexandre},
  title =	{{On the Utility of Word Embeddings for Enriching OpenWordNet-PT}},
  booktitle =	{3rd Conference on Language, Data and Knowledge (LDK 2021)},
  pages =	{21:1--21:13},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-199-3},
  ISSN =	{2190-6807},
  year =	{2021},
  volume =	{93},
  editor =	{Gromann, Dagmar and S\'{e}rasset, Gilles and Declerck, Thierry and McCrae, John P. and Gracia, Jorge and Bosque-Gil, Julia and Bobillo, Fernando and Heinisch, Barbara},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.LDK.2021.21},
  URN =		{urn:nbn:de:0030-drops-145578},
  doi =		{10.4230/OASIcs.LDK.2021.21},
  annote =	{Keywords: word embeddings, lexical resources, wordnet, analogy tests}
}
Document
MUAHAH: Taking the Most out of Simple Conversational Agents

Authors: Leonor Llansol, João Santos, Luís Duarte, José Santos, Mariana Gaspar, Ana Alves, Hugo Gonçalo Oliveira, and Luísa Coheur

Published in: OASIcs, Volume 94, 10th Symposium on Languages, Applications and Technologies (SLATE 2021)


Abstract
Dialog engines based on multi-agent architectures usually select a single agent, deemed to be the most suitable for a given scenario or for responding to a specific request, and disregard the answers from all of the other available agents. In this work, we present a multi-agent plug-and-play architecture that: (i) enables the integration of different agents; (ii) includes a decision maker module, responsible for selecting a suitable answer out of the responses of different agents. As usual, a single agent can be chosen to provide the final answer, but the latter can also be obtained from the responses of several agents, according to a voting scheme. We also describe three case studies in which we test several agents and decision making strategies; and show how new agents and a new decision strategy can be easily plugged in and take advantage of this platform in different ways. Experimentation also confirms that considering several agents contributes to better responses.

Cite as

Leonor Llansol, João Santos, Luís Duarte, José Santos, Mariana Gaspar, Ana Alves, Hugo Gonçalo Oliveira, and Luísa Coheur. MUAHAH: Taking the Most out of Simple Conversational Agents. In 10th Symposium on Languages, Applications and Technologies (SLATE 2021). Open Access Series in Informatics (OASIcs), Volume 94, pp. 7:1-7:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{llansol_et_al:OASIcs.SLATE.2021.7,
  author =	{Llansol, Leonor and Santos, Jo\~{a}o and Duarte, Lu{\'\i}s and Santos, Jos\'{e} and Gaspar, Mariana and Alves, Ana and Gon\c{c}alo Oliveira, Hugo and Coheur, Lu{\'\i}sa},
  title =	{{MUAHAH: Taking the Most out of Simple Conversational Agents}},
  booktitle =	{10th Symposium on Languages, Applications and Technologies (SLATE 2021)},
  pages =	{7:1--7:12},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-202-0},
  ISSN =	{2190-6807},
  year =	{2021},
  volume =	{94},
  editor =	{Queir\'{o}s, Ricardo and Pinto, M\'{a}rio and Sim\~{o}es, Alberto and Portela, Filipe and Pereira, Maria Jo\~{a}o},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2021.7},
  URN =		{urn:nbn:de:0030-drops-144248},
  doi =		{10.4230/OASIcs.SLATE.2021.7},
  annote =	{Keywords: Dialog systems, question answering, information retrieval, multi-agent}
}
Document
Exploring Different Methods for Solving Analogies with Portuguese Word Embeddings

Authors: Tiago Sousa, Hugo Gonçalo Oliveira, and Ana Alves

Published in: OASIcs, Volume 83, 9th Symposium on Languages, Applications and Technologies (SLATE 2020)


Abstract
A common way of assessing static word embeddings is to use them for solving analogies of the kind "what is to king as man is to woman?". For this purpose, the vector offset method (king - man + woman = queen), also known as 3CosAdd, has been effectively used for solving analogies and assessing different models of word embeddings in different languages. However, some researchers pointed out that this method is not the most effective for this purpose. Following this, we tested alternative analogy solving methods (3CosMul, 3CosAvg, LRCos) in Portuguese word embeddings and confirmed the previous statement. Specifically, those methods are used to answer the Portuguese version of the Google Analogy Test, dubbed LX-4WAnalogies, which covers syntactic and semantic analogies of different kinds. We discuss the accuracy of different methods applied to different models of embeddings and take some conclusions. Indeed, all methods outperform 3CosAdd, and the best performance is consistently achieved with LRCos, in GloVe.

Cite as

Tiago Sousa, Hugo Gonçalo Oliveira, and Ana Alves. Exploring Different Methods for Solving Analogies with Portuguese Word Embeddings. In 9th Symposium on Languages, Applications and Technologies (SLATE 2020). Open Access Series in Informatics (OASIcs), Volume 83, pp. 9:1-9:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{sousa_et_al:OASIcs.SLATE.2020.9,
  author =	{Sousa, Tiago and Gon\c{c}alo Oliveira, Hugo and Alves, Ana},
  title =	{{Exploring Different Methods for Solving Analogies with Portuguese Word Embeddings}},
  booktitle =	{9th Symposium on Languages, Applications and Technologies (SLATE 2020)},
  pages =	{9:1--9:14},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-165-8},
  ISSN =	{2190-6807},
  year =	{2020},
  volume =	{83},
  editor =	{Sim\~{o}es, Alberto and Henriques, Pedro Rangel and Queir\'{o}s, Ricardo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2020.9},
  URN =		{urn:nbn:de:0030-drops-130229},
  doi =		{10.4230/OASIcs.SLATE.2020.9},
  annote =	{Keywords: analogies, word embeddings, semantic relations, syntactic relations, Portuguese}
}
Document
Short Paper
Assessing Factoid Question-Answer Generation for Portuguese (Short Paper)

Authors: João Ferreira, Ricardo Rodrigues, and Hugo Gonçalo Oliveira

Published in: OASIcs, Volume 83, 9th Symposium on Languages, Applications and Technologies (SLATE 2020)


Abstract
We present work on the automatic generation of question-answer pairs in Portuguese, useful, for instance, for populating the knowledge-base of question-answering systems. This includes: (i) a new corpus of close to 600 factoid sentences, manually created from an existing corpus of questions and answers, used as our benchmark; (ii) two approaches for the automatic generation of question-answer pairs, which can be seen as baselines; (iii) results of those approaches in the corpus.

Cite as

João Ferreira, Ricardo Rodrigues, and Hugo Gonçalo Oliveira. Assessing Factoid Question-Answer Generation for Portuguese (Short Paper). In 9th Symposium on Languages, Applications and Technologies (SLATE 2020). Open Access Series in Informatics (OASIcs), Volume 83, pp. 16:1-16:9, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{ferreira_et_al:OASIcs.SLATE.2020.16,
  author =	{Ferreira, Jo\~{a}o and Rodrigues, Ricardo and Gon\c{c}alo Oliveira, Hugo},
  title =	{{Assessing Factoid Question-Answer Generation for Portuguese}},
  booktitle =	{9th Symposium on Languages, Applications and Technologies (SLATE 2020)},
  pages =	{16:1--16:9},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-165-8},
  ISSN =	{2190-6807},
  year =	{2020},
  volume =	{83},
  editor =	{Sim\~{o}es, Alberto and Henriques, Pedro Rangel and Queir\'{o}s, Ricardo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2020.16},
  URN =		{urn:nbn:de:0030-drops-130298},
  doi =		{10.4230/OASIcs.SLATE.2020.16},
  annote =	{Keywords: Question-Answer Generation, Corpus, NLP, Portuguese}
}
Document
Complete Volume
OASIcs, Volume 74, SLATE'19, Complete Volume

Authors: Ricardo Rodrigues, Jan Janoušek, Luís Ferreira, Luísa Coheur, Fernando Batista, and Hugo Gonçalo Oliveira

Published in: OASIcs, Volume 74, 8th Symposium on Languages, Applications and Technologies (SLATE 2019)


Abstract
OASIcs, Volume 74, SLATE'19, Complete Volume

Cite as

8th Symposium on Languages, Applications and Technologies (SLATE 2019). Open Access Series in Informatics (OASIcs), Volume 74, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@Proceedings{rodrigues_et_al:OASIcs.SLATE.2019,
  title =	{{OASIcs, Volume 74, SLATE'19, Complete Volume}},
  booktitle =	{8th Symposium on Languages, Applications and Technologies (SLATE 2019)},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-114-6},
  ISSN =	{2190-6807},
  year =	{2019},
  volume =	{74},
  editor =	{Rodrigues, Ricardo and Janou\v{s}ek, Jan and Ferreira, Lu{\'\i}s and Coheur, Lu{\'\i}sa and Batista, Fernando and Gon\c{c}alo Oliveira, Hugo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2019},
  URN =		{urn:nbn:de:0030-drops-109008},
  doi =		{10.4230/OASIcs.SLATE.2019},
  annote =	{Keywords: Computing methodologies, Natural language processing, Software and its engineering, Compilers; Information systems, World Wide Web}
}
Document
Front Matter
Front Matter, Table of Contents, Preface, Conference Organization

Authors: Ricardo Rodrigues, Jan Janoušek, Luís Ferreira, Luísa Coheur, Fernando Batista, and Hugo Gonçalo Oliveira

Published in: OASIcs, Volume 74, 8th Symposium on Languages, Applications and Technologies (SLATE 2019)


Abstract
Front Matter, Table of Contents, Preface, Conference Organization

Cite as

8th Symposium on Languages, Applications and Technologies (SLATE 2019). Open Access Series in Informatics (OASIcs), Volume 74, pp. 0:i-0:xviii, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{rodrigues_et_al:OASIcs.SLATE.2019.0,
  author =	{Rodrigues, Ricardo and Janou\v{s}ek, Jan and Ferreira, Lu{\'\i}s and Coheur, Lu{\'\i}sa and Batista, Fernando and Gon\c{c}alo Oliveira, Hugo},
  title =	{{Front Matter, Table of Contents, Preface, Conference Organization}},
  booktitle =	{8th Symposium on Languages, Applications and Technologies (SLATE 2019)},
  pages =	{0:i--0:xviii},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-114-6},
  ISSN =	{2190-6807},
  year =	{2019},
  volume =	{74},
  editor =	{Rodrigues, Ricardo and Janou\v{s}ek, Jan and Ferreira, Lu{\'\i}s and Coheur, Lu{\'\i}sa and Batista, Fernando and Gon\c{c}alo Oliveira, Hugo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2019.0},
  URN =		{urn:nbn:de:0030-drops-108679},
  doi =		{10.4230/OASIcs.SLATE.2019.0},
  annote =	{Keywords: Front Matter, Table of Contents, Preface, Conference Organization}
}
Document
Graph-of-Entity: A Model for Combined Data Representation and Retrieval

Authors: José Devezas, Carla Lopes, and Sérgio Nunes

Published in: OASIcs, Volume 74, 8th Symposium on Languages, Applications and Technologies (SLATE 2019)


Abstract
Managing large volumes of digital documents along with the information they contain, or are associated with, can be challenging. As systems become more intelligent, it increasingly makes sense to power retrieval through all available data, where every lead makes it easier to reach relevant documents or entities. Modern search is heavily powered by structured knowledge, but users still query using keywords or, at the very best, telegraphic natural language. As search becomes increasingly dependent on the integration of text and knowledge, novel approaches for a unified representation of combined data present the opportunity to unlock new ranking strategies. We tackle entity-oriented search using graph-based approaches for representation and retrieval. In particular, we propose the graph-of-entity, a novel approach for indexing combined data, where terms, entities and their relations are jointly represented. We compare the graph-of-entity with the graph-of-word, a text-only model, verifying that, overall, it does not yet achieve a better performance, despite obtaining a higher precision. Our assessment was based on a small subset of the INEX 2009 Wikipedia Collection, created from a sample of 10 topics and respectively judged documents. The offline evaluation we do here is complementary to its counterpart from TREC 2017 OpenSearch track, where, during our participation, we had assessed graph-of-entity in an online setting, through team-draft interleaving.

Cite as

José Devezas, Carla Lopes, and Sérgio Nunes. Graph-of-Entity: A Model for Combined Data Representation and Retrieval. In 8th Symposium on Languages, Applications and Technologies (SLATE 2019). Open Access Series in Informatics (OASIcs), Volume 74, pp. 1:1-1:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{devezas_et_al:OASIcs.SLATE.2019.1,
  author =	{Devezas, Jos\'{e} and Lopes, Carla and Nunes, S\'{e}rgio},
  title =	{{Graph-of-Entity: A Model for Combined Data Representation and Retrieval}},
  booktitle =	{8th Symposium on Languages, Applications and Technologies (SLATE 2019)},
  pages =	{1:1--1:14},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-114-6},
  ISSN =	{2190-6807},
  year =	{2019},
  volume =	{74},
  editor =	{Rodrigues, Ricardo and Janou\v{s}ek, Jan and Ferreira, Lu{\'\i}s and Coheur, Lu{\'\i}sa and Batista, Fernando and Gon\c{c}alo Oliveira, Hugo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2019.1},
  URN =		{urn:nbn:de:0030-drops-108686},
  doi =		{10.4230/OASIcs.SLATE.2019.1},
  annote =	{Keywords: Entity-oriented search, graph-based models, collection-based graph}
}
Document
Using Lucene for Developing a Question-Answering Agent in Portuguese

Authors: Hugo Gonçalo Oliveira, Ricardo Filipe, Ricardo Rodrigues, and Ana Alves

Published in: OASIcs, Volume 74, 8th Symposium on Languages, Applications and Technologies (SLATE 2019)


Abstract
Given the limitations of available platforms for creating conversational agents, and that a question-answering agent suffices in many scenarios, we take advantage of the Information Retrieval library Lucene for developing such an agent for Portuguese. The solution described answers natural language questions based on an indexed list of FAQs. Its adaptation to different domains is a matter of changing the underlying list. Different configurations of this solution, mostly on the language analysis level, resulted in different search strategies, which were tested for answering questions about the economic activity in Portugal. In addition to comparing the different search strategies, we concluded that, towards better answers, it is fruitful to combine the results of different strategies with a voting method.

Cite as

Hugo Gonçalo Oliveira, Ricardo Filipe, Ricardo Rodrigues, and Ana Alves. Using Lucene for Developing a Question-Answering Agent in Portuguese. In 8th Symposium on Languages, Applications and Technologies (SLATE 2019). Open Access Series in Informatics (OASIcs), Volume 74, pp. 2:1-2:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{goncalooliveira_et_al:OASIcs.SLATE.2019.2,
  author =	{Gon\c{c}alo Oliveira, Hugo and Filipe, Ricardo and Rodrigues, Ricardo and Alves, Ana},
  title =	{{Using Lucene for Developing a Question-Answering Agent in Portuguese}},
  booktitle =	{8th Symposium on Languages, Applications and Technologies (SLATE 2019)},
  pages =	{2:1--2:14},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-114-6},
  ISSN =	{2190-6807},
  year =	{2019},
  volume =	{74},
  editor =	{Rodrigues, Ricardo and Janou\v{s}ek, Jan and Ferreira, Lu{\'\i}s and Coheur, Lu{\'\i}sa and Batista, Fernando and Gon\c{c}alo Oliveira, Hugo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2019.2},
  URN =		{urn:nbn:de:0030-drops-108692},
  doi =		{10.4230/OASIcs.SLATE.2019.2},
  annote =	{Keywords: information retrieval, question answering, natural language interface, natural language processing, natural language understanding}
}
Document
Tracing Naming Semantics in Unit Tests of Popular Github Android Projects

Authors: Matej Madeja and Jaroslav Porubän

Published in: OASIcs, Volume 74, 8th Symposium on Languages, Applications and Technologies (SLATE 2019)


Abstract
The tests are so closely linked to the source code that we consider them up-to-date documentation. Developers are aware of recommended naming conventions and other best practices that should be used to write tests. In this paper we focus on how the developers test in practice and what conventions they use. For the analysis 5 very popular Android projects from Github were selected. The results show that 49 % of tests contain full and 76 % of tests contain a partial unit under test (UUT) method name in their name. Further, there was observed that UUT was only rarely tested by multiple test classes and thus in cases when the tester wanted to distinguish the way he or she worked with the tested object. The analysis of this paper shows that the word "test" in the test title is not a reliable metric for identifying the test. Apart from assertions, the developers use statements like verify, try-catch and throw exception to verify the correctness of UUT functionality. At the same time it was found out that the test titles contained keywords which could lead to the identification of UUT, use case of test or data used for test. It was also found out that the words in the test title were very often found in its body and in a smaller amount in UUT body which indicated the use of similar vocabulary in tests and UUT.

Cite as

Matej Madeja and Jaroslav Porubän. Tracing Naming Semantics in Unit Tests of Popular Github Android Projects. In 8th Symposium on Languages, Applications and Technologies (SLATE 2019). Open Access Series in Informatics (OASIcs), Volume 74, pp. 3:1-3:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{madeja_et_al:OASIcs.SLATE.2019.3,
  author =	{Madeja, Matej and Porub\"{a}n, Jaroslav},
  title =	{{Tracing Naming Semantics in Unit Tests of Popular Github Android Projects}},
  booktitle =	{8th Symposium on Languages, Applications and Technologies (SLATE 2019)},
  pages =	{3:1--3:13},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-114-6},
  ISSN =	{2190-6807},
  year =	{2019},
  volume =	{74},
  editor =	{Rodrigues, Ricardo and Janou\v{s}ek, Jan and Ferreira, Lu{\'\i}s and Coheur, Lu{\'\i}sa and Batista, Fernando and Gon\c{c}alo Oliveira, Hugo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2019.3},
  URN =		{urn:nbn:de:0030-drops-108705},
  doi =		{10.4230/OASIcs.SLATE.2019.3},
  annote =	{Keywords: unit tests, android, real testing practices, unit tests, program comprehension}
}
Document
Robust Phoneme Recognition with Little Data

Authors: Christopher Dane Shulby, Martha Dais Ferreira, Rodrigo F. de Mello, and Sandra Maria Aluisio

Published in: OASIcs, Volume 74, 8th Symposium on Languages, Applications and Technologies (SLATE 2019)


Abstract
A common belief in the community is that deep learning requires large datasets to be effective. We show that with careful parameter selection, deep feature extraction can be applied even to small datasets.We also explore exactly how much data is necessary to guarantee learning by convergence analysis and calculating the shattering coefficient for the algorithms used. Another problem is that state-of-the-art results are rarely reproducible because they use proprietary datasets, pretrained networks and/or weight initializations from other larger networks. We present a two-fold novelty for this situation where a carefully designed CNN architecture, together with a knowledge-driven classifier achieves nearly state-of-the-art phoneme recognition results with absolutely no pretraining or external weight initialization. We also beat the best replication study of the state of the art with a 28% FER. More importantly, we are able to achieve transparent, reproducible frame-level accuracy and, additionally, perform a convergence analysis to show the generalization capacity of the model providing statistical evidence that our results are not obtained by chance. Furthermore, we show how algorithms with strong learning guarantees can not only benefit from raw data extraction but contribute with more robust results.

Cite as

Christopher Dane Shulby, Martha Dais Ferreira, Rodrigo F. de Mello, and Sandra Maria Aluisio. Robust Phoneme Recognition with Little Data. In 8th Symposium on Languages, Applications and Technologies (SLATE 2019). Open Access Series in Informatics (OASIcs), Volume 74, pp. 4:1-4:11, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{shulby_et_al:OASIcs.SLATE.2019.4,
  author =	{Shulby, Christopher Dane and Ferreira, Martha Dais and de Mello, Rodrigo F. and Aluisio, Sandra Maria},
  title =	{{Robust Phoneme Recognition with Little Data}},
  booktitle =	{8th Symposium on Languages, Applications and Technologies (SLATE 2019)},
  pages =	{4:1--4:11},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-114-6},
  ISSN =	{2190-6807},
  year =	{2019},
  volume =	{74},
  editor =	{Rodrigues, Ricardo and Janou\v{s}ek, Jan and Ferreira, Lu{\'\i}s and Coheur, Lu{\'\i}sa and Batista, Fernando and Gon\c{c}alo Oliveira, Hugo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2019.4},
  URN =		{urn:nbn:de:0030-drops-108715},
  doi =		{10.4230/OASIcs.SLATE.2019.4},
  annote =	{Keywords: feature extraction, acoustic modeling, phoneme recognition, statistical learning theory}
}
Document
Towards European Portuguese Conversational Assistants for Smart Homes

Authors: Maksym Ketsmur, António Teixeira, Nuno Almeida, and Samuel Silva

Published in: OASIcs, Volume 74, 8th Symposium on Languages, Applications and Technologies (SLATE 2019)


Abstract
Nowadays, smart environments, such as Smart Homes, are becoming a reality, due to the access to a wide variety of smart devices at a low cost. These devices are connected to the home network and inhabitants can interact with them using smartphones, tablets and smart assistants, a feature with rising popularity. The diversity of devices, the user’s expectations regarding Smart Homes, and assistants' requirements pose several challenges. In this context, a Smart Home Assistant capable of conversation and device integration can be a valuable help to the inhabitants, not only for smart device control, but also to obtain valuable information and have a broader picture of how the house and its devices behave. This paper presents the current stage of development of one such assistant, targeting European Portuguese, not only supporting the control of home devices, but also providing a potentially more natural way to access a variety of information regarding the home and its devices. The development has been made in the scope of Smart Green Homes (SGH) project.

Cite as

Maksym Ketsmur, António Teixeira, Nuno Almeida, and Samuel Silva. Towards European Portuguese Conversational Assistants for Smart Homes. In 8th Symposium on Languages, Applications and Technologies (SLATE 2019). Open Access Series in Informatics (OASIcs), Volume 74, pp. 5:1-5:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2019)


Copy BibTex To Clipboard

@InProceedings{ketsmur_et_al:OASIcs.SLATE.2019.5,
  author =	{Ketsmur, Maksym and Teixeira, Ant\'{o}nio and Almeida, Nuno and Silva, Samuel},
  title =	{{Towards European Portuguese Conversational Assistants for Smart Homes}},
  booktitle =	{8th Symposium on Languages, Applications and Technologies (SLATE 2019)},
  pages =	{5:1--5:14},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-114-6},
  ISSN =	{2190-6807},
  year =	{2019},
  volume =	{74},
  editor =	{Rodrigues, Ricardo and Janou\v{s}ek, Jan and Ferreira, Lu{\'\i}s and Coheur, Lu{\'\i}sa and Batista, Fernando and Gon\c{c}alo Oliveira, Hugo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2019.5},
  URN =		{urn:nbn:de:0030-drops-108725},
  doi =		{10.4230/OASIcs.SLATE.2019.5},
  annote =	{Keywords: Smart Homes, Conversational Assistants, Ontology}
}
  • Refine by Author
  • 21 Gonçalo Oliveira, Hugo
  • 8 Rodrigues, Ricardo
  • 5 Coheur, Luísa
  • 5 Leal, José Paulo
  • 4 Alves, Ana
  • Show More...

  • Refine by Classification
  • 14 Computing methodologies → Natural language processing
  • 3 Computing methodologies → Language resources
  • 3 Information systems → Ontologies
  • 3 Software and its engineering → Domain specific languages
  • 2 Applied computing → Extensible Markup Language (XML)
  • Show More...

  • Refine by Keyword
  • 6 Portuguese
  • 3 Ontology
  • 3 Question Answering
  • 3 natural language processing
  • 3 word embeddings
  • Show More...

  • Refine by Type
  • 56 document

  • Refine by Publication Year
  • 25 2019
  • 17 2016
  • 2 2014
  • 2 2017
  • 2 2018
  • Show More...

Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail