OASIcs, Volume 94

10th Symposium on Languages, Applications and Technologies (SLATE 2021)



Thumbnail PDF

Event

SLATE 2021, July 1-2, 2021, Vila do Conde/Póvoa de Varzim, Portugal

Editors

Ricardo Queirós
  • Escola Superior de Media Artes e Design, Politécnico do Porto, Portugal
Mário Pinto
  • Escola Superior de Media Artes e Design, Politécnico do Porto, Portugal
Alberto Simões
  • Instituto Politécnico do Cávado e do Ave, Portugal
Filipe Portela
  • Universidade do Minho, Portugal
Maria João Pereira
  • Instituto Politécnico de Bragança, Portugal

Publication Details

  • published at: 2021-08-10
  • Publisher: Schloss Dagstuhl – Leibniz-Zentrum für Informatik
  • ISBN: 978-3-95977-202-0
  • DBLP: db/conf/slate/slate2021

Access Numbers

Documents

No documents found matching your filter selection.
Document
Complete Volume
OASIcs, Volume 94, SLATE 2021, Complete Volume

Authors: Ricardo Queirós, Mário Pinto, Alberto Simões, Filipe Portela, and Maria João Pereira


Abstract
OASIcs, Volume 94, SLATE 2021, Complete Volume

Cite as

10th Symposium on Languages, Applications and Technologies (SLATE 2021). Open Access Series in Informatics (OASIcs), Volume 94, pp. 1-260, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@Proceedings{queiros_et_al:OASIcs.SLATE.2021,
  title =	{{OASIcs, Volume 94, SLATE 2021, Complete Volume}},
  booktitle =	{10th Symposium on Languages, Applications and Technologies (SLATE 2021)},
  pages =	{1--260},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-202-0},
  ISSN =	{2190-6807},
  year =	{2021},
  volume =	{94},
  editor =	{Queir\'{o}s, Ricardo and Pinto, M\'{a}rio and Sim\~{o}es, Alberto and Portela, Filipe and Pereira, Maria Jo\~{a}o},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2021},
  URN =		{urn:nbn:de:0030-drops-144165},
  doi =		{10.4230/OASIcs.SLATE.2021},
  annote =	{Keywords: OASIcs, Volume 94, SLATE 2021, Complete Volume}
}
Document
Front Matter
Front Matter, Table of Contents, Preface, Conference Organization

Authors: Ricardo Queirós, Mário Pinto, Alberto Simões, Filipe Portela, and Maria João Pereira


Abstract
Front Matter, Table of Contents, Preface, Conference Organization

Cite as

10th Symposium on Languages, Applications and Technologies (SLATE 2021). Open Access Series in Informatics (OASIcs), Volume 94, pp. 0:i-0:xvi, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{queiros_et_al:OASIcs.SLATE.2021.0,
  author =	{Queir\'{o}s, Ricardo and Pinto, M\'{a}rio and Sim\~{o}es, Alberto and Portela, Filipe and Pereira, Maria Jo\~{a}o},
  title =	{{Front Matter, Table of Contents, Preface, Conference Organization}},
  booktitle =	{10th Symposium on Languages, Applications and Technologies (SLATE 2021)},
  pages =	{0:i--0:xvi},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-202-0},
  ISSN =	{2190-6807},
  year =	{2021},
  volume =	{94},
  editor =	{Queir\'{o}s, Ricardo and Pinto, M\'{a}rio and Sim\~{o}es, Alberto and Portela, Filipe and Pereira, Maria Jo\~{a}o},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2021.0},
  URN =		{urn:nbn:de:0030-drops-144171},
  doi =		{10.4230/OASIcs.SLATE.2021.0},
  annote =	{Keywords: Front Matter, Table of Contents, Preface, Conference Organization}
}
Document
Invited Talk
Natural and Artificial Intelligence; Natural and Artificial Language (Invited Talk)

Authors: Diana Santos


Abstract
This text starts by discussing what it means to be intelligent for humans and machines, what is the purpose of language, and how is human language fundamentally different from artificial languages. It presents the issue of values as one inescapable property of human language, and of human categorization in general, after reviewing five distinctive caracteristics of natural language. Then it proceeds to discuss static word embeddings, raising two questions: is the wisdom of the crowd an appropriate justification for using the underlying large text collections? And have the differences between languages been taken into account when intrinsically evaluating Portuguese word embeddings?

Cite as

Diana Santos. Natural and Artificial Intelligence; Natural and Artificial Language (Invited Talk). In 10th Symposium on Languages, Applications and Technologies (SLATE 2021). Open Access Series in Informatics (OASIcs), Volume 94, pp. 1:1-1:11, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{santos:OASIcs.SLATE.2021.1,
  author =	{Santos, Diana},
  title =	{{Natural and Artificial Intelligence; Natural and Artificial Language}},
  booktitle =	{10th Symposium on Languages, Applications and Technologies (SLATE 2021)},
  pages =	{1:1--1:11},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-202-0},
  ISSN =	{2190-6807},
  year =	{2021},
  volume =	{94},
  editor =	{Queir\'{o}s, Ricardo and Pinto, M\'{a}rio and Sim\~{o}es, Alberto and Portela, Filipe and Pereira, Maria Jo\~{a}o},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2021.1},
  URN =		{urn:nbn:de:0030-drops-144181},
  doi =		{10.4230/OASIcs.SLATE.2021.1},
  annote =	{Keywords: Artificial Intelligence, Natural Language Processing}
}
Document
Derzis: A Path Aware Linked Data Crawler

Authors: André Fernandes dos Santos and José Paulo Leal


Abstract
Consuming Semantic Web data presents several challenges, from the number of datasets it is composed of, to the (very) large size of some of those datasets and the uncertain availability of querying endpoints. According to its core principles, accessing linked data can be done simply by dereferencing the IRIs of RDF resources. This is a light alternative both for clients and servers when compared to dataset dumps or SPARQL endpoints. The linked data interface does not support complex querying, but using it recursively may suffice to gather information about RDF resources, or to extract the relevant sub-graph which can then be processed and queried using other methods. We present Derzis, an open source semantic web crawler capable of traversing the linked data cloud starting from a set of seed resources. Derzis maintains information about the paths followed while crawling, which allows to define property path-based restrictions to the crawling frontier.

Cite as

André Fernandes dos Santos and José Paulo Leal. Derzis: A Path Aware Linked Data Crawler. In 10th Symposium on Languages, Applications and Technologies (SLATE 2021). Open Access Series in Informatics (OASIcs), Volume 94, pp. 2:1-2:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{santos_et_al:OASIcs.SLATE.2021.2,
  author =	{Santos, Andr\'{e} Fernandes dos and Leal, Jos\'{e} Paulo},
  title =	{{Derzis: A Path Aware Linked Data Crawler}},
  booktitle =	{10th Symposium on Languages, Applications and Technologies (SLATE 2021)},
  pages =	{2:1--2:12},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-202-0},
  ISSN =	{2190-6807},
  year =	{2021},
  volume =	{94},
  editor =	{Queir\'{o}s, Ricardo and Pinto, M\'{a}rio and Sim\~{o}es, Alberto and Portela, Filipe and Pereira, Maria Jo\~{a}o},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2021.2},
  URN =		{urn:nbn:de:0030-drops-144198},
  doi =		{10.4230/OASIcs.SLATE.2021.2},
  annote =	{Keywords: Semantic web, linked open data, RDF, crawler}
}
Document
Major Minors - Ontological Representation of Minorities by Newspapers

Authors: Paulo Jorge Pereira Martins, Leandro José Abreu Dias Costa, and José Carlos Ramalho


Abstract
The stigma associated with certain minorities has changed throughout the years, yet there’s no central data repository that enables a concrete tracking of this representation. Published articles on renowned newspapers are a way of determining the public perception on this subject, mainly digital newspapers, being it through the media representation (text and photo illustrations) or user comments. The present paper seeks to showcase a project that attempts to fulfill that shortage of data by providing a repository in the form of an ontology: RDF triplestores composing a semantic database (W3C standards for Semantic Web). This open-source project aims to be a research tool for mapping and studying the representation of minority groups in a Portuguese journalistic context over the course of two decades.

Cite as

Paulo Jorge Pereira Martins, Leandro José Abreu Dias Costa, and José Carlos Ramalho. Major Minors - Ontological Representation of Minorities by Newspapers. In 10th Symposium on Languages, Applications and Technologies (SLATE 2021). Open Access Series in Informatics (OASIcs), Volume 94, pp. 3:1-3:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{martins_et_al:OASIcs.SLATE.2021.3,
  author =	{Martins, Paulo Jorge Pereira and Costa, Leandro Jos\'{e} Abreu Dias and Ramalho, Jos\'{e} Carlos},
  title =	{{Major Minors - Ontological Representation of Minorities by Newspapers}},
  booktitle =	{10th Symposium on Languages, Applications and Technologies (SLATE 2021)},
  pages =	{3:1--3:13},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-202-0},
  ISSN =	{2190-6807},
  year =	{2021},
  volume =	{94},
  editor =	{Queir\'{o}s, Ricardo and Pinto, M\'{a}rio and Sim\~{o}es, Alberto and Portela, Filipe and Pereira, Maria Jo\~{a}o},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2021.3},
  URN =		{urn:nbn:de:0030-drops-144201},
  doi =		{10.4230/OASIcs.SLATE.2021.3},
  annote =	{Keywords: RDF, OWL, Ontologies, Knowledge Representation, Minorities}
}
Document
Lyntax - A grammar-Based Tool for Linguistics

Authors: Manuel Gouveia Carneiro de Sousa, Maria João Varanda Pereira, and Pedro Rangel Henriques


Abstract
This paper is focused on using the formalism of attribute grammars to create a tool that allows Linguistic teachers to construct automatically their own processors totally adapted to each linguistic exercise. The system developed, named Lyntax, is a compiler for a domain specific language which intends to enable the teacher to specify different kinds of sentence structures, and then, ask the student to test his own sentences against those structures. The processor Lyntax validates the grammar (DSL program) written by the teacher, generating a processor every time the student defines a new sentence. For that ANTLR is used in both steps, generating not only the specialized processor but also the visualization of the syntax tree for analysis purposes. An interface that supports the specification of the language was built, also allowing the use of the processor and the generation of the specific grammar, abstracting the user of any calculations.

Cite as

Manuel Gouveia Carneiro de Sousa, Maria João Varanda Pereira, and Pedro Rangel Henriques. Lyntax - A grammar-Based Tool for Linguistics. In 10th Symposium on Languages, Applications and Technologies (SLATE 2021). Open Access Series in Informatics (OASIcs), Volume 94, pp. 4:1-4:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{desousa_et_al:OASIcs.SLATE.2021.4,
  author =	{de Sousa, Manuel Gouveia Carneiro and Pereira, Maria Jo\~{a}o Varanda and Henriques, Pedro Rangel},
  title =	{{Lyntax - A grammar-Based Tool for Linguistics}},
  booktitle =	{10th Symposium on Languages, Applications and Technologies (SLATE 2021)},
  pages =	{4:1--4:13},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-202-0},
  ISSN =	{2190-6807},
  year =	{2021},
  volume =	{94},
  editor =	{Queir\'{o}s, Ricardo and Pinto, M\'{a}rio and Sim\~{o}es, Alberto and Portela, Filipe and Pereira, Maria Jo\~{a}o},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2021.4},
  URN =		{urn:nbn:de:0030-drops-144213},
  doi =		{10.4230/OASIcs.SLATE.2021.4},
  annote =	{Keywords: Attribute Grammars, Linguistic Rules, Pedagogical Linguistic Tools}
}
Document
Programming Exercises Interoperability: The Case of a Non-Picky Consumer

Authors: Ricardo Queirós, José Carlos Paiva, and José Paulo Leal


Abstract
Problem-solving is considered one of the most important skills to retain in the coming decades for building a modern and proactive society. In this realm, computer programming learning is vital to enrich those skills. Practicing in this area boils down to solve programming exercises. In order to foster this practice, it is necessary to provide students with the best of the breed automated tools and a good set of exercises in a fair quantity covering the curricula of a typical programming course. Despite the increasing appearance of automated tools such as program evaluators, gamification engines and sophisticated web environments, access to exercises remains problematic. In fact, although the existence of several code repositories (most for feed computer programming contests), the majority of them store the exercises in proprietary formats and without any access facilities hindering their use. This leaves no other option to teachers but to manually create programming exercises which is time-consuming and error prone, or simply, reuse the same exercises, from previous years, which is considered as a detrimental and limiting approach to enhance multi-faceted and creative programmers. The article surveys the current interoperability efforts on programming exercises, more precisely, in terms of serialization formats and communication protocols. This study will sustain the selection of an API to feed a code playground called LearnJS with random programming exercises.

Cite as

Ricardo Queirós, José Carlos Paiva, and José Paulo Leal. Programming Exercises Interoperability: The Case of a Non-Picky Consumer. In 10th Symposium on Languages, Applications and Technologies (SLATE 2021). Open Access Series in Informatics (OASIcs), Volume 94, pp. 5:1-5:9, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{queiros_et_al:OASIcs.SLATE.2021.5,
  author =	{Queir\'{o}s, Ricardo and Paiva, Jos\'{e} Carlos and Leal, Jos\'{e} Paulo},
  title =	{{Programming Exercises Interoperability: The Case of a Non-Picky Consumer}},
  booktitle =	{10th Symposium on Languages, Applications and Technologies (SLATE 2021)},
  pages =	{5:1--5:9},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-202-0},
  ISSN =	{2190-6807},
  year =	{2021},
  volume =	{94},
  editor =	{Queir\'{o}s, Ricardo and Pinto, M\'{a}rio and Sim\~{o}es, Alberto and Portela, Filipe and Pereira, Maria Jo\~{a}o},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2021.5},
  URN =		{urn:nbn:de:0030-drops-144220},
  doi =		{10.4230/OASIcs.SLATE.2021.5},
  annote =	{Keywords: programming exercises format, interoperability, automated assessment, learning programming}
}
Document
DataGen: JSON/XML Dataset Generator

Authors: Filipa Alves dos Santos, Hugo André Coelho Cardoso, João da Cunha e Costa, Válter Ferreira Picas Carvalho, and José Carlos Ramalho


Abstract
In this document we describe the steps towards DataGen implementation. DataGen is a versatile and powerful tool that allows for quick prototyping and testing of software applications, since currently too few solutions offer both the complexity and scalability necessary to generate adequate datasets in order to feed a data API or a more complex APP enabling those applications testing with appropriate data volume and data complexity. DataGen core is a Domain Specific Language (DSL) that was created to specify datasets. This language suffered several updates: repeating fields (with no limit), fuzzy fields (statistically generated), lists, highorder functions over lists, custom made transformation functions. The final result is a complex algebra that allows the generation of very complex datasets coping with very complex requirements. Throughout the paper we will give several examples of the possibilities. After generating a dataset DataGen gives the user the possibility to generate a RESTFull data API with that dataset, creating a running prototype. This solution has already been used in real life cases, described with more detail throughout the paper, in which it was able to create the intended datasets successfully. These allowed the application’s performance to be tested and for the right adjustments to be made. The tool is currently being deployed for general use.

Cite as

Filipa Alves dos Santos, Hugo André Coelho Cardoso, João da Cunha e Costa, Válter Ferreira Picas Carvalho, and José Carlos Ramalho. DataGen: JSON/XML Dataset Generator. In 10th Symposium on Languages, Applications and Technologies (SLATE 2021). Open Access Series in Informatics (OASIcs), Volume 94, pp. 6:1-6:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{santos_et_al:OASIcs.SLATE.2021.6,
  author =	{Santos, Filipa Alves dos and Cardoso, Hugo Andr\'{e} Coelho and da Cunha e Costa, Jo\~{a}o and Carvalho, V\'{a}lter Ferreira Picas and Ramalho, Jos\'{e} Carlos},
  title =	{{DataGen: JSON/XML Dataset Generator}},
  booktitle =	{10th Symposium on Languages, Applications and Technologies (SLATE 2021)},
  pages =	{6:1--6:14},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-202-0},
  ISSN =	{2190-6807},
  year =	{2021},
  volume =	{94},
  editor =	{Queir\'{o}s, Ricardo and Pinto, M\'{a}rio and Sim\~{o}es, Alberto and Portela, Filipe and Pereira, Maria Jo\~{a}o},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2021.6},
  URN =		{urn:nbn:de:0030-drops-144239},
  doi =		{10.4230/OASIcs.SLATE.2021.6},
  annote =	{Keywords: JSON, XML, Data Generation, Open Source, REST API, Strapi, JavaScript, Node.js, Vue.js, Scalability, Fault Tolerance, Dataset, DSL, PEG.js, MongoDB}
}
Document
MUAHAH: Taking the Most out of Simple Conversational Agents

Authors: Leonor Llansol, João Santos, Luís Duarte, José Santos, Mariana Gaspar, Ana Alves, Hugo Gonçalo Oliveira, and Luísa Coheur


Abstract
Dialog engines based on multi-agent architectures usually select a single agent, deemed to be the most suitable for a given scenario or for responding to a specific request, and disregard the answers from all of the other available agents. In this work, we present a multi-agent plug-and-play architecture that: (i) enables the integration of different agents; (ii) includes a decision maker module, responsible for selecting a suitable answer out of the responses of different agents. As usual, a single agent can be chosen to provide the final answer, but the latter can also be obtained from the responses of several agents, according to a voting scheme. We also describe three case studies in which we test several agents and decision making strategies; and show how new agents and a new decision strategy can be easily plugged in and take advantage of this platform in different ways. Experimentation also confirms that considering several agents contributes to better responses.

Cite as

Leonor Llansol, João Santos, Luís Duarte, José Santos, Mariana Gaspar, Ana Alves, Hugo Gonçalo Oliveira, and Luísa Coheur. MUAHAH: Taking the Most out of Simple Conversational Agents. In 10th Symposium on Languages, Applications and Technologies (SLATE 2021). Open Access Series in Informatics (OASIcs), Volume 94, pp. 7:1-7:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{llansol_et_al:OASIcs.SLATE.2021.7,
  author =	{Llansol, Leonor and Santos, Jo\~{a}o and Duarte, Lu{\'\i}s and Santos, Jos\'{e} and Gaspar, Mariana and Alves, Ana and Gon\c{c}alo Oliveira, Hugo and Coheur, Lu{\'\i}sa},
  title =	{{MUAHAH: Taking the Most out of Simple Conversational Agents}},
  booktitle =	{10th Symposium on Languages, Applications and Technologies (SLATE 2021)},
  pages =	{7:1--7:12},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-202-0},
  ISSN =	{2190-6807},
  year =	{2021},
  volume =	{94},
  editor =	{Queir\'{o}s, Ricardo and Pinto, M\'{a}rio and Sim\~{o}es, Alberto and Portela, Filipe and Pereira, Maria Jo\~{a}o},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2021.7},
  URN =		{urn:nbn:de:0030-drops-144248},
  doi =		{10.4230/OASIcs.SLATE.2021.7},
  annote =	{Keywords: Dialog systems, question answering, information retrieval, multi-agent}
}
Document
NER in Archival Finding Aids

Authors: Luís Filipe Costa Cunha and José Carlos Ramalho


Abstract
At the moment, the vast majority of Portuguese archives with an online presence use a software solution to manage their finding aids: e.g. Digitarq or Archeevo. Most of these finding aids are written in natural language without any annotation that would enable a machine to identify named entities, geographical locations or even some dates. That would allow the machine to create smart browsing tools on top of those record contents like entity linking and record linking. In this work we have created a set of datasets to train Machine Learning algorithms to find those named entities and geographical locations. After training several algorithms we tested them in several datasets and registered their precision and accuracy. These results enabled us to achieve some conclusions about what kind of precision we can achieve with this approach in this context and what to do with the results: do we have enough precision and accuracy to create toponymic and anthroponomic indexes for archival finding aids? Is this approach suitable in this context? These are some of the questions we intend to answer along this paper.

Cite as

Luís Filipe Costa Cunha and José Carlos Ramalho. NER in Archival Finding Aids. In 10th Symposium on Languages, Applications and Technologies (SLATE 2021). Open Access Series in Informatics (OASIcs), Volume 94, pp. 8:1-8:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{costacunha_et_al:OASIcs.SLATE.2021.8,
  author =	{Costa Cunha, Lu{\'\i}s Filipe and Ramalho, Jos\'{e} Carlos},
  title =	{{NER in Archival Finding Aids}},
  booktitle =	{10th Symposium on Languages, Applications and Technologies (SLATE 2021)},
  pages =	{8:1--8:16},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-202-0},
  ISSN =	{2190-6807},
  year =	{2021},
  volume =	{94},
  editor =	{Queir\'{o}s, Ricardo and Pinto, M\'{a}rio and Sim\~{o}es, Alberto and Portela, Filipe and Pereira, Maria Jo\~{a}o},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2021.8},
  URN =		{urn:nbn:de:0030-drops-144257},
  doi =		{10.4230/OASIcs.SLATE.2021.8},
  annote =	{Keywords: Named Entity Recognition, Archival Descriptions, Machine Learning, Deep Learning}
}
Document
Short Paper
Mooshak’s Diet Update: Introducing YAPExIL Format to Mooshak (Short Paper)

Authors: José Carlos Paiva, Ricardo Queirós, and José Paulo Leal


Abstract
Practice is pivotal in learning programming. As many other automated assessment tools for programming assignments, Mooshak has been adopted by numerous educational practitioners to support them in delivering timely and accurate feedback to students during exercise solving. These tools specialize in the delivery and assessment of blank-sheet coding questions. However, the different phases of a student’s learning path may demand distinct types of exercises (e.g., bug fix and block sorting) to foster new competencies such as debugging programs and understanding unknown source code or, otherwise, to break the routine and keep engagement. Recently, a format for describing programming exercises - YAPExIL -, supporting different types of activities, has been introduced. Unfortunately, no automated assessment tool yet supports this novel format. This paper describes a JavaScript library to transform YAPExIL packages into Mooshak problem packages (i.e., MEF format), keeping support for all exercise types. Moreover, its integration in an exercise authoring tool is described.

Cite as

José Carlos Paiva, Ricardo Queirós, and José Paulo Leal. Mooshak’s Diet Update: Introducing YAPExIL Format to Mooshak (Short Paper). In 10th Symposium on Languages, Applications and Technologies (SLATE 2021). Open Access Series in Informatics (OASIcs), Volume 94, pp. 9:1-9:7, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{paiva_et_al:OASIcs.SLATE.2021.9,
  author =	{Paiva, Jos\'{e} Carlos and Queir\'{o}s, Ricardo and Leal, Jos\'{e} Paulo},
  title =	{{Mooshak’s Diet Update: Introducing YAPExIL Format to Mooshak}},
  booktitle =	{10th Symposium on Languages, Applications and Technologies (SLATE 2021)},
  pages =	{9:1--9:7},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-202-0},
  ISSN =	{2190-6807},
  year =	{2021},
  volume =	{94},
  editor =	{Queir\'{o}s, Ricardo and Pinto, M\'{a}rio and Sim\~{o}es, Alberto and Portela, Filipe and Pereira, Maria Jo\~{a}o},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2021.9},
  URN =		{urn:nbn:de:0030-drops-144261},
  doi =		{10.4230/OASIcs.SLATE.2021.9},
  annote =	{Keywords: programming exercises format, interoperability, automated assessment, learning programming}
}
Document
LeMe-PT: A Medical Package Leaflet Corpus for Portuguese

Authors: Alberto Simões and Pablo Gamallo


Abstract
The current trend on natural language processing is the use of machine learning. This is being done on every field, from summarization to machine translation. For these techniques to be applied, resources are needed, namely quality corpora. While there are large quantities of corpora for the Portuguese language, there is the lack of technical and focused corpora. Therefore, in this article we present a new corpus, built from drug package leaflets. We describe its structure and contents, and discuss possible exploration directions.

Cite as

Alberto Simões and Pablo Gamallo. LeMe-PT: A Medical Package Leaflet Corpus for Portuguese. In 10th Symposium on Languages, Applications and Technologies (SLATE 2021). Open Access Series in Informatics (OASIcs), Volume 94, pp. 10:1-10:10, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{simoes_et_al:OASIcs.SLATE.2021.10,
  author =	{Sim\~{o}es, Alberto and Gamallo, Pablo},
  title =	{{LeMe-PT: A Medical Package Leaflet Corpus for Portuguese}},
  booktitle =	{10th Symposium on Languages, Applications and Technologies (SLATE 2021)},
  pages =	{10:1--10:10},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-202-0},
  ISSN =	{2190-6807},
  year =	{2021},
  volume =	{94},
  editor =	{Queir\'{o}s, Ricardo and Pinto, M\'{a}rio and Sim\~{o}es, Alberto and Portela, Filipe and Pereira, Maria Jo\~{a}o},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2021.10},
  URN =		{urn:nbn:de:0030-drops-144277},
  doi =		{10.4230/OASIcs.SLATE.2021.10},
  annote =	{Keywords: drug corpora, information extractiom, word embeddings}
}
Document
Towards Automatic Creation of Annotations to Foster Development of Named Entity Recognizers

Authors: Emanuel Matos, Mário Rodrigues, Pedro Miguel, and António Teixeira


Abstract
Named Entity Recognition (NER) is an essential step for many natural language processing tasks, including Information Extraction. Despite recent advances, particularly using deep learning techniques, the creation of accurate named entity recognizers continues a complex task, highly dependent on annotated data availability. To foster existence of NER systems for new domains it is crucial to obtain the required large volumes of annotated data with low or no manual labor. In this paper it is proposed a system to create the annotated data automatically, by resorting to a set of existing NERs and information sources (DBpedia). The approach was tested with documents of the Tourism domain. Distinct methods were applied for deciding the final named entities and respective tags. The results show that this approach can increase the confidence on annotations and/or augment the number of categories possible to annotate. This paper also presents examples of new NERs that can be rapidly created with the obtained annotated data. The annotated data, combined with the possibility to apply both the ensemble of NER systems and the new Gazetteer-based NERs to large corpora, create the necessary conditions to explore the recent neural deep learning state-of-art approaches to NER (ex: BERT) in domains with scarce or nonexistent data for training.

Cite as

Emanuel Matos, Mário Rodrigues, Pedro Miguel, and António Teixeira. Towards Automatic Creation of Annotations to Foster Development of Named Entity Recognizers. In 10th Symposium on Languages, Applications and Technologies (SLATE 2021). Open Access Series in Informatics (OASIcs), Volume 94, pp. 11:1-11:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{matos_et_al:OASIcs.SLATE.2021.11,
  author =	{Matos, Emanuel and Rodrigues, M\'{a}rio and Miguel, Pedro and Teixeira, Ant\'{o}nio},
  title =	{{Towards Automatic Creation of Annotations to Foster Development of Named Entity Recognizers}},
  booktitle =	{10th Symposium on Languages, Applications and Technologies (SLATE 2021)},
  pages =	{11:1--11:14},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-202-0},
  ISSN =	{2190-6807},
  year =	{2021},
  volume =	{94},
  editor =	{Queir\'{o}s, Ricardo and Pinto, M\'{a}rio and Sim\~{o}es, Alberto and Portela, Filipe and Pereira, Maria Jo\~{a}o},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2021.11},
  URN =		{urn:nbn:de:0030-drops-144286},
  doi =		{10.4230/OASIcs.SLATE.2021.11},
  annote =	{Keywords: Named Entity Recognition (NER), Automatic Annotation, Gazetteers, Tourism, Portuguese}
}
Document
Semantic Search of Mobile Applications Using Word Embeddings

Authors: João Coelho, António Neto, Miguel Tavares, Carlos Coutinho, Ricardo Ribeiro, and Fernando Batista


Abstract
This paper proposes a set of approaches for the semantic search of mobile applications, based on their name and on the unstructured textual information contained in their description. The proposed approaches make use of word-level, character-level, and contextual word-embeddings that have been trained or fine-tuned using a dataset of about 500 thousand mobile apps, collected in the scope of this work. The proposed approaches have been evaluated using a public dataset that includes information about 43 thousand applications, and 56 manually annotated non-exact queries. Our results show that both character-level embeddings trained on our data, and fine-tuned RoBERTa models surpass the performance of the other existing retrieval strategies reported in the literature.

Cite as

João Coelho, António Neto, Miguel Tavares, Carlos Coutinho, Ricardo Ribeiro, and Fernando Batista. Semantic Search of Mobile Applications Using Word Embeddings. In 10th Symposium on Languages, Applications and Technologies (SLATE 2021). Open Access Series in Informatics (OASIcs), Volume 94, pp. 12:1-12:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{coelho_et_al:OASIcs.SLATE.2021.12,
  author =	{Coelho, Jo\~{a}o and Neto, Ant\'{o}nio and Tavares, Miguel and Coutinho, Carlos and Ribeiro, Ricardo and Batista, Fernando},
  title =	{{Semantic Search of Mobile Applications Using Word Embeddings}},
  booktitle =	{10th Symposium on Languages, Applications and Technologies (SLATE 2021)},
  pages =	{12:1--12:12},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-202-0},
  ISSN =	{2190-6807},
  year =	{2021},
  volume =	{94},
  editor =	{Queir\'{o}s, Ricardo and Pinto, M\'{a}rio and Sim\~{o}es, Alberto and Portela, Filipe and Pereira, Maria Jo\~{a}o},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2021.12},
  URN =		{urn:nbn:de:0030-drops-144292},
  doi =		{10.4230/OASIcs.SLATE.2021.12},
  annote =	{Keywords: Semantic Search, Word Embeddings, Elasticsearch, Mobile Applications}
}
Document
Command Similarity Measurement Using NLP

Authors: Zafar Hussain, Jukka K. Nurminen, Tommi Mikkonen, and Marcin Kowiel


Abstract
Process invocations happen with almost every activity on a computer. To distinguish user input and potentially malicious activities, we need to better understand program invocations caused by commands. To achieve this, one must understand commands’ objectives, possible parameters, and valid syntax. In this work, we collected commands’ data by scrapping commands’ manual pages, including command description, syntax, and parameters. Then, we measured command similarity using two of these - description and parameters - based on commands' natural language documentation. We used Term Frequency-Inverse Document Frequency (TFIDF) of a word to compare the commands, followed by measuring cosine similarity to find a similarity of commands’ description. For parameters, after measuring TFIDF and cosine similarity, the Hungarian method is applied to solve the assignment of different parameters’ combinations. Finally, commands are clustered based on their similarity scores. The results show that these methods have efficiently clustered the commands in smaller groups (commands with aliases or close counterparts), and in a bigger group (commands belonging to a larger set of related commands, e.g., bitsadmin for Windows and systemd for Linux). To validate the clustering results, we applied topic modeling on the commands' data, which confirms that 84% of the Windows commands and 98% ofthe Linux commands are clustered correctly.

Cite as

Zafar Hussain, Jukka K. Nurminen, Tommi Mikkonen, and Marcin Kowiel. Command Similarity Measurement Using NLP. In 10th Symposium on Languages, Applications and Technologies (SLATE 2021). Open Access Series in Informatics (OASIcs), Volume 94, pp. 13:1-13:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{hussain_et_al:OASIcs.SLATE.2021.13,
  author =	{Hussain, Zafar and Nurminen, Jukka K. and Mikkonen, Tommi and Kowiel, Marcin},
  title =	{{Command Similarity Measurement Using NLP}},
  booktitle =	{10th Symposium on Languages, Applications and Technologies (SLATE 2021)},
  pages =	{13:1--13:14},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-202-0},
  ISSN =	{2190-6807},
  year =	{2021},
  volume =	{94},
  editor =	{Queir\'{o}s, Ricardo and Pinto, M\'{a}rio and Sim\~{o}es, Alberto and Portela, Filipe and Pereira, Maria Jo\~{a}o},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2021.13},
  URN =		{urn:nbn:de:0030-drops-144305},
  doi =		{10.4230/OASIcs.SLATE.2021.13},
  annote =	{Keywords: Natural Language Processing, NLP, Windows Commands, Linux Commands, Textual Similarity, Command Term Frequency, Inverse Document Frequency, TFIDF, Cosine Similarity, Linear Sum Assignment, Command Clustering}
}
Document
Using Machine Learning for Vulnerability Detection and Classification

Authors: Tiago Baptista, Nuno Oliveira, and Pedro Rangel Henriques


Abstract
The work described in this paper aims at developing a machine learning based tool for automatic identification of vulnerabilities on programs (source, high level code), that uses an abstract syntax tree representation. It is based on FastScan, using code2seq approach. Fastscan is a recently developed system aimed capable of detecting vulnerabilities in source code using machine learning techniques. Nevertheless, FastScan is not able of identifying the vulnerability type. In the presented work the main goal is to go further and develop a method to identify specific types of vulnerabilities. As will be shown, the goal will be achieved by optimizing the model’s hyperparameters, changing the method of preprocessing the input data and developing an architecture that brings together multiple models to predict different specific vulnerabilities. The preliminary results obtained from the training stage, are very promising. The best f1 metric obtained is 93% resulting in a precision of 90% and accuracy of 85%, according to the performed tests and regarding a trained model to predict vulnerabilities of the injection type.

Cite as

Tiago Baptista, Nuno Oliveira, and Pedro Rangel Henriques. Using Machine Learning for Vulnerability Detection and Classification. In 10th Symposium on Languages, Applications and Technologies (SLATE 2021). Open Access Series in Informatics (OASIcs), Volume 94, pp. 14:1-14:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{baptista_et_al:OASIcs.SLATE.2021.14,
  author =	{Baptista, Tiago and Oliveira, Nuno and Henriques, Pedro Rangel},
  title =	{{Using Machine Learning for Vulnerability Detection and Classification}},
  booktitle =	{10th Symposium on Languages, Applications and Technologies (SLATE 2021)},
  pages =	{14:1--14:14},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-202-0},
  ISSN =	{2190-6807},
  year =	{2021},
  volume =	{94},
  editor =	{Queir\'{o}s, Ricardo and Pinto, M\'{a}rio and Sim\~{o}es, Alberto and Portela, Filipe and Pereira, Maria Jo\~{a}o},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2021.14},
  URN =		{urn:nbn:de:0030-drops-144315},
  doi =		{10.4230/OASIcs.SLATE.2021.14},
  annote =	{Keywords: Vulnerability Detection, Source Code Analysis, Machine Learning}
}
Document
NetLangEd, A Web Editor to Support Online Comment Annotation

Authors: Rui Rodrigues, Cristiana Araújo, and Pedro Rangel Henriques


Abstract
This paper focuses on the scientific areas of Digital Humanities, Social Networks and Inappropriate Social Discourse. The main objective of this research project is the development of an editor that allows researchers in the human and social sciences or psychologists to add their reflections or ideas out coming from reading and analyzing posts and comments of an online corpus . In the present context, the editor is being integrated with the analysis tools available in the NetLang platform. NetLangEd, in addition to allowing the three basic operations of adding, editing and removing annotations, will also offer mechanisms to manage, organize, view and locate annotations, all of which will be performed in an easy, fast and user-friendly way.

Cite as

Rui Rodrigues, Cristiana Araújo, and Pedro Rangel Henriques. NetLangEd, A Web Editor to Support Online Comment Annotation. In 10th Symposium on Languages, Applications and Technologies (SLATE 2021). Open Access Series in Informatics (OASIcs), Volume 94, pp. 15:1-15:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{rodrigues_et_al:OASIcs.SLATE.2021.15,
  author =	{Rodrigues, Rui and Ara\'{u}jo, Cristiana and Henriques, Pedro Rangel},
  title =	{{NetLangEd, A Web Editor to Support Online Comment Annotation}},
  booktitle =	{10th Symposium on Languages, Applications and Technologies (SLATE 2021)},
  pages =	{15:1--15:16},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-202-0},
  ISSN =	{2190-6807},
  year =	{2021},
  volume =	{94},
  editor =	{Queir\'{o}s, Ricardo and Pinto, M\'{a}rio and Sim\~{o}es, Alberto and Portela, Filipe and Pereira, Maria Jo\~{a}o},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2021.15},
  URN =		{urn:nbn:de:0030-drops-144325},
  doi =		{10.4230/OASIcs.SLATE.2021.15},
  annote =	{Keywords: Online Annotation tool, Document Markup System, Text Editor, Discourse Analysis}
}
Document
Intelligent Query Answering with Contextual Knowledge for Relational Databases

Authors: Dietmar Seipel, Daniel Weidner, and Salvador Abreu


Abstract
We are proposing a keyword-based query interface for knowledge bases - including relational or deductive databases - based on contextual background knowledge such as suitable join conditions or synonyms. Join conditions could be extracted from existing referential integrity (foreign key) constaints of the database schema. They could also be learned from other, previous database queries, if the database schema does not contain foreign key constraints. Given a textual representation - a word list - of a query to a relational database, one may parse the list into a structured term. The intelligent and cooperative part of our approach is to hypothesize the semantics of the word list and to find suitable links between the concepts mentioned in the query using contextual knowledge, more precisely join conditions between the database tables. We use a knowledge-based parser based on an extension of Definite Clause Grammars (Dcg) that are interweaved with calls to the database schema to suitably annotate the tokens as table names, table attributes, attribute values or relationships linking tables. Our tool DdQl yields the possible queries in a special domain specific rule language that extends Datalog, from which the user can choose one.

Cite as

Dietmar Seipel, Daniel Weidner, and Salvador Abreu. Intelligent Query Answering with Contextual Knowledge for Relational Databases. In 10th Symposium on Languages, Applications and Technologies (SLATE 2021). Open Access Series in Informatics (OASIcs), Volume 94, pp. 16:1-16:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{seipel_et_al:OASIcs.SLATE.2021.16,
  author =	{Seipel, Dietmar and Weidner, Daniel and Abreu, Salvador},
  title =	{{Intelligent Query Answering with Contextual Knowledge for Relational Databases}},
  booktitle =	{10th Symposium on Languages, Applications and Technologies (SLATE 2021)},
  pages =	{16:1--16:15},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-202-0},
  ISSN =	{2190-6807},
  year =	{2021},
  volume =	{94},
  editor =	{Queir\'{o}s, Ricardo and Pinto, M\'{a}rio and Sim\~{o}es, Alberto and Portela, Filipe and Pereira, Maria Jo\~{a}o},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2021.16},
  URN =		{urn:nbn:de:0030-drops-144330},
  doi =		{10.4230/OASIcs.SLATE.2021.16},
  annote =	{Keywords: Knowledge Bases, Natural Language Interface, Logic Programming, Definite Clause Grammars, Referential Integrity Constraints}
}
Document
Sentiment Analysis of Portuguese Economic News

Authors: Cátia Tavares, Ricardo Ribeiro, and Fernando Batista


Abstract
This paper proposes a rule-based method for automatic polarity detection over economic news texts, which proved suitable for detecting the sentiment in Portuguese economic news. The data used in our experiments consists of 400 manually annotated sentences extracted from economic news, used for evaluation, and about 90 thousand Portuguese economic news, extracted from two well-known Portuguese newspapers, covering the period from 2010 to 2020, that have been used for training our systems. In order to perform sentiment analysis of economic news, we have also tested the adaptation of existing pre-trained modules, and also performed experiments with a set of Machine Learning approaches, and self-training. Experimental results show that our rule-based approach, that uses manually written rules related to the economic context, achieves the best results for automatically detecting the polarity of economic news, largely surpassing the other approaches.

Cite as

Cátia Tavares, Ricardo Ribeiro, and Fernando Batista. Sentiment Analysis of Portuguese Economic News. In 10th Symposium on Languages, Applications and Technologies (SLATE 2021). Open Access Series in Informatics (OASIcs), Volume 94, pp. 17:1-17:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{tavares_et_al:OASIcs.SLATE.2021.17,
  author =	{Tavares, C\'{a}tia and Ribeiro, Ricardo and Batista, Fernando},
  title =	{{Sentiment Analysis of Portuguese Economic News}},
  booktitle =	{10th Symposium on Languages, Applications and Technologies (SLATE 2021)},
  pages =	{17:1--17:13},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-202-0},
  ISSN =	{2190-6807},
  year =	{2021},
  volume =	{94},
  editor =	{Queir\'{o}s, Ricardo and Pinto, M\'{a}rio and Sim\~{o}es, Alberto and Portela, Filipe and Pereira, Maria Jo\~{a}o},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2021.17},
  URN =		{urn:nbn:de:0030-drops-144347},
  doi =		{10.4230/OASIcs.SLATE.2021.17},
  annote =	{Keywords: Sentiment Analysis, Economic News, Portuguese Language}
}
Document
Short Paper
Bootstrapping a Data-Set and Model for Question-Answering in Portuguese (Short Paper)

Authors: Nuno Ramos Carvalho, Alberto Simões, and José João Almeida


Abstract
Question answering systems are mainly concerned with fulfilling an information query written in natural language, given a collection of documents with relevant information. They are key elements in many popular application systems as personal assistants, chat-bots, or even FAQ-based online support systems. This paper describes an exploratory work carried out to come up with a state-of-the-art model for question-answering tasks, for the Portuguese language, based on deep neural networks. We also describe the automatic construction of a data-set for training and testing the model. The final model is not trained in any specific topic or context, and is able to handle generic documents, achieving 50% accuracy in the testing data-set. While the results are not exceptional, this work can support further development in the area, as both the data-set and model are publicly available.

Cite as

Nuno Ramos Carvalho, Alberto Simões, and José João Almeida. Bootstrapping a Data-Set and Model for Question-Answering in Portuguese (Short Paper). In 10th Symposium on Languages, Applications and Technologies (SLATE 2021). Open Access Series in Informatics (OASIcs), Volume 94, pp. 18:1-18:5, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{carvalho_et_al:OASIcs.SLATE.2021.18,
  author =	{Carvalho, Nuno Ramos and Sim\~{o}es, Alberto and Almeida, Jos\'{e} Jo\~{a}o},
  title =	{{Bootstrapping a Data-Set and Model for Question-Answering in Portuguese}},
  booktitle =	{10th Symposium on Languages, Applications and Technologies (SLATE 2021)},
  pages =	{18:1--18:5},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-202-0},
  ISSN =	{2190-6807},
  year =	{2021},
  volume =	{94},
  editor =	{Queir\'{o}s, Ricardo and Pinto, M\'{a}rio and Sim\~{o}es, Alberto and Portela, Filipe and Pereira, Maria Jo\~{a}o},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2021.18},
  URN =		{urn:nbn:de:0030-drops-144355},
  doi =		{10.4230/OASIcs.SLATE.2021.18},
  annote =	{Keywords: Portuguese language, question answering, deep learning}
}
Document
Development of Self-Diagnosis Tests System Using a DSL for Creating New Test Suites for Integration in a Cyber-Physical System

Authors: Ricardo B. Pereira, José C. Ramalho, and Miguel A. Brito


Abstract
Testing Cyber-physical systems (CPS) requires highly qualified engineers to design the tests since its computational part is programmed in low-level languages. The origin of this work arises from the need to find a solution that optimizes this problem and allows abstracting the current methods so that the tests can be created and executed more efficiently. We intend to do this by creating a self-diagnosis tests system that allows us to automate some of the current processes in the creation and execution of test suites. The work presented here addresses the problem by creating a new self-diagnosis tests system that will guarantee the reliability and integrity of the CPS. In detail, this paper begins by exposing a study on the current state of the art of test automation, Keyword-driven Testing (KDT) methodology and Domain-specific Languages (DSL). A new modular and extensible architecture is proposed for self-diagnosis tests systems based on two main concepts: the creation of a DSL combined with the use of the KDT methodology, as well as a methodology to extend it and integrate it into a CPS. A new self-diagnosis tests system has been proposed that applies the proposed architecture proving that it is possible to carry out the self-diagnosis in real-time of the CPS and allowing the integration of any type of test. To validate the implementation of the system, 28 test cases were carried out to cover all its functionalities. The results show that all test cases passed and, therefore, the system meets all the proposed objectives.

Cite as

Ricardo B. Pereira, José C. Ramalho, and Miguel A. Brito. Development of Self-Diagnosis Tests System Using a DSL for Creating New Test Suites for Integration in a Cyber-Physical System. In 10th Symposium on Languages, Applications and Technologies (SLATE 2021). Open Access Series in Informatics (OASIcs), Volume 94, pp. 19:1-19:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{pereira_et_al:OASIcs.SLATE.2021.19,
  author =	{Pereira, Ricardo B. and Ramalho, Jos\'{e} C. and Brito, Miguel A.},
  title =	{{Development of Self-Diagnosis Tests System Using a DSL for Creating New Test Suites for Integration in a Cyber-Physical System}},
  booktitle =	{10th Symposium on Languages, Applications and Technologies (SLATE 2021)},
  pages =	{19:1--19:16},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-202-0},
  ISSN =	{2190-6807},
  year =	{2021},
  volume =	{94},
  editor =	{Queir\'{o}s, Ricardo and Pinto, M\'{a}rio and Sim\~{o}es, Alberto and Portela, Filipe and Pereira, Maria Jo\~{a}o},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2021.19},
  URN =		{urn:nbn:de:0030-drops-144367},
  doi =		{10.4230/OASIcs.SLATE.2021.19},
  annote =	{Keywords: Web Application, DSL, Self-diagnosis, Test automation, Cyber-physical systems}
}

Filters


Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail