OASIcs, Volume 113

12th Symposium on Languages, Applications and Technologies (SLATE 2023)

Thumbnail PDF


SLATE 2023, June 26-28, 2023, Vila do Conde, Portugal


Alberto Simões
  • 2Ai, School of Technology, Polytechnic Institute of Cávado and Ave (IPCA), Barcelos, Portugal
Mario Marcelo Berón
  • Departamento de Informática, Facultad de Ciencias Física Matemáticas y Naturales (FCFMyN), Universidad Nacional de San Luis, Argentina
Filipe Portela
  • Centro Algoritmi, Escola de Engenharia, Universidade do Minho, Guimarães, Portugal

Publication Details

  • published at: 2023-08-15
  • Publisher: Schloss Dagstuhl – Leibniz-Zentrum für Informatik
  • ISBN: 978-3-95977-291-4
  • DBLP: db/conf/slate/slate2023

Access Numbers


No documents found matching your filter selection.
Complete Volume
OASIcs, Volume 113, SLATE 2023, Complete Volume

Authors: Alberto Simões, Mario Marcelo Berón, and Filipe Portela

OASIcs, Volume 113, SLATE 2023, Complete Volume

Cite as

12th Symposium on Languages, Applications and Technologies (SLATE 2023). Open Access Series in Informatics (OASIcs), Volume 113, pp. 1-206, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

  title =	{{OASIcs, Volume 113, SLATE 2023, Complete Volume}},
  booktitle =	{12th Symposium on Languages, Applications and Technologies (SLATE 2023)},
  pages =	{1--206},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-291-4},
  ISSN =	{2190-6807},
  year =	{2023},
  volume =	{113},
  editor =	{Sim\~{o}es, Alberto and Ber\'{o}n, Mario Marcelo and Portela, Filipe},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2023},
  URN =		{urn:nbn:de:0030-drops-185130},
  doi =		{10.4230/OASIcs.SLATE.2023},
  annote =	{Keywords: OASIcs, Volume 113, SLATE 2023, Complete Volume}
Front Matter
Front Matter, Table of Contents, Preface, Conference Organization

Authors: Alberto Simões, Mario Marcelo Berón, and Filipe Portela

Front Matter, Table of Contents, Preface, Conference Organization

Cite as

12th Symposium on Languages, Applications and Technologies (SLATE 2023). Open Access Series in Informatics (OASIcs), Volume 113, pp. 0:i-0:xii, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

  author =	{Sim\~{o}es, Alberto and Ber\'{o}n, Mario Marcelo and Portela, Filipe},
  title =	{{Front Matter, Table of Contents, Preface, Conference Organization}},
  booktitle =	{12th Symposium on Languages, Applications and Technologies (SLATE 2023)},
  pages =	{0:i--0:xii},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-291-4},
  ISSN =	{2190-6807},
  year =	{2023},
  volume =	{113},
  editor =	{Sim\~{o}es, Alberto and Ber\'{o}n, Mario Marcelo and Portela, Filipe},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2023.0},
  URN =		{urn:nbn:de:0030-drops-185141},
  doi =		{10.4230/OASIcs.SLATE.2023.0},
  annote =	{Keywords: Front Matter, Table of Contents, Preface, Conference Organization}
Question Answering over Linked Data with GPT-3

Authors: Bruno Faria, Dylan Perdigão, and Hugo Gonçalo Oliveira

This paper explores GPT-3 for answering natural language questions over Linked Data. Different engines of the model and different approaches are adopted for answering questions in the QALD-9 dataset, namely: zero and few-shot SPARQL generation, as well as fine-tuning in the training portion of the dataset. Answers retrieved by the generated queries and answers generated directly by the model are also compared. Overall results are generally poor, but several insights are provided on using GPT-3 for the proposed task.

Cite as

Bruno Faria, Dylan Perdigão, and Hugo Gonçalo Oliveira. Question Answering over Linked Data with GPT-3. In 12th Symposium on Languages, Applications and Technologies (SLATE 2023). Open Access Series in Informatics (OASIcs), Volume 113, pp. 1:1-1:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

  author =	{Faria, Bruno and Perdig\~{a}o, Dylan and Gon\c{c}alo Oliveira, Hugo},
  title =	{{Question Answering over Linked Data with GPT-3}},
  booktitle =	{12th Symposium on Languages, Applications and Technologies (SLATE 2023)},
  pages =	{1:1--1:15},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-291-4},
  ISSN =	{2190-6807},
  year =	{2023},
  volume =	{113},
  editor =	{Sim\~{o}es, Alberto and Ber\'{o}n, Mario Marcelo and Portela, Filipe},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2023.1},
  URN =		{urn:nbn:de:0030-drops-185155},
  doi =		{10.4230/OASIcs.SLATE.2023.1},
  annote =	{Keywords: SPARQL Generation, Prompt Engineering, Few-Shot Learning, Question Answering, GPT-3}
A Framework for Fostering Easier Access to Enriched Textual Information

Authors: Gabriel Silva, Mário Rodrigues, António Teixeira, and Marlene Amorim

Considering the amount of information in unstructured data it is necessary to have suitable methods to extract information from it. Most of these methods have their own output making it difficult and costly to merge and share this information as there currently is no unified way of representing this information. While most of these methods rely on JSON or XML there has been a push to serialize these into RDF compliant formats due to their flexiblity and the existing ecosystem surrounding them. In this paper we introduce a framework whose goal is to provide a serialization of enriched data into an RDF format, following FAIR principles, making it more interpretable, interoperable and shareable. We process a subset of the WikiNER dataset and showcase two examples of using this framework: One using CoNLL annotations and the other by performing entity-linking on an already existing graph. The results are a graph with every connection starting from the document and finishing on tokens while keeping the original text intact while embedding the enriched data into it, in this case the CoNLL annotations and Entities.

Cite as

Gabriel Silva, Mário Rodrigues, António Teixeira, and Marlene Amorim. A Framework for Fostering Easier Access to Enriched Textual Information. In 12th Symposium on Languages, Applications and Technologies (SLATE 2023). Open Access Series in Informatics (OASIcs), Volume 113, pp. 2:1-2:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

  author =	{Silva, Gabriel and Rodrigues, M\'{a}rio and Teixeira, Ant\'{o}nio and Amorim, Marlene},
  title =	{{A Framework for Fostering Easier Access to Enriched Textual Information}},
  booktitle =	{12th Symposium on Languages, Applications and Technologies (SLATE 2023)},
  pages =	{2:1--2:14},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-291-4},
  ISSN =	{2190-6807},
  year =	{2023},
  volume =	{113},
  editor =	{Sim\~{o}es, Alberto and Ber\'{o}n, Mario Marcelo and Portela, Filipe},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2023.2},
  URN =		{urn:nbn:de:0030-drops-185165},
  doi =		{10.4230/OASIcs.SLATE.2023.2},
  annote =	{Keywords: Knowledge graphs, Enriched data, Natural language processing, Triplestore}
A Pseudonymization Prototype for Hungarian

Authors: Attila Novák and Borbála Novák

In this paper, we present a pseudonymization prototype for Hungarian, an agglutinating language with complex morphology, implemented as a web service. The service provides the following functions: entity identification and extraction; automatic generation and selection of replacement candidates; automatic and consistent replacement and reinflection of entities in the final pseudonymized document. The named entity recognition model applied handles names of persons well, and it has decent performance on other entity types as well. However ID-like entities need to be handled separately to achieve proper performance (not handled in the current prototype version). For automatic replacement candidate generation, a simple entity embedding model is used. We discuss the performance and limitations of the prototype in detail.

Cite as

Attila Novák and Borbála Novák. A Pseudonymization Prototype for Hungarian. In 12th Symposium on Languages, Applications and Technologies (SLATE 2023). Open Access Series in Informatics (OASIcs), Volume 113, pp. 3:1-3:10, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

  author =	{Nov\'{a}k, Attila and Nov\'{a}k, Borb\'{a}la},
  title =	{{A Pseudonymization Prototype for Hungarian}},
  booktitle =	{12th Symposium on Languages, Applications and Technologies (SLATE 2023)},
  pages =	{3:1--3:10},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-291-4},
  ISSN =	{2190-6807},
  year =	{2023},
  volume =	{113},
  editor =	{Sim\~{o}es, Alberto and Ber\'{o}n, Mario Marcelo and Portela, Filipe},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2023.3},
  URN =		{urn:nbn:de:0030-drops-185177},
  doi =		{10.4230/OASIcs.SLATE.2023.3},
  annote =	{Keywords: named entity recognition, morphological reinflection, pseudonymization, entity embedding model}
Generating and Ranking Distractors for Multiple-Choice Questions in Portuguese

Authors: Hugo Gonçalo Oliveira, Igor Caetano, Renato Matos, and Hugo Amaro

In the process of multiple-choice question generation, different methods are often considered for distractor acquisition, as an attempt to cover as many questions as possible. Some, however, result in many candidate distractors of variable quality, while only three or four are necessary. We implement some distractor generation methods for Portuguese and propose their combination and ranking with language models. Experimentation results confirm that this increases both coverage and suitability of the selected distractors.

Cite as

Hugo Gonçalo Oliveira, Igor Caetano, Renato Matos, and Hugo Amaro. Generating and Ranking Distractors for Multiple-Choice Questions in Portuguese. In 12th Symposium on Languages, Applications and Technologies (SLATE 2023). Open Access Series in Informatics (OASIcs), Volume 113, pp. 4:1-4:9, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

  author =	{Gon\c{c}alo Oliveira, Hugo and Caetano, Igor and Matos, Renato and Amaro, Hugo},
  title =	{{Generating and Ranking Distractors for Multiple-Choice Questions in Portuguese}},
  booktitle =	{12th Symposium on Languages, Applications and Technologies (SLATE 2023)},
  pages =	{4:1--4:9},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-291-4},
  ISSN =	{2190-6807},
  year =	{2023},
  volume =	{113},
  editor =	{Sim\~{o}es, Alberto and Ber\'{o}n, Mario Marcelo and Portela, Filipe},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2023.4},
  URN =		{urn:nbn:de:0030-drops-185185},
  doi =		{10.4230/OASIcs.SLATE.2023.4},
  annote =	{Keywords: Multiple-Choice Questions, Distractor Generation, Language Models}
Web of Science Citation Gaps: An Automatic Approach to Detect Indexed but Missing Citations

Authors: David Rodrigues, António L. Lopes, and Fernando Batista

The number of citations a research paper receives is a crucial metric for both researchers and institutions. However, since citation databases have their own source lists, finding all the citations of a given paper can be a challenge. As a result, there may be missing citations that are not counted towards a paper’s total citation count. To address this issue, we present an automated approach to find missing citations leveraging the use of multiple indexing databases. In this research, Web of Science (WoS) serves as a case study and OpenAlex is used as a reference point for comparison. For a given paper, we identify all citing papers found in both research databases. Then, for each citing paper we check if it is indexed in WoS, but not referred in WoS as a citing paper, in order to determine if it is a missing citation. In our experiments, from a set of 1539 papers indexed by WoS, we found 696 missing citations. This outcome proves the success of our approach, and reveals that WoS does not always consider the full list of citing papers of a given publication, even when these citing papers are indexed by WoS. We also found that WoS has a higher chance of missing information for more recent publications. These findings provide relevant insights about this indexing research database, and provide enough motivation for considering other research databases in our study, such as Scopus and Google Scholar, in order to improve the matching and querying algorithms, and to reduce false positives, towards providing a more comprehensive and accurate view of the citations of a paper.

Cite as

David Rodrigues, António L. Lopes, and Fernando Batista. Web of Science Citation Gaps: An Automatic Approach to Detect Indexed but Missing Citations. In 12th Symposium on Languages, Applications and Technologies (SLATE 2023). Open Access Series in Informatics (OASIcs), Volume 113, pp. 5:1-5:11, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

  author =	{Rodrigues, David and Lopes, Ant\'{o}nio L. and Batista, Fernando},
  title =	{{Web of Science Citation Gaps: An Automatic Approach to Detect Indexed but Missing Citations}},
  booktitle =	{12th Symposium on Languages, Applications and Technologies (SLATE 2023)},
  pages =	{5:1--5:11},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-291-4},
  ISSN =	{2190-6807},
  year =	{2023},
  volume =	{113},
  editor =	{Sim\~{o}es, Alberto and Ber\'{o}n, Mario Marcelo and Portela, Filipe},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2023.5},
  URN =		{urn:nbn:de:0030-drops-185199},
  doi =		{10.4230/OASIcs.SLATE.2023.5},
  annote =	{Keywords: Research Databases, Citations, Citation Databases, Web of Science, OpenAlex}
Querying Relational Databases with Speech-Recognition Driven by Contextual Knowledge

Authors: Dietmar Seipel, Benjamin Förster, Magnus Liebl, Marcel Waleska, and Salvador Abreu

We are extending the keyword-based query interface DdQl for relational databases which is based on contextual background knowledge such as suitable join conditions and which was proposed in [{Dietmar Seipel, 2021]. In the previous paper, join conditions were extracted from existing referential integrity (foreign key) constraints of the database schema, or they could be learned from other, previous database queries. In this paper, we describe a speech-to-text component for entering the query keywords based on the system Whisper. Keywords, which have been recognized wrongly by Whisper can be corrected to similarly sounding words. Again, the context of the database schema can help here. For users with a limited knowledge of the schema and the contents of the database, the approach of DdQl can help to provide useful suggestions for query implementations in Sql or Datalog, from which the user can choose one. Our tool DdQl can be run in a docker image; it yields the possible queries in Sql and a special domain specific rule language that extends Datalog. The Datalog variant allows for additional user-defined aggregation functions which are not possible in Sql.

Cite as

Dietmar Seipel, Benjamin Förster, Magnus Liebl, Marcel Waleska, and Salvador Abreu. Querying Relational Databases with Speech-Recognition Driven by Contextual Knowledge. In 12th Symposium on Languages, Applications and Technologies (SLATE 2023). Open Access Series in Informatics (OASIcs), Volume 113, pp. 6:1-6:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

  author =	{Seipel, Dietmar and F\"{o}rster, Benjamin and Liebl, Magnus and Waleska, Marcel and Abreu, Salvador},
  title =	{{Querying Relational Databases with Speech-Recognition Driven by Contextual Knowledge}},
  booktitle =	{12th Symposium on Languages, Applications and Technologies (SLATE 2023)},
  pages =	{6:1--6:15},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-291-4},
  ISSN =	{2190-6807},
  year =	{2023},
  volume =	{113},
  editor =	{Sim\~{o}es, Alberto and Ber\'{o}n, Mario Marcelo and Portela, Filipe},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2023.6},
  URN =		{urn:nbn:de:0030-drops-185202},
  doi =		{10.4230/OASIcs.SLATE.2023.6},
  annote =	{Keywords: Knowledge Bases, Natural Language Interface, Logic Programming, Definite Clause Grammars, Referential Integrity Constraints, Speech-to-Text}
Short Paper
Automatic Speech Recognition of Non-Native Child Speech for Language Learning Applications (Short Paper)

Authors: Simone Wills, Yu Bai, Cristian Tejedor-García, Catia Cucchiarini, and Helmer Strik

Voicebots have provided a new avenue for supporting the development of language skills, particularly within the context of second language learning. Voicebots, though, have largely been geared towards native adult speakers. We sought to assess the performance of two state-of-the-art ASR systems, Wav2Vec2.0 and Whisper AI, with a view to developing a voicebot that can support children acquiring a foreign language. We evaluated their performance on read and extemporaneous speech of native and non-native Dutch children. We also investigated the utility of using ASR technology to provide insight into the children’s pronunciation and fluency. The results show that recent, pre-trained ASR transformer-based models achieve acceptable performance from which detailed feedback on phoneme pronunciation quality can be extracted, despite the challenging nature of child and non-native speech.

Cite as

Simone Wills, Yu Bai, Cristian Tejedor-García, Catia Cucchiarini, and Helmer Strik. Automatic Speech Recognition of Non-Native Child Speech for Language Learning Applications (Short Paper). In 12th Symposium on Languages, Applications and Technologies (SLATE 2023). Open Access Series in Informatics (OASIcs), Volume 113, pp. 7:1-7:8, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

  author =	{Wills, Simone and Bai, Yu and Tejedor-Garc{\'\i}a, Cristian and Cucchiarini, Catia and Strik, Helmer},
  title =	{{Automatic Speech Recognition of Non-Native Child Speech for Language Learning Applications}},
  booktitle =	{12th Symposium on Languages, Applications and Technologies (SLATE 2023)},
  pages =	{7:1--7:8},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-291-4},
  ISSN =	{2190-6807},
  year =	{2023},
  volume =	{113},
  editor =	{Sim\~{o}es, Alberto and Ber\'{o}n, Mario Marcelo and Portela, Filipe},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2023.7},
  URN =		{urn:nbn:de:0030-drops-185218},
  doi =		{10.4230/OASIcs.SLATE.2023.7},
  annote =	{Keywords: Automatic Speech Recognition, ASR, Child Speech, Non-Native Speech, Human-computer Interaction, Whisper, Wav2Vec2.0}
OCRticle - a Structure-Aware OCR Application

Authors: Sofia G. Rodrigues dos Santos and J. João Dias de Almeida

While there are currently many applications and websites capable of performing Optical Character Recognition (OCR), none of the widely available options offer structured OCR, i.e., OCR that maintains the text’s original structure. For example, if a document has a title, after performing OCR on it, the title should have a different formatting, in order to distinguish it from the rest of the text. This paper covers the topic of structure-aware OCR, first by describing the current state of OCR tools, then by showcasing a prototype tool capable of retaining the structure of articles scanned from an image.

Cite as

Sofia G. Rodrigues dos Santos and J. João Dias de Almeida. OCRticle - a Structure-Aware OCR Application. In 12th Symposium on Languages, Applications and Technologies (SLATE 2023). Open Access Series in Informatics (OASIcs), Volume 113, pp. 8:1-8:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

  author =	{Rodrigues dos Santos, Sofia G. and Dias de Almeida, J. Jo\~{a}o},
  title =	{{OCRticle - a Structure-Aware OCR Application}},
  booktitle =	{12th Symposium on Languages, Applications and Technologies (SLATE 2023)},
  pages =	{8:1--8:14},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-291-4},
  ISSN =	{2190-6807},
  year =	{2023},
  volume =	{113},
  editor =	{Sim\~{o}es, Alberto and Ber\'{o}n, Mario Marcelo and Portela, Filipe},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2023.8},
  URN =		{urn:nbn:de:0030-drops-185220},
  doi =		{10.4230/OASIcs.SLATE.2023.8},
  annote =	{Keywords: OCR, Optical Character Recognition, Data Structure, Data Parsing, Document Structure}
Short Paper
Narrative Extraction from Semantic Graphs (Short Paper)

Authors: Daniil Lystopadskyi, André Santos, and José Paulo Leal

This paper proposes an interactive approach for narrative extraction from semantic graphs. The proposed approach extracts events from RDF triples, maps them to their corresponding attributes, and assembles them into a chronological sequence to form narrative graphs. The approach is evaluated on the Wikidata graph and achieves promising results in terms of narrative quality and coherence. The paper also discusses several avenues for future work, including the integration of machine learning, graph embedding methods and the exploration of advanced techniques for attention-based narrative labeling and semantic role labeling. Overall, the proposed method offers a promising approach to narrative extraction from semantic graphs and has the potential to be useful in various applications, including chatbots, conversational agents, and content creation tools.

Cite as

Daniil Lystopadskyi, André Santos, and José Paulo Leal. Narrative Extraction from Semantic Graphs (Short Paper). In 12th Symposium on Languages, Applications and Technologies (SLATE 2023). Open Access Series in Informatics (OASIcs), Volume 113, pp. 9:1-9:8, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

  author =	{Lystopadskyi, Daniil and Santos, Andr\'{e} and Leal, Jos\'{e} Paulo},
  title =	{{Narrative Extraction from Semantic Graphs}},
  booktitle =	{12th Symposium on Languages, Applications and Technologies (SLATE 2023)},
  pages =	{9:1--9:8},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-291-4},
  ISSN =	{2190-6807},
  year =	{2023},
  volume =	{113},
  editor =	{Sim\~{o}es, Alberto and Ber\'{o}n, Mario Marcelo and Portela, Filipe},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2023.9},
  URN =		{urn:nbn:de:0030-drops-185231},
  doi =		{10.4230/OASIcs.SLATE.2023.9},
  annote =	{Keywords: Narratives, Narrative Extraction, Information Retrieval, Knowledge Graphs, Semantic Graphs, Resource Description Framework, Web Ontology}
Short Paper
Large Language Models: Compilers for the 4^{th} Generation of Programming Languages? (Short Paper)

Authors: Francisco S. Marcondes, José João Almeida, and Paulo Novais

This paper explores the possibility of large language models as a fourth generation programming language compiler. This is based on the idea that large language models are able to translate a natural language specification into a program written in a particular programming language. In other words, just as high-level languages provided an additional language abstraction to assembly code, large language models can provide an additional language abstraction to high-level languages. This interpretation allows large language models to be thought of through the lens of compiler theory, leading to insightful conclusions.

Cite as

Francisco S. Marcondes, José João Almeida, and Paulo Novais. Large Language Models: Compilers for the 4^{th} Generation of Programming Languages? (Short Paper). In 12th Symposium on Languages, Applications and Technologies (SLATE 2023). Open Access Series in Informatics (OASIcs), Volume 113, pp. 10:1-10:8, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

  author =	{S. Marcondes, Francisco and Almeida, Jos\'{e} Jo\~{a}o and Novais, Paulo},
  title =	{{Large Language Models: Compilers for the 4^\{th\} Generation of Programming Languages?}},
  booktitle =	{12th Symposium on Languages, Applications and Technologies (SLATE 2023)},
  pages =	{10:1--10:8},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-291-4},
  ISSN =	{2190-6807},
  year =	{2023},
  volume =	{113},
  editor =	{Sim\~{o}es, Alberto and Ber\'{o}n, Mario Marcelo and Portela, Filipe},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2023.10},
  URN =		{urn:nbn:de:0030-drops-185240},
  doi =		{10.4230/OASIcs.SLATE.2023.10},
  annote =	{Keywords: programming language, compiler, large language model}
Hierarchical Data-Flow Graphs

Authors: José Pereira, Vitor Vieira, and Alberto Simões

Data-Flows are crucial to detect the dependency of statements and expressions in a programming language program. In the context of Static Application Security Testing (SAST), they are heavily used in different aspects, from detecting tainted data to understanding code dependency. In Checkmarx, these data flows are currently computed on the fly, but their efficiency is not the desired, especially when dealing with large projects. With this in mind, a new caching mechanism is being developed, based on hierarchical graphs. In this document, we discuss the basic idea behind this approach, the challenges found and the decisions put in place for the implementation. We will also share the first insights on speed improvements for a proof of concept implementation.

Cite as

José Pereira, Vitor Vieira, and Alberto Simões. Hierarchical Data-Flow Graphs. In 12th Symposium on Languages, Applications and Technologies (SLATE 2023). Open Access Series in Informatics (OASIcs), Volume 113, pp. 11:1-11:9, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

  author =	{Pereira, Jos\'{e} and Vieira, Vitor and Sim\~{o}es, Alberto},
  title =	{{Hierarchical Data-Flow Graphs}},
  booktitle =	{12th Symposium on Languages, Applications and Technologies (SLATE 2023)},
  pages =	{11:1--11:9},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-291-4},
  ISSN =	{2190-6807},
  year =	{2023},
  volume =	{113},
  editor =	{Sim\~{o}es, Alberto and Ber\'{o}n, Mario Marcelo and Portela, Filipe},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2023.11},
  URN =		{urn:nbn:de:0030-drops-185252},
  doi =		{10.4230/OASIcs.SLATE.2023.11},
  annote =	{Keywords: Data Flow, Static Application Security Testing, Hierarchical Graphs}
Type Annotation for SAST

Authors: Marco Pereira, Alberto Simões, and Pedro Rangel Henriques

Static Application Security Testing (SAST) is a type of software security testing that analyzes the source code of an application to identify security vulnerabilities and coding errors. It helps detect security vulnerabilities in software code before deployment reducing the risk of exploitation by attackers. The work presented in this document describes the work performed to upgrade Checkmarx’s SAST tool allowing the execution of vulnerability detection taking into account expression types. For this to be possible, every expression in the Document Object Model needs to have a specific type assigned accordingly to the kind of operation and to the different operand types. At the current stage, this project is already supporting the expression type annotation for three programming languages: C, C++ and C#. This support has been done through the addition of a new Resolver Rule to the Resolver stage, allowing for the generalization of languages. We also compare the complexity of writing vulnerability detection queries with or without access to type information.

Cite as

Marco Pereira, Alberto Simões, and Pedro Rangel Henriques. Type Annotation for SAST. In 12th Symposium on Languages, Applications and Technologies (SLATE 2023). Open Access Series in Informatics (OASIcs), Volume 113, pp. 12:1-12:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

  author =	{Pereira, Marco and Sim\~{o}es, Alberto and Henriques, Pedro Rangel},
  title =	{{Type Annotation for SAST}},
  booktitle =	{12th Symposium on Languages, Applications and Technologies (SLATE 2023)},
  pages =	{12:1--12:13},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-291-4},
  ISSN =	{2190-6807},
  year =	{2023},
  volume =	{113},
  editor =	{Sim\~{o}es, Alberto and Ber\'{o}n, Mario Marcelo and Portela, Filipe},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2023.12},
  URN =		{urn:nbn:de:0030-drops-185261},
  doi =		{10.4230/OASIcs.SLATE.2023.12},
  annote =	{Keywords: Static Application Security Testing, Type Annotation, C, C++, C#}
Characterization and Identification of Programming Languages

Authors: Júlio Alves, Alvaro Costa Neto, Maria João Varanda Pereira, and Pedro Rangel Henriques

This paper presents and discusses a research work whose main goal is to identify which characteristics influence the recognition and identification, by a programmer, of a programming language, specifically analysing a program source code and its linguistic style. In other words, the study that is described aims at answering the following questions: which grammatical elements - including lexical, syntactic, and semantic details - contribute the most for the characterization of a language? How many structural elements of a language may be modified without losing its identity? The long term objective of such research is to acquire new insights on the factors that can lead language engineers to design new programming languages that reduce the cognitive load of both learners and programmers. To elaborate on that subject, the paper starts with a brief explanation of programming languages fundamentals. Then, a list of the main syntactic characteristics of a set of programming languages, chosen for the study, is presented. Those characteristics outcome from the analysis we carried on at first phase of our project. To go deeper on the investigation we decided to collect and analyze the opinion of other programmers. So, the design of a survey to address that task is discussed. The answers obtained from the application of the questionnaire are analysed to present an overall picture of programming languages characteristics and their relative influence to their identification from the programmers’ perspective.

Cite as

Júlio Alves, Alvaro Costa Neto, Maria João Varanda Pereira, and Pedro Rangel Henriques. Characterization and Identification of Programming Languages. In 12th Symposium on Languages, Applications and Technologies (SLATE 2023). Open Access Series in Informatics (OASIcs), Volume 113, pp. 13:1-13:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

  author =	{Alves, J\'{u}lio and Costa Neto, Alvaro and Pereira, Maria Jo\~{a}o Varanda and Henriques, Pedro Rangel},
  title =	{{Characterization and Identification of Programming Languages}},
  booktitle =	{12th Symposium on Languages, Applications and Technologies (SLATE 2023)},
  pages =	{13:1--13:13},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-291-4},
  ISSN =	{2190-6807},
  year =	{2023},
  volume =	{113},
  editor =	{Sim\~{o}es, Alberto and Ber\'{o}n, Mario Marcelo and Portela, Filipe},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2023.13},
  URN =		{urn:nbn:de:0030-drops-185273},
  doi =		{10.4230/OASIcs.SLATE.2023.13},
  annote =	{Keywords: Programming Languages, Programming Language Characterization, Programming Language Design, Programming Language Identification}
Towards a Universal and Interoperable Scientific Data Model

Authors: João Oliveira, Diogo Gomes, Francisca Santana, Jorge Oliveira e Sá, and Filipe Portela

The growing number of researchers in Portugal has intensified the appearance of several scientific platforms that allow the indexation of publications and the management of scientific profiles. The diversity and high number of platforms brings problems at the level of crossover and integrity of the information, i.e., the researchers’ profiles are rarely updated, and their data are not properly grouped and cross-referenced. Hence, the need arises for a more global platform that enables the synchronization of information, free from constraints imposed by existing data.The study and work carried out aims to solve this problem by creating a robust and interoperable platform based on an innovative library merge algorithm. Thus, this platform includes information regarding publications, researchers and scientific indicators, by crossing and grouping data from several platforms.

Cite as

João Oliveira, Diogo Gomes, Francisca Santana, Jorge Oliveira e Sá, and Filipe Portela. Towards a Universal and Interoperable Scientific Data Model. In 12th Symposium on Languages, Applications and Technologies (SLATE 2023). Open Access Series in Informatics (OASIcs), Volume 113, pp. 14:1-14:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

  author =	{Oliveira, Jo\~{a}o and Gomes, Diogo and Santana, Francisca and S\'{a}, Jorge Oliveira e and Portela, Filipe},
  title =	{{Towards a Universal and Interoperable Scientific Data Model}},
  booktitle =	{12th Symposium on Languages, Applications and Technologies (SLATE 2023)},
  pages =	{14:1--14:16},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-291-4},
  ISSN =	{2190-6807},
  year =	{2023},
  volume =	{113},
  editor =	{Sim\~{o}es, Alberto and Ber\'{o}n, Mario Marcelo and Portela, Filipe},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2023.14},
  URN =		{urn:nbn:de:0030-drops-185280},
  doi =		{10.4230/OASIcs.SLATE.2023.14},
  annote =	{Keywords: RDProfile, Researchers, Scientific Platforms, Scientific Data}
Short Paper
Integrating Gamified Educational Escape Rooms in Learning Management Systems (Short Paper)

Authors: Ricardo Queirós, Carla Pinto, Mário Cruz, and Daniela Mascarenhas

Escape rooms offer an immersive and engaging learning experience that encourages critical thinking, problem solving and teamwork. Although they have shown promising results in promoting student engagement in the teaching-learning process, they continue to operate as independent systems that are not fully integrated into educational environments. This work aims to detail the integration of educational escape rooms, based on international standards, with the typical central component of an educational setting - the learning management system (LMS). In order to proof this concept, we present the integration of a math escape room with the Moodle LMS using the Learning Tools Interoperability (LTI) specification. Currently, this specification comprises a set of Web services that enable seamless integration between learning platforms and external tools and is not limited to any specific LMS which fosters learning interoperability. With this implementation, a single sign-on ecosystem is created, where teachers and students can interact in a simple and immersive way. The major contribution of this work is to serve as an integration guide for other applications and in different domains.

Cite as

Ricardo Queirós, Carla Pinto, Mário Cruz, and Daniela Mascarenhas. Integrating Gamified Educational Escape Rooms in Learning Management Systems (Short Paper). In 12th Symposium on Languages, Applications and Technologies (SLATE 2023). Open Access Series in Informatics (OASIcs), Volume 113, pp. 15:1-15:8, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

  author =	{Queir\'{o}s, Ricardo and Pinto, Carla and Cruz, M\'{a}rio and Mascarenhas, Daniela},
  title =	{{Integrating Gamified Educational Escape Rooms in Learning Management Systems}},
  booktitle =	{12th Symposium on Languages, Applications and Technologies (SLATE 2023)},
  pages =	{15:1--15:8},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-291-4},
  ISSN =	{2190-6807},
  year =	{2023},
  volume =	{113},
  editor =	{Sim\~{o}es, Alberto and Ber\'{o}n, Mario Marcelo and Portela, Filipe},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2023.15},
  URN =		{urn:nbn:de:0030-drops-185293},
  doi =		{10.4230/OASIcs.SLATE.2023.15},
  annote =	{Keywords: Escape Rooms, Interoperability, Learning Management Systems, Standardization}
Romaria De Nª Srª D'Agonia: Building a Digital Repository and a Virtual Museum

Authors: Sara Cristina Freitas Queirós, Cristiana Araújo, and Pedro Rangel Henriques

Romarias are Christian pilgrimages that occur in order to celebrate a specific saint. Romaria de Nª Srª d'Agonia (RNS Agonia, for short) occurs aimed at celebrating Nossa Senhora da Agonia, patron of all Fishermen, at Viana do Castelo, Portugal. RNS Agonia is a very old event that surely belongs to the Minho’s Cultural Heritage. There are many written documents, of various types, that describe the event; so, their digital preservation is mandatory. However, digital preservation is not restricted to a database of digital images obtained by scanning the documents. In this paper we are concerned with digital repositories of XML-based annotated documents from which we can extract automatically data to build a virtual museum that helps on disseminating information about RNS Agonia. Such a Web resource is crucial to support people wishing to know more about that pilgrimage, and also as a booster for tourism. The paper describes the different stages of this project, including the documents annotation process, data extraction mechanisms, the creation of a triple storage to archive the knowledge built from the sources analyzed, and the virtual museum implementation. The methodological approach devised for the project under discussion is based on the creation of an ontology that describes the RNS Agonia domain completely. The idea is to define the XML dialect, to be used in the annotation, from the ontology. Moreover the ontology will also lead the definition of the triple store used to set up the knowledge base that feeds the museum.

Cite as

Sara Cristina Freitas Queirós, Cristiana Araújo, and Pedro Rangel Henriques. Romaria De Nª Srª D'Agonia: Building a Digital Repository and a Virtual Museum. In 12th Symposium on Languages, Applications and Technologies (SLATE 2023). Open Access Series in Informatics (OASIcs), Volume 113, pp. 16:1-16:16, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

  author =	{Queir\'{o}s, Sara Cristina Freitas and Ara\'{u}jo, Cristiana and Henriques, Pedro Rangel},
  title =	{{Romaria De Nª Srª D'Agonia: Building a Digital Repository and a Virtual Museum}},
  booktitle =	{12th Symposium on Languages, Applications and Technologies (SLATE 2023)},
  pages =	{16:1--16:16},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-291-4},
  ISSN =	{2190-6807},
  year =	{2023},
  volume =	{113},
  editor =	{Sim\~{o}es, Alberto and Ber\'{o}n, Mario Marcelo and Portela, Filipe},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2023.16},
  URN =		{urn:nbn:de:0030-drops-185306},
  doi =		{10.4230/OASIcs.SLATE.2023.16},
  annote =	{Keywords: Ontology, XML, Romaria, Pilgrimage, Digital Knowledge Repository, Triple Storage Database, Virtual Museum}
