DROPS

Document

DOI: 10.4230/OASIcs.ICPEC.2023.9

NLP/AI Based Techniques for Programming Exercises Generation

Authors: Tiago Carvalho Freitas, Alvaro Costa Neto, Maria João Varanda Pereira, and Pedro Rangel Henriques

Published in: OASIcs, Volume 112, 4th International Computer Programming Education Conference (ICPEC 2023)

Abstract

This paper focuses on the enhancement of computer programming exercises generation to the benefit of both students and teachers. By exploring Natural Language Processing (NLP) and Machine Learning (ML) methods for automatic generation of text and source code, it is possible to semi-automatically construct programming exercises, aiding teachers to reduce redundant work and more easily apply active learning methodologies. This would not only allow them to still play a leading role in the teaching-learning process, but also provide students a better and more interactive learning experience. If embedded in a widely accessible website, an exercises generator with these Artificial Intelligence (AI) methods might be used directly by students, in order to obtain randomised lists of exercises for their own study, at their own time. The emergence of new and increasingly powerful technologies, such as the ones utilised by ChatGPT, raises the discussion about their use for exercise generation. Albeit highly capable, monetary and computational costs are still obstacles for wider adoption, as well as the possibility of incorrect results. This paper describes the characteristics and behaviour of several ML models applied and trained for text and code generation and their use to generate computer programming exercises. Finally, an analysis based on correctness and coherence of the resulting exercise statements and complementary source codes generated/produced is presented, and the role that this type of technology can play in a programming exercise automatic generation system is discussed.

Cite as

Tiago Carvalho Freitas, Alvaro Costa Neto, Maria João Varanda Pereira, and Pedro Rangel Henriques. NLP/AI Based Techniques for Programming Exercises Generation. In 4th International Computer Programming Education Conference (ICPEC 2023). Open Access Series in Informatics (OASIcs), Volume 112, pp. 9:1-9:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{freitas_et_al:OASIcs.ICPEC.2023.9,
  author =	{Freitas, Tiago Carvalho and Costa Neto, Alvaro and Pereira, Maria Jo\~{a}o Varanda and Henriques, Pedro Rangel},
  title =	{{NLP/AI Based Techniques for Programming Exercises Generation}},
  booktitle =	{4th International Computer Programming Education Conference (ICPEC 2023)},
  pages =	{9:1--9:12},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-290-7},
  ISSN =	{2190-6807},
  year =	{2023},
  volume =	{112},
  editor =	{Peixoto de Queir\'{o}s, Ricardo Alexandre and Teixeira Pinto, M\'{a}rio Paulo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.ICPEC.2023.9},
  URN =		{urn:nbn:de:0030-drops-185058},
  doi =		{10.4230/OASIcs.ICPEC.2023.9},
  annote =	{Keywords: Natural Language Processing, Computer Programming Education, Exercises Generation, Text Generation, Code Generation}
}

@InProceedings{freitas_et_al:OASIcs.ICPEC.2023.9,
  author =	{Freitas, Tiago Carvalho and Costa Neto, Alvaro and Pereira, Maria Jo\~{a}o Varanda and Henriques, Pedro Rangel},
  title =	{{NLP/AI Based Techniques for Programming Exercises Generation}},
  booktitle =	{4th International Computer Programming Education Conference (ICPEC 2023)},
  pages =	{9:1--9:12},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-290-7},
  ISSN =	{2190-6807},
  year =	{2023},
  volume =	{112},
  editor =	{Peixoto de Queir\'{o}s, Ricardo Alexandre and Teixeira Pinto, M\'{a}rio Paulo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.ICPEC.2023.9},
  URN =		{urn:nbn:de:0030-drops-185058},
  doi =		{10.4230/OASIcs.ICPEC.2023.9},
  annote =	{Keywords: Natural Language Processing, Computer Programming Education, Exercises Generation, Text Generation, Code Generation}
}

Document

DOI: 10.4230/OASIcs.SLATE.2014.19

Conclave: Writing Programs to Understand Programs

Authors: Nuno Ramos Carvalho, José João Almeida, Maria João Varanda Pereira, and Pedro Rangel Henriques

Published in: OASIcs, Volume 38, 3rd Symposium on Languages, Applications and Technologies (2014)

Abstract

Software maintainers are often challenged with source code changes to improve software systems, or eliminate defects, in unfamiliar programs. To undertake these tasks a sufficient understanding of the system, or at least a small part of it, is required. One of the most time consuming tasks of this process is locating which parts of the code are responsible for some key functionality or feature. This paper introduces Conclave, an environment for software analysis, that enhances program comprehension activities. Programmers use natural languages to describe and discuss the problem domain, programming languages to write source code, and markup languages to have programs talking with other programs, and so this system has to cope with this heterogeneity of dialects, and provide tools in all these areas to effectively contribute to the understanding process. The source code, the problem domain, and the side effects of running the program are represented in the system using ontologies. A combination of tools (specialized in different kinds of languages) create mappings between the different domains. Conclave provides facilities for feature location, code search, and views of the software that ease the process of understanding the code, devising changes. The underlying feature location technique explores natural language terms used in programs (e.g. function and variable names); using textual analysis and a collection of Natural Language Processing techniques, computes synonymous sets of terms. These sets are used to score relatedness between program elements, and search queries or problem domain concepts, producing sorted ranks of program elements that address the search criteria, or concepts respectively.

Cite as

Nuno Ramos Carvalho, José João Almeida, Maria João Varanda Pereira, and Pedro Rangel Henriques. Conclave: Writing Programs to Understand Programs. In 3rd Symposium on Languages, Applications and Technologies. Open Access Series in Informatics (OASIcs), Volume 38, pp. 19-34, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2014)

Copy BibTex To Clipboard

@InProceedings{carvalho_et_al:OASIcs.SLATE.2014.19,
  author =	{Carvalho, Nuno Ramos and Almeida, Jos\'{e} Jo\~{a}o and Pereira, Maria Jo\~{a}o Varanda and Henriques, Pedro Rangel},
  title =	{{Conclave: Writing Programs to Understand Programs}},
  booktitle =	{3rd Symposium on Languages, Applications and Technologies},
  pages =	{19--34},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-939897-68-2},
  ISSN =	{2190-6807},
  year =	{2014},
  volume =	{38},
  editor =	{Pereira, Maria Jo\~{a}o Varanda and Leal, Jos\'{e} Paulo and Sim\~{o}es, Alberto},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2014.19},
  URN =		{urn:nbn:de:0030-drops-45561},
  doi =		{10.4230/OASIcs.SLATE.2014.19},
  annote =	{Keywords: software maintenance, software evolution, program comprehension, feature location, concept location, natural language processing}
}

@InProceedings{carvalho_et_al:OASIcs.SLATE.2014.19,
  author =	{Carvalho, Nuno Ramos and Almeida, Jos\'{e} Jo\~{a}o and Pereira, Maria Jo\~{a}o Varanda and Henriques, Pedro Rangel},
  title =	{{Conclave: Writing Programs to Understand Programs}},
  booktitle =	{3rd Symposium on Languages, Applications and Technologies},
  pages =	{19--34},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-939897-68-2},
  ISSN =	{2190-6807},
  year =	{2014},
  volume =	{38},
  editor =	{Pereira, Maria Jo\~{a}o Varanda and Leal, Jos\'{e} Paulo and Sim\~{o}es, Alberto},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2014.19},
  URN =		{urn:nbn:de:0030-drops-45561},
  doi =		{10.4230/OASIcs.SLATE.2014.19},
  annote =	{Keywords: software maintenance, software evolution, program comprehension, feature location, concept location, natural language processing}
}

Document

DOI: 10.4230/OASIcs.SLATE.2014.185

Detecting a Tweet’s Topic within a Large Number of Portuguese Twitter Trends

Authors: Hugo Rosa, João Paulo Carvalho, and Fernando Batista

Published in: OASIcs, Volume 38, 3rd Symposium on Languages, Applications and Technologies (2014)

Abstract

In this paper we propose to approach the subject of Twitter Topic Detection when in the presence of a large number of trending topics. We use a new technique, called Twitter Topic Fuzzy Fingerprints, and compare it with two popular text classification techniques, Support Vector Machines (SVM) and k-Nearest Neighbours (kNN). Preliminary results show that it outperforms the other two techniques, while still being much faster, which is an essential feature when processing large volumes of streaming data. We focused on a data set of Portuguese language tweets and the respective top trends as indicated by Twitter.

Cite as

Hugo Rosa, João Paulo Carvalho, and Fernando Batista. Detecting a Tweet’s Topic within a Large Number of Portuguese Twitter Trends. In 3rd Symposium on Languages, Applications and Technologies. Open Access Series in Informatics (OASIcs), Volume 38, pp. 185-199, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2014)

Copy BibTex To Clipboard

@InProceedings{rosa_et_al:OASIcs.SLATE.2014.185,
  author =	{Rosa, Hugo and Carvalho, Jo\~{a}o Paulo and Batista, Fernando},
  title =	{{Detecting a Tweet’s Topic within a Large Number of Portuguese Twitter Trends}},
  booktitle =	{3rd Symposium on Languages, Applications and Technologies},
  pages =	{185--199},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-939897-68-2},
  ISSN =	{2190-6807},
  year =	{2014},
  volume =	{38},
  editor =	{Pereira, Maria Jo\~{a}o Varanda and Leal, Jos\'{e} Paulo and Sim\~{o}es, Alberto},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2014.185},
  URN =		{urn:nbn:de:0030-drops-45696},
  doi =		{10.4230/OASIcs.SLATE.2014.185},
  annote =	{Keywords: topic detection, social networks data mining, Twitter, Portuguese language}
}

Document

DOI: 10.4230/OASIcs.SLATE.2014.275

Expanding a Database of Portuguese Tweets

Authors: Gaspar Brogueira, Fernando Batista, João Paulo Carvalho, and Helena Moniz

Published in: OASIcs, Volume 38, 3rd Symposium on Languages, Applications and Technologies (2014)

Abstract

This paper describes an existing database of geolocated tweets that were produced in Portuguese regions and proposes an approach to further expand it. The existing database covers eight consecutive days of collected tweets, totaling about 300 thousand tweets, produced by about 11 thousand different users. A detailed analysis on the content of the messages suggests a predominance of young authors that use Twitter as a way of reaching their colleagues with their feelings, ideas and comments. In order to further characterize this community of young people, we propose a method for retrieving additional tweets produced by the same set of authors already in the database. Our goal is to further extend the knowledge about each user of this community, making it possible to automatically characterize each user by the content he/she produces, cluster users and open other possibilities in the scope of social analysis.

Cite as

Gaspar Brogueira, Fernando Batista, João Paulo Carvalho, and Helena Moniz. Expanding a Database of Portuguese Tweets. In 3rd Symposium on Languages, Applications and Technologies. Open Access Series in Informatics (OASIcs), Volume 38, pp. 275-282, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2014)

Copy BibTex To Clipboard

@InProceedings{brogueira_et_al:OASIcs.SLATE.2014.275,
  author =	{Brogueira, Gaspar and Batista, Fernando and Carvalho, Jo\~{a}o Paulo and Moniz, Helena},
  title =	{{Expanding a Database of Portuguese Tweets}},
  booktitle =	{3rd Symposium on Languages, Applications and Technologies},
  pages =	{275--282},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-939897-68-2},
  ISSN =	{2190-6807},
  year =	{2014},
  volume =	{38},
  editor =	{Pereira, Maria Jo\~{a}o Varanda and Leal, Jos\'{e} Paulo and Sim\~{o}es, Alberto},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2014.275},
  URN =		{urn:nbn:de:0030-drops-45763},
  doi =		{10.4230/OASIcs.SLATE.2014.275},
  annote =	{Keywords: Twitter, corpus of Portuguese tweets, Twitter API, natural language processing, text analysis}
}

Document

DOI: 10.4230/OASIcs.SLATE.2014.283

MLT-prealigner: a Tool for Multilingual Text Alignment

Authors: Pedro Carvalho and José João Almeida

Published in: OASIcs, Volume 38, 3rd Symposium on Languages, Applications and Technologies (2014)

Abstract

Parallel text alignment is a key procedure in the automated translation area. A large number of aligners have been presented along the years, but these require that the target resources have been pre-prepared for alignment (either manually or automatically). It is rather normal to encounter mixed language documents, that is, documents where the same information is written in many languages (Ex: manuals of electronic devices, touristic information, PhD thesis with dual language abstracts, etc). In this article we present MLT-prealigner: a tool aimed at helping those that need to process mixed texts in order to feed alignment tools and other related language systems.

Cite as

Pedro Carvalho and José João Almeida. MLT-prealigner: a Tool for Multilingual Text Alignment. In 3rd Symposium on Languages, Applications and Technologies. Open Access Series in Informatics (OASIcs), Volume 38, pp. 283-290, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2014)

Copy BibTex To Clipboard

@InProceedings{carvalho_et_al:OASIcs.SLATE.2014.283,
  author =	{Carvalho, Pedro and Almeida, Jos\'{e} Jo\~{a}o},
  title =	{{MLT-prealigner: a Tool for Multilingual Text Alignment}},
  booktitle =	{3rd Symposium on Languages, Applications and Technologies},
  pages =	{283--290},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-939897-68-2},
  ISSN =	{2190-6807},
  year =	{2014},
  volume =	{38},
  editor =	{Pereira, Maria Jo\~{a}o Varanda and Leal, Jos\'{e} Paulo and Sim\~{o}es, Alberto},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2014.283},
  URN =		{urn:nbn:de:0030-drops-45776},
  doi =		{10.4230/OASIcs.SLATE.2014.283},
  annote =	{Keywords: parallel corpora, multilingual text alignment, language detection, Perl, automated translation}
}

5 Search Results for "Carvalho, João Paulo"

NLP/AI Based Techniques for Programming Exercises Generation

Abstract

Cite as

Conclave: Writing Programs to Understand Programs

Abstract

Cite as

Detecting a Tweet’s Topic within a Large Number of Portuguese Twitter Trends

Abstract

Cite as

Expanding a Database of Portuguese Tweets

Abstract

Cite as

MLT-prealigner: a Tool for Multilingual Text Alignment

Abstract

Cite as

Thanks for your feedback!

Could not send message