DROPS

Document

DOI: 10.4230/OASIcs.ICPEC.2023.9

NLP/AI Based Techniques for Programming Exercises Generation

Authors: Tiago Carvalho Freitas, Alvaro Costa Neto, Maria João Varanda Pereira, and Pedro Rangel Henriques

Published in: OASIcs, Volume 112, 4th International Computer Programming Education Conference (ICPEC 2023)

Abstract

This paper focuses on the enhancement of computer programming exercises generation to the benefit of both students and teachers. By exploring Natural Language Processing (NLP) and Machine Learning (ML) methods for automatic generation of text and source code, it is possible to semi-automatically construct programming exercises, aiding teachers to reduce redundant work and more easily apply active learning methodologies. This would not only allow them to still play a leading role in the teaching-learning process, but also provide students a better and more interactive learning experience. If embedded in a widely accessible website, an exercises generator with these Artificial Intelligence (AI) methods might be used directly by students, in order to obtain randomised lists of exercises for their own study, at their own time. The emergence of new and increasingly powerful technologies, such as the ones utilised by ChatGPT, raises the discussion about their use for exercise generation. Albeit highly capable, monetary and computational costs are still obstacles for wider adoption, as well as the possibility of incorrect results. This paper describes the characteristics and behaviour of several ML models applied and trained for text and code generation and their use to generate computer programming exercises. Finally, an analysis based on correctness and coherence of the resulting exercise statements and complementary source codes generated/produced is presented, and the role that this type of technology can play in a programming exercise automatic generation system is discussed.

Cite as

Tiago Carvalho Freitas, Alvaro Costa Neto, Maria João Varanda Pereira, and Pedro Rangel Henriques. NLP/AI Based Techniques for Programming Exercises Generation. In 4th International Computer Programming Education Conference (ICPEC 2023). Open Access Series in Informatics (OASIcs), Volume 112, pp. 9:1-9:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2023)

Copy BibTex To Clipboard

@InProceedings{freitas_et_al:OASIcs.ICPEC.2023.9,
  author =	{Freitas, Tiago Carvalho and Costa Neto, Alvaro and Pereira, Maria Jo\~{a}o Varanda and Henriques, Pedro Rangel},
  title =	{{NLP/AI Based Techniques for Programming Exercises Generation}},
  booktitle =	{4th International Computer Programming Education Conference (ICPEC 2023)},
  pages =	{9:1--9:12},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-290-7},
  ISSN =	{2190-6807},
  year =	{2023},
  volume =	{112},
  editor =	{Peixoto de Queir\'{o}s, Ricardo Alexandre and Teixeira Pinto, M\'{a}rio Paulo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.ICPEC.2023.9},
  URN =		{urn:nbn:de:0030-drops-185058},
  doi =		{10.4230/OASIcs.ICPEC.2023.9},
  annote =	{Keywords: Natural Language Processing, Computer Programming Education, Exercises Generation, Text Generation, Code Generation}
}

@InProceedings{freitas_et_al:OASIcs.ICPEC.2023.9,
  author =	{Freitas, Tiago Carvalho and Costa Neto, Alvaro and Pereira, Maria Jo\~{a}o Varanda and Henriques, Pedro Rangel},
  title =	{{NLP/AI Based Techniques for Programming Exercises Generation}},
  booktitle =	{4th International Computer Programming Education Conference (ICPEC 2023)},
  pages =	{9:1--9:12},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-290-7},
  ISSN =	{2190-6807},
  year =	{2023},
  volume =	{112},
  editor =	{Peixoto de Queir\'{o}s, Ricardo Alexandre and Teixeira Pinto, M\'{a}rio Paulo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.ICPEC.2023.9},
  URN =		{urn:nbn:de:0030-drops-185058},
  doi =		{10.4230/OASIcs.ICPEC.2023.9},
  annote =	{Keywords: Natural Language Processing, Computer Programming Education, Exercises Generation, Text Generation, Code Generation}
}

Document

DOI: 10.4230/OASIcs.SLATE.2014.19

Conclave: Writing Programs to Understand Programs

Authors: Nuno Ramos Carvalho, José João Almeida, Maria João Varanda Pereira, and Pedro Rangel Henriques

Published in: OASIcs, Volume 38, 3rd Symposium on Languages, Applications and Technologies (2014)

Abstract

Software maintainers are often challenged with source code changes to improve software systems, or eliminate defects, in unfamiliar programs. To undertake these tasks a sufficient understanding of the system, or at least a small part of it, is required. One of the most time consuming tasks of this process is locating which parts of the code are responsible for some key functionality or feature. This paper introduces Conclave, an environment for software analysis, that enhances program comprehension activities. Programmers use natural languages to describe and discuss the problem domain, programming languages to write source code, and markup languages to have programs talking with other programs, and so this system has to cope with this heterogeneity of dialects, and provide tools in all these areas to effectively contribute to the understanding process. The source code, the problem domain, and the side effects of running the program are represented in the system using ontologies. A combination of tools (specialized in different kinds of languages) create mappings between the different domains. Conclave provides facilities for feature location, code search, and views of the software that ease the process of understanding the code, devising changes. The underlying feature location technique explores natural language terms used in programs (e.g. function and variable names); using textual analysis and a collection of Natural Language Processing techniques, computes synonymous sets of terms. These sets are used to score relatedness between program elements, and search queries or problem domain concepts, producing sorted ranks of program elements that address the search criteria, or concepts respectively.

Cite as

Nuno Ramos Carvalho, José João Almeida, Maria João Varanda Pereira, and Pedro Rangel Henriques. Conclave: Writing Programs to Understand Programs. In 3rd Symposium on Languages, Applications and Technologies. Open Access Series in Informatics (OASIcs), Volume 38, pp. 19-34, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2014)

Copy BibTex To Clipboard

@InProceedings{carvalho_et_al:OASIcs.SLATE.2014.19,
  author =	{Carvalho, Nuno Ramos and Almeida, Jos\'{e} Jo\~{a}o and Pereira, Maria Jo\~{a}o Varanda and Henriques, Pedro Rangel},
  title =	{{Conclave: Writing Programs to Understand Programs}},
  booktitle =	{3rd Symposium on Languages, Applications and Technologies},
  pages =	{19--34},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-939897-68-2},
  ISSN =	{2190-6807},
  year =	{2014},
  volume =	{38},
  editor =	{Pereira, Maria Jo\~{a}o Varanda and Leal, Jos\'{e} Paulo and Sim\~{o}es, Alberto},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2014.19},
  URN =		{urn:nbn:de:0030-drops-45561},
  doi =		{10.4230/OASIcs.SLATE.2014.19},
  annote =	{Keywords: software maintenance, software evolution, program comprehension, feature location, concept location, natural language processing}
}

@InProceedings{carvalho_et_al:OASIcs.SLATE.2014.19,
  author =	{Carvalho, Nuno Ramos and Almeida, Jos\'{e} Jo\~{a}o and Pereira, Maria Jo\~{a}o Varanda and Henriques, Pedro Rangel},
  title =	{{Conclave: Writing Programs to Understand Programs}},
  booktitle =	{3rd Symposium on Languages, Applications and Technologies},
  pages =	{19--34},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-939897-68-2},
  ISSN =	{2190-6807},
  year =	{2014},
  volume =	{38},
  editor =	{Pereira, Maria Jo\~{a}o Varanda and Leal, Jos\'{e} Paulo and Sim\~{o}es, Alberto},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2014.19},
  URN =		{urn:nbn:de:0030-drops-45561},
  doi =		{10.4230/OASIcs.SLATE.2014.19},
  annote =	{Keywords: software maintenance, software evolution, program comprehension, feature location, concept location, natural language processing}
}

Document

DOI: 10.4230/OASIcs.SLATE.2014.101

Unfuzzying Fuzzy Parsing

Authors: Pedro Carvalho, Nuno Oliveira, and Pedro Rangel Henriques

Published in: OASIcs, Volume 38, 3rd Symposium on Languages, Applications and Technologies (2014)

Abstract

Traditional parsing has always been a focus of discussion among the computer science community. Numerous techniques and algorithms have been proposed along these years, but they require that input texts are correct according to a specific grammar. However, in some cases it's necessary to cope with incorrect or unpredicted inputs that raise ambiguities, making traditional parsing unsuitable. These situations led to the emergence of robust parsing theories, where fuzzy parsing gains relevance. Robust parsing comes with a price by losing precision and decaying performance, as multiple parses of the input may be necessary while looking for an optimal one. In this short paper we briefly describe the main robust parsing techniques and end up proposing a different solution to deal with fuzziness of input texts. It is based on automata where states represent contexts and edges represent potential matches (of constructs of interest) inside those contexts. It is expected that such an approach reduces recognition time and ambiguity as contexts reduce the search space by defining a smaller domain for constructs of interest. Such benefits may be a great addition to the robust parsing area with application on program comprehension, among other research fields.

Cite as

Pedro Carvalho, Nuno Oliveira, and Pedro Rangel Henriques. Unfuzzying Fuzzy Parsing. In 3rd Symposium on Languages, Applications and Technologies. Open Access Series in Informatics (OASIcs), Volume 38, pp. 101-108, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2014)

Copy BibTex To Clipboard

@InProceedings{carvalho_et_al:OASIcs.SLATE.2014.101,
  author =	{Carvalho, Pedro and Oliveira, Nuno and Henriques, Pedro Rangel},
  title =	{{Unfuzzying Fuzzy Parsing}},
  booktitle =	{3rd Symposium on Languages, Applications and Technologies},
  pages =	{101--108},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-939897-68-2},
  ISSN =	{2190-6807},
  year =	{2014},
  volume =	{38},
  editor =	{Pereira, Maria Jo\~{a}o Varanda and Leal, Jos\'{e} Paulo and Sim\~{o}es, Alberto},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2014.101},
  URN =		{urn:nbn:de:0030-drops-45637},
  doi =		{10.4230/OASIcs.SLATE.2014.101},
  annote =	{Keywords: robust parsing, fuzzy parsing, automata}
}

Document

DOI: 10.4230/OASIcs.SLATE.2014.283

MLT-prealigner: a Tool for Multilingual Text Alignment

Authors: Pedro Carvalho and José João Almeida

Published in: OASIcs, Volume 38, 3rd Symposium on Languages, Applications and Technologies (2014)

Abstract

Parallel text alignment is a key procedure in the automated translation area. A large number of aligners have been presented along the years, but these require that the target resources have been pre-prepared for alignment (either manually or automatically). It is rather normal to encounter mixed language documents, that is, documents where the same information is written in many languages (Ex: manuals of electronic devices, touristic information, PhD thesis with dual language abstracts, etc). In this article we present MLT-prealigner: a tool aimed at helping those that need to process mixed texts in order to feed alignment tools and other related language systems.

Cite as

Pedro Carvalho and José João Almeida. MLT-prealigner: a Tool for Multilingual Text Alignment. In 3rd Symposium on Languages, Applications and Technologies. Open Access Series in Informatics (OASIcs), Volume 38, pp. 283-290, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2014)

Copy BibTex To Clipboard

@InProceedings{carvalho_et_al:OASIcs.SLATE.2014.283,
  author =	{Carvalho, Pedro and Almeida, Jos\'{e} Jo\~{a}o},
  title =	{{MLT-prealigner: a Tool for Multilingual Text Alignment}},
  booktitle =	{3rd Symposium on Languages, Applications and Technologies},
  pages =	{283--290},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-939897-68-2},
  ISSN =	{2190-6807},
  year =	{2014},
  volume =	{38},
  editor =	{Pereira, Maria Jo\~{a}o Varanda and Leal, Jos\'{e} Paulo and Sim\~{o}es, Alberto},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2014.283},
  URN =		{urn:nbn:de:0030-drops-45776},
  doi =		{10.4230/OASIcs.SLATE.2014.283},
  annote =	{Keywords: parallel corpora, multilingual text alignment, language detection, Perl, automated translation}
}

Document

DOI: 10.4230/OASIcs.SLATE.2012.239

Probabilistic SynSet Based Concept Location

Authors: Nuno Ramos Carvalho, José João Almeida, Maria João Varanda Pereira, and Pedro Rangel Henriques

Published in: OASIcs, Volume 21, 1st Symposium on Languages, Applications and Technologies (2012)

Abstract

Concept location is a common task in program comprehension techniques, essential in many approaches used for software care and software evolution. An important goal of this process is to discover a mapping between source code and human oriented concepts. Although programs are written in a strict and formal language, natural language terms and sentences like identifiers (variables or functions names), constant strings or comments, can still be found embedded in programs. Using terminology concepts and natural language processing techniques these terms can be exploited to discover clues about which real world concepts source code is addressing. This work extends symbol tables build by compilers with ontology driven constructs, extends synonym sets defined by linguistics, with automatically created Probabilistic SynSets from software domain parallel corpora. And using a relational algebra, creates semantic bridges between program elements and human oriented concepts, to enhance concept location tasks.

Cite as

Nuno Ramos Carvalho, José João Almeida, Maria João Varanda Pereira, and Pedro Rangel Henriques. Probabilistic SynSet Based Concept Location. In 1st Symposium on Languages, Applications and Technologies. Open Access Series in Informatics (OASIcs), Volume 21, pp. 239-253, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2012)

Copy BibTex To Clipboard

@InProceedings{carvalho_et_al:OASIcs.SLATE.2012.239,
  author =	{Carvalho, Nuno Ramos and Almeida, Jos\'{e} Jo\~{a}o and Varanda Pereira, Maria Jo\~{a}o and Henriques, Pedro Rangel},
  title =	{{Probabilistic SynSet Based Concept Location}},
  booktitle =	{1st Symposium on Languages, Applications and Technologies},
  pages =	{239--253},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-939897-40-8},
  ISSN =	{2190-6807},
  year =	{2012},
  volume =	{21},
  editor =	{Sim\~{o}es, Alberto and Queir\'{o}s, Ricardo and da Cruz, Daniela},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2012.239},
  URN =		{urn:nbn:de:0030-drops-35267},
  doi =		{10.4230/OASIcs.SLATE.2012.239},
  annote =	{Keywords: program comprehension, program visualization, concept location, code inspection, synonym sets, probabilistic synonym sets, translation dictionary}
}

@InProceedings{carvalho_et_al:OASIcs.SLATE.2012.239,
  author =	{Carvalho, Nuno Ramos and Almeida, Jos\'{e} Jo\~{a}o and Varanda Pereira, Maria Jo\~{a}o and Henriques, Pedro Rangel},
  title =	{{Probabilistic SynSet Based Concept Location}},
  booktitle =	{1st Symposium on Languages, Applications and Technologies},
  pages =	{239--253},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-939897-40-8},
  ISSN =	{2190-6807},
  year =	{2012},
  volume =	{21},
  editor =	{Sim\~{o}es, Alberto and Queir\'{o}s, Ricardo and da Cruz, Daniela},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2012.239},
  URN =		{urn:nbn:de:0030-drops-35267},
  doi =		{10.4230/OASIcs.SLATE.2012.239},
  annote =	{Keywords: program comprehension, program visualization, concept location, code inspection, synonym sets, probabilistic synonym sets, translation dictionary}
}

5 Search Results for "Carvalho, Pedro"

NLP/AI Based Techniques for Programming Exercises Generation

Abstract

Cite as

Conclave: Writing Programs to Understand Programs

Abstract

Cite as

Unfuzzying Fuzzy Parsing

Abstract

Cite as

MLT-prealigner: a Tool for Multilingual Text Alignment

Abstract

Cite as

Probabilistic SynSet Based Concept Location

Abstract

Cite as

Thanks for your feedback!

Could not send message