11 Search Results for "Ribeiro, Ricardo"


Document
Classification of Public Administration Complaints

Authors: Francisco Caldeira, Luís Nunes, and Ricardo Ribeiro

Published in: OASIcs, Volume 104, 11th Symposium on Languages, Applications and Technologies (SLATE 2022)


Abstract
Complaint management is a problem faced by many organizations that is both vital to customer image and highly dependent on human resources. This work attempts to tackle a part of the problem, by classifying summaries of complaints using machine learning models in order to better redirect these to the appropriate responders. The main challenges of this task is that training datasets are often small and highly imbalanced. This can can have a big impact on the performance of classification models. The dataset analyzed in this work suffers from both of these problems, being relatively small and having labels in different proportions. In this work, two different techniques are analyzed: combining classes together to increase the number of elements of the new class; and, providing new artificial examples for some classes via translation into other languages. The classification models explored were the following: k-NN, SVM, Naïve Bayes, boosting, and Deep Learning approaches, including transformers. The paper concludes that although, as expected, the classes with little representation are hard to classify, the techniques explored helped to boost the performance, especially in the classes with a low number of elements. SVM and BERT-based models outperformed their peers.

Cite as

Francisco Caldeira, Luís Nunes, and Ricardo Ribeiro. Classification of Public Administration Complaints. In 11th Symposium on Languages, Applications and Technologies (SLATE 2022). Open Access Series in Informatics (OASIcs), Volume 104, pp. 9:1-9:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{caldeira_et_al:OASIcs.SLATE.2022.9,
  author =	{Caldeira, Francisco and Nunes, Lu{\'\i}s and Ribeiro, Ricardo},
  title =	{{Classification of Public Administration Complaints}},
  booktitle =	{11th Symposium on Languages, Applications and Technologies (SLATE 2022)},
  pages =	{9:1--9:12},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-245-7},
  ISSN =	{2190-6807},
  year =	{2022},
  volume =	{104},
  editor =	{Cordeiro, Jo\~{a}o and Pereira, Maria Jo\~{a}o and Rodrigues, Nuno F. and Pais, Sebasti\~{a}o},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2022.9},
  URN =		{urn:nbn:de:0030-drops-167555},
  doi =		{10.4230/OASIcs.SLATE.2022.9},
  annote =	{Keywords: Text Classification, Natural Language Processing, Deep Learning, BERT}
}
Document
Comparing Different Approaches for Detecting Hate Speech in Online Portuguese Comments

Authors: Bernardo Cunha Matos, Raquel Bento Santos, Paula Carvalho, Ricardo Ribeiro, and Fernando Batista

Published in: OASIcs, Volume 104, 11th Symposium on Languages, Applications and Technologies (SLATE 2022)


Abstract
Online Hate Speech (OHS) has been growing dramatically on social media, which has motivated researchers to develop a diversity of methods for its automated detection. However, the detection of OHS in Portuguese is still little studied. To fill this gap, we explored different models that proved to be successful in the literature to address this task. In particular, we have explored transfer learning approaches, based on existing BERT-like pre-trained models. The performed experiments were based on CO-HATE, a corpus of YouTube comments posted by the Portuguese online community that was manually labeled by different annotators. Among other categories, those comments were labeled regarding the presence of hate speech and the type of hate speech, specifically overt and covert hate speech. We have assessed the impact of using annotations from different annotators on the performance of such models. In addition, we have analyzed the impact of distinguishing overt and and covert hate speech. The results achieved show the importance of considering the annotator’s profile in the development of hate speech detection models. Regarding the hate speech type, the results obtained do not allow to make any conclusion on what type is easier to detect. Finally, we show that pre-processing does not seem to have a significant impact on the performance of this specific task.

Cite as

Bernardo Cunha Matos, Raquel Bento Santos, Paula Carvalho, Ricardo Ribeiro, and Fernando Batista. Comparing Different Approaches for Detecting Hate Speech in Online Portuguese Comments. In 11th Symposium on Languages, Applications and Technologies (SLATE 2022). Open Access Series in Informatics (OASIcs), Volume 104, pp. 10:1-10:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{matos_et_al:OASIcs.SLATE.2022.10,
  author =	{Matos, Bernardo Cunha and Santos, Raquel Bento and Carvalho, Paula and Ribeiro, Ricardo and Batista, Fernando},
  title =	{{Comparing Different Approaches for Detecting Hate Speech in Online Portuguese Comments}},
  booktitle =	{11th Symposium on Languages, Applications and Technologies (SLATE 2022)},
  pages =	{10:1--10:12},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-245-7},
  ISSN =	{2190-6807},
  year =	{2022},
  volume =	{104},
  editor =	{Cordeiro, Jo\~{a}o and Pereira, Maria Jo\~{a}o and Rodrigues, Nuno F. and Pais, Sebasti\~{a}o},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2022.10},
  URN =		{urn:nbn:de:0030-drops-167560},
  doi =		{10.4230/OASIcs.SLATE.2022.10},
  annote =	{Keywords: Hate Speech, Text Classification, Transfer Learning, Supervised Learning, Deep Learning}
}
Document
Semi-Supervised Annotation of Portuguese Hate Speech Across Social Media Domains

Authors: Raquel Bento Santos, Bernardo Cunha Matos, Paula Carvalho, Fernando Batista, and Ricardo Ribeiro

Published in: OASIcs, Volume 104, 11th Symposium on Languages, Applications and Technologies (SLATE 2022)


Abstract
With the increasing spread of hate speech (HS) on social media, it becomes urgent to develop models that can help detecting it automatically. Typically, such models require large-scale annotated corpora, which are still scarce in languages such as Portuguese. However, creating manually annotated corpora is a very expensive and time-consuming task. To address this problem, we propose an ensemble of two semi-supervised models that can be used to automatically create a corpus representative of online hate speech in Portuguese. The first model combines Generative Adversarial Networks and a BERT-based model. The second model is based on label propagation, and consists of propagating labels from existing annotated corpora to the unlabeled data, by exploring the notion of similarity. We have explored the annotations of three existing corpora (CO-HATE, ToLR-BR, and HPHS) in order to automatically annotate FIGHT, a corpus composed of geolocated tweets produced in the Portuguese territory. Through the process of selecting the best model and the corresponding setup, we have tested different pre-trained embeddings, performed experiments using different training subsets, labeled by different annotators with different perspectives, and performed several experiments with active learning. Furthermore, this work explores back translation as a mean to automatically generate additional hate speech samples. The best results were achieved by combining all the labeled datasets, obtaining 0.664 F1-score for the Hate Speech class in FIGHT.

Cite as

Raquel Bento Santos, Bernardo Cunha Matos, Paula Carvalho, Fernando Batista, and Ricardo Ribeiro. Semi-Supervised Annotation of Portuguese Hate Speech Across Social Media Domains. In 11th Symposium on Languages, Applications and Technologies (SLATE 2022). Open Access Series in Informatics (OASIcs), Volume 104, pp. 11:1-11:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2022)


Copy BibTex To Clipboard

@InProceedings{santos_et_al:OASIcs.SLATE.2022.11,
  author =	{Santos, Raquel Bento and Matos, Bernardo Cunha and Carvalho, Paula and Batista, Fernando and Ribeiro, Ricardo},
  title =	{{Semi-Supervised Annotation of Portuguese Hate Speech Across Social Media Domains}},
  booktitle =	{11th Symposium on Languages, Applications and Technologies (SLATE 2022)},
  pages =	{11:1--11:14},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-245-7},
  ISSN =	{2190-6807},
  year =	{2022},
  volume =	{104},
  editor =	{Cordeiro, Jo\~{a}o and Pereira, Maria Jo\~{a}o and Rodrigues, Nuno F. and Pais, Sebasti\~{a}o},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2022.11},
  URN =		{urn:nbn:de:0030-drops-167570},
  doi =		{10.4230/OASIcs.SLATE.2022.11},
  annote =	{Keywords: Hate Speech, Semi-Supervised Learning, Semi-Automatic Annotation}
}
Document
Semantic Search of Mobile Applications Using Word Embeddings

Authors: João Coelho, António Neto, Miguel Tavares, Carlos Coutinho, Ricardo Ribeiro, and Fernando Batista

Published in: OASIcs, Volume 94, 10th Symposium on Languages, Applications and Technologies (SLATE 2021)


Abstract
This paper proposes a set of approaches for the semantic search of mobile applications, based on their name and on the unstructured textual information contained in their description. The proposed approaches make use of word-level, character-level, and contextual word-embeddings that have been trained or fine-tuned using a dataset of about 500 thousand mobile apps, collected in the scope of this work. The proposed approaches have been evaluated using a public dataset that includes information about 43 thousand applications, and 56 manually annotated non-exact queries. Our results show that both character-level embeddings trained on our data, and fine-tuned RoBERTa models surpass the performance of the other existing retrieval strategies reported in the literature.

Cite as

João Coelho, António Neto, Miguel Tavares, Carlos Coutinho, Ricardo Ribeiro, and Fernando Batista. Semantic Search of Mobile Applications Using Word Embeddings. In 10th Symposium on Languages, Applications and Technologies (SLATE 2021). Open Access Series in Informatics (OASIcs), Volume 94, pp. 12:1-12:12, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{coelho_et_al:OASIcs.SLATE.2021.12,
  author =	{Coelho, Jo\~{a}o and Neto, Ant\'{o}nio and Tavares, Miguel and Coutinho, Carlos and Ribeiro, Ricardo and Batista, Fernando},
  title =	{{Semantic Search of Mobile Applications Using Word Embeddings}},
  booktitle =	{10th Symposium on Languages, Applications and Technologies (SLATE 2021)},
  pages =	{12:1--12:12},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-202-0},
  ISSN =	{2190-6807},
  year =	{2021},
  volume =	{94},
  editor =	{Queir\'{o}s, Ricardo and Pinto, M\'{a}rio and Sim\~{o}es, Alberto and Portela, Filipe and Pereira, Maria Jo\~{a}o},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2021.12},
  URN =		{urn:nbn:de:0030-drops-144292},
  doi =		{10.4230/OASIcs.SLATE.2021.12},
  annote =	{Keywords: Semantic Search, Word Embeddings, Elasticsearch, Mobile Applications}
}
Document
Sentiment Analysis of Portuguese Economic News

Authors: Cátia Tavares, Ricardo Ribeiro, and Fernando Batista

Published in: OASIcs, Volume 94, 10th Symposium on Languages, Applications and Technologies (SLATE 2021)


Abstract
This paper proposes a rule-based method for automatic polarity detection over economic news texts, which proved suitable for detecting the sentiment in Portuguese economic news. The data used in our experiments consists of 400 manually annotated sentences extracted from economic news, used for evaluation, and about 90 thousand Portuguese economic news, extracted from two well-known Portuguese newspapers, covering the period from 2010 to 2020, that have been used for training our systems. In order to perform sentiment analysis of economic news, we have also tested the adaptation of existing pre-trained modules, and also performed experiments with a set of Machine Learning approaches, and self-training. Experimental results show that our rule-based approach, that uses manually written rules related to the economic context, achieves the best results for automatically detecting the polarity of economic news, largely surpassing the other approaches.

Cite as

Cátia Tavares, Ricardo Ribeiro, and Fernando Batista. Sentiment Analysis of Portuguese Economic News. In 10th Symposium on Languages, Applications and Technologies (SLATE 2021). Open Access Series in Informatics (OASIcs), Volume 94, pp. 17:1-17:13, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{tavares_et_al:OASIcs.SLATE.2021.17,
  author =	{Tavares, C\'{a}tia and Ribeiro, Ricardo and Batista, Fernando},
  title =	{{Sentiment Analysis of Portuguese Economic News}},
  booktitle =	{10th Symposium on Languages, Applications and Technologies (SLATE 2021)},
  pages =	{17:1--17:13},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-202-0},
  ISSN =	{2190-6807},
  year =	{2021},
  volume =	{94},
  editor =	{Queir\'{o}s, Ricardo and Pinto, M\'{a}rio and Sim\~{o}es, Alberto and Portela, Filipe and Pereira, Maria Jo\~{a}o},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2021.17},
  URN =		{urn:nbn:de:0030-drops-144347},
  doi =		{10.4230/OASIcs.SLATE.2021.17},
  annote =	{Keywords: Sentiment Analysis, Economic News, Portuguese Language}
}
Document
Moopec: A Tool for Creating Programming Problems

Authors: Rui C. Mendes

Published in: OASIcs, Volume 91, Second International Computer Programming Education Conference (ICPEC 2021)


Abstract
This paper presents a tool called Mooshak ProblEm Creator (Moopec) for facilitating the creation of programming exercises for a web-based multi-site programming contest system called Mooshak [Leal and Silva, 2003]. Users only need to create a text file for specifying all the information concerning problems including their description, tests and user feedback. This tool provides ways of automating most tasks involved in creating problems in Mooshak and, consequently, increases teachers' productivity. Moopec allows instructors to quickly create problem sets by simply editing a text file. Moopec is implemented in Python and is available at https://github.com/rcm/mooshak_problem_creator.

Cite as

Rui C. Mendes. Moopec: A Tool for Creating Programming Problems. In Second International Computer Programming Education Conference (ICPEC 2021). Open Access Series in Informatics (OASIcs), Volume 91, pp. 9:1-9:7, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)


Copy BibTex To Clipboard

@InProceedings{mendes:OASIcs.ICPEC.2021.9,
  author =	{Mendes, Rui C.},
  title =	{{Moopec: A Tool for Creating Programming Problems}},
  booktitle =	{Second International Computer Programming Education Conference (ICPEC 2021)},
  pages =	{9:1--9:7},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-194-8},
  ISSN =	{2190-6807},
  year =	{2021},
  volume =	{91},
  editor =	{Henriques, Pedro Rangel and Portela, Filipe and Queir\'{o}s, Ricardo and Sim\~{o}es, Alberto},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.ICPEC.2021.9},
  URN =		{urn:nbn:de:0030-drops-142253},
  doi =		{10.4230/OASIcs.ICPEC.2021.9},
  annote =	{Keywords: Automatic Program Assessment, Batch Generation, Testing}
}
Document
Towards the Identification of Fake News in Portuguese

Authors: João Rodrigues, Ricardo Ribeiro, and Fernando Batista

Published in: OASIcs, Volume 83, 9th Symposium on Languages, Applications and Technologies (SLATE 2020)


Abstract
All over the world, many initiatives have been taken to fight fake news. Governments (e.g., France, Germany, United Kingdom and Spain), on their own way, started to take action regarding legal accountability for those who manufacture or propagate fake news. Different media outlets have also taken a multitude of initiatives to deal with this phenomenon, such as the increase of discipline, accuracy and transparency of publications made internally. Some structural changes have lately been made in said companies and entities in order to better evaluate news in general. As such, many teams were built entirely to fight fake news - the so-called "fact-checkers". These have been adopting different techniques in order to do so: from the typical use of journalists to find out the true behind a controversial statement, to data-scientists that apply forefront techniques such as text mining and machine learning to support the journalist’s decisions. Many of these entities, which aim to maintain or improve their reputation, started to focus on high standards for quality and reliable information, which led to the creation of official and dedicated departments for fact-checking. In this revision paper, not only will we highlight relevant contributions and efforts across the fake news identification and classification status quo, but we will also contextualize the Portuguese language state of affairs in the current state-of-the-art.

Cite as

João Rodrigues, Ricardo Ribeiro, and Fernando Batista. Towards the Identification of Fake News in Portuguese. In 9th Symposium on Languages, Applications and Technologies (SLATE 2020). Open Access Series in Informatics (OASIcs), Volume 83, pp. 7:1-7:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{rodrigues_et_al:OASIcs.SLATE.2020.7,
  author =	{Rodrigues, Jo\~{a}o and Ribeiro, Ricardo and Batista, Fernando},
  title =	{{Towards the Identification of Fake News in Portuguese}},
  booktitle =	{9th Symposium on Languages, Applications and Technologies (SLATE 2020)},
  pages =	{7:1--7:14},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-165-8},
  ISSN =	{2190-6807},
  year =	{2020},
  volume =	{83},
  editor =	{Sim\~{o}es, Alberto and Henriques, Pedro Rangel and Queir\'{o}s, Ricardo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2020.7},
  URN =		{urn:nbn:de:0030-drops-130207},
  doi =		{10.4230/OASIcs.SLATE.2020.7},
  annote =	{Keywords: Fake News, Portuguese Language, Fact-checking}
}
Document
Short Paper
Different Lexicon-Based Approaches to Emotion Identification in Portuguese Tweets (Short Paper)

Authors: Soraia Filipe, Fernando Batista, and Ricardo Ribeiro

Published in: OASIcs, Volume 83, 9th Symposium on Languages, Applications and Technologies (SLATE 2020)


Abstract
This paper presents the existing literature on the identification of emotions and describes various lexica-based approaches and translation strategies to identify emotions in Portuguese tweets. A dataset of tweets was manually annotated to evaluate our classifier and also to assess the difficulty of the task. A lexicon-based approach was used in order to classify the presence or absence of eight different emotions in a tweet. Different strategies have been applied to refine and improve an existing and widely used lexicon, by means of automatic machine translation and aligned word embeddings. We tested six different classification approaches, exploring different ways of directly applying resources available for English by means of different translation strategies. The achieved results suggest that a better performance can be obtained both by improving a lexicon and by directly translating tweets into English and then applying an existing English lexicon.

Cite as

Soraia Filipe, Fernando Batista, and Ricardo Ribeiro. Different Lexicon-Based Approaches to Emotion Identification in Portuguese Tweets (Short Paper). In 9th Symposium on Languages, Applications and Technologies (SLATE 2020). Open Access Series in Informatics (OASIcs), Volume 83, pp. 12:1-12:8, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{filipe_et_al:OASIcs.SLATE.2020.12,
  author =	{Filipe, Soraia and Batista, Fernando and Ribeiro, Ricardo},
  title =	{{Different Lexicon-Based Approaches to Emotion Identification in Portuguese Tweets}},
  booktitle =	{9th Symposium on Languages, Applications and Technologies (SLATE 2020)},
  pages =	{12:1--12:8},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-165-8},
  ISSN =	{2190-6807},
  year =	{2020},
  volume =	{83},
  editor =	{Sim\~{o}es, Alberto and Henriques, Pedro Rangel and Queir\'{o}s, Ricardo},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2020.12},
  URN =		{urn:nbn:de:0030-drops-130252},
  doi =		{10.4230/OASIcs.SLATE.2020.12},
  annote =	{Keywords: Emotion detection, tweets, Portuguese Language, Emotion lexicon}
}
Document
Using Property-Based Testing to Generate Feedback for C Programming Exercises

Authors: Pedro Vasconcelos and Rita P. Ribeiro

Published in: OASIcs, Volume 81, First International Computer Programming Education Conference (ICPEC 2020)


Abstract
This paper reports on the use of property-based testing for providing feedback to C programming exercises. Test cases are generated automatically from properties specified in a test script; this not only makes it possible to conduct many tests (thus potentially find more mistakes), but also allows simplifying failed tests cases automatically. We present some experimental validation gathered for an introductory C programming course during the fall semester of 2018 that show significant positive correlations between getting feedback during the semester and the student’s results in the final exam. We also discuss some limitations regarding feedback for undefined behaviors in the C language.

Cite as

Pedro Vasconcelos and Rita P. Ribeiro. Using Property-Based Testing to Generate Feedback for C Programming Exercises. In First International Computer Programming Education Conference (ICPEC 2020). Open Access Series in Informatics (OASIcs), Volume 81, pp. 28:1-28:10, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2020)


Copy BibTex To Clipboard

@InProceedings{vasconcelos_et_al:OASIcs.ICPEC.2020.28,
  author =	{Vasconcelos, Pedro and Ribeiro, Rita P.},
  title =	{{Using Property-Based Testing to Generate Feedback for C Programming Exercises}},
  booktitle =	{First International Computer Programming Education Conference (ICPEC 2020)},
  pages =	{28:1--28:10},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-153-5},
  ISSN =	{2190-6807},
  year =	{2020},
  volume =	{81},
  editor =	{Queir\'{o}s, Ricardo and Portela, Filipe and Pinto, M\'{a}rio and Sim\~{o}es, Alberto},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.ICPEC.2020.28},
  URN =		{urn:nbn:de:0030-drops-123159},
  doi =		{10.4230/OASIcs.ICPEC.2020.28},
  annote =	{Keywords: property-based testing, C language, Haskell language, teaching programming}
}
Document
Short Paper
Less is more in incident categorization (Short Paper)

Authors: Sara Silva, Ricardo Ribeiro, and Rubén Pereira

Published in: OASIcs, Volume 62, 7th Symposium on Languages, Applications and Technologies (SLATE 2018)


Abstract
The IT incident management process requires a correct categorization to attribute incident tickets to the right resolution group and obtain as quickly as possible an operational system, impacting the minimum as possible the business and costumers. In this work, we introduce automatic text classification, demonstrating the application of several natural language processing techniques and analyzing the impact of each one on a real incident tickets dataset. The techniques that we explore in the pre-processing of the text that describes an incident are the following: tokenization, stemming, eliminating stop-words, named-entity recognition, and TF xIDF-based document representation. Finally, to build the model and observe the results after applying the previous techniques, we use two machine learning algorithms: Support Vector Machine (SVM) and K-Nearest Neighbor (KNN). Two important findings result from this study: a shorter description of an incident is better than a full description of an incident; and, pre-processing has little impact on incident categorization, mainly due the specific vocabulary used in this type of text.

Cite as

Sara Silva, Ricardo Ribeiro, and Rubén Pereira. Less is more in incident categorization (Short Paper). In 7th Symposium on Languages, Applications and Technologies (SLATE 2018). Open Access Series in Informatics (OASIcs), Volume 62, pp. 17:1-17:7, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)


Copy BibTex To Clipboard

@InProceedings{silva_et_al:OASIcs.SLATE.2018.17,
  author =	{Silva, Sara and Ribeiro, Ricardo and Pereira, Rub\'{e}n},
  title =	{{Less is more in incident categorization}},
  booktitle =	{7th Symposium on Languages, Applications and Technologies (SLATE 2018)},
  pages =	{17:1--17:7},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-072-9},
  ISSN =	{2190-6807},
  year =	{2018},
  volume =	{62},
  editor =	{Henriques, Pedro Rangel and Leal, Jos\'{e} Paulo and Leit\~{a}o, Ant\'{o}nio Menezes and Guinovart, Xavier G\'{o}mez},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2018.17},
  URN =		{urn:nbn:de:0030-drops-92755},
  doi =		{10.4230/OASIcs.SLATE.2018.17},
  annote =	{Keywords: machine learning, automated incident categorization, SVM, incident management, natural language}
}
Document
Adapting Speech Recognition in Augmented Reality for Mobile Devices in Outdoor Environments

Authors: Rui Pascoal, Ricardo Ribeiro, Fernando Batista, and Ana de Almeida

Published in: OASIcs, Volume 56, 6th Symposium on Languages, Applications and Technologies (SLATE 2017)


Abstract
This paper describes the process of integrating automatic speech recognition (ASR) into a mobile application and explores the benefits and challenges of integrating speech with augmented reality (AR) in outdoor environments. The augmented reality allows end-users to interact with the information displayed and perform tasks, while increasing the user’s perception about the real world by adding virtual information to it. Speech is the most natural way of communication: it allows hands-free interaction and may allow end-users to quickly and easily access a range of features available. Speech recognition technology is often available in most of the current mobile devices, but it often uses Internet to receive the corresponding transcript from remote servers, e.g., Google speech recognition. However, in some outdoor environments, Internet is not always available or may be offered at poor quality. We integrated an off-line automatic speech recognition module into an AR application for outdoor usage that does not require Internet. Currently, speech interaction is used within the application to access five different features, namely: to take a photo, shoot a film, communicate, messaging related tasks, and to request information, either geographic, biometric, or climatic. The application makes available solutions to manage and interact with the mobile device, offering good usability. We have compared the online and off-line speech recognition systems in order to assess their adequacy to the tasks. Both systems were tested under different conditions, commonly found in outdoor environments, such as: Internet access quality, presence of noise, and distractions.

Cite as

Rui Pascoal, Ricardo Ribeiro, Fernando Batista, and Ana de Almeida. Adapting Speech Recognition in Augmented Reality for Mobile Devices in Outdoor Environments. In 6th Symposium on Languages, Applications and Technologies (SLATE 2017). Open Access Series in Informatics (OASIcs), Volume 56, pp. 21:1-21:14, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2017)


Copy BibTex To Clipboard

@InProceedings{pascoal_et_al:OASIcs.SLATE.2017.21,
  author =	{Pascoal, Rui and Ribeiro, Ricardo and Batista, Fernando and de Almeida, Ana},
  title =	{{Adapting Speech Recognition in Augmented Reality for Mobile Devices in Outdoor Environments}},
  booktitle =	{6th Symposium on Languages, Applications and Technologies (SLATE 2017)},
  pages =	{21:1--21:14},
  series =	{Open Access Series in Informatics (OASIcs)},
  ISBN =	{978-3-95977-056-9},
  ISSN =	{2190-6807},
  year =	{2017},
  volume =	{56},
  editor =	{Queir\'{o}s, Ricardo and Pinto, M\'{a}rio and Sim\~{o}es, Alberto and Leal, Jos\'{e} Paulo and Varanda, Maria Jo\~{a}o},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{https://drops-dev.dagstuhl.de/entities/document/10.4230/OASIcs.SLATE.2017.21},
  URN =		{urn:nbn:de:0030-drops-79541},
  doi =		{10.4230/OASIcs.SLATE.2017.21},
  annote =	{Keywords: Speech Recognition, Natural Language Processing, Sphinx for Mobile Devices, Augmented Reality, Outdoor Environments}
}
  • Refine by Author
  • 9 Ribeiro, Ricardo
  • 7 Batista, Fernando
  • 2 Carvalho, Paula
  • 2 Matos, Bernardo Cunha
  • 2 Santos, Raquel Bento
  • Show More...

  • Refine by Classification
  • 4 Computing methodologies → Natural language processing
  • 2 Computing methodologies → Machine learning
  • 2 Computing methodologies → Transfer learning
  • 2 Information systems → Clustering and classification
  • 2 Information systems → Language models
  • Show More...

  • Refine by Keyword
  • 3 Portuguese Language
  • 2 Deep Learning
  • 2 Hate Speech
  • 2 Natural Language Processing
  • 2 Text Classification
  • Show More...

  • Refine by Type
  • 11 document

  • Refine by Publication Year
  • 3 2020
  • 3 2021
  • 3 2022
  • 1 2017
  • 1 2018

Questions / Remarks / Feedback
X

Feedback for Dagstuhl Publishing


Thanks for your feedback!

Feedback submitted

Could not send message

Please try again later or send an E-mail