Less is more in incident categorization (Short Paper)

Silva, Sara; Ribeiro, Ricardo; Pereira, Rubén

doi:10.4230/OASIcs.SLATE.2018.17

File

OASIcs.SLATE.2018.17.pdf

Filesize: 350 kB
7 pages

Document Identifiers

DOI: 10.4230/OASIcs.SLATE.2018.17
URN: urn:nbn:de:0030-drops-92755

Author Details

Sara Silva

Instituto Universitário de Lisboa (ISCTE-IUL) Lisbon, Portugal

Ricardo Ribeiro

INESC-ID Lisboa, Instituto Universitário de Lisboa (ISCTE-IUL), Lisbon, Portugal

Rubén Pereira

Instituto Universitário de Lisboa (ISCTE-IUL) Lisbon, Portugal

Cite AsGet BibTex

Sara Silva, Ricardo Ribeiro, and Rubén Pereira. Less is more in incident categorization (Short Paper). In 7th Symposium on Languages, Applications and Technologies (SLATE 2018). Open Access Series in Informatics (OASIcs), Volume 62, pp. 17:1-17:7, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2018)
https://doi.org/10.4230/OASIcs.SLATE.2018.17

Abstract

The IT incident management process requires a correct categorization to attribute incident tickets to the right resolution group and obtain as quickly as possible an operational system, impacting the minimum as possible the business and costumers. In this work, we introduce automatic text classification, demonstrating the application of several natural language processing techniques and analyzing the impact of each one on a real incident tickets dataset. The techniques that we explore in the pre-processing of the text that describes an incident are the following: tokenization, stemming, eliminating stop-words, named-entity recognition, and TF xIDF-based document representation. Finally, to build the model and observe the results after applying the previous techniques, we use two machine learning algorithms: Support Vector Machine (SVM) and K-Nearest Neighbor (KNN). Two important findings result from this study: a shorter description of an incident is better than a full description of an incident; and, pre-processing has little impact on incident categorization, mainly due the specific vocabulary used in this type of text.

Subject Classification

ACM Subject Classification

Computing methodologies → Natural language processing

Keywords

machine learning
automated incident categorization
SVM
incident management
natural language

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

PDF Downloads

0

Metadata Views

References

Muchahit Altintas and A. Cuneyd Tantug. Machine learning based volume diagnosis. In International Conference on Artificial Intelligence and Computer Science (AICS), pages 195-207, 2014.
Sylvain Arlot and Alain Celisse. A survey of cross-validation procedures for model selection. Statistics Surveys, 4:40-79, 2010. URL: http://dx.doi.org/10.1214/09-SS054.
Rajeev Gupta, K. Hima Prasad, Laura Luan, Daniela Rosu, and Chris Ward. Multi-dimensional knowledge integration for efficient incident management in a services cloud. In IEEE International Conference on Services Computing, pages 57-64, 2009. URL: http://dx.doi.org/10.1109/SCC.2009.48.
Rajeev Gupta, K. Hima Prasad, and Mukesh Mohania. Information integration techniques to automate incident management. In IEEE Network Operations and Management Symposium (NOMS), pages 979-982, 2008. URL: http://dx.doi.org/10.1109/NOMS.2008.4575262.
Chih-Wei Hsu, Chih-Chung Chang, and Chih-Jen Lin. A practical guide to support vector classification. BJU international, 101(1):1396-1400, 2008. URL: http://dx.doi.org/10.1177/02632760022050997.
Thorsten Joachims. Text categorization with Support Vector Machines: Learning with many relevant features. In Machine Learning: ECML-98, volume 1398 of Lecture Notes in Computer Science, pages 137-142. Springer, Berlin, Heidelberg, 1998.
John O. Long. Service operation. In Itil Version 3 at a Glance: Information Quick Reference, pages 55-74. Springer, 2008. URL: http://dx.doi.org/10.1007/978-0-387-77393-3_5.
Martin F. Porter. An algorithm for suffix stripping. Program, 14(3):130-137, 1980.
Sara Silva, Rúben Pereira, and Ricardo Ribeiro. Machine learning in incident categorization automation. In Proceedings of CISTI'2018: 13th Iberian Conference on Information Systems and Technologies, 2018.
Yang Song, Jian Huang, Ding Zhou, Hongyuan Zha, and C Lee Giles. IKNN: Informative K-Nearest Neighbor Pattern Classification. Proceedings of the European conference on Principles and Practice of Knowledge Discovery in Databases (PKDD), pages 248-264, 2007. URL: http://dx.doi.org/10.1007/978-3-540-74976-9_25.
Bruno Trstenjak, Sasa Mikac, and Dzenana Donko. KNN with TF-IDF based framework for text categorization. Procedia Engineering, 69:1356-1364, 2014. URL: http://dx.doi.org/10.1016/j.proeng.2014.03.129.