Towards Scope Detection in Textual Requirements

Holter, Ole Magnus; Ell, Basil

doi:10.4230/OASIcs.LDK.2021.31

File

Author Details

Ole Magnus Holter

Department of Informatics, University of Oslo, Norway

Basil Ell

Department of Informatics, University of Oslo, Norway

Cite AsGet BibTex

Ole Magnus Holter and Basil Ell. Towards Scope Detection in Textual Requirements. In 3rd Conference on Language, Data and Knowledge (LDK 2021). Open Access Series in Informatics (OASIcs), Volume 93, pp. 31:1-31:15, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2021)
https://doi.org/10.4230/OASIcs.LDK.2021.31

Abstract

Requirements are an integral part of industry operation and projects. Not only do requirements dictate industrial operations, but they are used in legally binding contracts between supplier and purchaser. Some companies even have requirements as their core business. Most requirements are found in textual documents, this brings a couple of challenges such as ambiguity, scalability, maintenance, and finding relevant and related requirements. Having the requirements in a machine-readable format would be a solution to these challenges, however, existing requirements need to be transformed into machine-readable requirements using NLP technology. Using state-of-the-art NLP methods based on end-to-end neural modelling on such documents is not trivial because the language is technical and domain-specific and training data is not available. In this paper, we focus on one step in that direction, namely scope detection of textual requirements using weak supervision and a simple classifier based on BERT general domain word embeddings and show that using openly available data, it is possible to get promising results on domain-specific requirements documents.

Subject Classification

ACM Subject Classification

Computing methodologies → Natural language processing

Keywords

Scope Detection
Textual requirements
NLP

Metrics

Access Statistics
Total Accesses (updated on a weekly basis)

0

PDF Downloads

0

Metadata Views

References

15926browser. (visited on 2021-02-22). URL: http://data.15926.org/rdl.
Apache PDFBox | A Java PDF Library. https://pdfbox.apache.org/ (visited on 2021-12-21).
Gensim. https://radimrehurek.com/gensim/ (visited on 2021-02-17).
Google-Research/Bert. https://github.com/google-research/bert (visited on 2021-01-27).
Iso15926 equipment class. http://data.15926.org/rdl/RDS8615020 (visited on 2021-02-22).
Natural Language Toolkit - NLTK. (visited on 2021-02-08). URL: https://www.nltk.org/.
INCOSE - guide for writing requirements, 2017.
S. Abualhaija, C Arora, et al. A Machine Learning-Based Approach for Demarcating Requirements in Textual Specifications. In RE 2019, pages 51-62, 2019.
H. Bast and C. Korzen. A Benchmark and Evaluation for Text Extraction from PDF. In 2017 ACM/IEEE Joint Conference on Digital Libraries (JCDL), pages 1-10, 2017.
Agustin Casamayor, Daniela Godoy, and Marcelo Campo. Identification of non-functional requirements in textual specifications: A semi-supervised learning approach. Information and Software Technology, 52(4):436-445, 2010.
Jacob Devlin, Ming-Wei Chang, et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv, 2019. URL: http://arxiv.org/abs/1810.04805.
Patric Drouin. TermoStat Web. http://termostat.ling.umontreal.ca/ (visited on 2021-02-17).
Patrick Drouin. Term extraction using non-technical corpora as a point of leverage. Terminology, 9:99-115, 2003.
G. Fantoni, E. Coli, et al. Text mining tool for translating terms of contract into technical specifications: Development and application in the railway sector. Computers in Industry, 124:103357, 2021.
Jeremy Howard and Sebastian Ruder. Universal Language Model Fine-tuning for Text Classification. arXiv, 2018. URL: http://arxiv.org/abs/1801.06146.
IBM. DOORS. https://www.ibm.com/uk-en/products/requirements-management (visited on 2021-03-08).
BS ISO. Iso 14224,“petroleum and natural gas industries: collection and exchange of reliability and maintenance data for equipment“. British Standards Institution, UK, 1999.
Menon Economics. Requirements as cost drivers in the Norwegian petroleum industry. https://www.menon.no/requirements-as-cost-drivers-in-the-norwegian-petroleum-industry (visited on 2021-02-19).
Mike Mintz, Steven Bills, et al. Distant supervision for relation extraction without labeled data. In ACL/AFNLP, volume 2, pages 1003-1011, 2009.
Farhad Nooralahzadeh, Jan Tore Lønning, and Lilja Øvrelid. Reinforcement-based denoising of distantly supervised ner with partial annotation. In DeepLo Workshop, 2019.
Farhad Nooralahzadeh, Lilja Øvrelid, and Jan Tore Lønning. Evaluation of domain-specific word embeddings using knowledge resources. In LREC 2018, 2018.
Alexander Ratner, Stephen H Bach, et al. Snorkel: Rapid training data creation with weak supervision. Proceedings of the VLDB Endowment, 11(3), 2017.
Benedetta Rosadini, Alessio Ferrari, et al. Using NLP to detect requirements defects: An industrial experience in the railway domain. In REFSQ, pages 344-360, 2017.
Sebastian Ruder, Matthew E Peters, Swabha Swayamdipta, and Thomas Wolf. Transfer learning in natural language processing. In NAACL Tutorials, pages 15-18, 2019.
SIEMENS. Polarion REQUIREMENTS. https://polarion.plm.automation.siemens.com/products/polarion-requirements (visited on 2021-03-08).
SIRIUS. DREAM and READI: Cooperation to Manage Digital Requirements. https://sirius-labs.no/dream-and-readi-cooperation-to-manage-digital-requirements/ (visited on 2021-03-08).
SIRIUS and DNV GL. On the READI method. Personal communication.
Jonas Winkler and Andreas Vogelsang. Automatic Classification of Requirements Based on Convolutional Neural Networks. In RE Workshops, pages 39-45, 2016.

Towards Scope Detection in Textual Requirements

Authors Ole Magnus Holter , Basil Ell

File

Document Identifiers

Author Details

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Towards Scope Detection in Textual Requirements

Authors Ole Magnus Holter , Basil Ell

File

Document Identifiers

Author Details

Funding

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

Supplementary Materials

References