Improving the Multi-Class Classification of Non-Functional Requirements in Spanish: A Study of Dataset Balancing and Performance Academic Article uri icon

abstract

  • Abstract Context In recent years, the multi-class classification of non-functional requirements has seen improvements through the use of Machine Learning algorithms. However, challenges such as data scarcity and class imbalance persist, particularly for languages other than English, such as Spanish. Objective This study aims to analyze the performance metrics of Machine Learning algorithms for classifying non-functional requirements translated into and originally written in Spanish. It evaluates the effectiveness of dataset balancing techniques and conducts cross-dataset validation to assess the generalizability of the models. Method A dataset balancing process was conducted using a combination of oversampling and undersampling techniques. Six algorithms were trained in two experiments using a hyperparameter tuning process, employing two different datasets: PROMISE_exp_translated and the newly PROMISE_exp_balanced . The best-performing models were further tested on unseen data to evaluate their generalizability. Results Logistic Regression and Naive Bayes demonstrated superior performance on the translated dataset, achieving f1-scores of 82% and 81%, respectively. Although overall performance decreased on the balanced dataset, specific underrepresented classes such as Portability and Fault Tolerance benefited from the balancing process. Conclusion Shallow Machine Learning algorithms are effective for classifying Spanish non-functional requirements, particularly when addressing data imbalance. The study highlights the importance of dataset balancing in improving classification performance for specific classes and provides insights into the challenges of generalizing models across datasets.

authors

  • Limaylla-Lunarejo, M.
  • Condori-Fernandez, N.
  • Rodríguez Luaces, M.
  • Karras, Oliver

publication date

  • 2026

start page

  • 6

volume

  • 31

issue

  • 1