Text size
  • Small
  • Medium
  • Large
  • Standard
  • Blue text on blue
  • High contrast (Yellow text on black)
  • Blue text on beige

    Feature Selection and Clustering in Software Quality Prediction

    11th International Conference on Evaluation and Assessment in Software Engineering (EASE)

    Keele University, UK, 2 - 3 April 2007


    Qi Wang, Jie Zhu & Bo Yu


    Software quality prediction models use the software metrics and fault data collected from previous software releases or similar projects to predict the quality of software components in development.

    Previous research has shown that this kind of models can yield predictions with impressive accuracy. However, building accurate software quality prediction model is still challenging for following two reasons. Firstly, the outliers in software data often have a disproportionate effect on the overalls predictive ability of the model. Secondly, not all collected software metrics should be used to construct model because of the curse of dimension.

    To resolve these two problems, we present a new software quality prediction model based on genetic algorithm (GA) in which outlier detection and feature selection are executed simultaneously. The experimental results illustrate this model performs better than some latest raised software quality prediction models based on S-PLUS and TreeDisc.

    Furthermore, the clustered software components and selected features are easier for software engineers and data analysts to study and interpret.


    PDF filePDF Version of this Paper (260kb)