Co-changing code volume prediction through association rule mining and linear regression model

Shin Jie Lee, Li Hsiang Lo, Yu Cheng Chen, Shi Min Shen

Research output: Contribution to journalArticlepeer-review

7 Citations (Scopus)

Abstract

Code smells are symptoms in the source code that indicate possible deeper problems and may serve as drivers for code refactoring. Although effort has been made on identifying divergent changes and shotgun surgeries, little emphasis has been put on predicting the volume of co-changing code that appears in the code smells. More specifically, when a software developer intends to perform a particular modification task on a method, a predicted volume of code that will potentially be co-changed with the method could be considered as significant information for estimating the modification effort. In this paper, we propose an approach to predicting volume of co-changing code affected by a method to be modified. The approach has the following key features: co-changing methods can be identified for detecting divergent changes and shotgun surgeries based on association rules mined from change histories; and volume of co-changing code affected by a method to be modified can be predicted through a derived fitted regression line with t-test based on the co-changing methods identification results. The experimental results show that the success rate of co-changing methods identification is 82% with a suggested threshold, and the numbers of correct identifications would not be influenced by the increasing number of commits as a project continuously evolves. Additionally, the mean absolute error of co-changing code volume predictions is 133 lines of code which is 95.3% less than the one of a naive approach.

Original languageEnglish
Pages (from-to)185-194
Number of pages10
JournalExpert Systems With Applications
Volume45
DOIs
Publication statusPublished - 2016 Mar 1

All Science Journal Classification (ASJC) codes

  • General Engineering
  • Computer Science Applications
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Co-changing code volume prediction through association rule mining and linear regression model'. Together they form a unique fingerprint.

Cite this