Co-changing code volume prediction through association rule mining and linear regression model

Shin Jie Lee, Li Hsiang Lo, Yu Cheng Chen, Shi Min Shen

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

Code smells are symptoms in the source code that indicate possible deeper problems and may serve as drivers for code refactoring. Although effort has been made on identifying divergent changes and shotgun surgeries, little emphasis has been put on predicting the volume of co-changing code that appears in the code smells. More specifically, when a software developer intends to perform a particular modification task on a method, a predicted volume of code that will potentially be co-changed with the method could be considered as significant information for estimating the modification effort. In this paper, we propose an approach to predicting volume of co-changing code affected by a method to be modified. The approach has the following key features: co-changing methods can be identified for detecting divergent changes and shotgun surgeries based on association rules mined from change histories; and volume of co-changing code affected by a method to be modified can be predicted through a derived fitted regression line with t-test based on the co-changing methods identification results. The experimental results show that the success rate of co-changing methods identification is 82% with a suggested threshold, and the numbers of correct identifications would not be influenced by the increasing number of commits as a project continuously evolves. Additionally, the mean absolute error of co-changing code volume predictions is 133 lines of code which is 95.3% less than the one of a naive approach.

Original languageEnglish
Pages (from-to)185-194
Number of pages10
JournalExpert Systems With Applications
Volume45
DOIs
Publication statusPublished - 2016 Mar 1

Fingerprint

Association rules
Linear regression
Surgery

All Science Journal Classification (ASJC) codes

  • Engineering(all)
  • Computer Science Applications
  • Artificial Intelligence

Cite this

@article{d6aee7fbc54f4c6fbadb22611ac9547f,
title = "Co-changing code volume prediction through association rule mining and linear regression model",
abstract = "Code smells are symptoms in the source code that indicate possible deeper problems and may serve as drivers for code refactoring. Although effort has been made on identifying divergent changes and shotgun surgeries, little emphasis has been put on predicting the volume of co-changing code that appears in the code smells. More specifically, when a software developer intends to perform a particular modification task on a method, a predicted volume of code that will potentially be co-changed with the method could be considered as significant information for estimating the modification effort. In this paper, we propose an approach to predicting volume of co-changing code affected by a method to be modified. The approach has the following key features: co-changing methods can be identified for detecting divergent changes and shotgun surgeries based on association rules mined from change histories; and volume of co-changing code affected by a method to be modified can be predicted through a derived fitted regression line with t-test based on the co-changing methods identification results. The experimental results show that the success rate of co-changing methods identification is 82{\%} with a suggested threshold, and the numbers of correct identifications would not be influenced by the increasing number of commits as a project continuously evolves. Additionally, the mean absolute error of co-changing code volume predictions is 133 lines of code which is 95.3{\%} less than the one of a naive approach.",
author = "Lee, {Shin Jie} and Lo, {Li Hsiang} and Chen, {Yu Cheng} and Shen, {Shi Min}",
year = "2016",
month = "3",
day = "1",
doi = "10.1016/j.eswa.2015.09.023",
language = "English",
volume = "45",
pages = "185--194",
journal = "Expert Systems with Applications",
issn = "0957-4174",
publisher = "Elsevier Limited",

}

Co-changing code volume prediction through association rule mining and linear regression model. / Lee, Shin Jie; Lo, Li Hsiang; Chen, Yu Cheng; Shen, Shi Min.

In: Expert Systems With Applications, Vol. 45, 01.03.2016, p. 185-194.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Co-changing code volume prediction through association rule mining and linear regression model

AU - Lee, Shin Jie

AU - Lo, Li Hsiang

AU - Chen, Yu Cheng

AU - Shen, Shi Min

PY - 2016/3/1

Y1 - 2016/3/1

N2 - Code smells are symptoms in the source code that indicate possible deeper problems and may serve as drivers for code refactoring. Although effort has been made on identifying divergent changes and shotgun surgeries, little emphasis has been put on predicting the volume of co-changing code that appears in the code smells. More specifically, when a software developer intends to perform a particular modification task on a method, a predicted volume of code that will potentially be co-changed with the method could be considered as significant information for estimating the modification effort. In this paper, we propose an approach to predicting volume of co-changing code affected by a method to be modified. The approach has the following key features: co-changing methods can be identified for detecting divergent changes and shotgun surgeries based on association rules mined from change histories; and volume of co-changing code affected by a method to be modified can be predicted through a derived fitted regression line with t-test based on the co-changing methods identification results. The experimental results show that the success rate of co-changing methods identification is 82% with a suggested threshold, and the numbers of correct identifications would not be influenced by the increasing number of commits as a project continuously evolves. Additionally, the mean absolute error of co-changing code volume predictions is 133 lines of code which is 95.3% less than the one of a naive approach.

AB - Code smells are symptoms in the source code that indicate possible deeper problems and may serve as drivers for code refactoring. Although effort has been made on identifying divergent changes and shotgun surgeries, little emphasis has been put on predicting the volume of co-changing code that appears in the code smells. More specifically, when a software developer intends to perform a particular modification task on a method, a predicted volume of code that will potentially be co-changed with the method could be considered as significant information for estimating the modification effort. In this paper, we propose an approach to predicting volume of co-changing code affected by a method to be modified. The approach has the following key features: co-changing methods can be identified for detecting divergent changes and shotgun surgeries based on association rules mined from change histories; and volume of co-changing code affected by a method to be modified can be predicted through a derived fitted regression line with t-test based on the co-changing methods identification results. The experimental results show that the success rate of co-changing methods identification is 82% with a suggested threshold, and the numbers of correct identifications would not be influenced by the increasing number of commits as a project continuously evolves. Additionally, the mean absolute error of co-changing code volume predictions is 133 lines of code which is 95.3% less than the one of a naive approach.

UR - http://www.scopus.com/inward/record.url?scp=84945242383&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84945242383&partnerID=8YFLogxK

U2 - 10.1016/j.eswa.2015.09.023

DO - 10.1016/j.eswa.2015.09.023

M3 - Article

AN - SCOPUS:84945242383

VL - 45

SP - 185

EP - 194

JO - Expert Systems with Applications

JF - Expert Systems with Applications

SN - 0957-4174

ER -