当前位置：和泉文库 > 计算机 > 浏览文档

How Far We Have Progressed in the Journey? An Examination of Cross-Project Defect Prediction

文件格式：PDF，文件大小：5.52MB，售价：14.25元

文档详细内容（约51页）

How Far We Have Progressed in the Journey?An Examination of CPDP 1:17 CPDP models were more cost-effective than simple module size models.However,the statistical significance and effect sizes were not examined.Currently,for most of the existing CPDP models, it is unclear whether they have a prediction performance superior to simple module size models. These observations reveal that much effort has been devoted to developing supervised CPDP models.However,little effort has been devoted to examining whether they are superior to simple module size models.It is important for researchers and practitioners to know the answer to this problem.If the magnitude of the difference were trivial,then simple module size models would be preferred in practice due to the low building and application cost.To answer this problem,we next compare the prediction performance of the existing supervised CPDP models with simple module size models. 3 EXPERIMENTAL DESIGN In this section,we first introduce the simple module size models under study.Then,we present the research questions relating simple module size models to the supervised CPDP models.After that,we describe the data analysis method used to investigate the research questions.Finally,we report the datasets used. 3.1 Simple Module Size Models In this study,we leverage simple module size metrics such as SLOC in the target release to build simple module size models.As stated by Monden et al.[67],to adopt defect prediction models in industry,one needs to consider not only their prediction performance but also the significant cost required for metrics collection and modeling themselves.A recent investigation from Google developers shows that a prerequisite for deploying a defect prediction model in a large company such as Google is that it must be able to scale to large source repositories [56].Therefore,our study only considers those simple models that have a low building cost,a low application cost, and a good scalability.More specifically,we take into account the following two simple module size models:ManualDown and ManualUp.For the simplicity of presentation,let m be a module in the target release,SizeMetric be a module size metric,and R(m)be the predicted risk value of the module m.Formally,the ManualDown model is R(m)=SizeMetric(m),while the ManualUp model is R(m)=1/SizeMetric(m).For a given target release,ManualDown considers a larger module as more defect-prone,as many studies report that a larger module tends to have more defects [65]. However,ManualUp considers a smaller module as more defect-prone,as recent studies argue that a smaller module is proportionally more defect-prone and hence should be inspected/tested first [49-51,65] Under ManualDown or ManualUp,it is possible that two modules have the same predicted risk values,i.e.,they have a tied rank.In our study,if there is a tied rank according to the predicted risk values,the module with a lower defect count will be ranked higher.In this way,we will obtain sim- ple module size models that have the "worst"predictive performance(theoretically).If the experi- mental results show that those "worst"simple module size models are competitive with the existing CPDP models,then we can safely conclude that,for practitioners,it would be better to apply simple module size models to predict defects in a target release.Note that,Canfora et al.used Trivialine and TrivialDec as the baseline models to investigate the performance of their proposed CPDP models under the ranking scenario [12].Conceptually,Trivialine is the same as ManualUp,while Trivialpec is the same as ManualDown.However,there are two important differences in the imple- mentation.First,TrivialIne and Trivialpec are applied to the z-score normalized module size data, while ManualUp and ManualDown are applied to the raw/unhandled module size data.Second,in Canfora et al.'s study,they did not report how to process the tied rank in Trivialine and TrivialDec. ACM Transactions on Software Engineering and Methodology,Vol.27.No.1,Article 1.Pub.date:April 2018

How Far We Have Progressed in the Journey? An Examination of CPDP 1:17 CPDP models were more cost-effective than simple module size models. However, the statistical significance and effect sizes were not examined. Currently, for most of the existing CPDP models, it is unclear whether they have a prediction performance superior to simple module size models. These observations reveal that much effort has been devoted to developing supervised CPDP models. However, little effort has been devoted to examining whether they are superior to simple module size models. It is important for researchers and practitioners to know the answer to this problem. If the magnitude of the difference were trivial, then simple module size models would be preferred in practice due to the low building and application cost. To answer this problem, we next compare the prediction performance of the existing supervised CPDP models with simple module size models. 3 EXPERIMENTAL DESIGN In this section, we first introduce the simple module size models under study. Then, we present the research questions relating simple module size models to the supervised CPDP models. After that, we describe the data analysis method used to investigate the research questions. Finally, we report the datasets used. 3.1 Simple Module Size Models In this study, we leverage simple module size metrics such as SLOC in the target release to build simple module size models. As stated by Monden et al. [67], to adopt defect prediction models in industry, one needs to consider not only their prediction performance but also the significant cost required for metrics collection and modeling themselves. A recent investigation from Google developers shows that a prerequisite for deploying a defect prediction model in a large company such as Google is that it must be able to scale to large source repositories [56]. Therefore, our study only considers those simple models that have a low building cost, a low application cost, and a good scalability. More specifically, we take into account the following two simple module size models: ManualDown and ManualUp. For the simplicity of presentation, let m be a module in the target release, SizeMetric be a module size metric, and R(m) be the predicted risk value of the modulem. Formally, the ManualDown model is R(m) = SizeMetric(m), while the ManualUp model is R(m) = 1/SizeMetric(m). For a given target release, ManualDown considers a larger module as more defect-prone, as many studies report that a larger module tends to have more defects [65]. However, ManualUp considers a smaller module as more defect-prone, as recent studies argue that a smaller module is proportionally more defect-prone and hence should be inspected/tested first [49–51, 65]. Under ManualDown or ManualUp, it is possible that two modules have the same predicted risk values, i.e., they have a tied rank. In our study, if there is a tied rank according to the predicted risk values, the module with a lower defect count will be ranked higher. In this way, we will obtain simple module size models that have the “worst” predictive performance (theoretically). If the experimental results show that those “worst” simple module size models are competitive with the existing CPDP models, then we can safely conclude that, for practitioners, it would be better to apply simple module size models to predict defects in a target release. Note that, Canfora et al. used TrivialInc and TrivialDec as the baseline models to investigate the performance of their proposed CPDP models under the ranking scenario [12]. Conceptually, TrivialInc is the same as ManualUp, while TrivialDec is the same as ManualDown. However, there are two important differences in the implementation. First, TrivialInc and TrivialDec are applied to the z-score normalized module size data, while ManualUp and ManualDown are applied to the raw/unhandled module size data. Second, in Canfora et al.’s study, they did not report how to process the tied rank in TrivialInc and TrivialDec . ACM Transactions on Software Engineering and Methodology, Vol. 27, No. 1, Article 1. Pub. date: April 2018

点击进入文档下载页（PDF格式）

共51页，试读已结束，阅读完整版请下载

您可能感兴趣的文档

点击购买下载（PDF）

下载及服务说明

购买前请先查看本文档预览页，确认内容后再进行支付；
如遇文件无法下载、无法访问或其它任何问题，可发送电子邮件反馈，核实后将进行文件补发或退款等其它相关操作；
邮箱：

文档浏览记录