TY - GEN
T1 - CPC
T2 - 42nd ACM/IEEE International Conference on Software Engineering, ICSE 2020
AU - Zhai, Juan
AU - Xu, Xiangzhe
AU - Shi, Yu
AU - Tao, Guanhong
AU - Pan, Minxue
AU - Ma, Shiqing
AU - Xu, Lei
AU - Zhang, Weifeng
AU - Tan, Lin
AU - Zhang, Xiangyu
N1 - Funding Information:
We thank the anonymous reviewers for their constructive comments. This research was supported, in part by NSF-China 61802166, 61972193 and 61832009, DARPA FA8650-15-C-7562, NSF 1748764, 1901242 and 1910300, ONR N000141410468 and N000141712947, and Sandia National Lab under award 1701331. Any opinions, indings, and conclusions in this paper are those of the authors only and do not necessarily relect the views of our sponsors.
Publisher Copyright:
© 2020 Association for Computing Machinery.
PY - 2020/6/27
Y1 - 2020/6/27
N2 - Code comments provide abundant information that have been leveraged to help perform various software engineering tasks, such as bug detection, speciication inference, and code synthesis. However, developers are less motivated to write and update comments, making it infeasible and error-prone to leverage comments to facilitate software engineering tasks. In this paper, we propose to leverage program analysis to systematically derive, reine, and propagate comments. For example, by propagation via program analysis, comments can be passed on to code entities that are not commented such that code bugs can be detected leveraging the propagated comments. Developers usually comment on diferent aspects of code elements like methods, and use comments to describe various contents, such as functionalities and properties. To more efectively utilize comments, a ine-grained and elaborated taxonomy of comments and a reliable classiier to automatically categorize a comment are needed. In this paper, we build a comprehensive taxonomy and propose using program analysis to propagate comments. We develop a prototype CPC, and evaluate it on 5 projects. The evaluation results demonstrate 41573 new comments can be derived by propagation from other code locations with 88% accuracy. Among them, we can derive precise functional comments for 87 native methods that have neither existing comments nor source code. Leveraging the propagated comments, we detect 37 new bugs in open source large projects, 30 of which have been conirmed and ixed by developers, and 304 defects in existing comments (by looking at inconsistencies between existing and propagated comments), including 12 incomplete comments and 292 wrong comments. This demonstrates the efectiveness of our approach. Our user study conirms propagated comments align well with existing comments in terms of quality.
AB - Code comments provide abundant information that have been leveraged to help perform various software engineering tasks, such as bug detection, speciication inference, and code synthesis. However, developers are less motivated to write and update comments, making it infeasible and error-prone to leverage comments to facilitate software engineering tasks. In this paper, we propose to leverage program analysis to systematically derive, reine, and propagate comments. For example, by propagation via program analysis, comments can be passed on to code entities that are not commented such that code bugs can be detected leveraging the propagated comments. Developers usually comment on diferent aspects of code elements like methods, and use comments to describe various contents, such as functionalities and properties. To more efectively utilize comments, a ine-grained and elaborated taxonomy of comments and a reliable classiier to automatically categorize a comment are needed. In this paper, we build a comprehensive taxonomy and propose using program analysis to propagate comments. We develop a prototype CPC, and evaluate it on 5 projects. The evaluation results demonstrate 41573 new comments can be derived by propagation from other code locations with 88% accuracy. Among them, we can derive precise functional comments for 87 native methods that have neither existing comments nor source code. Leveraging the propagated comments, we detect 37 new bugs in open source large projects, 30 of which have been conirmed and ixed by developers, and 304 defects in existing comments (by looking at inconsistencies between existing and propagated comments), including 12 incomplete comments and 292 wrong comments. This demonstrates the efectiveness of our approach. Our user study conirms propagated comments align well with existing comments in terms of quality.
UR - http://www.scopus.com/inward/record.url?scp=85094316824&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85094316824&partnerID=8YFLogxK
U2 - 10.1145/3377811.3380427
DO - 10.1145/3377811.3380427
M3 - Conference contribution
AN - SCOPUS:85094316824
T3 - Proceedings - International Conference on Software Engineering
SP - 1359
EP - 1371
BT - Proceedings - 2020 ACM/IEEE 42nd International Conference on Software Engineering, ICSE 2020
PB - IEEE Computer Society
Y2 - 27 June 2020 through 19 July 2020
ER -