1 Introduction
-
Contributing rich datasets. We created a SATD dataset containing 5,000 commit messages and 5,000 pull request sections from 103 Apache open-source projects. We manually tagged each item in this dataset as non-SATD or SATD (including types of SATD). Moreover, we also created a large dataset containing 23.7M code comments, 1.3M commit messages, 0.3M pull requests, and 0.6M issues from the same 103 Apache open-source projects. We make these two datasets publicly available1 to facilitate research in this area.
-
Proposing an approach (MT-Text-CNN) to identify four types of SATD from four sources. This approach is based on a convolutional neural network and leverages the multitask learning technique. The results indicate that our MT-Text-CNN approach achieves an average F1-score of 0.611 when identifying four types of SATD from the four aforementioned sources, outperforming other baseline methods by a large margin.
-
Summarizing lists of SATD keywords. SATD keywords for different types of SATD and for SATD from different sources are presented. The numbers of shared keywords between different sources are also calculated. The results show that issues and pull requests are the two most similar sources concerning the number of shared keywords, followed by commit messages, and finally by code comments.
-
Characterizing SATD from different sources in 103 open-source projects. The proposed MT-Text-CNN approach is utilized to identify SATD from 103 open-source projects. The number and percentage of different types of SATD are presented. The results indicate that SATD is evenly spread among different sources.
-
Investigating relations between SATD in different sources. We analyzed a sample of the identified SATD to explore the relations between SATD in different sources. The results show that there are four types of relations between SATD in different sources.
2 Related Work
2.1 Self-Admitted Technical Debt in Different Sources
2.2 Automatic Identification of Self-Admitted Technical Debt
3 Study Design
-
RQ1: How to accurately identify self-admitted technical debt from different sources?Rationale: As explained in Section 1, a fair amount of research has been focused on identifying SATD from source code comments (da Silva Maldonado et al. 2017; Huang et al. 2018; Ren et al. 2019; Wang et al. 2020). However, SATD in issues has hardly been explored (Dai and Kruchten 2017; Li et al. 2022b), while SATD identification in pull requests and commit messages has not been investigated before (Sierra et al. 2019). Moreover, there is a lack of integrated approaches to identify SATD from more than one source. This research question aims at proposing an approach for SATD identification in different sources with high accuracy.
-
RQ2: What are the most informative keywords to identify self-admitted technical debt in different sources?Rationale: When admitting technical debt in different sources, software engineers potentially have distinct ways of expressing the technical debt. For example, developers often write ‘TODO’ or ‘Fixme’ when admitting technical debt in source code comments, but may not commonly use these terms in other sources. Understanding the SATD keywords for different sources could give us an insight into the differences and similarities between sources. This can help practitioners identify SATD from different sources using summarized keywords. Furthermore, a recent study indicated that the keyword-based SATD identification method achieves a similar or even superior performance for source code comments compared with existing approaches (Guo et al. 2021). Thus, extracted keywords could be used to implement light-weighted keyword-based approaches to identify SATD from other sources.
-
RQ3: How much and what types of self-admitted technical debt are documented in different sources?Rationale: As aforementioned, software engineers can admit technical debt in different sources, but each technical debt item is also of a different type (e.g. design debt, test debt, code debt). Quantifying the number of different types of SATD can help to understand how SATD is distributed in different sources and the proportion of types of SATD in different sources. The types of SATD could help developers to prioritize SATD; for example, if test debt has higher priority in a company, developers should spend more effort in repaying this type in the sources it is most commonly found. Furthermore, answering this question could help in understanding the advantages and disadvantages of different sources for SATD management. For example, if most of the documentation debt is admitted in issue tracking systems, developers can focus on monitoring documentation debt in issues to keep it under control. We ask this research question to explore the characteristics of SATD in different sources.
-
RQ4: What are the relations between self-admitted technical debt in different sources?Rationale: Previous studies have reported that developers track technical debt using different sources (Zampetti et al. 2021), while different sources are used in different stages during software development (Aaron Stannard 2021; Monson-Haefel 2021; Akira Ajisaka 2021). There are likely interesting relations between SATD in different sources. An example of such a relation was revealed by Zampetti et al. (2018): SATD which was originally documented in code comments, is sometimes reported as paid back in commit messages. Understanding the relations between SATD in different sources can help in understanding the rationale behind admitting technical debt in each of these sources. It can also facilitate SATD repayment by grouping related SATD and solving them all together (Li et al. 2022c). Finally, providing developers with such relations could give them more context to understand the background of the SATD or its possible solutions. For example, after discussing the SATD within issues, developers may choose to document it in code comments to be repaid in the future. When that time comes, developers can combine the information on the code comments and the discussions in the related issue to make an informed repayment decision.
3.1 Approach Overview
3.2 Data Collection
Min | Max | Mean | Median | Sum | Min | Max | Mean | Median | Sum |
---|---|---|---|---|---|---|---|---|---|
# Issues | # Issue Comments | ||||||||
526 | 24,938 | 3,905 | 1,773 | 573,965 | 856 | 303,608 | 21,144 | 5,010 | 3,065,815 |
# Pull Requests | # Pull Comments | ||||||||
507 | 32,072 | 3,035 | 1,608 | 312,591 | 239 | 636,518 | 24,542 | 8,374 | 2,527,803 |
# Commits | # Code Comments | ||||||||
573 | 70,861 | 12,120 | 6,477 | 1,248,324 | 326 | 3,894,056 | 229,705 | 80,211 | 23,659,650 |
3.3 Linking Data in Different Sources
3.4 Data Cleansing
3.5 Data Classification
Type | Indicator | Definition |
---|---|---|
Arch. | Violation of modularity | Because shortcuts were taken, multiple modules became inter-dependent, while they should be independent. |
Using obsolete technology | Architecturally-significant technology has become obsolete. | |
Build | Over-or under-declared dependencies | Under-declared dependencies: dependencies in upstream libraries are not declared and rely on dependencies in lower-level libraries. Over-declared dependencies: unneeded dependencies are declared. |
Poor deployment practice | The quality of deployment is low that compile flags or build targets are not well organized. | |
Code | Complex code | Code has accidental complexity and requires extra refactoring action to reduce this complexity. |
Dead code | Code is no longer used and needs to be removed. | |
Duplicated code | Code that occurs more than once instead of as a single reusable function. | |
Low-quality code | Code quality is low, for example, because it is unreadable, inconsistent, or violating coding conventions. | |
Multi-thread correctness | Thread-safe code is not correct and may potentially result in synchronization problems or efficiency problems. | |
Slow algorithm | A non-optimal algorithm is utilized that runs slowly. | |
Defect | Uncorrected known defects | Defects are found by developers but ignored or deferred to be fixed. |
Design | Non-optimal decisions | Non-optimal design decisions are adopted. |
Doc. | Low-quality documentation | The documentation has been updated reflecting the changes in the system, but the quality of updated documentation is low. |
Outdated documentation | A function or class is added, removed, or modified in the system, but the documentation has not been updated to reflect the change. | |
Req. | Requirements partially implemented | Requirements are implemented, but some are not fully implemented. |
Non-functional requirements not being fully satisfied | Non-functional requirements (e.g. availability, capacity, concurrency, extensibility), as described by scenarios, are not fully satisfied. | |
Test | Expensive tests | Tests are expensive, resulting in slowing down testing activities. Extra refactoring actions are needed to simplify tests. |
Flaky tests | Tests fail or pass intermittently for the same configuration. | |
Lack of tests | A function is added, but no tests are added to cover the new function. | |
Low coverage | Only part of the source code is executed during testing. |
Number | Source | |||
---|---|---|---|---|
Code Comment | Issue Section | Pull Section | Commit Message | |
Code/Design Debt | 2,703 | 2,169 | 510 | 522 |
Documentation Debt | 54 | 487 | 101 | 98 |
Test Debt | 85 | 338 | 68 | 58 |
Requirement Debt | 757 | 97 | 20 | 27 |
Other | 58,676 | 20,089 | 4,301 | 4,295 |
3.6 Data Analysis
3.6.1 Machine Learning Models
-
Traditional machine learning approaches (LR, SVM, RF): To illustrate the effectiveness of our approach, we select and compare our approach with three prevalent traditional machine learning algorithms, namely Logistic Regression (LR) (Genkin et al. 2007), Support Vector Machine (SVM) (Sun et al. 2009), and Random Forest (RF) (Breiman 2001). We use TF-IDF to vectorize the input data and train these three traditional classifiers using the implementation in Sklearn4 with default settings.
-
Text Convolutional Neural Network (Text-CNN): Text-CNN is a state-of-the-art text classification algorithm proposed by Kim (2014), which has been used in several SATD identification studies (Ren et al. 2019; Li et al. 2022b). The details of this approach are given, as they are background knowledge for understanding the differences between Text-CNN and MT-Text-CNN. The architecture of Text-CNN is demonstrated in Fig. 4. As can be seen, Text-CNN consists of five layers, namely the embedding layer, convolutional layer, max-pooling layer, concatenation layer, and output layer.
-
Embedding layer: It is the first layer that converts the tokenized input sentence (the length of the sentence is n) into a matrix of size n × k using a k-dimensional word embedding (see Section 3.6.3). For example in Fig. 4, the input sentence is document should be updated to reflect this, which is transformed into a 7 × 5 matrix as the input sentence contains 7 words and the word embedding dimensionality equals to 5.
-
Convolutional layer: It is the fundamental layer of CNN that performs convolution operations to extract the high-level features from the sentence matrix. A convolution operation associates a filter, which is a matrix that has the same width as the sentence matrix (i.e., k) and the height of it varies. The height of the filter is denoted by region size. The filter with a region size of h can be applied to a window of h words to generate a new feature. Thus, by sliding a filter with a region size of h over the whole sentence matrix, a feature map of size n − h + 1 is produced. For instance in Fig. 4, when the model has filters whose region sizes are 1, 2, and 3, the sizes of produced feature maps are 7, 6, and 5 respectively.
-
Max-pooling layer: It is a layer that calculates the maximum value of each feature map to reduce the spatial size of the representation.
-
Concatenation layer: It is a layer that concatenates the scalar features to form the penultimate layer.
-
Output layer: It is the last layer that computes the probability of input text to be SATD text. Because Text-CNN is for a single task, it performs a linear transformation of the features from the previous layer by Y = W ⋅ X + B, where W and B denotes weight and bias. The length of Y equals the number of classes. Then the softmax function is applied to Y to calculate the probability of input text belonging to each class. For example, there are two classes: SATD text and non-SATD text in Fig. 4. In this work, because we focus on identifying different types of SATD or non-SATD, we have five classes: four types of SATD text and non-SATD text.
-
-
Multitask Text Convolutional Neural Network (MT-Text-CNN): Although SATD in different sources has substantial similarities, there still are significant differences between them (Li et al. 2022b). This could lower the accuracy of Text-CNN when detecting SATD from multiple sources, as the standards for SATD identification are slightly different for different sources. Thus, we propose the MT-Text-CNN approach to accurately identify SATD from different sources. The architecture of MT-Text-CNN is illustrated in Fig. 4. As we can see, apart from the output layer, the rest of the layers are identical to Text-CNN. Inspired by the work of Liu et al. (2015), for each task, we create a task-specific output layer, which also performs a linear transformation of the features from the previous layer by Y(t) = W(t) ⋅ X + B(t), where t denotes different tasks (i.e., identifying SATD from different sources). Then the softmax function is applied to Y(t) to calculate the probability of input text belonging to each class for task t.
3.6.2 Baseline Approaches
-
Random Classifier (Random): It classifies text as SATD randomly according to the probability of random text being SATD text. For instance, if the database contains 1,000 pieces of SATD text out of 10,000 pieces of text, this approach assumes the probability of new text to be SATD text is 1000/10000 = 10%. Then this approach randomly classifies any text as SATD text corresponding to the calculated probability (10%).
3.6.3 Word Embedding
3.6.4 Model Evaluation Approach
3.6.5 Multitask Network Training Procedure
-
Randomly pick up a task.
-
Get a random training sample for this task.
-
Train the machine learning model using the sample.
-
Go to the first step.
3.6.6 Strategies for Handling Imbalanced Data
3.6.7 Evaluation Metrics
3.6.8 Keyword Extraction
3.6.9 SATD Similarity Calculation
4 Results
4.1 (RQ1) How to Accurately Identify Self-Admitted Technical Debt from Different Sources?
4.1.1 Comparing the Predictive Performance of Different Classifiers
Classifier | Type of | Source | Average Imp. | ||||
---|---|---|---|---|---|---|---|
SATD | Comment | Commit | Pull | Issue | Avg. | over Random | |
Deep Learning | |||||||
Text-CNN | C/D. | 0.665 | 0.485 | 0.515 | 0.461 | 0.531 | 7.8× |
DOC. | 0.526 | 0.632 | 0.484 | 0.456 | 0.524 | 52.4× | |
TST. | 0.443 | 0.469 | 0.507 | 0.463 | 0.471 | 157.0× | |
REQ. | 0.566 | 0.217 | 0.299 | 0.343 | 0.356 | 71.2× | |
AVG. | 0.550 | 0.451 | 0.451 | 0.431 | 0.471 | 22.4× | |
MT-Text-CNN | C/D. | 0.725 | 0.536 | 0.539 | 0.486 | 0.571 | 8.4× |
DOC. | 0.626 | 0.659 | 0.441 | 0.457 | 0.546 | 54.6× | |
TST. | 0.540 | 0.449 | 0.461 | 0.432 | 0.470 | 159.7× | |
REQ. | 0.585 | 0.255 | 0.325 | 0.437 | 0.400 | 80.0× | |
AVG. | 0.619 | 0.475 | 0.441 | 0.453 | 0.497 | 23.7× | |
Traditional Machine Learning | |||||||
LR | C/D. | 0.613 | 0.327 | 0.457 | 0.353 | 0.438 | 6.4× |
DOC. | 0.352 | 0.556 | 0.281 | 0.235 | 0.356 | 35.6× | |
TST. | 0.245 | 0.129 | 0.206 | 0.228 | 0.202 | 67.3× | |
REQ. | 0.389 | 0.000 | 0.000 | 0.019 | 0.102 | 20.4× | |
AVG. | 0.400 | 0.253 | 0.236 | 0.208 | 0.274 | 13.0× | |
SVM | C/D. | 0.400 | 0.051 | 0.085 | 0.008 | 0.136 | 2.0× |
DOC. | 0.085 | 0.566 | 0.202 | 0.094 | 0.237 | 23.7× | |
TST. | 0.000 | 0.074 | 0.038 | 0.067 | 0.045 | 15.0× | |
REQ. | 0.200 | 0.000 | 0.000 | 0.000 | 0.050 | 10.0× | |
AVG. | 0.171 | 0.173 | 0.081 | 0.042 | 0.117 | 5.6× | |
RF | C/D. | 0.600 | 0.199 | 0.095 | 0.065 | 0.240 | 3.5× |
DOC. | 0.500 | 0.630 | 0.240 | 0.092 | 0.366 | 36.6× | |
TST. | 0.289 | 0.124 | 0.101 | 0.119 | 0.158 | 52.7× | |
REQ. | 0.584 | 0.000 | 0.000 | 0.056 | 0.160 | 32.0× | |
AVG. | 0.494 | 0.238 | 0.109 | 0.083 | 0.231 | 11.0× | |
Baseline | |||||||
Random | C/D. | 0.053 | 0.071 | 0.071 | 0.076 | 0.068 | |
DOC. | 0.001 | 0.003 | 0.021 | 0.013 | 0.010 | ||
TST. | 0.002 | 0.004 | 0.000 | 0.005 | 0.003 | ||
REQ. | 0.009 | 0.000 | 0.004 | 0.005 | 0.005 | ||
AVG. | 0.016 | 0.020 | 0.024 | 0.025 | 0.021 |
Approach | Text-CNN | LR | SVM | RF | Random |
---|---|---|---|---|---|
p-value | 3.62e-05 | 5.54e-59 | 1.37e-53 | 2.19e-32 | 2.49e-104 |
Cliff’s delta | 0.44 (large) | 1.0 (large) | 1.0 (large) | 1.0 (large) | 1.0 (large) |
4.1.2 Improving the MT-Text-CNN Approach
Word Embedding | Dimensions | Source | ||||
---|---|---|---|---|---|---|
Comment | Commit | Pull | Issue | Avg. | ||
Random (non-static) | 300 | 0.619 | 0.475 | 0.441 | 0.453 | 0.497 |
Trained (non-static) | 300 | 0.612 | 0.521 | 0.496 | 0.469 | 0.524 |
Trained (static) | 300 | 0.652 | 0.564 | 0.470 | 0.509 | 0.549 |
Type | Source | ||||
---|---|---|---|---|---|
Comment | Commit | Pull | Issue | Avg. | |
Default | 0.652 | 0.564 | 0.470 | 0.509 | 0.549 |
Weighted loss | 0.642 | 0.612 | 0.573 | 0.544 | 0.593 |
Region Size | Source | ||||
---|---|---|---|---|---|
Comment | Commit | Pull | Issue | Avg. | |
Single | |||||
(1) | 0.553 | 0.551 | 0.513 | 0.454 | 0.518 |
(3) | 0.602 | 0.643 | 0.541 | 0.523 | 0.577 |
(5) | 0.593 | 0.582 | 0.518 | 0.491 | 0.546 |
(7) | 0.567 | 0.513 | 0.477 | 0.451 | 0.502 |
Multiple | |||||
(1,2,3) | 0.632 | 0.647 | 0.596 | 0.596 | 0.606 |
(2,3,4) | 0.644 | 0.645 | 0.573 | 0.553 | 0.604 |
(3,4,5) | 0.642 | 0.612 | 0.573 | 0.544 | 0.593 |
(1,2,3,4) | 0.662 | 0.640 | 0.574 | 0.557 | 0.608 |
(1,3,5,7) | 0.652 | 0.631 | 0.559 | 0.541 | 0.596 |
(2,4,6,8) | 0.644 | 0.612 | 0.571 | 0.542 | 0.592 |
(1,2,3,4,5) | 0.656 | 0.642 | 0.581 | 0.555 | 0.609 |
(1,2,3,4,5,6) | 0.664 | 0.626 | 0.574 | 0.559 | 0.606 |
(1,2,3,4,5,6,7) | 0.662 | 0.615 | 0.576 | 0.558 | 0.603 |
Number of Features | Source | ||||
---|---|---|---|---|---|
Comment | Commit | Pull | Issue | Avg. | |
50 | 0.645 | 0.643 | 0.563 | 0.551 | 0.601 |
100 | 0.656 | 0.642 | 0.581 | 0.555 | 0.609 |
200 | 0.666 | 0.644 | 0.578 | 0.557 | 0.611 |
400 | 0.650 | 0.638 | 0.558 | 0.558 | 0.601 |
800 | 0.634 | 0.639 | 0.546 | 0.550 | 0.592 |
4.2 (RQ2) What are the most Informative Keywords to Identify Self-Admitted Technical Debt in Different Sources?
Comment | Commit | Pull | Issue |
---|---|---|---|
hack | typo | nit | typo |
todo | unused | typo | leak |
workaround | unnecessary | unnecessary | flaky |
defer argument checking | cleanup | redundant | unnecessary |
fixme | simplify | simplify | performance |
not needed | leak | flaky | checkstyle |
implement | flaky | unused | spelling |
this needs an extra | redundant | confusing | unused |
better | style | cleanup | cleanup |
efficient | polished | better | coverage |
Code/Design Debt | Documentation Debt |
unnecessary | typo |
nit | spelling |
leak | function needs documentation |
unused | todo document |
cleanup | missing license |
simplify | document why |
redundant | improve tutorial |
performance | add some javdoc |
checkstyle | add a comment |
confusing | more documentation |
Test Debt | Requirement Debt |
flaky | not implemented |
coverage | not thread-safe |
flakiness | todo |
todo test | work in progress |
more tests | yet implemented |
add tests | hasn’t implemented |
temporary test code | isn’t thread safe |
haven’t tested | not safe |
add a test | doesn’t support |
missing tests | isn’t implemented |
not tested | not supported |
4.3 (RQ3) How much and what Types of Self-Admitted Technical Debt are Documented in Different Sources?
Source | Total # | Type of SATD | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Code/Design | Doc. | Test | Req. | All | |||||||
# | % | # | % | # | % | # | % | # | % | ||
Comment | 9,747,914 | 411,060 | 4.2 | 21,817 | 0.2 | 16,152 | 0.2 | 61,256 | 0.6 | 510,285 | 5.2 |
Commit | 917,010 | 76,074 | 8.3 | 20,107 | 2.2 | 6,689 | 0.7 | 1,127 | 0.1 | 103,997 | 11.3 |
Pull | 2,925,540 | 335,005 | 11.5 | 61,452 | 2.1 | 36,575 | 1.3 | 5,667 | 0.2 | 438,699 | 15.0 |
Issue | 3,511,125 | 366,219 | 10.4 | 50,752 | 1.4 | 36,499 | 1.0 | 4,470 | 0.1 | 457,940 | 13.0 |
Type of SATD | Example |
---|---|
Code/Design | “Oh I didn’t realize we got duplicated logic. We need to refactor this.” |
- [from Superset-pull-request-6831] | |
“Need to add better handling for hz instance cleanup.” | |
- [from Camel-jira-issue-10563] | |
“Some new, friendlier APIs may be called for.” | |
- [from Druid-github-issue-5940] | |
Documentation | “Could you also please document the meaning of the various metrics” |
- [from Spark-pull-request-6905] | |
“I think we should document this” - [from Accumulo-jira-issue-1905] | |
“Currently, the api docs are missing from our website.” | |
- [from Mxnet-github-issue-6648] | |
Test | “It’d be good to add some usages of DurationGranularity to the query |
tests” - [from Druid-github-issue-3994] | |
“I did another cycle of review the unit tests, sorry I still not see value | |
in denial-of-service tests?” - [from Zookeeper-pull-request-689] | |
“I would like to have at least a simple testcase around | |
the UseV2WireProtocol feature” - [from Bookkeeper-github-issue-272] | |
Requirement | “TODO: add a dynamic context in front of every selector with a |
traversal” - [from Heron-code-comment] | |
“Remaining todo list for SQL parse module...” | |
- [from Pinot-github-issue-2505] | |
“Union is not supported yet. But i might be adding that capability | |
quite soon.” - [from Samza-pull-request-295] |
“[RFC][Quantization] Support quantized models from TensorflowLite...1. Support TFLite FP32 Relay frontend. PR: #23652. Support TFLite INT8 Relay frontend3. Extend the attribute of the convolution and related ops to support quantization4. Auto-TVM on ARM CPU can work with INT8...” - [from Tvm-github-issue-2351]
“[TFLite] Support TFLite FP32 Relay frontend. This is the first PR of #2351 to support importing exist quantized int8 TFLite model. The base version of Tensorflow / TFLite is 1.12.” - [from Tvm-pull-request-2365]
“[TFLite] Support TFLite FP32 Relay frontend. (#2365)- Support TFLite FP32 Relay frontend.- Fix lint issue- Remove unnecessary variables and packages...” - [from Tvm-commit-10df78a]
“# add more if we need target shapes in future” - [from Tvm-code-comment-10df78a]
Contribution Flow | Abbr. | # | % |
---|---|---|---|
Issue → Pull(s) → Commit(s) → Comment(s) | IPCC | 81,940 | 8.5 |
Issue → Commit(s) → Comment(s) | ICC | 182,406 | 18.9 |
Pull → Commit(s) → Comment(s) | PCC | 109,621 | 11.3 |
Commit → Comment(s) | CC | 593,015 | 61.3 |
4.4 (RQ4) What are the Relations Between Self-Admitted Technical Debt in Different Sources?
Pair of Sources | Number | Total | |
---|---|---|---|
Code Comment \(\leftrightarrow \) | Commit | 482 | 3,747 |
Pull Request | 989 | ||
Issue | 2,276 | ||
Commit \(\leftrightarrow \) | Code Comment | 482 | 3,564 |
Pull Request | 1,746 | ||
Issue | 1,336 | ||
Pull Request \(\leftrightarrow \) | Code Comment | 989 | 3,829 |
Commit | 1,746 | ||
Issue | 1,094 | ||
Issue \(\leftrightarrow \) | Code Comment | 2,276 | 4,706 |
Commit | 1,336 | ||
Pull Request | 1,094 |
-
Documenting existing SATD in additional sources. We found that developers document already existing SATD in other sources for two different reasons. As shown in Fig. 1, when developers identify technical debt and discuss it in issues or pull requests, if they choose not to fix it immediately, they could document it in code comments or commit messages, as a reminder to repay it in the future. For example, a developer agreed to improve functionality, but not immediately. They then commented in the pull request:“...to improve the read throughput, creating new watcher bit and adding it to the BitHashSet has its own lock to minimize the lock scope. I’ll add some comments here.” - [from Zookeeper-pull-590]Subsequently, they created a code comment to point out the issue that needs to be resolved:“// Need readLock to exclusively lock with removeWatcher, otherwise we may add a dead watch whose connection was just closed. Creating new watcher bit and adding it to the BitHashSet has it’s own lock to minimize the write lock scope.” - [from Zookeeper-code-comment]A second case arises when developers report technical debt in issues and decide to solve it with pull requests; they often create a new pull request using the same title or description as the issue to describe the existing SATD. For example, a developer created an issue to solve a legacy code problem:“Cleanup the legacy cluster mode.” - [from Tajo-issue-1482]After discussion, developers chose to create a pull request to pay back the debt:“TAJO-1482: Cleanup the legacy cluster mode.” - [from Tajo-pull-484]
-
Discussing the solution of SATD in other sources. When technical debt is reported in issues, developers may choose to create a pull request to discuss detailed solutions for it (see Fig. 1). For example, a developer reported a problem with mixing public and private headers by creating an issue:“Some public headers include private headers. Some public headers include items that do not need to be included.” - [from Geode-issue-4151]After that, they described the details of this technical debt and discussed the solutions in a pull request:“I found that early on we had mixed up the include paths in the CMake project so we were able to include private headers from the public headers. This will cause anyone trying to build a client to have a frustrating time since public won’t be able to find private headers...” - [from Geode-pull-173]
-
Documenting the repayment of SATD in other sources. When SATD is paid back, this repayment is sometimes documented in other sources. As we can see in Fig. 1, when the SATD is solved after discussing it inside issues or pull requests, developers could document its repayment in commit messages or code comments. For example, a software engineer found that error messages are too general and reported it in an issue:“To make troubleshooting easier I think that a more fine grained error handling could provide the user with a better view of what the underlying error really is.” - [from Camel-issue-9549]When the error messages were improved, the engineer reported the SATD repayment in the commit message:“CAMEL-9549 - Improvement of error messages when compiling the schema.” - [from Camel-commit-dc3bb68]Additionally, it is also usual to document SATD repayment in source code comments. For example, a software engineer reported a code duplication problem by creating a Jira issue ticket:“...a lot of functionality is shared between Followers and Observers. To avoid copying code, it makes sense to push the common code into a parent Peer class and specialise it for Followers and Observers.” - [from Zookeeper-issue-549]When this technical debt was solved, the engineer added an explanation in the code comments for this SATD repayment:“// This class is the superclass of two of the three main actors in a ZK ensemble: Followers and Observers. Both Followers and Observers share a good deal of code which is moved into Peer to avoid duplication.” - [from Zookeeper-code-comment]
-
Paying back documentation debt in code comments. This is a special case of the previous one. Because code comments are a kind of documentation, some documentation debt can be paid back by adding comments or Javadoc in source code comments. When documentation debt is reported in issues, developers might pay back the debt directly by writing code comments (see Fig. 1). For example, a developer found that documentation is incomplete:“If the assumption is that both the buffers should be of same length, please document it.” - [from Pinot-pull-2983]Subsequently, they updated the source code comments to solve this debt:“// NOTE: we don’t check whether the array is null or the length of the array for performance concern. All the dimension buffers should have the same length.” - [from Pinot-code-comment]
5 Discussion
5.1 Automatic Identification of Different SATD Types in Multiple Sources
5.2 Self-Admitting Technical Debt in Different Sources
“...added this logic to make it easier for the FE (you can see it in the ‘create‘ logic already), by not requiring us to stringify our json beforehand, which I’m fine with. Do you see it as being an issue in the long run?” - [from Superset-pull-11770]
“// Need better logic for this” - [from Superset-code-comment]
“yes, agreed, it’s a typo...” - [from Drill-pull-602]
“Fix typo” - [from Drill-commit-c77f13c]