ABSTRACT
Software readability is a property that influences how easily a given piece of code can be read and understood. Since readability can affect maintainability, quality, etc., programmers are very concerned about the readability of code. If automatic readability checkers could be built, they could be integrated into development tool-chains, and thus continually inform developers about the readability level of the code. Unfortunately, readability is a subjective code property, and not amenable to direct automated measurement. In a recently published study, Buse et al. asked 100 participants to rate code snippets by readability, yielding arguably reliable mean readability scores of each snippet; they then built a fairly complex predictive model for these mean scores using a large, diverse set of directly measurable source code properties. We build on this work: we present a simple, intuitive theory of readability, based on size and code entropy, and show how this theory leads to a much sparser, yet statistically significant, model of the mean readability scores produced in Buse's studies. Our model uses well-known size metrics and Halstead metrics, which are easily extracted using a variety of tools. We argue that this approach provides a more theoretically well-founded, practically usable, approach to readability measurement.
- The Zen of Python. http://www.python.org/dev/peps/pep-0020/. {Online; accessed 31-January-2011}.Google Scholar
- K. Aggarwal, Y. Singh, and J. Chhabra. An integrated measure of software maintainability. In Reliability and Maintainability Symposium, 2002. Proceedings. Annual, pages 235--241. IEEE, 2002.Google ScholarCross Ref
- R. M. Baecker and A. Marcus. Human factors and typography for more readable programs. ACM, New York, NY, USA, 1989. Google Scholar
- J. Börstler, M. Caspersen, and M. Nordström. Beauty and the Beast: Toward a Measurement Framework for Example Program Quality. Department of Computing Science, Umeå University, 2008.Google Scholar
- L. Briand and J. Wüst. Empirical studies of quality models in object-oriented systems. Advances in Computers, 56:97--166, 2002.Google ScholarCross Ref
- R. Buse and W. Weimer. Learning a Metric for Code Readability. Software Engineering, IEEE Transactions on, 36(4):546--558, 2010. Google ScholarDigital Library
- S. Butler, M. Wermelinger, Y. Yu, and H. Sharp. Relating Identifier Naming Flaws and Code Quality: An Empirical Study. In 2009 16th Working Conference on Reverse Engineering, pages 31--35. IEEE, 2009. Google ScholarDigital Library
- S. Butler, M. Wermelinger, Y. Yu, and H. Sharp. Exploring the influence of identifier names on code quality: An empirical study. In 14th European Conference on Software Maintenance and Reengineering, March 2010. Pages 159--168. Google ScholarDigital Library
- J. Cohen. Applied multiple regression/correlation analysis for the behavioral sciences. Lawrence Erlbaum, 2003.Google Scholar
- D. Coleman, D. Ash, B. Lowther, and P. Oman. Using metrics to evaluate software system maintainability. Computer, 27(8):44--49, 2002. Google ScholarDigital Library
- N. Coulter. Software science and cognitive psychology. IEEE Transactions on Software Engineering, pages 166--171, 1983. Google ScholarDigital Library
- S. Dahiya, J. Chhabra, and S. Kumar. Use of genetic algorithm for software maintainability metrics' conditioning. In Advanced Computing and Communications, 2007. ADCOM 2007. International Conference on, pages 87--92. IEEE, 2008. Google ScholarDigital Library
- F. Détienne and F. Bott. Software design-cognitive aspects. Springer Verlag, 2002. Google ScholarDigital Library
- K. El Emam, S. Benlarbi, N. Goel, and S. Rai. The confounding effect of class size on the validity of object-oriented metrics. Software Engineering, IEEE Transactions on, 27(7):630--650, 2002. Google ScholarDigital Library
- J. Elshoff and M. Marcotty. Improving computer program readability to aid modification. Communications of the ACM, 25(8):512--521, 1982. Google ScholarDigital Library
- L. Etzkorn, S. Gholston, and W. Hughes Jr. A semantic entropy metric. Journal of Software Maintenance and Evolution: Research and Practice, 14(4):293--310, 2002. Google ScholarCross Ref
- R. Flesch. A new readability yardstick. Journal of applied psychology, 32(3):221--233, 1948.Google Scholar
- R. Forax. Why extension methods are evil. http://weblogs.java.net/blog/forax/archive/2009/11/-28/why-extension-methods-are-evil. {Online; accessed 31-January-2011}.Google Scholar
- B. Guzel. Top 15 best practices for writing super readable code. http://net.tutsplus.com/tutorials/htmlcss-techniques/top-15-best-practices-for-writing-superreadable-code/. {Online; accessed 31-January-2011}.Google Scholar
- M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. Witten. The WEKA data mining software: An update. ACM SIGKDD Explorations Newsletter, 11(1):10--18, 2009. Google ScholarDigital Library
- M. Halstead. Elements of software science. Elsevier New York, 1977. Google ScholarDigital Library
- A. Hindle, M. Godfrey, and R. Holt. Reading beside the lines: Indentation as a proxy for complexity metric. In Program Comprehension, 2008. ICPC 2008. The 16th IEEE International Conference on, pages 133--142. IEEE, 2008. Google ScholarDigital Library
- M. Kanat-Alexander. Readability and naming things. http://www.codesimplicity.com/post/readability-andnaming-things/. {Online; accessed 31-January-2011}.Google Scholar
- J. Kearney, R. Sedlmeyer, W. Thompson, M. Gray, and M. Adler. Software complexity measurement. Communications of the ACM, 29(11):1044--1050, 1986. Google ScholarDigital Library
- D. Kozlov, J. Koskinen, M. Sakkinen, and J. Markkula. Assessing maintainability change over multiple software releases. Journal of Software Maintenance and Evolution: Research and Practice, 20(1):31--58, 2008. Google ScholarDigital Library
- J. Kumar Chhabra, K. Aggarwal, and Y. Singh. Code and data spatial complexity: two important software understandability measures. Information and software Technology, 45(8):539--546, 2003.Google Scholar
- S. Lessmann, B. Baesens, C. Mues, and S. Pietsch. Benchmarking classification models for software defect prediction: A proposed framework and novel findings. IEEE Transactions on Software Engineering, 34(4):485, 2008. Google ScholarDigital Library
- J. Lin and K. Wu. A Model for Measuring Software Understandability. In Computer and Information Technology, 2006. CIT'06. The Sixth IEEE International Conference on, page 192. IEEE, 2006. Google ScholarDigital Library
- J. Lin and K. Wu. Evaluation of software understandability based on fuzzy matrix. In Fuzzy Systems, 2008. FUZZ-IEEE 2008.(IEEE World Congress on Computational Intelligence). IEEE International Conference on, pages 887--892. IEEE, 2008.Google Scholar
- A. Mohan, N. Gold, and P. Layzell. An initial approach to assessing program comprehensibility using spatial complexity, number of concepts and typographical style. In Reverse Engineering, 2004. Proceedings. 11th Working Conference on, pages 246--255. IEEE, 2005. Google ScholarDigital Library
- N. Naeem, M. Batchelder, and L. Hendren. Metrics for measuring the effectiveness of decompilers and obfuscators. In Program Comprehension, 2007. ICPC'07. 15th IEEE International Conference on, pages 253--258. IEEE, 2007. Google ScholarDigital Library
- P. Peduzzi, J. Concato, E. Kemper, T. Holford, and A. Feinstein. A simulation study of the number of events per variable in logistic regression analysis* 1. Journal of clinical epidemiology, 49(12):1373--1379, 1996.Google Scholar
- D. Raymond. Reading source code. In Proceedings of the 1991 conference of the Centre for Advanced Studies on Collaborative research, pages 3--16. IBM Press, 1991. Google ScholarDigital Library
- P. Relf. Tool assisted identifier naming for improved software readability: an empirical study. In Empirical Software Engineering, 2005. 2005 International Symposium on, page 10. IEEE, 2005.Google Scholar
Index Terms
- A simpler model of software readability
Recommendations
Towards understanding code readability and its impact on design quality
NL4SE 2018: Proceedings of the 4th ACM SIGSOFT International Workshop on NLP for Software EngineeringReadability of code is commonly believed to impact the overall quality of software. Poor readability not only hinders developers from understanding what the code is doing but also can cause developers to make sub-optimal changes and introduce bugs. ...
Improving source code readability: theory and practice
ICPC '19: Proceedings of the 27th International Conference on Program ComprehensionThere are several widely accepted metrics to measure code quality that are currently being used in both research and practice to detect code smells and to find opportunities for code improvement. Although these metrics have been proposed as a proxy of ...
The Effect of Font Type on Screen Readability by People with Dyslexia
Around 10% of the people have dyslexia, a neurological disability that impairs a person’s ability to read and write. There is evidence that the presentation of the text has a significant effect on a text’s accessibility for people with dyslexia. However,...
Comments