Skip to main content
Erschienen in: Empirical Software Engineering 2/2020

20.01.2020

Code Localization in Programming Screencasts

verfasst von: Mohammad Alahmadi, Abdulkarim Khormi, Biswas Parajuli, Jonathan Hassel, Sonia Haiduc, Piyush Kumar

Erschienen in: Empirical Software Engineering | Ausgabe 2/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Programming screencasts are growing in popularity and are often used by developers as a learning source. The source code shown in these screencasts is often not available for download or copy-pasting. Without having the code readily available, developers have to frequently pause a video to transcribe the code. This is time-consuming and reduces the effectiveness of learning from videos. Recent approaches have applied Optical Character Recognition (OCR) techniques to automatically extract source code from programming screencasts. One of their major limitations, however, is the extraction of noise such as the text information in the menu, package hierarchy, etc. due to the imprecise approximation of the code location on the screen. This leads to incorrect, unusable code. We aim to address this limitation and propose an approach to significantly improve the accuracy of code localization in programming screencasts, leading to a more precise code extraction. Our approach uses a Convolutional Neural Network to automatically predict the exact location of code in an image. We evaluated our approach on a set of frames extracted from 450 screencasts covering Java, C#, and Python programming topics. The results show that our approach is able to detect the area containing the code with 94% accuracy and that our approach significantly outperforms previous work. We also show that applying OCR on the code area identified by our approach leads to a 97% match with the ground truth on average, compared to only 31% when OCR is applied to the entire frame.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Literatur
Zurück zum Zitat Bao L, Li J, Xing Z, Wang X, Xia X, Zhou B (2017) Extracting and analyzing time-series hci data from screen-captured task videos. Empir Softw Eng 22 (1):134–174CrossRef Bao L, Li J, Xing Z, Wang X, Xia X, Zhou B (2017) Extracting and analyzing time-series hci data from screen-captured task videos. Empir Softw Eng 22 (1):134–174CrossRef
Zurück zum Zitat Brandt J, Guo PJ, Lewenstein J, Dontcheva M, Klemmer SR (2009) Two studies of opportunistic programming: Interleaving web foraging, learning, and writing code. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI ’09. https://doi.org/10.1145/1518701.1518944. ACM, New York, pp 1589–1598 Brandt J, Guo PJ, Lewenstein J, Dontcheva M, Klemmer SR (2009) Two studies of opportunistic programming: Interleaving web foraging, learning, and writing code. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI ’09. https://​doi.​org/​10.​1145/​1518701.​1518944. ACM, New York, pp 1589–1598
Zurück zum Zitat Canny J (1986) A computational approach to edge detection. Ieee Transactions on Pattern Analysis and Machine Inteligence, pp 679–698CrossRef Canny J (1986) A computational approach to edge detection. Ieee Transactions on Pattern Analysis and Machine Inteligence, pp 679–698CrossRef
Zurück zum Zitat Dai J, Li Y, He K, Sun J (2016) R-FCN: Object detection via region-based fully convolutional networks. arXiv:160506409 [cs] Dai J, Li Y, He K, Sun J (2016) R-FCN: Object detection via region-based fully convolutional networks. arXiv:160506409 [cs]
Zurück zum Zitat Ellmann M, Oeser A, Fucci D, Maalej W (2017) Find, understand, and extend development screencasts on youtube. In: Proceedings of the 3rd ACM SIGSOFT International Workshop on Software Analytics, ACM, pp 1–7 Ellmann M, Oeser A, Fucci D, Maalej W (2017) Find, understand, and extend development screencasts on youtube. In: Proceedings of the 3rd ACM SIGSOFT International Workshop on Software Analytics, ACM, pp 1–7
Zurück zum Zitat Escobar-Avila J, Parra E, Haiduc S (2017) Text retrieval-based tagging of software engineering video tutorials. In: Proceedings of the 39th IEEE/ACM International Conference on Software Engineering (ICSE’17). IEEE, Buenos Aires, pp 341–343 Escobar-Avila J, Parra E, Haiduc S (2017) Text retrieval-based tagging of software engineering video tutorials. In: Proceedings of the 39th IEEE/ACM International Conference on Software Engineering (ICSE’17). IEEE, Buenos Aires, pp 341–343
Zurück zum Zitat Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338CrossRef Everingham M, Van Gool L, Williams CK, Winn J, Zisserman A (2010) The pascal visual object classes (voc) challenge. Int J Comput Vis 88(2):303–338CrossRef
Zurück zum Zitat Felzenszwalb PF, Huttenlocher DP (2004) Efficient graph-based image segmentation. Int J Comput Vis 59(2):167–181CrossRef Felzenszwalb PF, Huttenlocher DP (2004) Efficient graph-based image segmentation. Int J Comput Vis 59(2):167–181CrossRef
Zurück zum Zitat Girshick R, Donahue J, Darrell T, Malik J (2013) Rich feature hierarchies for accurate object detection and semantic segmentation. arXiv:13112524 [cs] Girshick R, Donahue J, Darrell T, Malik J (2013) Rich feature hierarchies for accurate object detection and semantic segmentation. arXiv:13112524 [cs]
Zurück zum Zitat Grzywaczewski A, Iqbal R (2012) Task-specific information retrieval systems for software engineers. J Comput Syst Sci 78(4):1204–1218MathSciNetCrossRef Grzywaczewski A, Iqbal R (2012) Task-specific information retrieval systems for software engineers. J Comput Syst Sci 78(4):1204–1218MathSciNetCrossRef
Zurück zum Zitat He K, Zhang X, Ren S, Sun J (2015) Deep residual learning for image recognition. arXiv:151203385 [cs] He K, Zhang X, Ren S, Sun J (2015) Deep residual learning for image recognition. arXiv:151203385 [cs]
Zurück zum Zitat Huang J, Rathod V, Sun C, Zhu M, Korattikara A, Fathi A, Fischer I, Wojna Z, Song Y, Guadarrama S, et al. (2017) Speed/accuracy trade-offs for modern convolutional object detectors. In: IEEE CVPR, vol 4 Huang J, Rathod V, Sun C, Zhu M, Korattikara A, Fathi A, Fischer I, Wojna Z, Song Y, Guadarrama S, et al. (2017) Speed/accuracy trade-offs for modern convolutional object detectors. In: IEEE CVPR, vol 4
Zurück zum Zitat Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv:150203167 [cs] Ioffe S, Szegedy C (2015) Batch normalization: Accelerating deep network training by reducing internal covariate shift. arXiv:150203167 [cs]
Zurück zum Zitat Jaccard P (1912) The distribution of the flora in the alpine zone. 1. New Phytologist 11(2):37–50CrossRef Jaccard P (1912) The distribution of the flora in the alpine zone. 1. New Phytologist 11(2):37–50CrossRef
Zurück zum Zitat Juan L, Gwun O (2009) A comparison of sift, pca-sift and surf. International Journal of Image Processing (IJIP) 3(4):143–152 Juan L, Gwun O (2009) A comparison of sift, pca-sift and surf. International Journal of Image Processing (IJIP) 3(4):143–152
Zurück zum Zitat Kim KH, Hong S, Roh B, Cheon Y, Park M (2016) PVANET: Deep but lightweight neural networks for real-time object detection. arXiv:160808021 Kim KH, Hong S, Roh B, Cheon Y, Park M (2016) PVANET: Deep but lightweight neural networks for real-time object detection. arXiv:160808021
Zurück zum Zitat Lin TY, Maire M, Belongie S, Bourdev L, Girshick R, Hays J, Perona P, Ramanan D, Zitnick CL, Dollár P (2014) Microsoft coco: Common objects in context. arXiv:14050312 [cs] Lin TY, Maire M, Belongie S, Bourdev L, Girshick R, Hays J, Perona P, Ramanan D, Zitnick CL, Dollár P (2014) Microsoft coco: Common objects in context. arXiv:14050312 [cs]
Zurück zum Zitat Lin TY, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: CVPR, vol 2 Lin TY, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: CVPR, vol 2
Zurück zum Zitat Lowe DG (2004) Distinctive image features from Scale-Invariant keypoints. Int J Comput Vis 60(2):91–110CrossRef Lowe DG (2004) Distinctive image features from Scale-Invariant keypoints. Int J Comput Vis 60(2):91–110CrossRef
Zurück zum Zitat MacLeod L, Storey MA, Bergen A (2015) Code, camera, action: How software developers document and share program knowledge using youtube. In: Proceedings of the 23rd IEEE International Conference on Program Comprehension (ICPC’15), Florence, pp 104–114 MacLeod L, Storey MA, Bergen A (2015) Code, camera, action: How software developers document and share program knowledge using youtube. In: Proceedings of the 23rd IEEE International Conference on Program Comprehension (ICPC’15), Florence, pp 104–114
Zurück zum Zitat Mikolajczyk K, Schmid C (2005) A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(10):1615–1630CrossRef Mikolajczyk K, Schmid C (2005) A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(10):1615–1630CrossRef
Zurück zum Zitat Ott J, Atchison A, Harnack P, Bergh A, Linstead E (2018a) A deep learning approach to identifying source code in images and video. In: Proceedings of the 15th IEEE/ACM Working Conference on Mining Software Repositories, pp 376–386 Ott J, Atchison A, Harnack P, Bergh A, Linstead E (2018a) A deep learning approach to identifying source code in images and video. In: Proceedings of the 15th IEEE/ACM Working Conference on Mining Software Repositories, pp 376–386
Zurück zum Zitat Ott J, Atchison A, Harnack P, Best N, Anderson H, Firmani C, Linstead E (2018b) Learning lexical features of programming languages from imagery using convolutional neural networks Ott J, Atchison A, Harnack P, Best N, Anderson H, Firmani C, Linstead E (2018b) Learning lexical features of programming languages from imagery using convolutional neural networks
Zurück zum Zitat Parra E, Escobar-Avila J, Haiduc S (2018) Automatic tag recommendation for software development video tutorials. In: Proceedings of the 26th Conference on Program Comprehension, ACM, pp 222–232 Parra E, Escobar-Avila J, Haiduc S (2018) Automatic tag recommendation for software development video tutorials. In: Proceedings of the 26th Conference on Program Comprehension, ACM, pp 222–232
Zurück zum Zitat Poché E, Jha N, Williams G, Staten J, Vesper M, Mahmoud A (2017) Analyzing user comments on youtube coding tutorial videos. In: Proceedings of the 25th International Conference on Program Comprehension, IEEE Press, pp 196–206 Poché E, Jha N, Williams G, Staten J, Vesper M, Mahmoud A (2017) Analyzing user comments on youtube coding tutorial videos. In: Proceedings of the 25th International Conference on Program Comprehension, IEEE Press, pp 196–206
Zurück zum Zitat Ponzanelli L, Bavota G, Mocci A, Di Penta M, Oliveto R, Russo B, Haiduc S, Lanza M (2016b) codetube: Extracting relevant fragments from software development video tutorials. In: Proceedings of the 38th ACM/IEEE International Conference on Software Engineering (ICSE’16). ACM, Austin, pp 645–648 Ponzanelli L, Bavota G, Mocci A, Di Penta M, Oliveto R, Russo B, Haiduc S, Lanza M (2016b) codetube: Extracting relevant fragments from software development video tutorials. In: Proceedings of the 38th ACM/IEEE International Conference on Software Engineering (ICSE’16). ACM, Austin, pp 645–648
Zurück zum Zitat Redmon J, Divvala S, Girshick R, Farhadi A (2015) You only look once: Unified, real-time object detection. arXiv:150602640 [cs] Redmon J, Divvala S, Girshick R, Farhadi A (2015) You only look once: Unified, real-time object detection. arXiv:150602640 [cs]
Zurück zum Zitat Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: Towards real-time object detection with region proposal networks. arXiv:150601497 [cs] Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: Towards real-time object detection with region proposal networks. arXiv:150601497 [cs]
Zurück zum Zitat Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252MathSciNetCrossRef Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M et al (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252MathSciNetCrossRef
Zurück zum Zitat Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:14091556 [cs] Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:14091556 [cs]
Zurück zum Zitat Szegedy C, Ioffe S, Vanhoucke V, Alemi A (2016) Inception-v4, inception-resnet and the impact of residual connections on learning. arXiv:160207261 [cs] Szegedy C, Ioffe S, Vanhoucke V, Alemi A (2016) Inception-v4, inception-resnet and the impact of residual connections on learning. arXiv:160207261 [cs]
Zurück zum Zitat Uijlings JR, Van De Sande KE, Gevers T, Smeulders AW (2013) Selective search for object recognition. Int J Comput Vis 104(2):154–171CrossRef Uijlings JR, Van De Sande KE, Gevers T, Smeulders AW (2013) Selective search for object recognition. Int J Comput Vis 104(2):154–171CrossRef
Zurück zum Zitat Wang Z, Bovik AC, Sheikh HR, Simoncelli EP et al (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13 (4):600–612CrossRef Wang Z, Bovik AC, Sheikh HR, Simoncelli EP et al (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process 13 (4):600–612CrossRef
Zurück zum Zitat Yadid S, Yahav E (2016) Extracting code from programming tutorial videos. In: Proceedings of the 6th ACM International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software (Onward!’16). ACM, Amsterdam, pp 98–111 Yadid S, Yahav E (2016) Extracting code from programming tutorial videos. In: Proceedings of the 6th ACM International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software (Onward!’16). ACM, Amsterdam, pp 98–111
Zurück zum Zitat Zhao D, Xing Z, Chen C, Xia X, Li G, Tong SJ (2019) Actionnet: Vision-based workflow action recognition from programming screencasts. In: Proceedings of the 41st ACM/IEEE International Conference on Software Engineering (ICSE’19) Zhao D, Xing Z, Chen C, Xia X, Li G, Tong SJ (2019) Actionnet: Vision-based workflow action recognition from programming screencasts. In: Proceedings of the 41st ACM/IEEE International Conference on Software Engineering (ICSE’19)
Zurück zum Zitat Zimmermann T, Premraj R, Zeller A (2007) Predicting defects for eclipse. In: Proceedings of the 3rd IEEE International Workshop on Predictor Models in Software Engineering (PROMISE’07), Washington, pp 9–15 Zimmermann T, Premraj R, Zeller A (2007) Predicting defects for eclipse. In: Proceedings of the 3rd IEEE International Workshop on Predictor Models in Software Engineering (PROMISE’07), Washington, pp 9–15
Metadaten
Titel
Code Localization in Programming Screencasts
verfasst von
Mohammad Alahmadi
Abdulkarim Khormi
Biswas Parajuli
Jonathan Hassel
Sonia Haiduc
Piyush Kumar
Publikationsdatum
20.01.2020
Verlag
Springer US
Erschienen in
Empirical Software Engineering / Ausgabe 2/2020
Print ISSN: 1382-3256
Elektronische ISSN: 1573-7616
DOI
https://doi.org/10.1007/s10664-019-09759-w

Weitere Artikel der Ausgabe 2/2020

Empirical Software Engineering 2/2020 Zur Ausgabe