Skip to main content
Top
Published in: Multimedia Systems 1/2022

18-05-2021 | Regular Paper

Code generation from a graphical user interface via attention-based encoder–decoder model

Authors: Wen-Yin Chen, Pavol Podstreleny, Wen-Huang Cheng, Yung-Yao Chen, Kai-Lung Hua

Published in: Multimedia Systems | Issue 1/2022

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Code generation from graphical user interface images is a promising area of research. Recent progress on machine learning methods made it possible to transform user interface into the code using several methods. The encoder–decoder framework represents one of the possible ways to tackle code generation tasks. Our model implements the encoder–decoder framework with an attention mechanism that helps the decoder to focus on a subset of salient image features when needed. Our attention mechanism also helps the decoder to generate token sequences with higher accuracy. Experimental results show that our model outperforms previously proposed models on the pix2code benchmark dataset.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
3.
go back to reference Xu, K., Ba, J., Kiros, R., Cho, K., Courville, A.C., Salakhutdinov, R., Zemel, R.S., Bengio, Y.: Show, attend and tell: Neural image caption generation with visual attention. CoRR arXiv:1502.03044 (2015) Xu, K., Ba, J., Kiros, R., Cho, K., Courville, A.C., Salakhutdinov, R., Zemel, R.S., Bengio, Y.: Show, attend and tell: Neural image caption generation with visual attention. CoRR arXiv:​1502.​03044 (2015)
4.
go back to reference Liu, Y., Hu, Q., Shu, K.: Improving pix2code based bi-directional lstm. 2018 IEEE international conference on automation, electronics and electrical engineering (AUTEEE) p. 220–223 (2018) Liu, Y., Hu, Q., Shu, K.: Improving pix2code based bi-directional lstm. 2018 IEEE international conference on automation, electronics and electrical engineering (AUTEEE) p. 220–223 (2018)
5.
go back to reference Zhu, Z., Xue, Z., Yuan, Z.: Automatic graphics program generation using attention-based hierarchical decoder. CoRR arXiv:1810.11536 (2018) Zhu, Z., Xue, Z., Yuan, Z.: Automatic graphics program generation using attention-based hierarchical decoder. CoRR arXiv:​1810.​11536 (2018)
6.
go back to reference Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate (2014) Bahdanau, D., Cho, K., Bengio, Y.: Neural machine translation by jointly learning to align and translate (2014)
7.
go back to reference Tang, H.L., Chien, S.C., Cheng, W.H., Chen, Y.Y., Hua, K.L.: Multi-cue pedestrian detection from 3d point cloud data. In: 2017 IEEE international conference on multimedia and expo (ICME), pp. 1279–1284. IEEE (2017) Tang, H.L., Chien, S.C., Cheng, W.H., Chen, Y.Y., Hua, K.L.: Multi-cue pedestrian detection from 3d point cloud data. In: 2017 IEEE international conference on multimedia and expo (ICME), pp. 1279–1284. IEEE (2017)
8.
go back to reference Hua, K.L., Hidayati, S.C., He, F.L., Wei, C.P., Wang, Y.C.F.: Context-aware joint dictionary learning for color image demosaicking. J. Vis. Commun. Image Represent. 38, 230–245 (2016)CrossRef Hua, K.L., Hidayati, S.C., He, F.L., Wei, C.P., Wang, Y.C.F.: Context-aware joint dictionary learning for color image demosaicking. J. Vis. Commun. Image Represent. 38, 230–245 (2016)CrossRef
9.
go back to reference Tan, D.S., Chen, W.Y., Hua, K.L.: Deepdemosaicking: adaptive image demosaicking via multiple deep fully convolutional networks. IEEE Trans. Image Process. 27(5), 2408–2419 (2018)MathSciNetCrossRef Tan, D.S., Chen, W.Y., Hua, K.L.: Deepdemosaicking: adaptive image demosaicking via multiple deep fully convolutional networks. IEEE Trans. Image Process. 27(5), 2408–2419 (2018)MathSciNetCrossRef
10.
go back to reference Sanchez-Riera, J., Hua, K.L., Hsiao, Y.S., Lim, T., Hidayati, S.C., Cheng, W.H.: A comparative study of data fusion for rgb-d based visual recognition. Pattern Recogn. Lett. 73, 1–6 (2016)CrossRef Sanchez-Riera, J., Hua, K.L., Hsiao, Y.S., Lim, T., Hidayati, S.C., Cheng, W.H.: A comparative study of data fusion for rgb-d based visual recognition. Pattern Recogn. Lett. 73, 1–6 (2016)CrossRef
11.
go back to reference Hidayati, S.C., Hua, K.L., Cheng, W.H., Sun, S.W.: What are the fashion trends in new york? In: Proceedings of the 22nd ACM international conference on multimedia, pp. 197–200 (2014) Hidayati, S.C., Hua, K.L., Cheng, W.H., Sun, S.W.: What are the fashion trends in new york? In: Proceedings of the 22nd ACM international conference on multimedia, pp. 197–200 (2014)
12.
go back to reference Sharma, V., Srinivasan, K., Chao, H.C., Hua, K.L., Cheng, W.H.: Intelligent deployment of uavs in 5g heterogeneous communication environment for improved coverage. J. Netw. Comput. Appl. 85, 94–105 (2017)CrossRef Sharma, V., Srinivasan, K., Chao, H.C., Hua, K.L., Cheng, W.H.: Intelligent deployment of uavs in 5g heterogeneous communication environment for improved coverage. J. Netw. Comput. Appl. 85, 94–105 (2017)CrossRef
13.
14.
go back to reference Mao, J., Xu, W., Yang, Y., Wang, J., Huang, Z., Yuille, A.: Deep captioning with multimodal recurrent neural networks (m-rnn) (2015) Mao, J., Xu, W., Yang, Y., Wang, J., Huang, Z., Yuille, A.: Deep captioning with multimodal recurrent neural networks (m-rnn) (2015)
15.
go back to reference Chen, L., Zhang, H., Xiao, J., Nie, L., Shao, J., Chua, T.: SCA-CNN: spatial and channel-wise attention in convolutional networks for image captioning. CoRR arxiv:1611.05594 (2016) Chen, L., Zhang, H., Xiao, J., Nie, L., Shao, J., Chua, T.: SCA-CNN: spatial and channel-wise attention in convolutional networks for image captioning. CoRR arxiv:​1611.​05594 (2016)
16.
go back to reference Lu, J., Xiong, C., Parikh, D., Socher, R.: Knowing when to look: adaptive attention via A visual sentinel for image captioning. CoRR arXiv:1612.01887 (2016) Lu, J., Xiong, C., Parikh, D., Socher, R.: Knowing when to look: adaptive attention via A visual sentinel for image captioning. CoRR arXiv:​1612.​01887 (2016)
17.
go back to reference Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)CrossRef
19.
go back to reference Cho, K., van Merrienboer, B., Gülçehre, Ç., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. CoRR arXiv:1406.1078 (2014) Cho, K., van Merrienboer, B., Gülçehre, Ç., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. CoRR arXiv:​1406.​1078 (2014)
20.
21.
go back to reference Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR (2015) Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: ICLR (2015)
22.
go back to reference Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M.S., Berg, A.C., Li, F.: Imagenet large scale visual recognition challenge. CoRR arXiv:1409.0575 (2014) Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M.S., Berg, A.C., Li, F.: Imagenet large scale visual recognition challenge. CoRR arXiv:​1409.​0575 (2014)
Metadata
Title
Code generation from a graphical user interface via attention-based encoder–decoder model
Authors
Wen-Yin Chen
Pavol Podstreleny
Wen-Huang Cheng
Yung-Yao Chen
Kai-Lung Hua
Publication date
18-05-2021
Publisher
Springer Berlin Heidelberg
Published in
Multimedia Systems / Issue 1/2022
Print ISSN: 0942-4962
Electronic ISSN: 1432-1882
DOI
https://doi.org/10.1007/s00530-021-00804-7

Other articles of this Issue 1/2022

Multimedia Systems 1/2022 Go to the issue