Skip to main content
Top

2018 | OriginalPaper | Chapter

Form Filling Based on Constraint Solving

Authors : Ben Spencer, Michael Benedikt, Pierre Senellart

Published in: Web Engineering

Publisher: Springer International Publishing

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

We describe a system for analyzing form-based websites to discover sequences of actions and values that result in a valid form submission. Rather than looking at the text or DOM structure of the form, our method is driven by solving constraints involving the underlying client-side JavaScript code. In order to deal with the complexity of client-side code, we adapt a method from program analysis and testing, concolic testing, which mixes concrete code execution, symbolic code tracing, and constraint solving to find values that lead to new code paths. While concolic testing is commonly used for detecting bugs in stand-alone code with developer support, we show how it can be applied to the very different problem of filling Web forms. We evaluate our system on a benchmark of both real and synthetic Web forms.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literature
1.
go back to reference Anand, S., Naik, M., Harrold, M.J., Yang, H.: Automated concolic testing of smartphone apps. In: FSE (2012) Anand, S., Naik, M., Harrold, M.J., Yang, H.: Automated concolic testing of smartphone apps. In: FSE (2012)
3.
go back to reference Artzi, S., Dolby, J., Jensen, S.H., Møller, A., Tip, F.: A framework for automated testing of JavaScript Web applications. In: ICSE (2011) Artzi, S., Dolby, J., Jensen, S.H., Møller, A., Tip, F.: A framework for automated testing of JavaScript Web applications. In: ICSE (2011)
4.
go back to reference Barbosa, L., Freire, J.: An adaptive crawler for locating hidden-Web entry points. In: WWW (2007) Barbosa, L., Freire, J.: An adaptive crawler for locating hidden-Web entry points. In: WWW (2007)
5.
go back to reference Barbosa, L., Freire, J.: Siphoning hidden-Web data through keyword-based interfaces. In: JIDM (2010) Barbosa, L., Freire, J.: Siphoning hidden-Web data through keyword-based interfaces. In: JIDM (2010)
7.
go back to reference Benedikt, M., Furche, T., Savvides, A., Senellart, P.: ProFoUnd: program-analysis-based form understanding. In: WWW (2012). Demonstration Benedikt, M., Furche, T., Savvides, A., Senellart, P.: ProFoUnd: program-analysis-based form understanding. In: WWW (2012). Demonstration
8.
go back to reference Bergman, M.K.: The deep Web: surfacing hidden value. J. Electron. Publishing 7 (2001) Bergman, M.K.: The deep Web: surfacing hidden value. J. Electron. Publishing 7 (2001)
9.
go back to reference Cok, D.R., Déharbe, D., Weber, T.: The 2014 SMT competition. J. Satisfiability Boolean Model. Comput. 9, 207–242 (2014)MathSciNet Cok, D.R., Déharbe, D., Weber, T.: The 2014 SMT competition. J. Satisfiability Boolean Model. Comput. 9, 207–242 (2014)MathSciNet
11.
go back to reference Crescenzi, V., Mecca, G., Merialdo, P.: RoadRunner: towards automatic data extraction from large Web sites. In: VLDB (2001) Crescenzi, V., Mecca, G., Merialdo, P.: RoadRunner: towards automatic data extraction from large Web sites. In: VLDB (2001)
13.
go back to reference de Moura, L., Bjørner, N.: Satisfiability modulo theories: introduction and applications. Commun. ACM 54(9), 69–77 (2011)CrossRef de Moura, L., Bjørner, N.: Satisfiability modulo theories: introduction and applications. Commun. ACM 54(9), 69–77 (2011)CrossRef
14.
go back to reference Duda, C., Frey, G., Kossmann, D., Matter, R., Zhou, C.: Making AJAX applications searchable. In: ICDE, AJAX Crawl (2009) Duda, C., Frey, G., Kossmann, D., Matter, R., Zhou, C.: Making AJAX applications searchable. In: ICDE, AJAX Crawl (2009)
15.
17.
go back to reference Furche, T., Gottlob, G., Grasso, G., Guo, X., Orsi, G., Schallhart, C.: OPAL: automated form understanding for the deep Web. In: WWW (2012) Furche, T., Gottlob, G., Grasso, G., Guo, X., Orsi, G., Schallhart, C.: OPAL: automated form understanding for the deep Web. In: WWW (2012)
18.
go back to reference Furche, T., Gottlob, G., Grasso, G., Guo, X., Orsi, G., Schallhart, C., Wang, C.: DIADEM: thousands of websites to a single database. PVLDB 7(14), 1845–1856 (2014) Furche, T., Gottlob, G., Grasso, G., Guo, X., Orsi, G., Schallhart, C., Wang, C.: DIADEM: thousands of websites to a single database. PVLDB 7(14), 1845–1856 (2014)
20.
go back to reference Godefroid, P., Klarlund, N., Sen, K.: DART: directed automated random testing. In: PLDI (2005) Godefroid, P., Klarlund, N., Sen, K.: DART: directed automated random testing. In: PLDI (2005)
21.
go back to reference He, B., Patel, M., Zhang, Z., Chang, K.C.-C.: Accessing the deep Web. Commun. ACM 50(5), 94–101 (2007)CrossRef He, B., Patel, M., Zhang, Z., Chang, K.C.-C.: Accessing the deep Web. Commun. ACM 50(5), 94–101 (2007)CrossRef
22.
go back to reference Hu, G., Yuan, X., Tang, Y., Yang, J.: Efficiently, effectively detecting mobile app bugs with AppDoctor. In: EuroSys (2014) Hu, G., Yuan, X., Tang, Y., Yang, J.: Efficiently, effectively detecting mobile app bugs with AppDoctor. In: EuroSys (2014)
24.
go back to reference Jensen, C.S., Prasad, M.R., Møller, A.: Automated testing with targeted event sequence generation. In: ISSTA, July 2013 Jensen, C.S., Prasad, M.R., Møller, A.: Automated testing with targeted event sequence generation. In: ISSTA, July 2013
26.
go back to reference Kantorski, G.Z., Moraes, T.G., Moreira, V.P., Heuser, C.A.: Choosing values for text fields in Web forms. In: Morzy, T., Härder, T., Wrembel, R. (eds.) Advances in Databases and Information Systems. Advances in Intelligent Systems and Computing, vol. 186. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-32741-4_12 Kantorski, G.Z., Moraes, T.G., Moreira, V.P., Heuser, C.A.: Choosing values for text fields in Web forms. In: Morzy, T., Härder, T., Wrembel, R. (eds.) Advances in Databases and Information Systems. Advances in Intelligent Systems and Computing, vol. 186. Springer, Heidelberg (2013). https://​doi.​org/​10.​1007/​978-3-642-32741-4_​12
28.
go back to reference Lage, J.P., da Silva, A.S., Golgher, P.B., Laender, A.H.F.: Automatic generation of agents for collecting hidden Web pages for data extraction. Data Knowl. Eng. 49(2), 177–196 (2004)CrossRef Lage, J.P., da Silva, A.S., Golgher, P.B., Laender, A.H.F.: Automatic generation of agents for collecting hidden Web pages for data extraction. Data Knowl. Eng. 49(2), 177–196 (2004)CrossRef
29.
go back to reference Li, G., Andreasen, E., Ghosh, I.: SymJS: automatic symbolic testing of JavaScript Web applications. In: FSE (2014) Li, G., Andreasen, E., Ghosh, I.: SymJS: automatic symbolic testing of JavaScript Web applications. In: FSE (2014)
30.
go back to reference Liakos, P., Ntoulas, A., Labrinidis, A., Delis, A.: Focused crawling for the hidden Web. WWW 19, 605–631 (2016)CrossRef Liakos, P., Ntoulas, A., Labrinidis, A., Delis, A.: Focused crawling for the hidden Web. WWW 19, 605–631 (2016)CrossRef
31.
go back to reference Liang, T., Reynolds, A., Tinelli, C., Barrett, C., Deters, M.: A DPLL(T) theory solver for a theory of strings and regular expressions. In: CAV (2014) Liang, T., Reynolds, A., Tinelli, C., Barrett, C., Deters, M.: A DPLL(T) theory solver for a theory of strings and regular expressions. In: CAV (2014)
32.
go back to reference Lu, J., Wang, Y., Liang, J., Chen, J., Liu, J.: An approach to deep Web crawling by sampling. In: WI-IAT (2008) Lu, J., Wang, Y., Liang, J., Chen, J., Liu, J.: An approach to deep Web crawling by sampling. In: WI-IAT (2008)
33.
go back to reference Lu, Y., He, H., Zhao, H., Meng, W., Yu, C.: Annotating structured data of the deep Web. In: ICDE (2007) Lu, Y., He, H., Zhao, H., Meng, W., Yu, C.: Annotating structured data of the deep Web. In: ICDE (2007)
34.
go back to reference Madhavan, J., Ko, D., Kot, L., Ganapathy, V., Rasmussen, A., Halevy, A.: Google’s deep-Web crawl. In: VLDB (2008)CrossRef Madhavan, J., Ko, D., Kot, L., Ganapathy, V., Rasmussen, A., Halevy, A.: Google’s deep-Web crawl. In: VLDB (2008)CrossRef
36.
go back to reference Mesbah, A., van Deursen, A., Lenselink, S.: Crawling Ajax-based Web applications through dynamic analysis of user interface state changes. ACM Trans. Web 6(1), 3:1–3:30 (2012)CrossRef Mesbah, A., van Deursen, A., Lenselink, S.: Crawling Ajax-based Web applications through dynamic analysis of user interface state changes. ACM Trans. Web 6(1), 3:1–3:30 (2012)CrossRef
37.
go back to reference Nguyen, H., Nguyen, T., Freire, J.: Learning to extract form labels. PVLDB 1(1), 684–694 (2008) Nguyen, H., Nguyen, T., Freire, J.: Learning to extract form labels. PVLDB 1(1), 684–694 (2008)
38.
go back to reference Niemetz, A., Preiner, M., Biere, A.: Boolector 2.0 system description. J. Satisfiability Boolean Model. Comput. 9, 53–58 (2015) Niemetz, A., Preiner, M., Biere, A.: Boolector 2.0 system description. J. Satisfiability Boolean Model. Comput. 9, 53–58 (2015)
39.
go back to reference Ntoulas, A., Zerfos, P., Cho, J.: Downloading textual hidden Web content through keyword queries. In: JCDL (2005) Ntoulas, A., Zerfos, P., Cho, J.: Downloading textual hidden Web content through keyword queries. In: JCDL (2005)
40.
go back to reference Raghavan, S., Garcia-Molina, H.: Crawling the hidden Web. In: VLDB (2001) Raghavan, S., Garcia-Molina, H.: Crawling the hidden Web. In: VLDB (2001)
41.
go back to reference Sen, K., Kalasapur, S., Brutch, T.G., Gibbs, S.: Jalangi: a tool framework for concolic testing, selective record-replay, and dynamic analysis of JavaScript. In: FSE (2013) Sen, K., Kalasapur, S., Brutch, T.G., Gibbs, S.: Jalangi: a tool framework for concolic testing, selective record-replay, and dynamic analysis of JavaScript. In: FSE (2013)
42.
go back to reference Sen, K., Marinov, D., Agha, G.L: CUTE: a concolic unit testing engine for C. In: FSE (2005) Sen, K., Marinov, D., Agha, G.L: CUTE: a concolic unit testing engine for C. In: FSE (2005)
43.
go back to reference Senellart, P., Mittal, A., Muschick, D., Gilleron, R., Tommasi, M.: Automatic wrapper induction from hidden-Web sources with domain knowledge. In: WIDM (2008) Senellart, P., Mittal, A., Muschick, D., Gilleron, R., Tommasi, M.: Automatic wrapper induction from hidden-Web sources with domain knowledge. In: WIDM (2008)
44.
go back to reference Spencer, B., Benedikt, M., Møller, A., van Breugel, F.: ArtForm: a tool for exploring the codebase of form-based websites. In: ISSTA (2017) Spencer, B., Benedikt, M., Møller, A., van Breugel, F.: ArtForm: a tool for exploring the codebase of form-based websites. In: ISSTA (2017)
45.
go back to reference Wang, Y., Liang, J., Lu, J.: Discover hidden Web properties by random walk on bipartite graph. Inf. Retrieval 17(3), 203–228 (2014)CrossRef Wang, Y., Liang, J., Lu, J.: Discover hidden Web properties by random walk on bipartite graph. Inf. Retrieval 17(3), 203–228 (2014)CrossRef
47.
go back to reference Zhao, H., Meng, W., Wu, Z., Raghavan, V., Yu, C.: Fully automatic wrapper generation for search engines. In: WWW (2005) Zhao, H., Meng, W., Wu, Z., Raghavan, V., Yu, C.: Fully automatic wrapper generation for search engines. In: WWW (2005)
Metadata
Title
Form Filling Based on Constraint Solving
Authors
Ben Spencer
Michael Benedikt
Pierre Senellart
Copyright Year
2018
DOI
https://doi.org/10.1007/978-3-319-91662-0_7

Premium Partner