Skip to main content

2015 | OriginalPaper | Buchkapitel

Benchmarking and Resource Measurement

verfasst von : Dirk Beyer, Stefan Löwe, Philipp Wendler

Erschienen in: Model Checking Software

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Proper benchmarking and resource measurement is an important topic, because benchmarking is a widely-used method for the comparative evaluation of tools and algorithms in many research areas. It is essential for researchers, tool developers, and users, as well as for competitions. We formulate a set of requirements that are indispensable for reproducible benchmarking and reliable resource measurement of automatic solvers, verifiers, and similar tools, and discuss limitations of existing methods and benchmarking tools. Fulfilling these requirements in a benchmarking framework is complex and can (on Linux) currently only be done by using the cgroups feature of the kernel. We provide , a ready-to-use, tool-independent, and free implementation of a benchmarking framework that fulfills all presented requirements, making reproducible benchmarking and reliable resource measurement easy. Our framework is able to work with a wide range of different tools and has proven its reliability and usefulness in the International Competition on Software Verification.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Anhänge
Nur mit Berechtigung zugänglich
Fußnoten
4
Our experience from competition organization shows that developers of complex tools are not always aware of how their system spawns child processes and how to properly terminate them.
 
6
We experienced this when organizing SV-COMP’13, for a portfolio-based verifier. Initial CPU time measurements were significantly too low, which was luckily discovered by chance. The verifier had to be patched to wait for its sub-processes and the benchmarks had to be re-run.
 
8
Actually, independent hierarchies are currently supported. We restrict ourselves to the single-hierarchy case because independent hierarchies are going to be deprecated.
 
11
Or clear the caches with drop_caches.
 
15
Tools that do not support this specification format can also be benchmarked. In this case, the specification is used by only to determine the expected result.
 
17
For example, is used to automatically check for regressions in the integration test-suite of .
 
18
We successfully use on four different clusters, each under different administrative control and with software as old as SuSE Enterprise 11 and Linux 3.0, and on the machines of the student computer pool of our department.
 
25
c.f. verify.sh in the package
 
26
git revision 9d58031 from 2013-09-13, c.f. https://​github.​com/​tkren/​vcwc/​
 
Literatur
1.
Zurück zum Zitat Balint, A., Belov, A., Heule, M., Järvisalo, M.: Proceedings of SAT competition 2013: Solver and benchmark descriptions. Technical report B-2013-1, University of Helsinki (2013) Balint, A., Belov, A., Heule, M., Järvisalo, M.: Proceedings of SAT competition 2013: Solver and benchmark descriptions. Technical report B-2013-1, University of Helsinki (2013)
2.
Zurück zum Zitat Barrett, C., Deters, M., de Moura, L., Oliveras, A., Stump, A.: 6 years of SMT-COMP. J. Autom. Reasoning 50(3), 243–277 (2012)CrossRef Barrett, C., Deters, M., de Moura, L., Oliveras, A., Stump, A.: 6 years of SMT-COMP. J. Autom. Reasoning 50(3), 243–277 (2012)CrossRef
3.
Zurück zum Zitat Beyer, D.: Software verification and verifiable witnesses. In: Baier, C., Tinelli, C. (eds.) TACAS 2015. LNCS, vol. 9035, pp. 401–416. Springer, Heidelberg (2015) Beyer, D.: Software verification and verifiable witnesses. In: Baier, C., Tinelli, C. (eds.) TACAS 2015. LNCS, vol. 9035, pp. 401–416. Springer, Heidelberg (2015)
4.
Zurück zum Zitat Beyer, D., Dresler, G., Wendler, P.: Software verification in the Google App-Engine Cloud. In: Biere, A., Bloem, R. (eds.) CAV 2014. LNCS, vol. 8559, pp. 327–333. Springer, Heidelberg (2014) Beyer, D., Dresler, G., Wendler, P.: Software verification in the Google App-Engine Cloud. In: Biere, A., Bloem, R. (eds.) CAV 2014. LNCS, vol. 8559, pp. 327–333. Springer, Heidelberg (2014)
5.
Zurück zum Zitat Charwat, G., Ianni, G., Krennwallner, T., Kronegger, M., Pfandler, A., Redl, C., Schwengerer, M., Spendier, L.K., Wallner, J.P., Xiao, G.: VCWC: a versioning competition workflow compiler. In: Cabalar, P., Son, T.C. (eds.) LPNMR 2013. LNCS, vol. 8148, pp. 233–238. Springer, Heidelberg (2013) CrossRef Charwat, G., Ianni, G., Krennwallner, T., Kronegger, M., Pfandler, A., Redl, C., Schwengerer, M., Spendier, L.K., Wallner, J.P., Xiao, G.: VCWC: a versioning competition workflow compiler. In: Cabalar, P., Son, T.C. (eds.) LPNMR 2013. LNCS, vol. 8148, pp. 233–238. Springer, Heidelberg (2013) CrossRef
7.
Zurück zum Zitat JCGM Working Group 2. International vocabulary of metrology - basic and general concepts and associated terms (VIM), 3rd edn. Technical report JCGM 200:2012, BIPM (2012) JCGM Working Group 2. International vocabulary of metrology - basic and general concepts and associated terms (VIM), 3rd edn. Technical report JCGM 200:2012, BIPM (2012)
8.
Zurück zum Zitat Kordon, F., Hulin-Hubard, F.: BenchKit, a tool for massive concurrent benchmarking. In: ACSD 2014. pp. 159–165. IEEE (2014) Kordon, F., Hulin-Hubard, F.: BenchKit, a tool for massive concurrent benchmarking. In: ACSD 2014. pp. 159–165. IEEE (2014)
10.
Zurück zum Zitat Roussel, O.: Controlling a solver execution with the runsolver tool. J. Satisfiability, Boolean Model. Comput. 7, 139–144 (2011)MathSciNet Roussel, O.: Controlling a solver execution with the runsolver tool. J. Satisfiability, Boolean Model. Comput. 7, 139–144 (2011)MathSciNet
11.
Zurück zum Zitat Singh, B., Srinivasan, V.: Containers: challenges with the memory resource controller and its performance. In: Ottawa Linux Symposium (OLS), p. 209. (2007) Singh, B., Srinivasan, V.: Containers: challenges with the memory resource controller and its performance. In: Ottawa Linux Symposium (OLS), p. 209. (2007)
12.
Zurück zum Zitat Stump, A., Sutcliffe, G., Tinelli, C.: StarExec: a cross-community infrastructure for logic solving. In: Demri, S., Kapur, D., Weidenbach, C. (eds.) IJCAR 2014. LNCS, vol. 8562, pp. 367–373. Springer, Heidelberg (2014) Stump, A., Sutcliffe, G., Tinelli, C.: StarExec: a cross-community infrastructure for logic solving. In: Demri, S., Kapur, D., Weidenbach, C. (eds.) IJCAR 2014. LNCS, vol. 8562, pp. 367–373. Springer, Heidelberg (2014)
Metadaten
Titel
Benchmarking and Resource Measurement
verfasst von
Dirk Beyer
Stefan Löwe
Philipp Wendler
Copyright-Jahr
2015
DOI
https://doi.org/10.1007/978-3-319-23404-5_12