Skip to main content

2015 | OriginalPaper | Buchkapitel

Improved Tool Support for Machine-Code Decompilation in HOL4

verfasst von : Anthony Fox

Erschienen in: Interactive Theorem Proving

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

The HOL4 interactive theorem prover provides a sound logical environment for reasoning about machine-code programs. The rigour of HOL’s LCF-style kernel naturally guarantees very high levels of assurance, but it does present challenges when it comes implementing efficient proof tools. This paper presents improvements that have been made to our methodology for soundly decompiling machine-code programs to functions expressed in HOL logic. These advancements have been facilitated by the development of a domain specific language, called L3, for the specification of Instruction Set Architectures (ISAs). As a result of these improvements, decompilation is faster (on average by one to two orders of magnitude), the instruction set specifications are easier to write, and the proof tools are easier to maintain.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
For example, 2736 pages for ARMv7-A, 5242 pages for ARM-v8 (which contains a full description of the legacy AArch32 mode) and 3020 pages for x86.
 
2
L3 specifications are available at www.​cl.​cam.​ac.​uk/​-acjf3/​l3/​isa-models.​tar.​bz2 and the HOL4 developments can be viewed at the Github repository www.​github.​com/​HOL-Theorem-Prover/​HOL under the directory www.​examples/​l3-machine-code.
 
3
This computes r1 := r1 * pow(r2, r0). Note that 1w denotes a machine word (bit-vector) with value one.
 
4
This should not be confused with ISA exceptions (e.g. software interrupts), which are typically modelled explicitly, following the ISA reference manual semantics.
 
5
The ARM manual describes the unpredictable as meaning “the behaviour cannot be relied upon”. For example, the shift instruction ASR r1,r2,pc is unpredictable.
 
6
A state option value is either \(\mathrm {NONE}\) (when there is an exception) or otherwise it is \(\mathrm {SOME}\ s\) for some state s. We are only interested in exception free cases here.
 
7
This is the Modus Ponens inference rule with automatic matching. For example, given a theorem \(A_0 \vdash t_1\wedge \dots \wedge t_n \Rightarrow t_0\) and a list of theorems \(A_1 \vdash t_1, \dots , A_n \vdash t_n\), we can use this rule to derive \(A_0, \dots , A_n \vdash t_0\).
 
8
These are a form of discrimination net. A similar structure Net (credited to Larry Paulson) has been used for many years in HOL’s term-rewriting conversions.
 
9
The Skein example under the ‘original’ decompiler is an anomaly here.
 
10
Caching on hexadecimal opcode values has not been enabled, so these run times correspond with specialising generic spec theorems.
 
11
When an instruction explicitly reads the PC this normally gives the value of that instruction’s address plus eight (in ARM mode) or plus four (in Thumb mode). This is because early ARM processors employed a 3-stage pipeline.
 
12
An instruction opcode is undefined when it is not supported by an architecture version or configuration. For example, CLZ opcodes on ARMv4 will raise an undefined exception trap.
 
Literatur
1.
Zurück zum Zitat Campbell, B., Stark, I.: Randomised testing of a microprocessor model using SMT-solver state generation. In: Lang, F., Flammini, F. (eds.) FMICS 2014. LNCS, vol. 8718, pp. 185–199. Springer, Heidelberg (2014) Campbell, B., Stark, I.: Randomised testing of a microprocessor model using SMT-solver state generation. In: Lang, F., Flammini, F. (eds.) FMICS 2014. LNCS, vol. 8718, pp. 185–199. Springer, Heidelberg (2014)
2.
Zurück zum Zitat Dam, M., Guanciale, R., Nemati, H.: Machine code verification of a tiny ARM hypervisor. In: Sadeghi, A., Armknecht, F., Seifert, J. (eds.) TrustED 2013, pp. 3–12. ACM, New York (2013) Dam, M., Guanciale, R., Nemati, H.: Machine code verification of a tiny ARM hypervisor. In: Sadeghi, A., Armknecht, F., Seifert, J. (eds.) TrustED 2013, pp. 3–12. ACM, New York (2013)
3.
Zurück zum Zitat Fox, A.C.J.: LCF-Style bit-blasting in HOL4. In: van Eekelen, M., Geuvers, H., Schmaltz, J., Wiedijk, F. (eds.) ITP 2011. LNCS, vol. 6898, pp. 357–362. Springer, Heidelberg (2011) CrossRef Fox, A.C.J.: LCF-Style bit-blasting in HOL4. In: van Eekelen, M., Geuvers, H., Schmaltz, J., Wiedijk, F. (eds.) ITP 2011. LNCS, vol. 6898, pp. 357–362. Springer, Heidelberg (2011) CrossRef
4.
Zurück zum Zitat Fox, A.: Directions in ISA specification. In: Beringer, L., Felty, A. (eds.) ITP 2012. LNCS, vol. 7406, pp. 338–344. Springer, Heidelberg (2012) CrossRef Fox, A.: Directions in ISA specification. In: Beringer, L., Felty, A. (eds.) ITP 2012. LNCS, vol. 7406, pp. 338–344. Springer, Heidelberg (2012) CrossRef
5.
Zurück zum Zitat Fox, A., Myreen, M.O.: A trustworthy monadic formalization of the ARMv7 instruction set architecture. In: Kaufmann, M., Paulson, L.C. (eds.) ITP 2010. LNCS, vol. 6172, pp. 243–258. Springer, Heidelberg (2010) CrossRef Fox, A., Myreen, M.O.: A trustworthy monadic formalization of the ARMv7 instruction set architecture. In: Kaufmann, M., Paulson, L.C. (eds.) ITP 2010. LNCS, vol. 6172, pp. 243–258. Springer, Heidelberg (2010) CrossRef
6.
Zurück zum Zitat Goel, S., Hunt, Jr., W.A., Kaufmann, M., Ghosh, S.: Simulation and formal verification of x86 machine-code programs that make system calls. In: FMCAD 2014, pp. 91–98. IEEE (2014) Goel, S., Hunt, Jr., W.A., Kaufmann, M., Ghosh, S.: Simulation and formal verification of x86 machine-code programs that make system calls. In: FMCAD 2014, pp. 91–98. IEEE (2014)
7.
Zurück zum Zitat Jensen, J.B., Benton, N., Kennedy, A.: High-level separation logic for low-level code. In: Giacobazzi, R., Cousot, R. (eds.) POPL 2013, pp. 301–314. ACM, New York (2013) Jensen, J.B., Benton, N., Kennedy, A.: High-level separation logic for low-level code. In: Giacobazzi, R., Cousot, R. (eds.) POPL 2013, pp. 301–314. ACM, New York (2013)
8.
Zurück zum Zitat Kumar, R., Myreen, M.O., Norrish, M., Owens, S.: CakeML: a verified implementation of ML. In: Jagannathan, S., Sewell, P. (eds.) POPL 2014, pp. 179–192. ACM, New York (2014) Kumar, R., Myreen, M.O., Norrish, M., Owens, S.: CakeML: a verified implementation of ML. In: Jagannathan, S., Sewell, P. (eds.) POPL 2014, pp. 179–192. ACM, New York (2014)
9.
Zurück zum Zitat Leroy, X.: Formal verification of a realistic compiler. Commun. ACM 52(7), 107–115 (2009)CrossRef Leroy, X.: Formal verification of a realistic compiler. Commun. ACM 52(7), 107–115 (2009)CrossRef
10.
Zurück zum Zitat Lim, J., Reps, T.W.: TSL: a system for generating abstract interpreters and its application to machine-code analysis. ACM Trans. Program. Lang. Syst. 35(1), 4 (2013)CrossRef Lim, J., Reps, T.W.: TSL: a system for generating abstract interpreters and its application to machine-code analysis. ACM Trans. Program. Lang. Syst. 35(1), 4 (2013)CrossRef
11.
Zurück zum Zitat Morrisett, G., Tan, G., Tassarotti, J., Tristan, J., Gan, E.: Rocksalt: better, faster, stronger SFI for the x86. In: Vitek, J., Lin, H., Tip, F. (eds.) PLDI 2012, pp. 395–404. ACM, New York (2012) Morrisett, G., Tan, G., Tassarotti, J., Tristan, J., Gan, E.: Rocksalt: better, faster, stronger SFI for the x86. In: Vitek, J., Lin, H., Tip, F. (eds.) PLDI 2012, pp. 395–404. ACM, New York (2012)
12.
Zurück zum Zitat Myreen, M.O., Gordon, M.J.C.: Hoare logic for realistically modelled machine code. In: Grumberg, O., Huth, M. (eds.) TACAS 2007. LNCS, vol. 4424, pp. 568–582. Springer, Heidelberg (2007) CrossRef Myreen, M.O., Gordon, M.J.C.: Hoare logic for realistically modelled machine code. In: Grumberg, O., Huth, M. (eds.) TACAS 2007. LNCS, vol. 4424, pp. 568–582. Springer, Heidelberg (2007) CrossRef
13.
Zurück zum Zitat Myreen, M.O., Gordon, M.J.C., Slind, K.: Decompilation into logic - improved. In: Cabodi, G., Singh, S. (eds.) FMCAD, pp. 78–81. IEEE (2012) Myreen, M.O., Gordon, M.J.C., Slind, K.: Decompilation into logic - improved. In: Cabodi, G., Singh, S. (eds.) FMCAD, pp. 78–81. IEEE (2012)
14.
Zurück zum Zitat Ni, Z., Yu, D., Shao, Z.: Using XCAP to certify realistic systems code: machine context management. In: Schneider, K., Brandt, J. (eds.) TPHOLs 2007. LNCS, vol. 4732, pp. 189–206. Springer, Heidelberg (2007) CrossRef Ni, Z., Yu, D., Shao, Z.: Using XCAP to certify realistic systems code: machine context management. In: Schneider, K., Brandt, J. (eds.) TPHOLs 2007. LNCS, vol. 4732, pp. 189–206. Springer, Heidelberg (2007) CrossRef
15.
Zurück zum Zitat Sewell, T.A.L., Myreen, M.O., Klein, G.: Translation validation for a verified OS kernel. In: Boehm, H.J., Flanagan, C. (eds.) PLDI, pp. 471–482. ACM, New York (2013) Sewell, T.A.L., Myreen, M.O., Klein, G.: Translation validation for a verified OS kernel. In: Boehm, H.J., Flanagan, C. (eds.) PLDI, pp. 471–482. ACM, New York (2013)
16.
Zurück zum Zitat Simon, A., Kranz, J.: The GDSL toolkit: Generating frontends for the analysis of machine code. In: Jagannathan, S., Sewell, P. (eds.) PPREW 2014, p. 7. ACM, New York (2014) Simon, A., Kranz, J.: The GDSL toolkit: Generating frontends for the analysis of machine code. In: Jagannathan, S., Sewell, P. (eds.) PPREW 2014, p. 7. ACM, New York (2014)
17.
Zurück zum Zitat Woodruff, J., Watson, R.N.M., Chisnall, D., Moore, S.W., Anderson, J., Davis, B., Laurie, B., Neumann, P.G., Norton, R., Roe, M.: The CHERI capability model: revisiting RISC in an age of risk. In: ISCA 2014, pp. 457–468. IEEE Computer Society (2014) Woodruff, J., Watson, R.N.M., Chisnall, D., Moore, S.W., Anderson, J., Davis, B., Laurie, B., Neumann, P.G., Norton, R., Roe, M.: The CHERI capability model: revisiting RISC in an age of risk. In: ISCA 2014, pp. 457–468. IEEE Computer Society (2014)
Metadaten
Titel
Improved Tool Support for Machine-Code Decompilation in HOL4
verfasst von
Anthony Fox
Copyright-Jahr
2015
DOI
https://doi.org/10.1007/978-3-319-22102-1_12