In recent years attackers have changed their attack vector from the operating system level to the application level. Particularly, attackers concentrate their efforts on finding vulnerabilities in common office applications such as Microsoft Office and Adobe Acrobat. In this paper, we present a novel approach to detect
identify the actual vulnerability exploited by a malicious document and extract the exploit code itself. To achieve this, we automatically extract from a security patch information about which code fragments were changed. During the analysis of a document, we open the document using the appropriate application, log the execution path, and automatically identify embedded malicious code using dynamic binary instrumentation. Then both pieces of information are used to determine whether a malicious document exploits a known security flaw and, if so, which vulnerability is targeted.