Background
Introduction
-
Denial of service (DoS): an attacker makes some computing resources too busy to handle legitimate requests.
-
Remote to user (R2L): an attacker who does not have an account on a remote machine sends packets to that machine over a network and exploits some vulnerability to gain local access as a user of that machine.
-
User to root (U2R): an attacker starts out with access to a normal user account on the system and is able to exploit system vulnerabilities to gain root access to the system.
-
Probing: an attacker scans a network of computers to gather information or find known vulnerabilities. An attacker with a map of machines and services that are available on a network can use this information to look for exploits.
-
Information base: level of knowledge about the company before the execution of Pentest.
-
Aggressiveness: depth level of the test, i.e., determine whether it is trying to identify the main vulnerabilities or whether it should exploit all possible attacks.
-
Scope: set for a specific environment or to a general environment.
-
Technique: what are the techniques and methodologies used on Pentest.
Related work
Systematic mapping study
Planning
Scope and objective
Question structure
-
Population: establishes the target population of the research method execution. In this paper, the published research papers are on information security.
-
Intervention: represents the specific issue related to the research objective. Here, the intervention is penetration test.
-
Comparison: defines what will be compared with the intervention. In this systematic mapping, the comparison is not applied.
-
Outcome: the obtained results, like type and quantity of the evidences regarding penetration tests, in order to identify the tools, models, methodologies, scenarios, and main challenges in this area.
Research questions
Research process
Structure | Terms | Synonyms |
---|---|---|
Population | Security information | |
Intervention | Penetration test | Pentest |
Penetration testing | ||
Pentesting | ||
Outcome | Tool | Tools |
Software | ||
Suite | ||
Model | Process | |
Methodology | ||
Standard | ||
Framework | ||
Environment | Context | |
Challenges | Open research topics | |
Open problems |
Inclusion and exclusion criteria
-
IC1. The primary study discusses one or more tools for Pentest
-
IC2. The primary study suggests a model, process, framework, or methodology for Pentest
-
EC1. The primary study is not direct related to Pentest
-
EC2. The study shows a Pentest methodology but does not provide enough information about its use and application
-
EC3. The study does not have any kind of evaluation to demonstrate outcomes, e.g., case study, experiment, or proof of correctness
Quality assessment criteria
-
QA1. Does the study present a contribution to Pentest?
-
QA2. Is there any kind of evaluation based on analysis or discussion about the use of the models or tools for Pentest?
-
QA3. Does the study describe the used tools or models?
-
QA1. Y, the contribution is explicitly defined in the study; P, the contribution is implicit; and N, the contribution cannot be identified and/or it is not established;
-
QA2. Y, the study has explicitly applied an evaluation (for example, a case study, an experiment, or another); P, the evaluation is a short example; and N, no evaluation has been presented;
-
QA3. Y, the tools or models are clearly specified; P, the tools or models are barely specified; N, the tools or models were not specified
Selection process
-
Step 1. To search databases. Initially, the search strings are generated based on keywords and their synonyms. After that, an initial selection occurs based on the inclusion and exclusion criteria mentioned in the “Inclusion and exclusion criteria” section.
-
Step 2. To eliminate redundancies. As the results come from different search engines, the redundant studies are eliminated and stored.
-
Step 3. Intermediate selection. The title and the abstract of each selected study are read (introduction and conclusion are also read when it is necessary).
-
Step 4. Final selection. In this step, all studies are completely read.
-
Step 5. To eliminate divergences. If there are any divergences or doubts about the studies, a second Pentest specialist reads the studies and discusses its inclusion or not in the final selection;
-
Step 6. Quality assessment. Based on the quality criteria previously mentioned, the quality of the studies in the final selection are evaluated.
Data analysis
-
Identify the tools used on Pentest and their characteristics (RQ1);
-
Map the main Pentest application domains (RQ2);
-
Enumerate the studies that have been selected by models or specifications (RQ3);
-
Gather the studies selected by research type and contribution (RQ4).
Conduction
Search databases
Database | Retrieved | Not duplic. | Selected | Prec. rate | Rate index |
---|---|---|---|---|---|
ACM DL | 144 | 141 | 8 | 0.0555 | 0.1481 |
IEEE Xplore | 531 | 523 | 32 | 0.0602 | 0.5925 |
SCOPUS | 128 | 90 | 3 | 0.0234 | 0.0555 |
Springer Link | 342 | 336 | 11 | 0.0321 | 0.2037 |
Total | 1145 | 1090 | 54 |
Study quality assessment
Studies | QA | Quality | Studies | QA | Quality | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
ID | Reference | Year | 1 | 2 | 3 | Sc | Des | ID | Reference | Year | 1 | 2 | 3 | Sc | Des |
01 | [13] Austin | 2013 | Y | Y | Y | 3.0 | E | 28 | [62] Line | 2008 | Y | P | P | 2.0 | V |
02 | [40] Hsu | 2008 | P | P | P | 1.5 | G | 29 | [37] Mainka | 2012 | Y | P | Y | 2.5 | E |
03 | [53] Holm | 2011 | P | Y | N | 1.5 | G | 30 | [11] Geer | 2002 | Y | Y | Y | 3.0 | E |
04 | [41] Bechtsoudis | 2012 | Y | P | Y | 2.5 | E | 31 | [48] Traore | 2011 | Y | P | N | 1.5 | G |
05 | [42] Sarraute | 2011 | Y | Y | Y | 3.0 | E | 32 | [39] Benkhelifa | 2013 | P | P | P | 1.5 | G |
06 | [14] Khoury | 2011 | P | Y | N | 1.5 | G | 33 | [22] Salas | 2014 | Y | Y | Y | 3.0 | E |
07 | [34] Antunes | 2015 | Y | Y | N | 2.0 | V | 34 | [23] Büchler | 2012 | Y | Y | Y | 3.0 | E |
08 | [15] Xu | 2012 | P | P | Y | 2.0 | V | 35 | [56] Sandouka | 2009 | Y | P | P | 2.0 | V |
09 | [43] Shen | 2011 | Y | P | Y | 2.5 | E | 36 | [24] Liu | 2012 | Y | Y | Y | 3.0 | E |
10 | [35] Mendes | 2011 | P | Y | P | 2.0 | V | 37 | [52] Masood | 2011 | Y | P | Y | 2.5 | E |
11 | [16] Fong | 2008 | Y | Y | P | 2.5 | E | 38 | [25] Igure | 2008 | Y | Y | P | 2.5 | E |
12 | [65] Williams | 2012 | Y | Y | P | 2.5 | E | 39 | [64] Khoury | 2011 | Y | P | N | 1.5 | G |
13 | [44] Bou-harb | 2014 | P | Y | Y | 2.5 | E | 40 | [26] Leibolt | 2010 | P | P | P | 1.5 | G |
14 | [45] Kasinathan | 2013 | P | P | P | 1.5 | G | 41 | [27] Fonseca | 2010 | Y | P | P | 2.0 | V |
15 | [46] Xing | 2010 | Y | P | Y | 2.5 | E | 42 | [49] Jajodia | 2005 | P | P | Y | 3.0 | E |
16 | [36] Antunes | 2009 | Y | Y | P | 2.5 | E | 43 | [50] Blackwell | 2014 | Y | Y | Y | 3.0 | E |
17 | [54] Holik | 2014 | Y | Y | Y | 3.0 | E | 44 | [28] Prandini | 2010 | Y | Y | Y | 3.0 | E |
18 | [17] Avramescu | 2013 | Y | Y | Y | 3.0 | E | 45 | [59] Dimkov | 2010 | Y | Y | Y | 2.0 | V |
19 | [57] Ridgewell | 2013 | P | P | P | 1.5 | G | 46 | [60] Stepien | 2012 | Y | P | Y | 2.5 | E |
20 | [18] Walden | 2008 | P | Y | P | 2.0 | V | 47 | [29] Badawy | 2013 | P | P | P | 1.5 | G |
21 | [19] Mink | 2006 | P | P | P | 1.5 | G | 48 | [30] Curphey | 2006 | P | P | P | 1.5 | G |
22 | [55] Tondel | 2008 | P | Y | P | 2.0 | V | 49 | [31] Huang | 2005 | P | P | P | 1.5 | G |
23 | [20] Armando | 2010 | Y | P | P | 2.0 | V | 50 | [32] Doupé | 2010 | P | Y | P | 2.0 | V |
24 | [63] Dahl | 2006 | Y | Y | Y | 3.0 | E | 51 | [51] Vegendla | 2016 | Y | Y | Y | 3.0 | E |
25 | [47] Mclaughlin | 2010 | Y | Y | Y | 3.0 | E | 52 | [61] Casseli | 2016 | Y | Y | Y | 3.0 | E |
26 | [58] Somorovsky | 2012 | Y | Y | Y | 3.0 | E | 53 | [38] Antunes | 2016 | Y | Y | Y | 3.0 | E |
27 | [21] Garn | 2014 | Y | Y | Y | 1.5 | G | 54 | [33] Awang | 2015 | Y | Y | P | 2.5 | E |
Result analysis
Classification schemes
Mapping
-
Discussion on methodologies for Pentest: 15 of the selected and analyzed primary studies present as their main contribution discussion on methodologies. This point encourages discussions about the existing methodologies for the application of Pentest, dealing mainly the deep level of the testing knowledge, since for certain scenarios it becomes interesting to use more models for each security testing process.
-
Distribution of the types of research: 37 studies, representing 68,5% overall, were analyzed and characterized as empirical studies. This results seems coherent with the way normally studies on the Pentest area are performed, i.e., research papers are applied to a specific area and therefore are not general strategies that can be applied to any context.
Threats to validity
Discussion
RQ1—what are the main tools used in Pentest?
-
Reconnaissance: Reconnaissance is the process of obtaining essential information about a target organization. In most cases, attackers will find out as much as they can usually by obtaining public information or masquerading as a normal user.
-
Scanning: In this phase, remotely accessible hosts are mapped. Network scanning can also sometimes reveal the vendor brands of systems being used, as well as identify operating system types and versions. Network scanning helps to determine firewall location, routers in use, and the network’s general structure.
-
Gaining access: Vulnerabilities exposed during the reconnaissance and scanning phases are exploited to get access to the target system.
-
Maintaining access: Once the access to a target system was achieved, it is necessary to keep this access for a future exploitation and attack.
-
Covering tracks: The last phase covers tracks to avoid detection after the hacker has achieved the access.
Tool name | Manufacturer | License | Category | Phase on Pentest |
---|---|---|---|---|
Acunetix WS | Acunetix | Commercial | Web vulnerability scanner | Pre-attack and attack |
WebInspect | HP | Commercial | Web vulnerability scanner | Pre-attack and attack |
AppScan | IBM | Commercial | Web vulnerability scanner | Pre-attack and attack |
Metasploit | Rapid7 | Open Source | Vulnerability exploitation tool | Attack |
Nessus | Tenable | Commercial | Vulnerability scanner | Pre-attack |
NeXpose | Rapid7 | Commercial | Vulnerability scanner | Pre-attack |
Nikto | CIRT | Open Source | Web vulnerability scanner | Pre-attack |
Nmap | – | Open Source | Port scanner | Pre-attack |
Paros | – | Open Source | Web vulnerability scanner; Web proxy | Pre-attack |
QualysGuard | Qualys | Commercial | Vulnerability scanner | Pre-attack |
WebScarab | OWASP | Open Source | Web vulnerability scanner | Pre-attack |
Wireshark | Wireshark | Open Source | Packet crafting tool | Pre-attack |
RQ2—what are the target-scenarios in Pentest?
RQ3—what are the models of Pentest?
-
Meet (M): Provides detailed definitions and concepts to deal with that feature in an appropriate manner.
-
Partly meet (PM): Issues about the feature are mentioned, but without the necessary robustness.
-
Not meet (NM): The methodology does not mention anything related to the feature.
RQ4—what are the main challenges on Pentest?
Lessons learned and future directions
-
Target scenarios: One of the main goals of a security test is to assess the security of the resources, devices, controls, or systems, considering a great diversity of target-scenarios. The majority of the studies that we found considers the web context as top priority when testing security; to a minor extent, network environment and its protocols are also considered important. However, there is almost none discussion on security testing in scenarios such as cloud computing, mobile devices, or solutions related to IoT (Internet of Things). Therefore, studies about security testing applications—especially Pentest—in those scenarios, for example, present the possibility of groundbreaking discoveries and improvements through new studies.
-
Models and methodologies: As presented previously, the existing methodologies for security testing contain several variations in their characteristics, objectives, and procedures; however, those methodologies also have limitations regarding target scenarios since they are tailored to serve distinct purposes. Therefore, we believe that none of the so-called “standard” methodologies could be used to execute Pentest considering the variety of target-scenarios. This could be considered one of the core lessons of this systematic mapping since it presents an open challenge in the security testing area. Creating a new methodology or strategy that could manage the diversity of target scenarios and the aspects—advantages and disadvantages—of any existing methodology could potentially point towards a new and interesting path for future studies in security.
-
Tools and task automation: During the security testing process, several tools are used for each activity, and tools listed when answering research question 1 (RQ1) are some of the most consolidated in the current research context. Those tools have specific purposes in each testing phase, and the testers can determine when and how those tools will be utilized according to their preferences. Among the tools, it was possible to notice that applications that scan and identify vulnerabilities are the ones that are most cited/mentioned in the research papers. Sometimes those tools are not as adequate for some of the strategies testers use; hence, it is necessary to have some study to verify to what extend those automated tools solve the testers goals. In this sense, the idea of attack graphs is considered a topic related to automation in Pentest. Sarraute et al. [42] discuss that attack graphs have been proposed as a tool to help testers understand the potential weaknesses in the target network, once that assessing network security is a complex and difficult activity. A better explanation about attack graphs is described in [71]. According to their review, attack graphs are used to determine if designated goal states can be reached by attackers attempting to penetrate computer networks from initial starting states. The graphs are made by nodes and arcs, representing the attacker actions (normally involve exploits or exploit steps that take advantage of vulnerabilities) and the changes in the network state caused by these actions. The goal of these actions is for the attacker to obtain typically restricted privileges on one or more target hosts. An attack graph must show all possible sequences of attacker actions that lead to the desired level of privilege on the target. It is possible to use nodes to represent network states and arcs to represent attack actions, while some use other representations like nodes for both actions and networks states and also with actions that are nodes and network states that are arcs [71]. The idea of tools or frameworks that help the tester in the most insightful way during the entire process is an interesting possibility; future studies could study how to bring a better balance to the complexity of testing and the comprehension of the results.
-
Dynamics and test reprocessing: Since a Pentest requires the identified vulnerabilities to be exploited, the test activities can be modified according to the consequences of this exploitation. This change affects directly the test dynamics and flow, and some decisions during the activities execution depend on the tester discernment. Nevertheless, a point that is not considered in the related studies, mentioned in this systematic mapping, refers to the flexibility of security testing applications allied to the concerns of reprocessing the stages during the test. In this sense, a continuous evaluation of the executed tasks with the intention of installing verification cycles could result in an increased test efficacy or efficiency, which could potentially facilitate the enumeration of new attack vectors.