1 Introduction
2 Background
-
Training: obtaining features for the definition of the user behavior pattern;
-
Recognition: matching observed features against user behavior pattern.
3 Systematic review
3.1 Planning
-
Identification of the review need: a systematic review has the goal of summarizing all information regarding a specific topic. However, before starting a systematic review, the need of this review has to be checked. This checking, for instance, should verify the existence of previously published systematic reviews that deal with the topic under investigation and whether the protocol of these reviews meet the requirements of the research.
-
Commissioning (optional): in some cases, due to the lack of time or specific knowledge, one may need to request that other researchers conduct the systematic review.
-
Specification of the research questions: this is considered to be the most important part of the systematic review, as these questions will guide all the following steps, as the search for primary studies, extraction and analysis of information.
-
Development of the review protocol: this step defines strategies to be used for the search, selection and evaluation of the references. In addition, the information to be extracted from each of the selected references is also defined.
-
Protocol evaluation (optional): as the review protocol is an essential part of the systematic review, it is recommended to be reviewed by other researches.
3.2 Conduction
-
Reference search: search for the greatest possible number of references which can answer the research question in order to avoid bias. In the systematic review, the search is performed with increased rigour, with the pre-definition of search expressions and databases, making it different from traditional reviews.
-
Selection of primary studies: after reference search, the studies that are in fact relevant for the research must be selected, by the use of inclusion/exclusion criteria.
-
Quality evaluation: each of the selected references undergo a quality evaluation. This evaluation may be used with diverse aims, like contributing for the inclusion/exclusion criteria or supporting the summary results, by measuring the importance of each study.
-
Information extraction: the information extraction from the references must be done with the support of forms defined during the planning phase of the systematic review.
-
Data synthesis: this step corresponds to summarizing the results attained during the review. This summary may involve qualitative and quantitative aspects. For quantitative aspects, a meta-analysis may also be applied.
3.3 Reporting the review
-
Specification of the dissemination mechanisms and formulation of the report: dissemination of the results attained by the systematic review. This can be done by publishing in academic journals and conferences or even in web sites.
-
Report evaluation (optional): this evaluation can be requested to experts in the area of the research. If the review is submitted to a journal or conference, the review process of the publication can be considered an evaluation of the report.
4 How the systematic review was applied
4.1 Planning
4.1.1 Research questions
-
What are the advantages and disadvantages of using keystroke dynamics for intrusion detection?
-
What features are extracted from the typing data?
-
What classification algorithms are applied? What algorithms are used in the performance comparisons?
-
What measures were used to evaluate the performance? What was the performance achieved?
-
What datasets are used to measure the performance of the classifier? How many users took part in the tests performed?
4.1.2 References search
-
ACM Digital Library (http://dl.acm.org/)
-
IEEE Xplore (http://ieeexplore.ieee.org/)
-
Science Direct (http://www.sciencedirect.com/)
-
Web of Science (http://isiknowledge.com/)
-
Scopus (http://www.scopus.com/)
4.1.3 Selection criteria
4.1.4 Information extraction
-
Basic information about the publication (title, authors, name and year of publication)
-
Were performance tests conducted?
-
Type of device (e.g. PC, mobile)
-
Best performance achieved: algorithm, measure and performance
-
Number of users in the tests
-
Algorithms used in the tests
-
Extracted features
-
Is the test dataset available to be reused? Where?
-
Type of verification: static text or dynamic text?
-
Observations
4.2 Conduction
4.2.1 Application of the search expressions
Database | Number of references |
---|---|
ACM Digital Library | 71 |
IEEE Xplore | 308 |
Science Direct | 104 |
Web of Science | 596 |
Scopus | 943 |
Gaines et al. [15] | 1 |
Total | 2,023 |
4.2.2 Selection of references
Step | Number |
---|---|
Total of references | 2,023 |
After elimination of duplicates and exclusion criteria 1 and 2 | 230 |
After exclusion criterion 3 | 203 |
After exclusion of secondary studies | 200 |
4.2.3 Quality assessment
5 Results
5.1 Advantages and disadvantages
-
Passwords may be shared by several users, resulting in unauthorized access;
-
Passwords may be copied without authorization;
-
Passwords may be guessed, particularly for easy passwords, as when someone uses his/her birthday as a password [43].
5.2 Extracted features
-
DU1: time difference between the instants in which a key is pressed and released. This feature represents the time that the key keeps pressed and is also named by some authors as dwell time [38].
-
DU2: time difference between the instants in which a key is pressed and the next key is released.
-
UD: time difference between the instants in which a key is released and the next is pressed. This feature is also known as flight time [38].
-
DD: time difference between the instants in which a key is pressed and the next key is pressed.
-
UU: time difference between the instants in which a key is released and the next key is released.
Reference | Extracted features |
---|---|
Montalvao et al. [37] | DD |
DD with equalization | |
Giot et al. [17] | UU, DD, UD, DU2 |
Giot et al. [19] | UU, DD, UD, DU2 and total typing time |
Killourhy and Maxion [30] | DU1, UD |
Rodrigues et al. [43] | UD, DU1 |
UD, DU1, UU, DD | |
Hosseinzadeh and Krishnan [24] | DU1 |
DD | |
UU | |
UU, DD | |
DU1, DD | |
DU1, UU | |
DU1, UU, DD | |
Killourhy and Maxion [31] | DU1, DD, UD |
DU1, DD | |
DU1, UD | |
Bartlow and Cukic [3] | DU1, UD (average, standard deviation, sum, minimum and maximum), including the Shift key |
Chang [9] | DU1, UD |
Montalvao Filho and Freire [14] | DD |
DD with equalization | |
Gunetti and Piccardi [22] | DU1, UD |
Monrose and Rubin [36] | DU1, UD |
Yu e Cho [48] | DU1, UD |
Giot et al. [20] | UU, DD, UD, DU1 |
Chang et al. [8] | DU1, UD, DD, pressure |
Killourhy and Maxion [32] | DU1, DD |
5.3 Classification algorithms
Reference | Classifier |
---|---|
Montalvao et al. [37] | Bleha [4] |
Monrose and Rubin [36] | |
Gunetti and Picardi [22] | |
Giot et al. [17] | SVM |
Statistical | |
Neural network | |
Classifier based on distance | |
Giot et al. [19] | SVM |
Statistical | |
Classifier based on Euclidean distance | |
Classifier based on Hamming distance | |
Killourhy and Maxion [30] |
Nearest neighbour
|
Neural network | |
Mean-based classifier | |
Rodrigues et al. [43] | Hidden Markov Model (HMM) |
Statistical | |
Hosseinzadeh and Krishnan [24] | Gaussian Mixture Model (GMM) + Leave one out method |
Killourhy and Maxion [31] |
Nearest neighbour
|
Outlier count (z-score)
| |
Manhattan distance
| |
Bartlow and Cukic [3] |
Random Forests
|
Chang [9] | Tree-based with Euclidean distance |
Montalvao Filho and Freire [14] | Bleha [4] |
Monrose and Rubin [36] | |
1D-Histogram and 2D-Histogram | |
Gunetti and Piccardi [22] | Proposed Methods: R Measure and A Measure |
Monrose and Rubin [36] | Euclidean distance |
Weighted and non-weighted probability | |
Bayes
| |
Yu e Cho [48] | SVM \(^1\) |
2-layer and 4-layer Auto Associative Multi-layer Perceptron (AAMLP) | |
Giot et al. [20] | Based on Gaussian distribution [23] |
Chang et al. [8] | Statistical [5] |
Killourhy and Maxion [32] | Statistical |
Disorder-based |
5.4 Performance evaluation
-
FAR and FRR: the false acceptance rate (FAR) measures the percentage of times that an intruder is erroneously accepted as being legitimate and the false rejection rate (FRR) measures the percentage of times that a legitimate user is wrongly rejected [40]. Hypothetically, these two rates vary according to the graph in Fig. 7, depending on the sensitivity level of the algorithm: when one rate decreases, the other increases.
-
EER: the equal error rate (EER) represents the error value when both FAR and FRR assume the same value [11]. In contrast to FAR and FRR, this measure does not depend on the level of sensibility of the classification algorithm.
-
Accuracy rate: only measures the percentage of correct classifications attained by the algorithm.
-
Integrated error: is the area under the curve plotted with FAR and FRR rates, as shown Fig. 8. The value of the shaded area is the integrated error. Smaller areas represent better performance.
Classifier | Users | EER (%) |
---|---|---|
Gunetti and Picardi [37] | 205 | 13 |
SVM [19] | 100 | 6.95 |
Nearest neighbor [30] | 51 | 9.96 |
Hidden Markov Model [43] | 20 | 3.6 |
Bleha (with equalization) [14] | 47 | 6.2 |
Manhattan distance [31] | 51 | 7.1 |
GMM [24] | 41 | 4.4 |
Based on Gaussian distribution [20] | 83 | 8.87 |
Statistical [8] | 100 | 6.9 |