Introduction
Recent work
Keystroke dynamics
Keystroke exploitation
Publications | Medium | Objective | Evaluation method | Evaluation result |
---|---|---|---|---|
Malboard [21] | Malicious USB keylogger installed on the victim’s computer keyboard | Steal a victim’s typing behavior via malicious USB keylogger and use them to impersonate the victim on a keystroke dynamics authentication | Evasion Rate (ER) | KeyTrac: ± 90% ER TypingDNA: ± 85% ER DuckHunt: ± 100% ER |
SILK-TV [8] | A video that records physical screen (i.e., ATM and Computer screen) which displays password/pin input | Extract passwords and PINs typing delays and use them to infer the plaintext behind the masked password and PINs | Reduced Search-space | Reduced the password’s search space by 25% to 385% depending on the complexity of the password |
Mimicry [30] | A video that records the victim’s typing activities and their finger movements on a smartphone | Extract a victim’s typing behaviour by observing the fingers’ movements and create an interface for an attacker to mimic the typing behaviour | Evasion Rate (ER) | ± 97% ER for ≤ 3 attack attempts against Touchalytics |
EyeTell [31] | A video that records the victim’s face and gaze while typing their PIN on a touch-screen device | Extract a victim’s keystrokes by capturing and analyzing his eye movements and use them to infer the typed PINs | Reduced Search-space | 4-digit PIN: 74% of the PINs are located in the Top-10 PIN wordlist 6-digit PIN: 80% of the PINs are located in the Top-10 PIN wordlist |
Proposed architectures
Text cursor detection and tracking
Character isolation and timing extraction
Character conversion and timing extraction
Training and evaluating recognition model
Parameter | Value |
---|---|
Learning rate | 0.1 |
Batch size | 64 |
Epoch | 40 |
Optimizer | Stochastic gradient descent |
Loss | Categorical ross-entropy |
Character conversion with recognition model
Keystroke timing extraction and structuring
Attribute | Data type | Description |
---|---|---|
KeyPress | Integer | Index of frame where the character first appears |
KeyRelease | Integer | Index of frame where the character last appears |
KeyDelay | Float | Represents Down–Down Time (DDT) |
KeyText | Character | Identified character from the IC frame |
OCR Confidence | Float | Confidence score of character recognition process |
Attribute | Data type | Description |
---|---|---|
FrameNo | Integer | Index of frame where the KUnit originates |
KUnit Image | Binary | The binary data of KUnit image |
Shape | Float | The shape of the KUnit (width, height) |
X_Coord | Float | The x-axis position of the KUnit on the frame (xmin, xmax) |
Y_Coord | Float | The y-axis position of the KUnit on the frame (ymin, ymax) |
Video-inferred Keystroke timing calculation
Result and discussion
Experimental settings
ID | Type | Sentence |
---|---|---|
A | Password phrase | abudhabiacrossthesea |
B | Greeting sentence | hi my name is [NAME] |
Subjects | Occupation | Age |
---|---|---|
S-001 | Lab Technician | 23 |
S-002 to S-004 | Lecturer | 27 to 40 |
S-005 to S-009 | Software engineer | 21 to 23 |
S-009 to S-014 | Student | 16 to 21 |
Password phrase group
The string is 20 characters long, alphabetical only, and contains no space. No typo occurred in the process of collecting every sample. This sample group represents a password input.abudhabiacrossthesea
Similarity evaluation
Performance evaluation
Greeting sentence group
The text is alphabetical only and varies between 18 to 34 characters long (space-included) due to the subjects’ different names. There is no typo that occurred in the process of collecting every sample. In addition, there are no two subjects that have a similar name. Thus the sentences typed by one subject with another subject are different. This sample group represents a dynamic input.hi my name is [NAME]
Similarity test
Performance test
KeyTrac spoofing simulation
Mode | Evaluation group | Text |
---|---|---|
Password | Password phrase | abudhabiacrossthesea |
Freetext | Greeting phrase | Hi my name is [NAME] |
Mode | Config | Value | Description |
---|---|---|---|
Password | Threshold | 50% | Threshold for a user to be considered as authenticated |
Min. Sample Count | 2 | Minimum number of (valid) samples. | |
Freetext | Threshold | 50% | Threshold for a user to be considered as authenticated |
Min. Text Length | 10 | Minimum length of a text to be used for an enrollment. |
Biometrics evaluation measurement
Evasion rate measurement
Big data application
Method comparison and limitations
Methods | Attack medium | Target platform | Attack requirement |
---|---|---|---|
Malboard [21] | Hardware (USB) Keylogger | Computer | 1. Need to install arbitrary keylogger hardware (USB) 2. Require more combinations of keystrokes data collected to achieve higher similarity |
Mimicry [30] | Video of Smartphone Typing Activities (Touchscreen) | Smartphone | 1. Victim’s finger must be visible on the video 2. Attacker needs to train themselves to mimic the victim’s typing behaviour, the proposed method only provides visual real-time guidance |
Our Method | Screen-recorded Video | Both Computer & Smartphone | 1. Victim’s screen must be recorded (e.g., through screen sharing activity) 2. Require more combinations of keystrokes data collected to achieve higher similarity |
Methods | Attacker interaction | Remote exploitation | Detection likelihood | Other limitations |
---|---|---|---|---|
Malboard [21] | Required, attacker must install hardware keylogger into victim’s keyboard | No | Likely, the attached hardware keylogger is easier to be noticed | The attack requires Internet connection to transfer the collected keystroke behaviours |
Mimicry [30] | Required, attacker must be able to record the victim’s finger and the smartphone’s screen while the victim is typing | No | Likely, the attacker must be in close proximity to the victim to record their typing activity on the smartphone | The attack is not automated, the proposed method only provides visual guidance. |
Our Method | Semi-required, attacker could passively obtain the screen-recorded video via the victim’s screen-sharing activity | Yes | Unlikely, the victim is less-likely to notice when their screen is being recorded remotely (e.g., via the screen-sharing activity) | Evasion Rates (ER) is lower, and only lowercase characters are supported (as of now) |