Background
Related work
System architecture for prediction of phishing URL
Feature extraction
Heuristic 1: length of the host URL
Heuristics 2: number of slashes in URL
Heuristics 3: dots in host name of the URL
Heuristics 4: number of terms in the host name of the URL
Heuristics 5: special characters
Heuristic 6: IP address
Heuristics 7: unicode in URL
Heuristics 8: transport layer security
Heuristics 9: Subdomain
Heuristics 10: certain keyword in the URL
Heuristics 11: top level domain
Heuristics 12: number of dots in the path of the URL
Heuristics 13: hyphen in the host name of the URL
Heuristics 14: URL length
Association rule mining
Association rule mining to detect phishing URL
Source | Link |
---|---|
Yahoo most visited sites | |
Most visited sites google’s top 1000 | |
Alexa’s top targeted sites | |
Netcraft’s most visited sites | |
Millersmile’s top targeted sites |
Rule extracted from apriori
Rule extracted from predictive apriori
No. of instances | Apriori (ms) | Predictive apriori (ms) |
---|---|---|
500 | 0 | 10,958 |
1000 | 1 | 23,996 |
1500 | 1 | 39,002 |
2000 | 2 | 48,000 |
2500 | 2 | 109,982 |
3000 | 3 | 198,001 |