Detection of Rogue Certificates from Trusted Certificate Authorities Using Deep Neural Networks

Authors:
Zheng Dong

Microsoft Corporation, Redmond, WA

Microsoft Corporation, Redmond, WA
View Profile

,
Kevin Kane

Microsoft Research, Redmond, WA

Microsoft Research, Redmond, WA
View Profile

,
L. Jean Camp

Indiana University Bloomington, Bloomington, IN

Indiana University Bloomington, Bloomington, IN
View Profile

Authors Info & Claims

ACM Transactions on Privacy and Security Volume 19 Issue 2Article No.: 5pp 1–31https://doi.org/10.1145/2975591

Published:17 September 2016Publication History

ACM Transactions on Privacy and Security

Abstract

Rogue certificates are valid certificates issued by a legitimate certificate authority (CA) that are nonetheless untrustworthy; yet trusted by web browsers and users. With the current public key infrastructure, there exists a window of vulnerability between the time a rogue certificate is issued and when it is detected. Rogue certificates from recent compromises have been trusted for as long as weeks before detection and revocation. Previous proposals to close this window of vulnerability require changes in the infrastructure, Internet protocols, or end user experience. We present a method for detecting rogue certificates from trusted CAs developed from a large and timely collection of certificates. This method automates classification by building machine-learning models with Deep Neural Networks (DNN). Despite the scarcity of rogue instances in the dataset, DNN produced a classification method that is proven both in simulation and in the July 2014 compromise of the India CCA. We report the details of the classification method and illustrate that it is repeatable, such as with datasets obtained from crawling. We describe the classification performance under our current research deployment.

References

Bernhard Amann, Robin Sommer, Matthias Vallentin, and Seth Hall. 2013. No attack necessary: The surprising dynamics of SSL trust relationships. In Proc. of ACSAC’13. ACM, 179--188. Google ScholarDigital Library
Bernhard Amann, Matthias Vallentin, Seth Hall, and Robin Sommer. 2012. Extracting Certificates from Live Traffic: A Near Real-Time SSL Notary Service. Technical Report. TR-12-014, ICSI.Google Scholar
ANSSI. 2013. Revocation of an IGC/A branch. http://www.ssi.gouv.fr/en/the-anssi/events/revocation-of-an- igc-a-branch-808.html. (Dec 2013).Google Scholar
Michael Bailey, Jon Oberheide, Jon Andersen, Z. Morley Mao, Farnam Jahanian, and Jose Nazario. 2007. Automated classification and analysis of internet malware. In Recent Advances in Intrusion Detection. Springer, 178--197. Google ScholarDigital Library
Ram Basnet, Srinivas Mukkamala, and Andrew H. Sung. 2008. Detection of phishing attacks: A machine learning approach. In Soft Computing Applications in Industry, Bhanu Prasad (Ed.). Studies in Fuzziness and Soft Computing, Vol. 226. Springer, Berlin, 373--383.Google Scholar
Lujo Bauer, Scott Garriss, and Michael Reiter. 2011. Detecting and resolving policy misconfigurations in access-control systems. ACM Trans. Inform. Syst. Secur. 14, 1 (2011), 2. Google ScholarDigital Library
Yoshua Bengio. 2009. Learning deep architectures for AI. Found. Trends Mach. Learn. 2, 1 (2009), 1--127. Google ScholarDigital Library
Léon Bottou. 1991. Stochastic gradient learning in neural networks. Proc. Neuro-Nimes 91, 8 (1991).Google Scholar
Leo Breiman. 2001. Random forests. Mach. Learn. 45, 1 (2001), 5--32. Google ScholarDigital Library
Chad Brubaker, Suman Jana, Baishakhi Ray, Sarfraz Khurshid, and Vitaly Shmatikov. 2014. Using Frankencerts for automated adversarial testing of certificate validation in SSL/TLS implementations. In Proc. of SP’14. IEEE Computer Society, 114--129. Google ScholarDigital Library
Nitesh V. Chawla. 2005. Data mining for imbalanced datasets: An overview. In Data Mining and Knowledge Discovery Handbook. Springer, 853--867.Google Scholar
Nitesh V. Chawla, Kevin W. Bowyer, Lawrence O. Hall, and W. Philip Kegelmeyer. 2002. SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 1 (Jun. 2002), 321--357. Google ScholarDigital Library
Corinna Cortes and Vladimir Vapnik. 1995. Support-vector networks. Mach. Learn. 20, 3 (1995), 273--297. Google ScholarDigital Library
Sevtap Duman, Kaan Onarlioglu, Ali Osman Ulusoy, William Robertson, Engin Kirda, Erik-Oliver Blass, Travis Mayberry, Guevara Noubir, Kaan Onarlioglu, Michael Weissbacher, and others. 2014. TrueClick: Automatically distinguishing trick banners from genuine download links. In Proc. of ACSAC’14. ACM, 456--465. Google ScholarDigital Library
Zakir Durumeric, James Kasten, Michael Bailey, and J. Alex Halderman. 2013. Analysis of the HTTPS certificate ecosystem. In Proc. of IMC’13. ACM, 291--304. Google ScholarDigital Library
Kevin P. Dyer, Scott E. Coull, Thomas Ristenpart, and Thomas Shrimpton. 2012. Peek-a-boo, i still see you: Why efficient traffic analysis countermeasures fail. In Proc. of SP’12. IEEE, 332--346. Google ScholarDigital Library
Dumitru Erhan, Yoshua Bengio, Aaron Courville, Pierre-Antoine Manzagol, Pascal Vincent, and Samy Bengio. 2010. Why does unsupervised pre-training help deep learning? J. Mach. Learn. Res. 11 (2010), 625--660. Google ScholarDigital Library
Nicolas Falliere, Liam O. Murchu, and Eric Chien. 2011. W32. stuxnet dossier. White Paper, Symantec Corp., Security Response (2011).Google Scholar
Wei Fan, Matthew Miller, Sal Stolfo, Wenke Lee, and Phil Chan. 2004. Using artificial anomalies to detect unknown and known network intrusions. Knowl. Inform. Syst. 6, 5 (2004), 507--527. Google ScholarDigital Library
Dennis Fisher. 2011. DigiNotar Says Its CA Infrastructure Was Compromised. Retrieved from https://threatpost.com/diginotar-says-its-ca-infrastructure-was-compromised-083011/75594/.Google Scholar
CA/Browser Forum. 2015. Baseline Requirements Certificate Policy for the Issuance and Management of Publicly-Trusted Certificates. Retrieved from https://cabforum.org/wp-content/uploads/CAB-Forum-BR-1.3.0.pdf.Google Scholar
Laura Fumanelli, Marco Ajelli, Piero Manfredi, Alessandro Vespignani, and Stefano Merler. 2012. Inferring the structure of social contacts from demographic data in the analysis of infectious diseases spread. PLoS Comput. Biol. 8, 9 (Sep. 2012), e1002673.Google ScholarCross Ref
Sujata Garera, Niels Provos, Monica Chew, and Aviel D. Rubin. 2007. A framework for detection and measurement of phishing attacks. In Proc. of WORM’07. ACM, 1--8. Google ScholarDigital Library
Fabio Gonzalez, Dipankar Dasgupta, and Robert Kozma. 2002. Combining negative selection and classification techniques for anomaly detection. In Proc. of CEC’02, Vol. 1. IEEE, 705--710.Google ScholarCross Ref
Guofei Gu, Roberto Perdisci, Junjie Zhang, Wenke Lee, and others. 2008. BotMiner: Clustering analysis of network traffic for protocol-and structure-independent botnet detection. In Proc. of USENIX Security’08. USENIX, 139--154. Google ScholarDigital Library
Phillip Hallam-Baker. 2011. Comodo SSL Affiliate The Recent RA Compromise. Retrieved from https://blogs. comodo.com/uncategorized/the-recent-ra-compromise/.Google Scholar
Ling Huang, Anthony D. Joseph, Blaine Nelson, Benjamin I. P. Rubinstein, and J. D. Tygar. 2011. Adversarial machine learning. In Proc. of AISec’11. ACM, 43--58. Google ScholarDigital Library
Lin Shung Huang, Alex Rice, Erling Ellingsen, and Collin Jackson. 2014. Analyzing forged SSL certificates in the wild. In Proc. of SP’14. IEEE Computer Society, 83--97. Google ScholarDigital Library
IETF. 2008. Internet X.509 Public Key Infrastructure Certificate and Certificate Revocation List (CRL) Profile. http://tools.ietf.org/html/rfc5280. (May 2008).Google Scholar
Josh Karlin, Stephanie Forrest, and Jennifer Rexford. 2006. Pretty good BGP: Improving BGP by cautiously adopting routes. In Proc. of ICNP’06. IEEE, 290--299. Google ScholarDigital Library
Timothy Kelley and L. Jean Camp. 2012. Online promiscuity: Prophylactic patching and the spread of computer transmitted infections. In Proc. of WEIS’12. Springer.Google Scholar
Richard L. Barnes. 2011. DANE: Taking TLS authentication to the next level using DNSSEC. IETF J. (Oct. 2011).Google Scholar
Jon Larimer and Kenny Root. 2012. Security and Privacy in Android Apps. Retrieved from https://developers.google.com/events/io/2012/sessions/gooio2012/107/.Google Scholar
Ben Laurie, Adam Langley, and Emilia Kasper. 2013. RFC 6962: Certificate transparency. http://www.rfceditor.org/info/rfc6962.Google Scholar
Saskia Le Cessie and Johannes C. Van Houwelingen. 1992. Ridge estimators in logistic regression. Appl. Stat. 41, 1 (1992), 191--201.Google ScholarCross Ref
Tie-Yan Liu, Yiming Yang, Hao Wan, Hua-Jun Zeng, Zheng Chen, and Wei-Ying Ma. 2005. Support vector machines classification with a very large-scale taxonomy. ACM SIGKDD Explor. Newslett. 7, 1 (2005), 36--43. Google ScholarDigital Library
Justin Ma, Lawrence K. Saul, Stefan Savage, and Geoffrey M. Voelker. 2009. Identifying suspicious URLs: An application of large-scale online learning. In Proc. of ICML’09. ACM, 681--688. Google ScholarDigital Library
Michelle L. Mazurek, Saranga Komanduri, Timothy Vidas, Lujo Bauer, Nicolas Christin, Lorrie Faith Cranor, Patrick Gage Kelley, Richard Shay, and Blase Ur. 2013. Measuring password guessability for an entire university. In Proc. of CCS’13. ACM, 173--186. Google ScholarDigital Library
Ralph C. Merkle. 1988. A digital signature based on a conventional encryption function. In Proc. of CRYPTO’87. Springer, 369--378. Google ScholarDigital Library
Microsoft. 2013. Microsoft Security Advisory 2798897: Fraudulent Digital Certificates Could Allow Spoofing. Retrieved from https://technet.microsoft.com/library/security/2798897.Google Scholar
Microsoft. 2014a. Manage Trusted Root Certificates. Retrieved from https://technet.microsoft.com/en-us/library/cc754841.aspx.Google Scholar
Microsoft. 2014b. Microsoft Security Advisory 2982792: Improperly Issued Digital Certificates Could Allow Spoofing. https://technet.microsoft.com/en-us/library/security/2982792.aspx. (Jul 2014).Google Scholar
Mishari Al Mishari, Emiliano De Cristofaro, Karim El Defrawy, and Gene Tsudik. 2009. Harvesting SSL certificate data to identify web-fraud. arXiv preprint arXiv:0909.3688 (Sep 2009).Google Scholar
Tyler Moore and Richard Clayton. 2007. Examining the impact of website take-down on phishing. In Proc. of APWG eCrime’07. APWG, 1--13. Google ScholarDigital Library
Mozilla. 2015. CA:AddRootToFirefox: Installing Certificates into Firefox. Retrieved from https://wiki. mozilla.org/CA:AddRootToFirefox.Google Scholar
Angelo P. E. Rosiello, E. Kirda, C. Kruegel, and F. Ferrandi. 2007. A layout-similarity-based approach for detecting phishing pages. In Proc. of SecureComm’07. Springer, 454--463.Google Scholar
A. H. Schistad Solberg and R. Solberg. 1996. A large-scale evaluation of features for automatic detection of oil spills in ERS SAR images. In Proc. of IGARSS’96, Vol. 3. 1484--1486.Google Scholar
Robin Sommer and Vern Paxson. 2010. Outside the closed world: On using machine learning for network intrusion detection. In Proc. of SP’10. IEEE, 305--316. Google ScholarDigital Library
Andreas P. Streich, Mario Frank, David Basin, and Joachim M. Buhmann. 2009. Multi-assignment clustering for Boolean data. In Proc. of ICML’09. ACM, 969--976. Google ScholarDigital Library
James Theiler and D. Michael Cai. 2003. Resampling approach for anomaly detection in multispectral images. In Proc. of SPIE Aerosense’03. International Society for Optics and Photonics, 230--240.Google Scholar
Adam Toon. 2012. Models as Make-Believe: Imagination, Fiction and Scientific Representation. Palgrave Macmillan.Google Scholar
Tor. 2011. The DigiNotar Debacle, and What You Should Do About It. Retrieved from https://blog.torproject.org/blog/diginotar-debacle-and-what-you-should-do-about-it.Google Scholar
Gang Wang, Tianyi Wang, Haitao Zheng, and Ben Y. Zhao. 2014. Man vs. machine: Practical adversarial detection of malicious crowdsourcing workers. In Proc. of USENIX Security’14. USENIX, 239--254. Google ScholarDigital Library
Michael Weisberg. 2013. Simulation and Similarity: Using Models to Understand the World. Oxford University Press.Google Scholar
Dan Wendlandt, David G. Andersen, and Adrian Perrig. 2008. Perspectives: Improving SSH-style host authentication with multi-path probing. In Proc. of USENIX’08, Vol. 200. USENIX, 321--334. Google ScholarDigital Library
Colin Whittaker, Brian Ryner, and Marria Nazif. 2010. Large-scale automatic classification of phishing pages. In Proc. of NDSS’10. ISOC.Google Scholar
Guang Xiang, Jason Hong, Carolyn P. Rose, and Lorrie Cranor. 2011. CANTINA+: A feature-rich machine learning framework for detecting phishing web sites. ACM Trans. Inf. Syst. Secur. 14, 2 (Sep 2011), 21:1--21:28. Google ScholarDigital Library

Index Terms

Detection of Rogue Certificates from Trusted Certificate Authorities Using Deep Neural Networks
1. Security and privacy
2. Social and professional topics
  1. Computing / technology policy
    1. Computer crime

Recommendations

Security Analysis on Practices of Certificate Authorities in the HTTPS Phishing Ecosystem
ASIA CCS '21: Proceedings of the 2021 ACM Asia Conference on Computer and Communications Security

Phishing attacks are causing substantial damage albeit extensive effort in academia and industry. Recently, a large volume of phishing attacks transit toward adopting HTTPS, leveraging TLS certificates issued from Certificate Authorities (CAs), to make ...
Read More
Revocation Speedrun: How the WebPKI Copes with Fraudulent Certificates
PACMNET

The TLS ecosystem depends on certificates to bootstrap secure connections. Certificate Authorities (CAs) are trusted to issue these correctly. However, as a result of security breaches or attacks, certificates may be issued fraudulently and need to be ...
Read More
X.509 Certificate Error Testing
ARES '18: Proceedings of the 13th International Conference on Availability, Reliability and Security

X.509 Certificates are used by a wide range of technologies to verify identities, while the SSL protocol is used to provide a secure encrypted tunnel through which data can be sent over a public network. Combined both of these technologies provides the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Privacy and Security Volume 19, Issue 2
September 2016
83 pages
ISSN:2471-2566
EISSN:2471-2574
DOI:10.1145/2988517
Editor:
David Basin
ETH Zurich, Switzerland
Issue’s Table of Contents
Copyright © 2016 Owner/Author
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 17 September 2016
- Accepted: 1 July 2016
- Revised: 1 April 2016
- Received: 1 February 2015
Published in tops Volume 19, Issue 2

Check for updates
Author Tags
Certificates
machine learning
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 27
  Total Citations
  View Citations
- 2,959
  Total Downloads
- Downloads (Last 12 months)520
- Downloads (Last 6 weeks)97
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Detection of Rogue Certificates from Trusted Certificate Authorities Using Deep Neural Networks

ACM Transactions on Privacy and Security

Abstract

References

Cited By

Index Terms

Recommendations

Security Analysis on Practices of Certificate Authorities in the HTTPS Phishing Ecosystem

Revocation Speedrun: How the WebPKI Copes with Fraudulent Certificates

X.509 Certificate Error Testing