skip to main content
research-article
Open Access

Detection of Rogue Certificates from Trusted Certificate Authorities Using Deep Neural Networks

Published:17 September 2016Publication History
Skip Abstract Section

Abstract

Rogue certificates are valid certificates issued by a legitimate certificate authority (CA) that are nonetheless untrustworthy; yet trusted by web browsers and users. With the current public key infrastructure, there exists a window of vulnerability between the time a rogue certificate is issued and when it is detected. Rogue certificates from recent compromises have been trusted for as long as weeks before detection and revocation. Previous proposals to close this window of vulnerability require changes in the infrastructure, Internet protocols, or end user experience. We present a method for detecting rogue certificates from trusted CAs developed from a large and timely collection of certificates. This method automates classification by building machine-learning models with Deep Neural Networks (DNN). Despite the scarcity of rogue instances in the dataset, DNN produced a classification method that is proven both in simulation and in the July 2014 compromise of the India CCA. We report the details of the classification method and illustrate that it is repeatable, such as with datasets obtained from crawling. We describe the classification performance under our current research deployment.

References

  1. Bernhard Amann, Robin Sommer, Matthias Vallentin, and Seth Hall. 2013. No attack necessary: The surprising dynamics of SSL trust relationships. In Proc. of ACSAC’13. ACM, 179--188. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Bernhard Amann, Matthias Vallentin, Seth Hall, and Robin Sommer. 2012. Extracting Certificates from Live Traffic: A Near Real-Time SSL Notary Service. Technical Report. TR-12-014, ICSI.Google ScholarGoogle Scholar
  3. ANSSI. 2013. Revocation of an IGC/A branch. http://www.ssi.gouv.fr/en/the-anssi/events/revocation-of-an- igc-a-branch-808.html. (Dec 2013).Google ScholarGoogle Scholar
  4. Michael Bailey, Jon Oberheide, Jon Andersen, Z. Morley Mao, Farnam Jahanian, and Jose Nazario. 2007. Automated classification and analysis of internet malware. In Recent Advances in Intrusion Detection. Springer, 178--197. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Ram Basnet, Srinivas Mukkamala, and Andrew H. Sung. 2008. Detection of phishing attacks: A machine learning approach. In Soft Computing Applications in Industry, Bhanu Prasad (Ed.). Studies in Fuzziness and Soft Computing, Vol. 226. Springer, Berlin, 373--383.Google ScholarGoogle Scholar
  6. Lujo Bauer, Scott Garriss, and Michael Reiter. 2011. Detecting and resolving policy misconfigurations in access-control systems. ACM Trans. Inform. Syst. Secur. 14, 1 (2011), 2. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Yoshua Bengio. 2009. Learning deep architectures for AI. Found. Trends Mach. Learn. 2, 1 (2009), 1--127. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. Léon Bottou. 1991. Stochastic gradient learning in neural networks. Proc. Neuro-Nimes 91, 8 (1991).Google ScholarGoogle Scholar
  9. Leo Breiman. 2001. Random forests. Mach. Learn. 45, 1 (2001), 5--32. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Chad Brubaker, Suman Jana, Baishakhi Ray, Sarfraz Khurshid, and Vitaly Shmatikov. 2014. Using Frankencerts for automated adversarial testing of certificate validation in SSL/TLS implementations. In Proc. of SP’14. IEEE Computer Society, 114--129. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Nitesh V. Chawla. 2005. Data mining for imbalanced datasets: An overview. In Data Mining and Knowledge Discovery Handbook. Springer, 853--867.Google ScholarGoogle Scholar
  12. Nitesh V. Chawla, Kevin W. Bowyer, Lawrence O. Hall, and W. Philip Kegelmeyer. 2002. SMOTE: Synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 1 (Jun. 2002), 321--357. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Corinna Cortes and Vladimir Vapnik. 1995. Support-vector networks. Mach. Learn. 20, 3 (1995), 273--297. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Sevtap Duman, Kaan Onarlioglu, Ali Osman Ulusoy, William Robertson, Engin Kirda, Erik-Oliver Blass, Travis Mayberry, Guevara Noubir, Kaan Onarlioglu, Michael Weissbacher, and others. 2014. TrueClick: Automatically distinguishing trick banners from genuine download links. In Proc. of ACSAC’14. ACM, 456--465. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Zakir Durumeric, James Kasten, Michael Bailey, and J. Alex Halderman. 2013. Analysis of the HTTPS certificate ecosystem. In Proc. of IMC’13. ACM, 291--304. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Kevin P. Dyer, Scott E. Coull, Thomas Ristenpart, and Thomas Shrimpton. 2012. Peek-a-boo, i still see you: Why efficient traffic analysis countermeasures fail. In Proc. of SP’12. IEEE, 332--346. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Dumitru Erhan, Yoshua Bengio, Aaron Courville, Pierre-Antoine Manzagol, Pascal Vincent, and Samy Bengio. 2010. Why does unsupervised pre-training help deep learning? J. Mach. Learn. Res. 11 (2010), 625--660. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Nicolas Falliere, Liam O. Murchu, and Eric Chien. 2011. W32. stuxnet dossier. White Paper, Symantec Corp., Security Response (2011).Google ScholarGoogle Scholar
  19. Wei Fan, Matthew Miller, Sal Stolfo, Wenke Lee, and Phil Chan. 2004. Using artificial anomalies to detect unknown and known network intrusions. Knowl. Inform. Syst. 6, 5 (2004), 507--527. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Dennis Fisher. 2011. DigiNotar Says Its CA Infrastructure Was Compromised. Retrieved from https://threatpost.com/diginotar-says-its-ca-infrastructure-was-compromised-083011/75594/.Google ScholarGoogle Scholar
  21. CA/Browser Forum. 2015. Baseline Requirements Certificate Policy for the Issuance and Management of Publicly-Trusted Certificates. Retrieved from https://cabforum.org/wp-content/uploads/CAB-Forum-BR-1.3.0.pdf.Google ScholarGoogle Scholar
  22. Laura Fumanelli, Marco Ajelli, Piero Manfredi, Alessandro Vespignani, and Stefano Merler. 2012. Inferring the structure of social contacts from demographic data in the analysis of infectious diseases spread. PLoS Comput. Biol. 8, 9 (Sep. 2012), e1002673.Google ScholarGoogle ScholarCross RefCross Ref
  23. Sujata Garera, Niels Provos, Monica Chew, and Aviel D. Rubin. 2007. A framework for detection and measurement of phishing attacks. In Proc. of WORM’07. ACM, 1--8. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Fabio Gonzalez, Dipankar Dasgupta, and Robert Kozma. 2002. Combining negative selection and classification techniques for anomaly detection. In Proc. of CEC’02, Vol. 1. IEEE, 705--710.Google ScholarGoogle ScholarCross RefCross Ref
  25. Guofei Gu, Roberto Perdisci, Junjie Zhang, Wenke Lee, and others. 2008. BotMiner: Clustering analysis of network traffic for protocol-and structure-independent botnet detection. In Proc. of USENIX Security’08. USENIX, 139--154. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Phillip Hallam-Baker. 2011. Comodo SSL Affiliate The Recent RA Compromise. Retrieved from https://blogs. comodo.com/uncategorized/the-recent-ra-compromise/.Google ScholarGoogle Scholar
  27. Ling Huang, Anthony D. Joseph, Blaine Nelson, Benjamin I. P. Rubinstein, and J. D. Tygar. 2011. Adversarial machine learning. In Proc. of AISec’11. ACM, 43--58. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Lin Shung Huang, Alex Rice, Erling Ellingsen, and Collin Jackson. 2014. Analyzing forged SSL certificates in the wild. In Proc. of SP’14. IEEE Computer Society, 83--97. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. IETF. 2008. Internet X.509 Public Key Infrastructure Certificate and Certificate Revocation List (CRL) Profile. http://tools.ietf.org/html/rfc5280. (May 2008).Google ScholarGoogle Scholar
  30. Josh Karlin, Stephanie Forrest, and Jennifer Rexford. 2006. Pretty good BGP: Improving BGP by cautiously adopting routes. In Proc. of ICNP’06. IEEE, 290--299. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Timothy Kelley and L. Jean Camp. 2012. Online promiscuity: Prophylactic patching and the spread of computer transmitted infections. In Proc. of WEIS’12. Springer.Google ScholarGoogle Scholar
  32. Richard L. Barnes. 2011. DANE: Taking TLS authentication to the next level using DNSSEC. IETF J. (Oct. 2011).Google ScholarGoogle Scholar
  33. Jon Larimer and Kenny Root. 2012. Security and Privacy in Android Apps. Retrieved from https://developers.google.com/events/io/2012/sessions/gooio2012/107/.Google ScholarGoogle Scholar
  34. Ben Laurie, Adam Langley, and Emilia Kasper. 2013. RFC 6962: Certificate transparency. http://www.rfceditor.org/info/rfc6962.Google ScholarGoogle Scholar
  35. Saskia Le Cessie and Johannes C. Van Houwelingen. 1992. Ridge estimators in logistic regression. Appl. Stat. 41, 1 (1992), 191--201.Google ScholarGoogle ScholarCross RefCross Ref
  36. Tie-Yan Liu, Yiming Yang, Hao Wan, Hua-Jun Zeng, Zheng Chen, and Wei-Ying Ma. 2005. Support vector machines classification with a very large-scale taxonomy. ACM SIGKDD Explor. Newslett. 7, 1 (2005), 36--43. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. Justin Ma, Lawrence K. Saul, Stefan Savage, and Geoffrey M. Voelker. 2009. Identifying suspicious URLs: An application of large-scale online learning. In Proc. of ICML’09. ACM, 681--688. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Michelle L. Mazurek, Saranga Komanduri, Timothy Vidas, Lujo Bauer, Nicolas Christin, Lorrie Faith Cranor, Patrick Gage Kelley, Richard Shay, and Blase Ur. 2013. Measuring password guessability for an entire university. In Proc. of CCS’13. ACM, 173--186. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. Ralph C. Merkle. 1988. A digital signature based on a conventional encryption function. In Proc. of CRYPTO’87. Springer, 369--378. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Microsoft. 2013. Microsoft Security Advisory 2798897: Fraudulent Digital Certificates Could Allow Spoofing. Retrieved from https://technet.microsoft.com/library/security/2798897.Google ScholarGoogle Scholar
  41. Microsoft. 2014a. Manage Trusted Root Certificates. Retrieved from https://technet.microsoft.com/en-us/library/cc754841.aspx.Google ScholarGoogle Scholar
  42. Microsoft. 2014b. Microsoft Security Advisory 2982792: Improperly Issued Digital Certificates Could Allow Spoofing. https://technet.microsoft.com/en-us/library/security/2982792.aspx. (Jul 2014).Google ScholarGoogle Scholar
  43. Mishari Al Mishari, Emiliano De Cristofaro, Karim El Defrawy, and Gene Tsudik. 2009. Harvesting SSL certificate data to identify web-fraud. arXiv preprint arXiv:0909.3688 (Sep 2009).Google ScholarGoogle Scholar
  44. Tyler Moore and Richard Clayton. 2007. Examining the impact of website take-down on phishing. In Proc. of APWG eCrime’07. APWG, 1--13. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Mozilla. 2015. CA:AddRootToFirefox: Installing Certificates into Firefox. Retrieved from https://wiki. mozilla.org/CA:AddRootToFirefox.Google ScholarGoogle Scholar
  46. Angelo P. E. Rosiello, E. Kirda, C. Kruegel, and F. Ferrandi. 2007. A layout-similarity-based approach for detecting phishing pages. In Proc. of SecureComm’07. Springer, 454--463.Google ScholarGoogle Scholar
  47. A. H. Schistad Solberg and R. Solberg. 1996. A large-scale evaluation of features for automatic detection of oil spills in ERS SAR images. In Proc. of IGARSS’96, Vol. 3. 1484--1486.Google ScholarGoogle Scholar
  48. Robin Sommer and Vern Paxson. 2010. Outside the closed world: On using machine learning for network intrusion detection. In Proc. of SP’10. IEEE, 305--316. Google ScholarGoogle ScholarDigital LibraryDigital Library
  49. Andreas P. Streich, Mario Frank, David Basin, and Joachim M. Buhmann. 2009. Multi-assignment clustering for Boolean data. In Proc. of ICML’09. ACM, 969--976. Google ScholarGoogle ScholarDigital LibraryDigital Library
  50. James Theiler and D. Michael Cai. 2003. Resampling approach for anomaly detection in multispectral images. In Proc. of SPIE Aerosense’03. International Society for Optics and Photonics, 230--240.Google ScholarGoogle Scholar
  51. Adam Toon. 2012. Models as Make-Believe: Imagination, Fiction and Scientific Representation. Palgrave Macmillan.Google ScholarGoogle Scholar
  52. Tor. 2011. The DigiNotar Debacle, and What You Should Do About It. Retrieved from https://blog.torproject.org/blog/diginotar-debacle-and-what-you-should-do-about-it.Google ScholarGoogle Scholar
  53. Gang Wang, Tianyi Wang, Haitao Zheng, and Ben Y. Zhao. 2014. Man vs. machine: Practical adversarial detection of malicious crowdsourcing workers. In Proc. of USENIX Security’14. USENIX, 239--254. Google ScholarGoogle ScholarDigital LibraryDigital Library
  54. Michael Weisberg. 2013. Simulation and Similarity: Using Models to Understand the World. Oxford University Press.Google ScholarGoogle Scholar
  55. Dan Wendlandt, David G. Andersen, and Adrian Perrig. 2008. Perspectives: Improving SSH-style host authentication with multi-path probing. In Proc. of USENIX’08, Vol. 200. USENIX, 321--334. Google ScholarGoogle ScholarDigital LibraryDigital Library
  56. Colin Whittaker, Brian Ryner, and Marria Nazif. 2010. Large-scale automatic classification of phishing pages. In Proc. of NDSS’10. ISOC.Google ScholarGoogle Scholar
  57. Guang Xiang, Jason Hong, Carolyn P. Rose, and Lorrie Cranor. 2011. CANTINA+: A feature-rich machine learning framework for detecting phishing web sites. ACM Trans. Inf. Syst. Secur. 14, 2 (Sep 2011), 21:1--21:28. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Detection of Rogue Certificates from Trusted Certificate Authorities Using Deep Neural Networks

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in

      Full Access

      • Published in

        cover image ACM Transactions on Privacy and Security
        ACM Transactions on Privacy and Security  Volume 19, Issue 2
        September 2016
        83 pages
        ISSN:2471-2566
        EISSN:2471-2574
        DOI:10.1145/2988517
        Issue’s Table of Contents

        Copyright © 2016 Owner/Author

        Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 17 September 2016
        • Accepted: 1 July 2016
        • Revised: 1 April 2016
        • Received: 1 February 2015
        Published in tops Volume 19, Issue 2

        Check for updates

        Qualifiers

        • research-article
        • Research
        • Refereed

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader