Skip to main content
Top

2025 | OriginalPaper | Chapter

Formally Verified Implementation of the K-Nearest Neighbors Classification Algorithm

Authors : Bernny Velasquez, Jessica Herring, Nadeem Abdul Hamid

Published in: Formal Methods: Foundations and Applications

Publisher: Springer Nature Switzerland

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

Classification, one of the most commonly applied algorithmic techniques in data mining, involves assigning a class label to observations based on previously seen data. Among the most ubiquitous and well-known approaches to classification is K-nearest neighbor (\(K\)NN) search, which predicts the class label for a query by determining the plurality class of its K-nearest data points. In this paper, we present a mechanically verified implementation of a K-nearest neighbors classification algorithm in the Coq  proof assistant, a powerful formal verification tool. Formally certifying the implementation, by proving that it meets its specification, provides a strong guarantee and high confidence that the classifier actually produces results in the expected manner. Given the conceptually simple nature of the \(K\)NN algorithm, this serves as a good baseline for developing specification and verification techniques for machine learning implementations.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Appendix
Available only for authorised users
Footnotes
1
Throughout the paper, lowercase k denotes the number of dimensions of the data points and uppercase K is the number of nearest neighbors that are sought.
 
2
The Coq syntax we use for function headers is: \((\textit{ident}_{\tiny \text {{function}}}\ {\textbf {(}}\textit{ident}_{\tiny \text {{param}}} {\textbf {:}} \ \textit{type}_{\tiny \text {{param}}} {\textbf {)}} \cdots \ {\textbf {(}}\textit{ident}_{\tiny \text {{param}}}\ {\textbf {:}} \ \textit{type}_{\tiny \text {{param}}} {\textbf {)}}\ {\textbf {:}} \ \textit{type}_{\tiny \text {{return}}})\) representing a named function with explicitly-typed parameters.
 
3
As part of our work for the implementation of this paper, we abstracted the distance metric, D, used by knn_search - an improvement over [16]. For space reasons, we omit any further discussion about it throughout the rest of this paper.
 
6
Conventionally, in Coq theorem statements, a conjunction of premises is written as a string of implication clauses (... -> ... -> ...).
 
8
Note, that correctness of the mechanics of the algorithm implementation (as formalized in this work), does not imply that the algorithm will necessarily produce accurate classifications for data points. It could well be that the correct plurality vote of the K nearest neighbors is different from the actual label of a query point.
 
Literature
4.
go back to reference Barthe, G., Forest, J., Pichardie, D., Rusu, V.: Defining and reasoning about recursive functions: a practical tool for the coq proof assistant. In: Hagiya, M., Wadler, P. (eds.) Functional and Logic Programming, pp. 114–129. Springer, Berlin Heidelberg, Berlin, Heidelberg (2006). https://doi.org/10.1007/11737414_9CrossRef Barthe, G., Forest, J., Pichardie, D., Rusu, V.: Defining and reasoning about recursive functions: a practical tool for the coq proof assistant. In: Hagiya, M., Wadler, P. (eds.) Functional and Logic Programming, pp. 114–129. Springer, Berlin Heidelberg, Berlin, Heidelberg (2006). https://​doi.​org/​10.​1007/​11737414_​9CrossRef
5.
go back to reference Barthe, G., Grégoire, B., Zanella Béguelin, S.: Formal certification of code-based cryptographic proofs. In: POPL ’09: Proceedings of the 36th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pp. 90–101. Association for Computing Machinery, New York, NY, USA (2009). https://doi.org/10.1145/1480881.1480894 Barthe, G., Grégoire, B., Zanella Béguelin, S.: Formal certification of code-based cryptographic proofs. In: POPL ’09: Proceedings of the 36th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pp. 90–101. Association for Computing Machinery, New York, NY, USA (2009). https://​doi.​org/​10.​1145/​1480881.​1480894
11.
go back to reference Fan, A.Z., Koutris, P.: Certifiable robustness for nearest neighbor classifiers. In: Olteanu, D., Vortmeier, N. (eds.) 25th International Conference on Database Theory, ICDT 2022, March 29 to April 1, 2022, Edinburgh, UK (Virtual Conference). LIPIcs, vol. 220, pp. 1–20. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2022). https://doi.org/10.4230/LIPICS.ICDT.2022.6 Fan, A.Z., Koutris, P.: Certifiable robustness for nearest neighbor classifiers. In: Olteanu, D., Vortmeier, N. (eds.) 25th International Conference on Database Theory, ICDT 2022, March 29 to April 1, 2022, Edinburgh, UK (Virtual Conference). LIPIcs, vol. 220, pp. 1–20. Schloss Dagstuhl - Leibniz-Zentrum für Informatik (2022). https://​doi.​org/​10.​4230/​LIPICS.​ICDT.​2022.​6
12.
go back to reference Fassina, N., Ranzato, F., Zanella, M.: Robustness certification of k-nearest neighbors. In: Proceedings of the 23rd IEEE International Conference on Data Mining (ICDM’23) (2023) Fassina, N., Ranzato, F., Zanella, M.: Robustness certification of k-nearest neighbors. In: Proceedings of the 23rd IEEE International Conference on Data Mining (ICDM’23) (2023)
14.
go back to reference Gu, R., Shao, Z., Chen, H., Wu, X., Kim, J., Sjöberg, V., Costanzo, D.: Certikos: an extensible architecture for building certified concurrent OS kernels. In: OSDI’16: Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation, pp. 653–669. USENIX Association, USA (2016) Gu, R., Shao, Z., Chen, H., Wu, X., Kim, J., Sjöberg, V., Costanzo, D.: Certikos: an extensible architecture for building certified concurrent OS kernels. In: OSDI’16: Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation, pp. 653–669. USENIX Association, USA (2016)
15.
go back to reference Hall, J.A.: Seven myths of formal methods. IEEE Softw. 7(5), 11–19 (1990)CrossRef Hall, J.A.: Seven myths of formal methods. IEEE Softw. 7(5), 11–19 (1990)CrossRef
16.
go back to reference Hamid, N.A.: (Nearest) neighbors you can rely on: formally verified k-d tree construction and search in Coq. In: SAC ’24: Proceedings of the 39th ACM/SIGAPP Symposium on Applied Computing, pp. 1684–1693. Association for Computing Machinery, New York, NY, USA (2024). https://doi.org/10.1145/3605098.3635960 Hamid, N.A.: (Nearest) neighbors you can rely on: formally verified k-d tree construction and search in Coq. In: SAC ’24: Proceedings of the 39th ACM/SIGAPP Symposium on Applied Computing, pp. 1684–1693. Association for Computing Machinery, New York, NY, USA (2024). https://​doi.​org/​10.​1145/​3605098.​3635960
20.
go back to reference Namiot, D., Ilyushin, E., Chizhov, I.: On a formal verification of machine learning systems. Int. J. Open Inf. Technol. 10(05), 30–34 (2022) Namiot, D., Ilyushin, E., Chizhov, I.: On a formal verification of machine learning systems. Int. J. Open Inf. Technol. 10(05), 30–34 (2022)
22.
go back to reference Selsam, D., Liang, P., Dill, D.L.: Developing bug-free machine learning systems with formal mathematics. In: ICML’17: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 3047–3056. JMLR.org (2017) Selsam, D., Liang, P., Dill, D.L.: Developing bug-free machine learning systems with formal mathematics. In: ICML’17: Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 3047–3056. JMLR.org (2017)
24.
go back to reference Sitawarin, C., Kornaropoulos, E.M., Song, D., Wagner, D.: Adversarial examples for k-nearest neighbor classifiers based on higher-order voronoi diagrams. In: Beygelzimer, A., Dauphin, Y., Liang, P., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34. Curran Associates Inc, Red Hook, NY (2021) Sitawarin, C., Kornaropoulos, E.M., Song, D., Wagner, D.: Adversarial examples for k-nearest neighbor classifiers based on higher-order voronoi diagrams. In: Beygelzimer, A., Dauphin, Y., Liang, P., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34. Curran Associates Inc, Red Hook, NY (2021)
Metadata
Title
Formally Verified Implementation of the K-Nearest Neighbors Classification Algorithm
Authors
Bernny Velasquez
Jessica Herring
Nadeem Abdul Hamid
Copyright Year
2025
DOI
https://doi.org/10.1007/978-3-031-78116-2_9

Premium Partner