research-article

Adaptive kNN using expected accuracy for classification of geo-spatial data

Authors:
Mark Kibanov

University of Kassel, Kassel

University of Kassel, Kassel
View Profile

,
Martin Becker

University of Würzburg, Würzburg

University of Würzburg, Würzburg
View Profile

,
Juergen Mueller

University of Kassel, Kassel

University of Kassel, Kassel
View Profile

,
Martin Atzmueller

Tilburg University (CSAI), Tilburg

Tilburg University (CSAI), Tilburg
View Profile

,
Andreas Hotho

University of Würzburg, Würzburg

University of Würzburg, Würzburg
View Profile

,
Gerd Stumme

University of Kassel, Kassel

University of Kassel, Kassel
View Profile

SAC '18: Proceedings of the 33rd Annual ACM Symposium on Applied ComputingApril 2018Pages 857–865https://doi.org/10.1145/3167132.3167226

Published:09 April 2018Publication History

SAC '18: Proceedings of the 33rd Annual ACM Symposium on Applied Computing

Pages 857–865

ABSTRACT

The k-Nearest Neighbor (kNN) classification approach is conceptually simple - yet widely applied since it often performs well in practical applications. However, using a global constant k does not always provide an optimal solution, e. g., for datasets with an irregular density distribution of data points. This paper proposes an adaptive kNN classifier where k is chosen dynamically for each instance (point) to be classified, such that the expected accuracy of classification is maximized. We define the expected accuracy as the accuracy of a set of structurally similar observations. An arbitrary similarity function can be used to find these observations. We introduce and evaluate different similarity functions. For the evaluation, we use five different classification tasks based on geo-spatial data. Each classification task consists of (tens of) thousands of items. We demonstrate, that the presented expected accuracy measures can be a good estimator for kNN performance, and the proposed adaptive kNN classifier outperforms common kNN and previously introduced adaptive kNN algorithms. Also, we show that the range of considered k can be significantly reduced to speed up the algorithm without negative influence on classification accuracy.

References

Gianni Barlacchi, Marco De Nadai, Roberto Larcher, Antonio Casella, Cristiana Chitic, Giovanni Torrisi, Fabrizio Antonelli, Alessandro Vespignani, Alex Pentland, and Bruno Lepri. 2015. A multi-source dataset of urban life in the city of Milan and the Province of Trentino. Scientific Data 2 (Oct. 2015), 150055.Google Scholar
Martin Becker, Saverio Caminiti, Donato Fiorella, Louise Francis, Pietro Gravino, Mordechai (Muki) Haklay, Andreas Hotho, Vittorio Loreto, Juergen Mueller, Ferdinando Ricchiuti, Vito D. P. Servedio, Alina Sîrbu, and Francesca Tria. 2013. Awareness and Learning in Participatory Noise Sensing. PLOS ONE 8, 12 (2013).Google Scholar
Y. Fang, R. Cheng, W. Tang, S. Maniu, and X. Yang. 2016. Scalable Algorithms for Nearest-Neighbor Joins on Big Trajectory Data. IEEE Transactions on Knowledge and Data Engineering 28, 3 (March 2016), 785--800. Google ScholarDigital Library
Tom Fawcett. 2006. An Introduction to ROC Analysis. Pattern Recogn. Lett. 27, 8 (June 2006), 861--874. Google ScholarDigital Library
Evelyn Fix and Joseph L Hodges Jr. 1951. Discriminatory analysis-nonparametric discrimination: consistency properties. Technical Report. California Univ Berkeley.Google Scholar
B Apexa Kamdar and K Ishan Rajani. 2015. Improved Adaptive K Nearest Neighbor algorithm using MapReduce. International Journal of Science, Engineering and Technology Research (IJSETR) 4, 6 (2015).Google Scholar
David J. Ketchen and Christopher L. Shook. 1996. The Application of Cluster Analysis in Strategic Management Research: an Analysis and Critique. Strategic Management Journal 17, 6 (1996), 441--458. <441::AID-SMJ819>3.0.CO;2-GGoogle ScholarCross Ref
S. B. Kotsiantis. 2007. Supervised Machine Learning: A Review of Classification Techniques. Informatica 31, 3 (2007). http://www.informatica.si/index.php/informatica/article/view/148Google Scholar
Stefanos Ougiaroglou, Alexandros Nanopoulos, Apostolos N. Papadopoulos, Yannis Manolopoulos, and Tatjana Welzer-Druzovec. 2007. Adaptive k-Nearest-Neighbor Classification Using a Dynamic Number of Nearest Neighbors. In 11th East-European Conference on Advances in Databases and Information Systems (ADBIS 2007). Springer, 66--82. Google ScholarDigital Library
Vandana Roy and Sriganesh Madhvanath. 2008. A Skew-tolerant Strategy and Confidence Measure for k-NN Classification of Online Handwritten Characters. Technical Report 4. Tech. rep., HP Laboratories. 48.Google Scholar
Robin Senge, Stefan Bösner, Krzysztof Dembczynski, Jörg Haasenritter, Oliver Hirsch, Norbert Donner-Banzhoff, and Eyke Hüllermeier. 2014. Reliable Classification: Learning Classifiers that Distinguish Aleatoric and Epistemic Uncertainty. Inform. Sci. 255 (2014), 16--29. Google ScholarDigital Library
Shiliang Sun and Rongqing Huang. 2010. An Adaptive k-Nearest Neighbor Algorithm. In 7th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2010), Vol. 1. IEEE, 91--94.Google ScholarCross Ref
Jigang Wang, Predrag Neskovic, and Leon N. Cooper. 2006. Neighborhood Size Selection in the k-Nearest-Neighbor Rule Using Statistical Confidence. Pattern Recogn. 39, 3 (2006), 417--423. Google ScholarDigital Library
Jigang Wang, Predrag Neskovic, and Leon N. Cooper. 2007. Improving Nearest Neighbor Rule with a Simple Adaptive Distance Measure. Pattern Recogn. Lett. 28, 2 (2007), 207--213. Google ScholarDigital Library

Index Terms

Adaptive kNN using expected accuracy for classification of geo-spatial data
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Instance-based learning

Recommendations

Classification of Gunshots with KNN Classifier
EATIS '18: Proceedings of the Euro American Conference on Telematics and Information Systems

In this article a system of detection and classification of gunshots is proposed, which consists of using the KNN classifier in the presence and absence of Gaussian additive noise. The results guarantee that the classifier reaches up to 94 % of ...
Read More
Improving Classification Accuracy on Imbalanced Data by Ensembling Technique

Prediction using classification techniques is one of the fundamental feature widely applied in various fields. Classification accuracy is still a great challenge due to data imbalance problem. The increased volume of data is also posing a challenge for ...
Read More
Development of predictive model of diabetic using supervised machine learning classification algorithm of ensemble voting

Predicting the health status of patients suffering from diabetic is an important task in the health sector because the medical history of diabetic evidenced that it is a slow killer. If data collection is enough, suitable, and noise-free, such ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SAC '18: Proceedings of the 33rd Annual ACM Symposium on Applied Computing
April 2018
2327 pages
ISBN:9781450351911
DOI:10.1145/3167132
Conference Chairs:
Hisham M. Haddad
Kennesaw State University
,
Roger L. Wainwright
University of Tulsa
,
Richard Chbeir
University of Pau & Pays Adour, France
Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 9 April 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
adaptive algorithms
classification
geo-spatial data
kNN
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,650of6,669submissions,25%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 8
  Total Citations
  View Citations
- 204
  Total Downloads
- Downloads (Last 12 months)19
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Adaptive kNN using expected accuracy for classification of geo-spatial data

SAC '18: Proceedings of the 33rd Annual ACM Symposium on Applied Computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Classification of Gunshots with KNN Classifier

Improving Classification Accuracy on Imbalanced Data by Ensembling Technique

Development of predictive model of diabetic using supervised machine learning classification algorithm of ensemble voting

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Adaptive kNN using expected accuracy for classification of geo-spatial data

SAC '18: Proceedings of the 33rd Annual ACM Symposium on Applied Computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Classification of Gunshots with KNN Classifier

Improving Classification Accuracy on Imbalanced Data by Ensembling Technique

Development of predictive model of diabetic using supervised machine learning classification algorithm of ensemble voting

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media