Skip to main content
main-content
Top

Hint

Swipe to navigate through the articles of this issue

16-11-2020 | Regular Paper | Issue 2/2021

Knowledge and Information Systems 2/2021

BestNeighbor: efficient evaluation of kNN queries on large time series databases

Journal:
Knowledge and Information Systems > Issue 2/2021
Authors:
Oleksandra Levchenko, Boyan Kolev, Djamel-Edine Yagoubi, Reza Akbarinia, Florent Masseglia, Themis Palpanas, Dennis Shasha, Patrick Valduriez
Important notes

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Abstract

This paper presents parallel solutions (developed based on two state-of-the-art algorithms iSAX and sketch) for evaluating k nearest neighbor queries on large databases of time series, compares them based on various measures of quality and time performance, and offers a tool that uses the characteristics of application data to determine which algorithm to choose for that application and how to set the parameters for that algorithm. Specifically, our experiments show that: (i) iSAX and its derivatives perform best in both time and quality when the time series can be characterized by a few low-frequency Fourier Coefficients, a regime where the iSAX pruning approach works well. (ii) iSAX performs significantly less well when high-frequency Fourier Coefficients have much of the energy of the time series. (iii) A random projection approach based on sketches by contrast is more or less independent of the frequency power spectrum. The experiments show the close relationship between pruning ratio and time for exact iSAX as well as between pruning ratio and the quality of approximate iSAX. Our toolkit analyzes typical time series of an application (i) to determine optimal segment sizes for iSAX and (ii) when to use Parallel Sketches instead of iSAX. Our algorithms have been implemented using Spark, evaluated over a cluster of nodes, and have been applied to both real and synthetic data. The results apply to any databases of numerical sequences, whether or not they relate to time.

Please log in to get access to this content

To get access to this content you need the following product:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 69.000 Bücher
  • über 500 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Umwelt
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Testen Sie jetzt 30 Tage kostenlos.

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 58.000 Bücher
  • über 300 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Testen Sie jetzt 30 Tage kostenlos.

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 50.000 Bücher
  • über 380 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Umwelt
  • Maschinenbau + Werkstoffe




Testen Sie jetzt 30 Tage kostenlos.

Literature
About this article

Other articles of this Issue 2/2021

Knowledge and Information Systems 2/2021 Go to the issue

Premium Partner

    Image Credits