Skip to main content

2017 | OriginalPaper | Buchkapitel

QuickScorer: Efficient Traversal of Large Ensembles of Decision Trees

verfasst von : Claudio Lucchese, Franco Maria Nardini, Salvatore Orlando, Raffaele Perego, Nicola Tonellotto, Rossano Venturini

Erschienen in: Machine Learning and Knowledge Discovery in Databases

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Machine-learnt models based on additive ensembles of binary regression trees are currently deemed the best solution to address complex classification, regression, and ranking tasks. Evaluating these models is a computationally demanding task as it needs to traverse thousands of trees with hundreds of nodes each. The cost of traversing such large forests of trees significantly impacts their application to big and stream input data, when the time budget available for each prediction is limited to guarantee a given processing throughput. Document ranking in Web search is a typical example of this challenging scenario, where the exploitation of tree-based models to score query-document pairs, and finally rank lists of documents for each incoming query, is the state-of-art method for ranking (a.k.a. Learning-to-Rank). This paper presents QuickScorer, a novel algorithm for the traversal of huge decision trees ensembles that, thanks to a cache- and CPU-aware design, provides a \({\sim } 9 \! \times \) speedup over best competitors.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Literatur
1.
Zurück zum Zitat Asadi, N., Lin, J., de Vries, A.P.: Runtime optimizations for tree-based machine learning models. IEEE TKDE 26(9), 2281–2292 (2014) Asadi, N., Lin, J., de Vries, A.P.: Runtime optimizations for tree-based machine learning models. IEEE TKDE 26(9), 2281–2292 (2014)
2.
Zurück zum Zitat Chen, T., Guestrin, C.: XGBoost: A scalable tree boosting system. In: Proceedings of SIGKDD, pp. 785–794. ACM (2016) Chen, T., Guestrin, C.: XGBoost: A scalable tree boosting system. In: Proceedings of SIGKDD, pp. 785–794. ACM (2016)
3.
Zurück zum Zitat Dato, D., Lucchese, C., Nardini, F.M., Orlando, S., Perego, R., Tonellotto, N., Venturini, R.: Fast ranking with additive ensembles of oblivious and non-oblivious regression trees. ACM TOIS 35(2), 1–31 (2016)CrossRef Dato, D., Lucchese, C., Nardini, F.M., Orlando, S., Perego, R., Tonellotto, N., Venturini, R.: Fast ranking with additive ensembles of oblivious and non-oblivious regression trees. ACM TOIS 35(2), 1–31 (2016)CrossRef
5.
Zurück zum Zitat Gulin, A., Kuralenok, I., Pavlov, D.: Winning The transfer learning track of yahoo!’s learning to rank challenge with YetiRank. In: Yahoo! Learning to Rank Challenge, pp. 63–76 (2011) Gulin, A., Kuralenok, I., Pavlov, D.: Winning The transfer learning track of yahoo!’s learning to rank challenge with YetiRank. In: Yahoo! Learning to Rank Challenge, pp. 63–76 (2011)
6.
Zurück zum Zitat Langley, P., Sage, S.: Oblivious decision trees and abstract cases. In: Working Notes of the AAAI-94 Workshop on Case-Based Reasoning, pp. 113–117. AAAI Press (1994) Langley, P., Sage, S.: Oblivious decision trees and abstract cases. In: Working Notes of the AAAI-94 Workshop on Case-Based Reasoning, pp. 113–117. AAAI Press (1994)
7.
Zurück zum Zitat Lucchese, C., Nardini, F.M., Orlando, S., Perego, R., Tonellotto, N., Venturini, R.: QuickScorer: a fast algorithm to rank documents with additive ensembles of regression trees. In: Proceedings of SIGIR, pp. 73–82. ACM (2015) Lucchese, C., Nardini, F.M., Orlando, S., Perego, R., Tonellotto, N., Venturini, R.: QuickScorer: a fast algorithm to rank documents with additive ensembles of regression trees. In: Proceedings of SIGIR, pp. 73–82. ACM (2015)
8.
Zurück zum Zitat Lucchese, C., Nardini, F.M., Orlando, S., Perego, R., Tonellotto, N., Venturini, R.: Exploiting CPU SIMD extensions to speed-up document scoring with tree ensembles. In: Proceedings of SIGIR, pp. 833–836. ACM (2016) Lucchese, C., Nardini, F.M., Orlando, S., Perego, R., Tonellotto, N., Venturini, R.: Exploiting CPU SIMD extensions to speed-up document scoring with tree ensembles. In: Proceedings of SIGIR, pp. 833–836. ACM (2016)
9.
Zurück zum Zitat Segalovich, I.: Machine learning in search quality at Yandex. Presentation at the Industry Track of SIGIR (2010) Segalovich, I.: Machine learning in search quality at Yandex. Presentation at the Industry Track of SIGIR (2010)
Metadaten
Titel
QuickScorer: Efficient Traversal of Large Ensembles of Decision Trees
verfasst von
Claudio Lucchese
Franco Maria Nardini
Salvatore Orlando
Raffaele Perego
Nicola Tonellotto
Rossano Venturini
Copyright-Jahr
2017
DOI
https://doi.org/10.1007/978-3-319-71273-4_36