Skip to main content

2019 | OriginalPaper | Buchkapitel

Multiple Instance Learning with Bag-Level Randomized Trees

verfasst von : Tomáš Komárek, Petr Somol

Erschienen in: Machine Learning and Knowledge Discovery in Databases

Verlag: Springer International Publishing

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config
loading …

Abstract

Knowledge discovery in databases with a flexible structure poses a great challenge to machine learning community. Multiple Instance Learning (MIL) aims at learning from samples (called bags) represented by multiple feature vectors (called instances) as opposed to single feature vectors characteristic for the traditional data representation. This relaxation turns out to be useful in formulating many machine learning problems including classification of molecules, cancer detection from tissue images or identification of malicious network communications. However, despite the recent progress in this area, the current set of MIL tools still seems to be very application specific and/or burdened with many tuning parameters or processing steps. In this paper, we propose a simple, yet effective tree-based algorithm for solving MIL classification problems. Empirical evaluation against 28 classifiers on 29 publicly available benchmark datasets shows a high level performance of the proposed solution even with its default parameter settings. Data related to this paper are available at: https://​github.​com/​komartom/​MIDatasets.​jl. Code related to this paper is available at: https://​github.​com/​komartom/​BLRT.​jl.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Fußnoten
1
Technically, the value of \(0.\bar{9}\) should be 1 minus the smallest representable value.
 
2
If \(x^{\text {min}}_f\) equals to \(x^{\text {max}}_f\), no splitting rules are generated on feature f.
 
3
Term extremely corresponds to setting \(T=1\).
 
4
When bags are of size one (i.e. \(N_I=N_\mathcal {B}=N\)) and \(T=1\), the complexity is equivalent to the complexity of Extremely randomized trees \(\varTheta (MKN \log N)\).
 
5
Area Under a ROC Curve showing the true positive rate as a function of the false positive rate. AUC is agnostic to class imbalance and classifier’s threshold setting.
 
6
Except for Newsgroup3 where the proposal is competitive with the best prior art.
 
7
The sum in Eq. 2 is not normalized by bag size \(|\mathcal {B}|\) and parameter r can take values from interval \([1,\text {argmax}_{\mathcal {B}\in \mathcal {S}} |\mathcal {B}|)\).
 
8
 
Literatur
6.
Zurück zum Zitat Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A.: Classification and Regression Trees. CRC Press, Boca Raton (1984)MATH Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A.: Classification and Regression Trees. CRC Press, Boca Raton (1984)MATH
14.
Zurück zum Zitat Gehler, P.V., Chapelle, O.: Deterministic annealing for multiple-instance learning. In: Artificial Intelligence and Statistics, pp. 123–130 (2007) Gehler, P.V., Chapelle, O.: Deterministic annealing for multiple-instance learning. In: Artificial Intelligence and Statistics, pp. 123–130 (2007)
15.
Zurück zum Zitat Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Mach. Learn. 63(1), 3–42 (2006)CrossRef Geurts, P., Ernst, D., Wehenkel, L.: Extremely randomized trees. Mach. Learn. 63(1), 3–42 (2006)CrossRef
19.
Zurück zum Zitat Ruffo, G.: Learning single and multiple instance decision trees for computer security applications. University of Turin, Torino (2000) Ruffo, G.: Learning single and multiple instance decision trees for computer security applications. University of Turin, Torino (2000)
21.
22.
Zurück zum Zitat Tax, D.M.J.: A matlab toolbox for multiple-instance learning, version 1.2.2, Faculty EWI, Delft University of Technology, The Netherlands, April 2017 Tax, D.M.J.: A matlab toolbox for multiple-instance learning, version 1.2.2, Faculty EWI, Delft University of Technology, The Netherlands, April 2017
24.
Zurück zum Zitat Zhang, J., Marszałek, M., Lazebnik, S., Schmid, C.: Local features and kernels for classification of texture and object categories: a comprehensive study. Int. J. Comput. Vision 73(2), 213–238 (2007)CrossRef Zhang, J., Marszałek, M., Lazebnik, S., Schmid, C.: Local features and kernels for classification of texture and object categories: a comprehensive study. Int. J. Comput. Vision 73(2), 213–238 (2007)CrossRef
25.
Zurück zum Zitat Zhang, Q., Goldman, S.A.: Em-dd: an improved multiple-instance learning technique. In: Advances in Neural Information Processing Systems, pp. 1073–1080. MIT Press (2001) Zhang, Q., Goldman, S.A.: Em-dd: an improved multiple-instance learning technique. In: Advances in Neural Information Processing Systems, pp. 1073–1080. MIT Press (2001)
Metadaten
Titel
Multiple Instance Learning with Bag-Level Randomized Trees
verfasst von
Tomáš Komárek
Petr Somol
Copyright-Jahr
2019
DOI
https://doi.org/10.1007/978-3-030-10925-7_16

Premium Partner