nach oben

Data Mining and Knowledge Discovery

Erschienen in:

21.06.2020

Treant: training evasion-aware decision trees

verfasst von: Stefano Calzavara, Claudio Lucchese, Gabriele Tolomei, Seyum Assefa Abebe, Salvatore Orlando

Erschienen in: Data Mining and Knowledge Discovery | Ausgabe 5/2020

Einloggen

Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.

search-config

KI-gestützte Suche

Aus

Abstract

Despite its success and popularity, machine learning is now recognized as vulnerable to evasion attacks, i.e., carefully crafted perturbations of test inputs designed to force prediction errors. In this paper we focus on evasion attacks against decision tree ensembles, which are among the most successful predictive models for dealing with non-perceptual problems. Even though they are powerful and interpretable, decision tree ensembles have received only limited attention by the security and machine learning communities so far, leading to a sub-optimal state of the art for adversarial learning techniques. We thus propose Treant, a novel decision tree learning algorithm that, on the basis of a formal threat model, minimizes an evasion-aware loss function at each step of the tree construction. Treant is based on two key technical ingredients: robust splitting and attack invariance, which jointly guarantee the soundness of the learning process. Experimental results on publicly available datasets show that Treant is able to generate decision tree ensembles that are at the same time accurate and nearly insensitive to evasion attacks, outperforming state-of-the-art adversarial learning techniques.

Vorheriger Artikel Scalable attack on graph data by injecting vicious nodes

Nächster Artikel Large-scale network motif analysis using compression

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

über 102.000 Bücher
über 537 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Finance + Banking
Management + Führung
Marketing + Vertrieb
Maschinenbau + Werkstoffe
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 390 Zeitschriften

aus folgenden Fachgebieten:

Automobil + Motoren
Bauwesen + Immobilien
Business IT + Informatik
Elektrotechnik + Elektronik
Energie + Nachhaltigkeit
Maschinenbau + Werkstoffe

Jetzt Wissensvorsprung sichern!

Jetzt informieren

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

über 67.000 Bücher
über 340 Zeitschriften

aus folgenden Fachgebieten:

Bauwesen + Immobilien
Business IT + Informatik
Finance + Banking
Management + Führung
Marketing + Vertrieb
Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Jetzt informieren

The name comes from the role playing game “Dungeons & Dragons”, where it identifies giant tree-like creatures.

For simplicity, we only consider numerical features over \(\mathbb {R}\). However, our framework can be readily generalized to other use cases, e.g., categorical or ordinal features, which we support in our implementation and experiments.

\(\mathcal {L} (t,\mathcal {D}) = (-2+1)^2+(-1+1)^2+(-1-0)^2+4\cdot (2-2)^2=2\).

\(\mathcal {L} ^A(t, \mathcal {D}) = (-2+1)^2+(-1+1)^2+(2-0)^2+4\cdot (2-2)^2=5\).

\(\mathcal {L} ^A(t,\mathcal {D}) = (-2+1.5)^2+(-1+1.5)^2+(0-1.6)^2+4\cdot (2-1.6)^2=3.7\).

\(\mathcal {L} ^A(t,\mathcal {D}) = (-2+1.5)^2+(-1+1.5)^2+(0-1.5)^2+(2-1)^2+3\cdot (2-2)^2=3.75\).

\(\mathcal {L} ^A(t,\mathcal {D}) = (-2+1.5)^2+(-1+1.5)^2+(0-1.5)^2+(2-1.5)^2+3\cdot (2-2)^2=3\).

The pointwise maximum and the sum of convex functions preserve convexity.

We use the symbol \(\lessgtr \) to stand for either \(\le \) or \(\ge \) when the distinction is unimportant.

https://github.com/microsoft/LightGBM.

https://archive.ics.uci.edu/ml/datasets/census+income.

https://www.kaggle.com/c/uci-wine-quality-dataset/data.

https://archive.ics.uci.edu/ml/datasets/default+of+credit+card+clients.

https://ieee-dataport.org/open-access/malware-analysis-datasets-top-1000-pe-imports.

The wine dataset was originally conceived for a multiclass classification problem; we turned that into a binary one, where the positive class identifies good-quality wines (i.e., those whose quality is at least 6, on a 0–10 scale) and the negative class contains the remaining instances.

Biggio B, Roli F (2018) Wild patterns: ten years after the rise of adversarial machine learning. Pattern Recognit 84:317–331CrossRef

Biggio B, Fumera G, Roli F (2010) Multiple classifier systems for robust classifier design in adversarial environments. Int J Mach Learn Cybern 1(1–4):27–41CrossRef

Biggio B, Nelson B, Laskov P (2011) Support vector machines under adversarial label noise. In: ACML Asian conference on machine learning, pp 97–112

Biggio B, Corona I, Maiorca D, Nelson B, Srndic N, Laskov P, Giacinto G, Roli F (2013) Evasion attacks against machine learning at test time. In: Machine learning and knowledge discovery in databases—European conference, ECML PKDD, pp 387–402

Biggio B, Fumera G, Roli F (2014) Security evaluation of pattern classifiers under attack. IEEE Trans Knowl Data Eng 26(4):984–996CrossRef

Breiman L (2001) Random forests. Mach Learn 45(1):5–32CrossRef

Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Wadsworth. ISBN 0-534-98053-8

Calzavara S, Lucchese C, Tolomei G (2019) Adversarial training of gradient-boosted decision trees. In: Zhu W, Tao D, Cheng X, Cui P, Runden-steiner EA, Carmel D, He Q, Yu JX (eds), Proceedings of the 28th ACM international conference on information and knowledge management, CIKM 2019, Beijing, China, November 3–7, 2019, pp 2429–2432. ACM

Carlini N, Wagner DA (2017) Towards evaluating the robustness of neural networks. In: IEEE symposium on security and privacy S&P, pp 39–57

Chen H, Zhang H, Boning DS, Hsieh C (2019) Robust decision trees against adversarial examples. In: International conference on machine learning ICML, pp 1122–1131

Chollet F (2017) Deep learning with python, 1st edn. Manning Publications Co., Greenwich, CT, USA. ISBN 1617294438, 9781617294433

Dang H, Huang Y, Chang E (2017) Evading classifiers by morphing in the dark. In: ACM SIGSAC conference on computer and communications security, pp 119–133

Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 29:1189–1232MathSciNetCrossRef

Goodfellow IJ, Shlens J, Szegedy C (2015) Explaining and harnessing adversarial examples. In: International conference on learning representations ICLR

Gu S, Rigazio L (2015) Towards deep neural network architectures robust to adversarial examples. In: International conference on learning representations ICLR, Workshop Track Proceedings

He W, Wei J, Chen X, Carlini N, Song D (2017) Adversarial example defense: ensembles of weak defenses are not strong. In: USENIX workshop on offensive technologies WOOT

Hershkop S, Stolfo SJ (2005) Combining email models for false positive reduction. In: ACM SIGKDD international conference on knowledge discovery in data mining, pp 98–107. https://doi.org/10.1145/1081870.1081885

Huang L, Joseph AD, Nelson B, Rubinstein BIP, Tygar JD (2011) Adversarial machine learning. In: ACM workshop on security and artificial intelligence AISec, pp 43–58

Hunt EB, Marin J, Stone PJ (1966) Experiments in Induction. Academic Press, New York

Hyafil L, Rivest RL (1976) Constructing optimal binary decision trees is np-complete. Inf Process Lett 5(1):15–17MathSciNetCrossRef

Kantchelian A, Tygar JD, Joseph AD (2016) Evasion and hardening of tree ensemble classifiers. In: International conference on machine learning ICML, pp 2387–2396

Kraft D (1994) Algorithm 733: Tomp-fortran modules for optimal control calculations. ACM Trans Math Softw TOMS 20(3):262–281. https://doi.org/10.1145/192115.192124 ISSN 0098-3500

Lash MT, Lin Q, Street WN, Robinson JG (2017) A budget-constrained inverse classification framework for smooth classifiers. In: IEEE international conference on data mining workshops ICDMW, pp 1184–1193

Lowd D, Meek C (2005) Adversarial learning. In: ACM SIGKDD international conference on knowledge discovery in data mining, pp 641–647

Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A (2018) Towards deep learning models resistant to adversarial attacks. In: International conference on learning representations ICLR

Mehta M, Agrawal R, Rissanen J (1996) Sliq: a fast scalable classifier for data mining. In: International conference on extending database technology, pp 18–32. Springer

Moosavi-Dezfooli S, Fawzi A, Frossard P (2016) Deepfool: a simple and accurate method to fool deep neural networks. In: IEEE conference on computer vision and pattern recognition CVPR, pp 2574–2582

Murthy SK (1998) Automatic construction of decision trees from data: a multi-disciplinary survey. Data Min Knowl Discov 2(4):345–389CrossRef

Nawar S, Mouazen A (2017) Comparison between random forests, artificial neural networks and gradient boosted machines methods of on-line vis-nir spectroscopy measurements of soil total nitrogen and total carbon. Sensors 17(10):2428CrossRef

Nelson B, Rubinstein BIP, Huang L, Joseph AD, Lau S, Lee SJ, Rao S, Tran A, Tygar JD (2010) Near-optimal evasion of convex-inducing classifiers. In: International conference on artificial intelligence and statistics AISTATS, pp 549–556

Nguyen AM, Yosinski J, Clune J (2015) Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In: IEEE conference on computer vision and pattern recognition CVPR, pp 427–436

Papernot N, McDaniel PD, Jha S, Fredrikson M, Celik ZB, Swami A (2016a) The limitations of deep learning in adversarial settings. In IEEE European symposium on security and privacy EuroS&P, pp 372–387

Papernot N, McDaniel PD, Wu X, Jha S, Swami A (2016b) Distillation as a defense to adversarial perturbations against deep neural networks. In IEEE symposium on security and privacy S&P, pp 582–597

Perdisci R, Gu G, Lee W (2006) Using an ensemble of one-class SVM classifiers to harden payload-based anomaly detection systems. In: IEEE international conference on data mining ICDM, pp 488–498. https://doi.org/10.1109/ICDM.2006.165

Quinlan JR (1986) Induction of decision trees. Mach Learn 1(1):81–106

Srndic N, Laskov P (2014) Practical evasion of a learning-based classifier: a case study. In: IEEE symposium on security and privacy S&P, pp 197–211

Szegedy C, Zaremba W, Sutskever I, Bruna J, Erhan D, Goodfellow IJ, Fergus R (2014) Intriguing properties of neural networks. In: International conference on learning representations ICLR

Tolomei G, Silvestri F, Haines A, Lalmas M (2017) Interpretable predictions of tree-based ensembles via actionable feature tweaking. In: ACM SIGKDD international conference on knowledge discovery in data mining, pp 465–474

Tran TP, Tsai P, Jan T (2008) An adjustable combination of linear regression and modified probabilistic neural network for anti-spam filtering. In: International conference on pattern recognition ICPR, pp 1–4. https://doi.org/10.1109/ICPR.2008.4761358

Tyler DE (2008) Robust Statistics: Theory and Methods. J Am Stat Assoc 103(482):888–889. https://doi.org/10.1198/jasa.2008.s239CrossRef

Vapnik V (1992) Principles of Risk Minimization for Learning Theory. Adv Neural Inf Process Syst 4:831–838

Xiao H, Biggio B, Nelson B, Xiao H, Eckert C, Roli F (2015) Support vector machines under adversarial label contamination. Neurocomputing 160:53–62CrossRef

Zhang F, Wang Y, Liu S, Wang H (2020) Decision-based evasion attacks on tree ensemble classifiers. World Wide Web, pp 1–21

Titel: Treant: training evasion-aware decision trees
verfasst von: Stefano Calzavara
Claudio Lucchese
Gabriele Tolomei
Seyum Assefa Abebe
Salvatore Orlando
Publikationsdatum: 21.06.2020
Verlag: Springer US
Erschienen in: Data Mining and Knowledge Discovery / Ausgabe 5/2020
Print ISSN: 1384-5810
Elektronische ISSN: 1573-756X
DOI: https://doi.org/10.1007/s10618-020-00694-9

Springer Professional

Abstract

Bitte loggen Sie sich ein, um Zugang zu Ihrer Lizenz zu erhalten.

Sie haben noch keine Lizenz? Dann Informieren Sie sich jetzt über unsere Produkte:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Weitere Artikel der Ausgabe 5/2020

ROCKET: exceptionally fast and accurate time series classification using random convolutional kernels

Introduction to the special issue of the ECML PKDD 2020 journal track

CrawlSN: community-aware data acquisition with maximum willingness in online social networks

Exceptional spatio-temporal behavior mining through Bayesian non-parametric modeling

Active learning for hierarchical multi-label classification

Large-scale network motif analysis using compression

Premium Partner