Abstract
Machine Learning models are prevalent in critical human-related decision making, such as resume filtering and loan applications. Refused individuals naturally ask what could change the decision, should they reapply. This question is hard for the model owner to answer: first, the model is typically complex and not easily interpretable; second, models may be updated periodically; and last, attributes of the individual seeking approval are apt to change in time. While each of these challenges have been extensively studied in isolation, their conjunction has not.
To this end, we propose a novel framework that allows users to devise a plan of action to individuals in presence of Machine Learning classification, where both the ML model and the user properties are expected to change over time. Our technical solution is currently confined to a particular yet important class of models, namely those of tree-based ensembles (Random Forests, Gradient Boosted trees). In this setting it uniquely combines state-of-the-art solutions for single model interpretation, domain adaptation techniques for predicting future models, and constraint databases to represent and query the space of possible actions. We devise efficient algorithms that leverage these foundations in a novel solution, and experimentally show that they are effective in proposing useful and actionable steps leading to the desired classification.
- S. Bach, A. Binder, G. Montavon, F. Klauschen, K.-R. Müller, and W. Samek. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PloS one, 10(7):e0130140, 2015.Google ScholarCross Ref
- M. Baudinet, M. Niezette, and P. Wolper. On the representation of infinite temporal data and queries. In Proc. of the 10th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, pages 280--290. Denver, 1991.Google ScholarDigital Library
- S. Ben-David, J. Blitzer, K. Crammer, A. Kulesza, F. Pereira, and J. W. Vaughan. A theory of learning from different domains. Machine learning, 79(1-2):151--175, 2010.Google ScholarDigital Library
- S. Ben-David, J. Blitzer, K. Crammer, and F. Pereira. Analysis of representations for domain adaptation. In Advances in neural information processing systems, pages 137--144, 2007.Google Scholar
- M. Benedikt, G. Dong, L. Libkin, and L. Wong. Relational expressive power of constraint query languages. Journal of the ACM (JACM), 45(1):1--34, 1998.Google Scholar
- N. Boer, D. Deutch, N. Frost, and T. Milo. Just in time: Personal temporal insights for altering model decisions. In 2019 IEEE 35th International Conference on Data Engineering (ICDE), pages 1988--1991. IEEE, 2019.Google ScholarCross Ref
- N. Boer, D. Deutch, N. Frost, and T. Milo. Personal insights for altering decisions of tree-based ensembles over time (technical report). http://bit.ly/2YCceoP, 2019.Google Scholar
- L. Breiman. Bagging predictors. Machine learning, 24(2):123--140, 1996.Google ScholarCross Ref
- J.-H. Byon and P. Z. Revesz. Disco: A constraint database system with sets. In ESPRIT WG CONTESSA Workshop on Constraint Databases and Applications, pages 68--83. Springer, 1995.Google Scholar
- Y. S. Chan and H. T. Ng. Word sense disambiguation with distribution estimation. In IJCAI, volume 5, pages 1010--5, 2005.Google Scholar
- Y. S. Chan and H. T. Ng. Estimating class priors in domain adaptation for word sense disambiguation. In Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics, pages 89--96. Association for Computational Linguistics, 2006.Google ScholarDigital Library
- A. Datta, S. Sen, and Y. Zick. Algorithmic transparency via quantitative input influence: Theory and experiments with learning systems. In Security and Privacy (SP), 2016 IEEE Symposium on, pages 598--617. IEEE, 2016.Google ScholarCross Ref
- D. Deutch and N. Frost. CEC: Constraints based explanation for classifications. In Proceedings of the 27th ACM International Conference on Information and Knowledge Management, pages 1879--1882. ACM, 2018.Google ScholarDigital Library
- D. Deutch and N. Frost. Constraints-based explanations of classifications. In 2019 IEEE 35th International Conference on Data Engineering (ICDE), pages 530--541. IEEE, 2019.Google ScholarCross Ref
- F. Doshi-Velez and B. Kim. Towards a rigorous science of interpretable machine learning. 2017.Google Scholar
- Y. Freund and R. E. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of computer and system sciences, 55(1):119--139, 1997.Google Scholar
- J. H. Friedman. Greedy function approximation: a gradient boosting machine. Annals of statistics, pages 1189--1232, 2001.Google Scholar
- F. Geerts. Constraint databases. In Encyclopedia of Database Systems, pages 585--586. Springer New York, 2018.Google ScholarCross Ref
- F. Geerts, S. Haesevoets, and B. Kuijpers. A theory of spatio-temporal database queries. In International Workshop on Database Programming Languages, pages 198--212. Springer, 2001.Google Scholar
- F. Geerts and B. Kuijpers. Real algebraic geometry and constraint databases. In Handbook of Spatial Logics, pages 799--856. Springer, 2007.Google ScholarCross Ref
- X. Glorot, A. Bordes, and Y. Bengio. Domain adaptation for large-scale sentiment classification: A deep learning approach. In Proceedings of the 28th international conference on machine learning (ICML-11), pages 513--520, 2011.Google ScholarDigital Library
- S. Grumbach, P. Rigaux, M. Scholl, and L. Segoufin. Dedale, a spatial constraint database. In International Workshop on Database Programming Languages, pages 38--59. Springer, 1997.Google ScholarDigital Library
- S. Grumbach and J. Su. Queries with arithmetical constraints. Theoretical Computer Science, 173(1):151--181, 1997.Google ScholarDigital Library
- S. Grumbach, J. Su, and C. Tollu. Linear constraint databases. 1995.Google Scholar
- T. K. Ho. Random decision forests. In Document analysis and recognition, 1995., proceedings of the third international conference on, pages 278--282, 1995.Google ScholarDigital Library
- Home credit data. https://www.kaggle.com/c/home-credit-default-risk/data.Google Scholar
- J. Huysmans, B. Baesens, and J. Vanthienen. Using rule extraction to improve the comprehensibility of predictive models. 2006.Google Scholar
- J. Huysmans, K. Dejaeger, C. Mues, J. Vanthienen, and B. Baesens. An empirical evaluation of the comprehensibility of decision table, tree and rule based predictive models. Decision Support Systems, 51:141--154, 2011.Google ScholarDigital Library
- N. Japkowicz and S. Stephen. The class imbalance problem: A systematic study. Intelligent data analysis, 6(5):429--449, 2002.Google Scholar
- J. Jiang. A literature survey on domain adaptation of statistical classifiers. 3:1--12, 2008. http://sifaka.es.uiuc.edu/jiang4/domainadaptation/survey.Google Scholar
- M. Kachuee, S. Fazeli, and M. Sarrafzadeh. Ecg heartbeat classification: A deep transferable representation. In 2018 IEEE International Conference on Healthcare Informatics (ICHI), pages 443--444. IEEE, 2018.Google ScholarCross Ref
- Kaggle. The state of ml and data science 2017. https://www.kaggle.com/surveys/2017, 2017.Google Scholar
- P. C. Kanellakis, G. M. Kuper, and P. Z. Revesz. Constraint query languages. Journal of Computer and System Sciences, 51(1):26--52, 1995.Google ScholarDigital Library
- M. Koubarakis. Complexity results for first-order theories of temporal constraints. In Principles of Knowledge Representation and Reasoning, pages 379--390. Elsevier, 1994.Google ScholarCross Ref
- M. Koubarakis. The complexity of query evaluation in indefinite temporal constraint databases. Theoretical Computer Science, 171(1-2):25--60, 1997.Google ScholarDigital Library
- B. Kuijpers and W. Othman. Trajectory databases: Data models, uncertainty and complete query languages. 2010.Google Scholar
- B. Kulis, K. Saenko, and T. Darrell. What you saw is not what you get: Domain adaptation using asymmetric kernel transforms. In CVPR 2011, pages 1785--1792. IEEE, 2011.Google ScholarDigital Library
- A. Kumagai and T. Iwata. Learning future classifiers without additional data. In Thirtieth AAAI Conference on Artificial Intelligence, 2016.Google ScholarDigital Library
- A. Kumagai and T. Iwata. Learning non-linear dynamics of decision boundaries for maintaining classification performance. In Thirty-First AAAI Conference on Artificial Intelligence, 2017.Google ScholarDigital Library
- G. Kuper, L. Libkin, and J. Paredaens. Constraint databases. Springer Science & Business Media, 2013.Google Scholar
- C. H. Lampert. Predicting the future behavior of a time-varying probability distribution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 942--950, 2015.Google ScholarCross Ref
- T. Laugel, M.-J. Lesot, C. Marsala, X. Renard, and M. Detyniecki. Inverse classification for comparison-based interpretability in machine learning. arXiv preprint arXiv:1712.08443, 2017.Google Scholar
- Lending club data. https://www.lendingclub.com/info/download-data.action.Google Scholar
- Y. Lin, Y. Lee, and G. Wahba. Support vector machines for classification in nonstandard situations. Machine learning, 46(1-3):191--202, 2002.Google ScholarDigital Library
- S. M. Lundberg, G. Erion, H. Chen, A. DeGrave, J. M. Prutkin, B. Nair, R. Katz, J. Himmelfarb, N. Bansal, and S.-I. Lee. Explainable ai for trees: From local explanations to global understanding. arXiv preprint arXiv:1905.04610, 2019.Google Scholar
- S. M. Lundberg, G. G. Erion, and S.-I. Lee. Consistent individualized feature attribution for tree ensembles. arXiv preprint arXiv:1802.03888, 2018.Google Scholar
- S. M. Lundberg and S.-I. Lee. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems, pages 4765--4774, 2017.Google ScholarDigital Library
- C. Molnar. Interpretable Machine Learning. 2019. https://christophm.github.io/interpretable-ml-book/.Google Scholar
- J. Paredaens, J. Van den Bussche, and D. Van Gucht. Towards a theory of spatial database queries. In Proceedings of the thirteenth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems, pages 279--288. ACM, 1994.Google ScholarDigital Library
- A. Pnueli. The temporal logic of programs. In 18th Annual Symposium on Foundations of Computer Science (sfcs 1977), pages 46--57. IEEE, 1977.Google ScholarDigital Library
- F. Poursabzi-Sangdeh, D. G. Goldstein, J. M. Hofman, J. W. Vaughan, and H. Wallach. Manipulating and measuring model interpretability. In NIPS 2017 Transparent and Interpretable Machine Learning in Safety Critical Environments Workshop, 2017.Google Scholar
- P. Z. Revesz. Constraint databases: A survey. In International Workshop on Semantics in Databases, pages 209--246. Springer, 1995.Google Scholar
- M. T. Ribeiro, S. Singh, and C. Guestrin. Why should i trust you?: Explaining the predictions of any classifier. pages 1135--1144, 2016.Google Scholar
- A. Shrikumar, P. Greenside, and A. Kundaje. Learning important features through propagating activation differences. arXiv preprint arXiv:1704.02685, 2017.Google Scholar
- E. Tzeng, J. Hoffman, T. Darrell, and K. Saenko. Simultaneous deep transfer across domains and tasks. In Proceedings of the IEEE International Conference on Computer Vision, pages 4068--4076, 2015.Google ScholarDigital Library
- R. van der Meyden. The complexity of querying indefinite data about linearly ordered domains. Journal of Computer and System Sciences, 54(1):113--135, 1997.Google ScholarDigital Library
- A. Vellido, J. D. Martín-Guerrero, and P. J. Lisboa. Making machine learning models interpretable. In ESANN, volume 12, pages 163--172, 2012.Google Scholar
- S. Wachter, B. Mittelstadt, and C. Russell. Counterfactual explanations without opening the black box: Automated decisions and the gpdr. Harv. JL & Tech., 31:841, 2017.Google Scholar
- M. Wang and W. Deng. Deep visual domain adaptation: A survey. Neurocomputing, 312:135--153, 2018.Google ScholarDigital Library
- Y. Wang, L. Wang, Y. Li, D. He, W. Chen, and T.-Y. Liu. A theoretical analysis of ndcg ranking measures. In Proceedings of the 26th annual conference on learning theory (COLT 2013), volume 8, page 6, 2013.Google Scholar
Recommendations
Decision tree ensembles based on kernel features
A classifier ensemble is a set of classifiers whose individual decisions are combined to classify new examples. Classifiers, which can represent complex decision boundaries are accurate. Kernel functions can also represent complex decision boundaries. ...
Software defect prediction using tree-based ensembles
PROMISE 2020: Proceedings of the 16th ACM International Conference on Predictive Models and Data Analytics in Software EngineeringSoftware defect prediction is an active research area in software engineering. Accurate prediction of software defects assists software engineers in guiding software quality assurance activities. In machine learning, ensemble learning has been proven to ...
Tree-based classifier ensembles for early detection method of diabetes: an exploratory study
Diabetes is a lifestyle-driven disease which has become a critical health issue worldwide. In this paper, we conduct an exploratory study about early detection method of diabetes mellitus using various ensemble learning techniques. Eight tree-based ...
Comments