ABSTRACT
To design and develop AI-based systems that users and the larger public can justifiably trust, one needs to understand how machine learning technologies impact trust. To guide the design and implementation of trusted AI-based systems, this paper provides a systematic approach to relate considerations about trust from the social sciences to trustworthiness technologies proposed for AI-based services and products. We start from the ABI+ (Ability, Benevolence, Integrity, Predictability) framework augmented with a recently proposed mapping of ABI+ on qualities of technologies that support trust. We consider four categories of trustworthiness technologies for machine learning, namely these for Fairness, Explainability, Auditability and Safety (FEAS) and discuss if and how these support the required qualities. Moreover, trust can be impacted throughout the life cycle of AI-based systems, and we therefore introduce the concept of Chain of Trust to discuss trustworthiness technologies in all stages of the life cycle. In so doing we establish the ways in which machine learning technologies support trusted AI-based systems. Finally, FEAS has obvious relations with known frameworks and therefore we relate FEAS to a variety of international 'principled AI' policy and technology frameworks that have emerged in recent years.
- Martin Abadi, Andy Chu, Ian Goodfellow, H Brendan McMahan, Ilya Mironov, Kunal Talwar, and Li Zhang. 2016. Deep learning with differential privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. ACM, 308--318.Google ScholarDigital Library
- Eleanor Ainy, Pierre Bourhis, Susan B Davidson, Daniel Deutch, and Tova Milo. 2015. Approximated summarization of data provenance. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. ACM, 483--492.Google ScholarDigital Library
- Mhairi Aitken, Sarah Cunningham-Burley, and Claudia Pagliari. 2016. Moving from trust to trustworthiness: Experiences of public engagement in the Scottish Health Informatics Programme. Science and Public Policy 43, 5 (2016), 713--723.Google ScholarCross Ref
- Naomi S Altman. 1992. An introduction to kernel and nearest-neighbor nonparametric regression. The American Statistician 46, 3 (1992), 175--185.Google ScholarCross Ref
- Peter Andras, Lukas Esterle, Michael Guckert, The Anh Han, Peter R Lewis, Kristina Milanovic, Terry Payne, Cedric Perret, Jeremy Pitt, Simon T Powers, and others. 2018. Trusting Intelligent Machines: Deepening Trust Within Socio-Technical Systems. IEEE Technology and Society Magazine 37, 4 (2018), 76--83.Google ScholarCross Ref
- Susan Athey and Guido W Imbens. 2015. Machine learning methods for estimating heterogeneous causal effects. stat 1050, 5 (2015), 1--26.Google Scholar
- A Avizienis, J. Laprie, B Randell, and C Landwehr. 2004. Basic concepts and taxonomy of dependable and secure computing. IEEE Transactions on Dependable and Secure Computing 1, 1 (1 2004), 11--33. Google ScholarDigital Library
- Akanksha Baid, Wentao Wu, Chong Sun, AnHai Doan, and Jeffrey F Naughton. 2015. On Debugging Non-Answers in Keyword Search Systems.. In EDBT. 37--48.Google Scholar
- Robert Bartlett, Adair Morse, Richard Stanton, and Nancy Wallace. 2018. Consumer-Lending Discrimination in the Era of FinTech. Unpublished working paper. University of California, Berkeley (2018).Google Scholar
- Nicole Bidoit, Melanie Herschel, and Katerina Tzompanaki. 2014. Query-based why-not provenance with nedexplain. In Extending database technology (EDBT).Google Scholar
- Battista Biggio, Igino Corona, Giorgio Fumera, Giorgio Giacinto, and Fabio Roli. 2011. Bagging classifiers for fighting poisoning attacks in adversarial classification tasks. In International workshop on multiple classifier systems. Springer, 350--359.Google ScholarDigital Library
- Battista Biggio, Igino Corona, Blaine Nelson, Benjamin I P Rubinstein, Davide Maiorca, Giorgio Fumera, Giorgio Giacinto, and Fabio Roli. 2014. Security evaluation of support vector machines in adversarial environments. In Support Vector Machines Applications. Springer, 105--153.Google Scholar
- Battista Biggio, Giorgio Fumera, and Fabio Roli. 2010. Multiple classifier systems for robust classifier design in adversarial environments. International Journal of Machine Learning and Cybernetics 1, 1-4 (2010), 27--41.Google ScholarCross Ref
- Battista Biggio, Giorgio Fumera, and Fabio Roli. 2014. Security evaluation of pattern classifiers under attack. IEEE transactions on knowledge and data engineering 26, 4 (2014), 984--996.Google Scholar
- Battista Biggio, Blaine Nelson, and Pavel Laskov. 2011. Support vector machines under adversarial label noise. In Asian Conference on Machine Learning. 97--112.Google Scholar
- Matt Bishop. 2007. About penetration testing. IEEE Security & Privacy 5, 6 (2007), 84--87.Google ScholarDigital Library
- Avrim L Blum and Pat Langley. 1997. Selection of relevant features and examples in machine learning. Artificial Intelligence 97, 1-2 (1997), 245--271.Google ScholarDigital Library
- Peter Buneman, Adriane Chapman, and James Cheney. 2006. Provenance management in curated databases. In Proceedings of the 2006 ACM SIGMOD international conference on Management of data. ACM, 539--550.Google ScholarDigital Library
- Jie Cai, Jiawei Luo, Shulin Wang, and Sheng Yang. 2018. Feature selection in machine learning: A new perspective. Neurocomputing 300 (2018), 70--79. Google ScholarCross Ref
- Adriane P Chapman, Hosagrahar V Jagadish, and Prakash Ramanan. 2008. Efficient provenance storage. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data. ACM, 993--1006.Google ScholarDigital Library
- James Cheney, Amal Ahmed, and Umut A Acar. 2007. Provenance as dependency analysis. In International Symposium on Database Programming Languages. Springer, 138--152.Google ScholarCross Ref
- Erika Chin, Adrienne Porter Felt, Vyas Sekar, and David Wagner. 2012. Measuring user confidence in smartphone security and privacy. In Proceedings of the eighth symposium on usable privacy and security. ACM, 1.Google ScholarDigital Library
- Sam Corbett-Davies and Sharad Goel. 2018. The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning. arXiv preprint arXiv:1808.00023 (2018).Google Scholar
- Sarah Cunningham-Burley. 2006. Public knowledge and public trust. Public Health Genomics 9, 3 (2006), 204--210.Google ScholarCross Ref
- Brian d'Alessandro, Cathy O'Neil, and Tom LaGatta. 2017. Conscientious classification: A data scientist's guide to discrimination-aware classification. Big data 5, 2 (2017), 120--134.Google Scholar
- Ambra Demontis, Marco Melis, Battista Biggio, Davide Maiorca, Daniel Arp, Konrad Rieck, Igino Corona, Giorgio Giacinto, and Fabio Roli. 2017. Yes, machine learning can be more secure! a case study on android malware detection. IEEE Transactions on Dependable and Secure Computing (2017).Google Scholar
- Daniel Deutch, Nave Frost, and Amir Gilad. 2016. Nlprov: Natural language provenance. Proceedings of the VLDB Endowment 9, 13 (2016), 1537--1540.Google ScholarDigital Library
- Daniel Deutch, Amir Gilad, and Yuval Moskovitch. 2015. Selective provenance for datalog programs using top-k queries. Proceedings of the VLDB Endowment 8, 12 (2015), 1394--1405.Google ScholarDigital Library
- Graham Dietz and Deanne N Den Hartog. 2006. Measuring trust inside organisations. Personnel review 35, 5 (2006), 557--588.Google Scholar
- Graham Dietz and Nicole Gillespie. 2012. Recovery of Trust: Case Studies of Organisational Failures and Trust Repair. Vol. 5. Institute of Business Ethics London.Google Scholar
- Finale Doshi-Velez and Been Kim. 2017. Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608 (2017).Google Scholar
- Harris Drucker, Donghui Wu, and Vladimir N Vapnik. 1999. Support vector machines for spam categorization. IEEE Transactions on Neural networks 10, 5 (1999), 1048--1054.Google ScholarDigital Library
- Rehab Duwairi and Mahmoud El-Orfali. 2014. A study of the effects of preprocessing strategies on sentiment analysis for Arabic text. Journal of Information Science 40, 4 (2014), 501--513.Google ScholarDigital Library
- Cynthia Dwork. 2011. Differential privacy. Encyclopedia of Cryptography and Security (2011), 338--340.Google Scholar
- Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. 2012. Fairness through awareness. In Proceedings of the 3rd innovations in theoretical computer science conference. ACM, 214--226.Google ScholarDigital Library
- Dumitru Erhan, Yoshua Bengio, Aaron Courville, Pierre-Antoine Manzagol, Pascal Vincent, and Samy Bengio. 2010. Why does unsupervised pre-training help deep learning? Journal of Machine Learning Research 11, Feb (2010), 625--660.Google ScholarDigital Library
- Michael Feldman, Sorelle A Friedler, John Moeller, Carlos Scheidegger, and Suresh Venkatasubramanian. 2015. Certifying and removing disparate impact. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 259--268.Google ScholarDigital Library
- Jessica Fjeld, Hannah Hilligoss, Nele Achten, Maia Levy Daniel, Sally Kagay, and Joshua Feldman. 2019. Principled Artificial Intelligence: Mapping Consensus and Divergence in Ethical and Rights-Based Approaches. (2019). https://aihr.cyber.harvard.edu/Google Scholar
- Zahra Ghodsi, Tianyu Gu, and Siddharth Garg. 2017. Safetynets: Verifiable execution of deep neural networks on an untrusted cloud. In Advances in Neural Information Processing Systems. 4672--4681.Google Scholar
- Nicole Gillespie and Graham Dietz. 2009. Trust repair after an organization-level failure. Academy of Management Review 34, 1 (2009), 127--145.Google ScholarCross Ref
- Carlos Adriano Gonçalves, Celia Talma Gonçalves, Rui Camacho, and Eugenio C Oliveira. 2010. The impact of Pre-Processing on the Classification of MEDLINE Documents. Pattern Recognition in Information Systems, Proceedings of the 10th International Workshop on Pattern Recognition in Information Systems, PRIS 2010, In conjunction with ICEIS 2010 (2010), 10.Google Scholar
- Carlos Vladimiro González Zelaya. 2019. Towards Explaining the Effects of Data Preprocessing on Machine Learning. 2019 IEEE 35th International Conference on Data Engineering (ICDE) (2019).Google Scholar
- Riccardo Guidotti, Anna Monreale, Salvatore Ruggieri, Franco Turini, Fosca Giannotti, and Dino Pedreschi. 2018. A survey of methods for explaining black box models. ACM computing surveys (CSUR) 51, 5 (2018), 93.Google Scholar
- Himanshu Gupta, Sameep Mehta, Sandeep Hans, Bapi Chatterjee, Pranay Lohia, and C Rajmohan. 2017. Provenance in context of Hadoop as a Service (HaaS)-State of the Art and Research Directions. In 2017 IEEE 10th International Conference on Cloud Computing (CLOUD). IEEE, 680--683.Google ScholarCross Ref
- Rob Hagendijk and Alan Irwin. 2006. Public deliberation and governance: engaging with science and technology in contemporary Europe. Minerva 44, 2 (2006), 167--184.Google ScholarCross Ref
- Moritz Hardt, Eric Price, Nati Srebro, and others. 2016. Equality of opportunity in supervised learning. In Advances in neural information processing systems. 3315--3323.Google Scholar
- Weiwei Hu and Ying Tan. 2017. Generating adversarial malware examples for black-box attacks based on GAN. arXiv preprint arXiv:1702.05983 (2017).Google Scholar
- Anil K Jain, M Narasimha Murty, and Patrick J Flynn. 1999. Data clustering: a review. ACM computing surveys (CSUR) 31, 3 (1999), 264--323.Google Scholar
- Matthew Joseph, Michael Kearns, Jamie Morgenstern, Seth Neel, and Aaron Roth. 2016. Rawlsian Fairness for Machine Learning. FATML (2016), 1--26. http://arxiv.org/abs/1610.09559Google Scholar
- Toshihiro Kamishima, Shotaro Akaho, Hideki Asoh, and Jun Sakuma. 2012. Fairness-aware classifier with prejudice remover regularizer. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, 35--50.Google ScholarCross Ref
- Been Kim, Elena Glassman, Brittney Johnson, and Julie Shah. 2015. iBCM: Interactive Bayesian case model empowering humans via intuitive interaction. (2015).Google Scholar
- Sabine Theresia Koszegi. 2019. High-Level Expert Group on Artificial Intelligence.Google Scholar
- Alex Krizhevsky, Geoffrey Hinton, and others. 2009. Learning multiple layers of features from tiny images. Technical Report. Citeseer.Google Scholar
- Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097--1105.Google Scholar
- Matt J Kusner, Joshua Loftus, Chris Russell, and Ricardo Silva. 2017. Counterfactual fairness. In Advances in Neural Information Processing Systems. 4066--4076.Google Scholar
- Ricky Laishram and Vir Virander Phoha. 2016. Curie: A method for protecting SVM Classifier from Poisoning Attack. arXiv preprint arXiv:1606.01584 (2016).Google Scholar
- Himabindu Lakkaraju, Ece Kamar, Rich Caruana, and Jure Leskovec. 2017. Interpretable & Explorable Approximations of Black Box Models. (7 2017).Google Scholar
- Pavel Laskov and Marius Kloft. 2009. A framework for quantitative security analysis of machine learning. In Proceedings of the 2nd ACM workshop on Security and artificial intelligence. ACM, 1--4.Google ScholarDigital Library
- Xin Li, Traci J Hess, and Joseph S Valacich. 2008. Why do we trust new technology? A study of initial trust formation with organizational information systems. The Journal of Strategic Information Systems 17, 1 (2008), 39--71.Google ScholarDigital Library
- Zachary C Lipton. 2016. The mythos of model interpretability. arXiv preprint arXiv:1606.03490 (2016).Google Scholar
- Qiang Liu, Pan Li, Wentao Zhao, Wei Cai, Shui Yu, and Victor C M Leung. 2018. A survey on security threats and defensive techniques of machine learning: A data driven view. IEEE access 6 (2018), 12103--12117.Google Scholar
- Yin Lou, Rich Caruana, Johannes Gehrke, and Giles Hooker. 2013. Accurate intelligible models with pairwise interactions. Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD '13 (2013), 623. Google ScholarDigital Library
- Kristian Lum and James Johndrow. 2016. A statistical framework for fair predictive algorithms. arXiv preprint arXiv:1610.08077 (2016).Google Scholar
- Scott M Lundberg and Su-In Lee. 2017. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems. 4765--4774.Google Scholar
- Binh Thanh Luong, Salvatore Ruggieri, and Franco Turini. 2011. k-NN as an implementation of situation testing for discrimination discovery and prevention. In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM, 502--510.Google ScholarDigital Library
- Peter Macko, Daniel Margo, and Margo Seltzer. 2013. Local clustering in provenance graphs. In Proceedings of the 22nd ACM international conference on Information & Knowledge Management. ACM, 835--840.Google ScholarDigital Library
- Roger C Mayer, James H Davis, and F David Schoorman. 1995. An integrative model of organizational trust. Academy of management review 20, 3 (1995), 709--734.Google Scholar
- Donald Michie, David J Spiegelhalter, C C Taylor, and others. 1994. Machine learning. Neural and Statistical Classification 13 (1994).Google Scholar
- Grégoire Montavon, Wojciech Samek, and Klaus-Robert Müller. 2018. Methods for interpreting and understanding deep neural networks. Digital Signal Processing 73 (2018), 1--15.Google ScholarCross Ref
- Arvind Narayanan. 2018. Translation tutorial: 21 fairness definitions and their politics. In Proc. Conf. Fairness Accountability Transp., New York, USA.Google Scholar
- Olga Ohrimenko, Felix Schuster, Cédric Fournet, Aastha Mehta, Sebastian Nowozin, Kapil Vaswani, and Manuel Costa. 2016. Oblivious multi-party machine learning on trusted processors. In 25th ${$USENIX$}$ Security Symposium (${$USENIX$}$ Security 16). 619--636.Google Scholar
- Nicolas Papernot, Patrick McDaniel, Arunesh Sinha, and Michael Wellman. 2016. Towards the science of security and privacy in machine learning. arXiv preprint arXiv:1611.03814 (2016).Google Scholar
- Nicolas Papernot, Patrick McDaniel, Arunesh Sinha, and Michael P Wellman. 2018. SoK: Security and privacy in machine learning. In 2018 IEEE European Symposium on Security and Privacy (EuroS&P). IEEE, 399--414.Google ScholarCross Ref
- Christopher Ré and Dan Suciu. 2008. Approximate lineage for probabilistic databases. Proceedings of the VLDB Endowment 1, 1 (2008), 797--808.Google ScholarDigital Library
- Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. 2016. Why should i trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, 1135--1144.Google ScholarDigital Library
- Ronald L Rivest. 1987. Learning decision lists. Machine learning 2, 3 (1987), 229--246.Google Scholar
- Andrea Romei and Salvatore Ruggieri. 2014. A multidisciplinary survey on discrimination analysis. The Knowledge Engineering Review 29, 5 (2014), 582--638.Google ScholarCross Ref
- Benjamin I P Rubinstein, Blaine Nelson, Ling Huang, Anthony D Joseph, Shinghon Lau, Satish Rao, Nina Taft, and J Doug Tygar. 2009. Antidote: understanding and defending against poisoning of anomaly detectors. In Proceedings of the 9th ACM SIGCOMM conference on Internet measurement. ACM, 1--14.Google ScholarDigital Library
- Reza Shokri and Vitaly Shmatikov. 2015. Privacy-preserving deep learning. In Proceedings of the 22nd ACM SIGSAC conference on computer and communications security. ACM, 1310--1321.Google Scholar
- Avanti Shrikumar, Peyton Greenside, and Anshul Kundaje. 2017. Learning important features through propagating activation differences. In Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 3145--3153.Google ScholarDigital Library
- Keng Siau and Weiyu Wang. 2018. Building trust in artificial intelligence, machine learning, and robotics. Cutter Business Technology Journal 31, 2 (2018), 47--53.Google Scholar
- Jatinder Singh, Jennifer Cobbe, and Chris Norval. 2018. Decision Provenance: Harnessing data flow for accountable systems. IEEE Access 7 (2018), 6562--6574.Google ScholarCross Ref
- Ilya Sutskever, Oriol Vinyals, and Quoc V Le. 2014. Sequence to sequence learning with neural networks. In Advances in neural information processing systems. 3104--3112.Google Scholar
- Richard S Sutton and Andrew G Barto. 2018. Reinforcement learning: An introduction. MIT press.Google Scholar
- Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2013. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013).Google Scholar
- Alper Kursat Uysal and Serkan Gunal. 2014. The impact of preprocessing on text classification. Information Processing and Management 50, 1 (2014), 104--112. Google ScholarDigital Library
- Michael Walfish and Andrew J Blumberg. 2015. Verifying computations without reexecuting them. Commun. ACM 58, 2 (2015), 74--84.Google ScholarDigital Library
- Jess Whittlestone, Rune Nyrup, Anna Alexandrova, and Stephen Cave. 2019. The Role and Limits of Principles in AI Ethics: Towards a Focus on Tensions. In Proceedings of the AAAI/ACM Conference on AI Ethics and Society, Honolulu, HI, USA. 27--28.Google ScholarDigital Library
- Simon N Wood. 2003. Thin plate regression splines. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 65, 1 (2003), 95--114.Google ScholarCross Ref
- Eugene Wu, Samuel Madden, and Michael Stonebraker. 2013. Subzero: a finegrained lineage system for scientific databases. In 2013 IEEE 29th International Conference on Data Engineering (ICDE). IEEE, 865--876.Google ScholarDigital Library
- Brian Wynne. 1992. Misunderstood misunderstanding: social identities and public uptake of science. Public understanding of science 1, 3 (1992), 281--304.Google Scholar
- Brian Wynne. 1996. A reflexive view of the expert-lay knowledge divide. Risk, environment and modernity: Towards a new ecology 40 (1996), 44.Google Scholar
- Brian Wynne. 2006. Public engagement as a means of restoring public trust in science-hitting the notes, but missing the music? Public Health Genomics 9, 3 (2006), 211--220.Google ScholarCross Ref
- Yulai Xie, Kiran-Kumar Muniswamy-Reddy, Dan Feng, Yan Li, and Darrell D E Long. 2013. Evaluation of a hybrid approach for efficient provenance storage. ACM Transactions on Storage (TOS) 9, 4 (2013), 14.Google Scholar
- Masaki Yuki, William W Maddux, Marilynn B Brewer, and Kosuke Takemura. 2005. Cross-cultural differences in relationship-and group-based trust. Personality and Social Psychology Bulletin 31, 1 (2005), 48--62.Google ScholarCross Ref
- Muhammad Bilal Zafar. 2019. Discrimination in Algorithmic Decision Making: From Principles to Measures and Mechanisms. (2019).Google Scholar
- Muhammad Bilal Zafar, Isabel Valera, Manuel Gomez Rodriguez, and Krishna P Gummadi. 2017. Fairness beyond disparate treatment & disparate impact: Learning classification without disparate mistreatment. In Proceedings of the 26th International Conference on World Wide Web. International World Wide Web Conferences Steering Committee, 1171--1180.Google ScholarDigital Library
- Indrė Žliobait\.e. 2017. Measuring discrimination in algorithmic decision making. Data Mining and Knowledge Discovery 31, 4 (2017), 1060--1089.Google ScholarDigital Library
Index Terms
- The relationship between trust in AI and trustworthy machine learning technologies
Recommendations
National culture and consumer trust in e-commerce
The article examines how culture influences trust in e-commerce.Disposition to trust is a significant predictor of perceived trustworthiness.Disposition to trust mediates effects between national culture and trustworthiness.Long-term orientation and ...
How Explainability Contributes to Trust in AI
FAccT '22: Proceedings of the 2022 ACM Conference on Fairness, Accountability, and TransparencyWe provide a philosophical explanation of the relation between artificial intelligence (AI) explainability and trust in AI, providing a case for expressions, such as “explainability fosters trust in AI,” that commonly appear in the literature. This ...
Toward a Unified Theory of Learned Trust in Interpersonal and Human-Machine Interactions
A proposal for a unified theory of learned trust implemented in a cognitive architecture is presented. The theory is instantiated as a computational cognitive model of learned trust that integrates several seemingly unrelated categories of findings from ...
Comments