Skip to main content
Top
Published in: Minds and Machines 2/2020

25-04-2020

Moral Gridworlds: A Theoretical Proposal for Modeling Artificial Moral Cognition

Author: Julia Haas

Published in: Minds and Machines | Issue 2/2020

Log in

Activate our intelligent search to find suitable subject content or patents.

search-config
loading …

Abstract

I describe a suite of reinforcement learning environments in which artificial agents learn to value and respond to moral content and contexts. I illustrate the core principles of the framework by characterizing one such environment, or “gridworld,” in which an agent learns to trade-off between monetary profit and fair dealing, as applied in a standard behavioral economic paradigm. I then highlight the core technical and philosophical advantages of the learning approach for modeling moral cognition, and for addressing the so-called value alignment problem in AI.

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Online-Abonnement

Mit Springer Professional "Wirtschaft+Technik" erhalten Sie Zugriff auf:

  • über 102.000 Bücher
  • über 537 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Maschinenbau + Werkstoffe
  • Versicherung + Risiko

Jetzt Wissensvorsprung sichern!

Springer Professional "Technik"

Online-Abonnement

Mit Springer Professional "Technik" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 390 Zeitschriften

aus folgenden Fachgebieten:

  • Automobil + Motoren
  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Elektrotechnik + Elektronik
  • Energie + Nachhaltigkeit
  • Maschinenbau + Werkstoffe




 

Jetzt Wissensvorsprung sichern!

Springer Professional "Wirtschaft"

Online-Abonnement

Mit Springer Professional "Wirtschaft" erhalten Sie Zugriff auf:

  • über 67.000 Bücher
  • über 340 Zeitschriften

aus folgenden Fachgebieten:

  • Bauwesen + Immobilien
  • Business IT + Informatik
  • Finance + Banking
  • Management + Führung
  • Marketing + Vertrieb
  • Versicherung + Risiko




Jetzt Wissensvorsprung sichern!

Footnotes
1
The Ultimatum Game appears to track fairness. This does not yet mean that it explains fairness, i.e., that it provides an account of ‘how fairness works.’ However, to model fairness, the fairness gridworld only needs such a benchmark of human behavior, not a full-fledged or mechanistic explanation of it. On the contrary, the gridworld may be one way to begin looking inside the ‘black box’ of fairness (for more on this last idea, see Sect. 3.2) Thank you to an anonymous reviewer to pressing me on this point.
 
2
This is also known as an indefinite horizon task, i.e., an interaction which lasts an indefinite period of time, but eventually terminates.
 
3
Thanks to Ivan Gonzalez-Cabrera for suggesting this point.
 
4
Interest in modeling a three-way relation between monetary value, fairness, and honesty considerations may further weigh in favor of a MORL rather than a specification approach (see Sect. 2.3).
 
5
Although its implementation of a consequentialist ethics (and, specifically, Asimov’s three laws for governing robotic behavior) technically makes the proposal a model of normative moral AI, the paper’s heavy emphasis on modeling the naturalistic simulation theory of cognition lends to many of the objectives of what I am calling the (descriptive) moral psychological approach.
 
6
Thanks to an anonymous reviewer for pressing me on this point. For further discussion concerning the difficulties of context identification in machine ethics, see Winfield et al. (2019).
 
Literature
go back to reference Adamson, G., Havens, J. C., & Chatila, R. (2019). Designing a value-driven future for ethical autonomous and intelligent systems. Proceedings of the IEEE,107(3), 518–525. Adamson, G., Havens, J. C., & Chatila, R. (2019). Designing a value-driven future for ethical autonomous and intelligent systems. Proceedings of the IEEE,107(3), 518–525.
go back to reference Allen, C., Smit, I., & Wallach, W. (2005). Artificial morality: Top-down, bottom-up, and hybrid approaches. Ethics and Information Technology,7(3), 149–155. Allen, C., Smit, I., & Wallach, W. (2005). Artificial morality: Top-down, bottom-up, and hybrid approaches. Ethics and Information Technology,7(3), 149–155.
go back to reference Allen, C., & Wallach, W. (2012). Moral machines: Contradiction in terms or abdication of human responsibility. In Robot ethics: The ethical and social implications of robotics (p. 55–68). Cambridge: MIT Press. Allen, C., & Wallach, W. (2012). Moral machines: Contradiction in terms or abdication of human responsibility. In Robot ethics: The ethical and social implications of robotics (p. 55–68). Cambridge: MIT Press.
go back to reference Alvard, M. S. (2004). The ultimatum game, fairness, and cooperation among big game hunters. In J. Henrich, R. Boyd, S. Bowles, C. Camerer, E. Fehr, & H. Gintis (Eds.), Foundations of human sociality (pp. 413–435). Oxford: Oxford University Press. Alvard, M. S. (2004). The ultimatum game, fairness, and cooperation among big game hunters. In J. Henrich, R. Boyd, S. Bowles, C. Camerer, E. Fehr, & H. Gintis (Eds.), Foundations of human sociality (pp. 413–435). Oxford: Oxford University Press.
go back to reference Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J., & Mané, D. (2016). Concrete problems in AI safety. arXiv preprint arXiv:1606.06565. Amodei, D., Olah, C., Steinhardt, J., Christiano, P., Schulman, J., & Mané, D. (2016). Concrete problems in AI safety. arXiv preprint arXiv:​1606.​06565.
go back to reference Anderson, M., & Anderson, S. L. (2018). GenEth: a general ethical dilemma analyzer. Paladyn, Journal of Behavioral Robotics,9(1), 337–357. Anderson, M., & Anderson, S. L. (2018). GenEth: a general ethical dilemma analyzer. Paladyn, Journal of Behavioral Robotics,9(1), 337–357.
go back to reference Anderson, M., Anderson, S. L., & Armen, C. (2006). MedEthEx: a prototype medical ethics advisor. In Proceedings of the national conference on artificial intelligence (Vol. 21, No. 2, p. 1759). Menlo Park, CA; Cambridge, MA; London; AAAI Press; MIT Press; 1999. Anderson, M., Anderson, S. L., & Armen, C. (2006). MedEthEx: a prototype medical ethics advisor. In Proceedings of the national conference on artificial intelligence (Vol. 21, No. 2, p. 1759). Menlo Park, CA; Cambridge, MA; London; AAAI Press; MIT Press; 1999.
go back to reference Anderson, M., Anderson, S. L., & Berenz, V. (2019). A value-driven eldercare robot: Virtual and physical instantiations of a case-supported principle-based behavior paradigm. Proceedings of the IEEE,107(3), 526–540. Anderson, M., Anderson, S. L., & Berenz, V. (2019). A value-driven eldercare robot: Virtual and physical instantiations of a case-supported principle-based behavior paradigm. Proceedings of the IEEE,107(3), 526–540.
go back to reference Arnold, T., Kasenberg, D., & Scheutz, M. (2017). Value alignment or misalignment—What will keep systems accountable?. In Workshops at the thirty-first AAAI conference on artificial intelligence. Arnold, T., Kasenberg, D., & Scheutz, M. (2017). Value alignment or misalignment—What will keep systems accountable?. In Workshops at the thirty-first AAAI conference on artificial intelligence.
go back to reference Barocas, S., & Selbst, A. D. (2016). Big data’s disparate impact. California Law Review,104, 671. Barocas, S., & Selbst, A. D. (2016). Big data’s disparate impact. California Law Review,104, 671.
go back to reference Bechtel, W., & Mundale, J. (1999). Multiple realizability revisited: Linking cognitive and neural states. Philosophy of Science,66(2), 175–207. Bechtel, W., & Mundale, J. (1999). Multiple realizability revisited: Linking cognitive and neural states. Philosophy of Science,66(2), 175–207.
go back to reference Bengio, Y., & LeCun, Y. (2007). Scaling learning algorithms towards AI. Large-scale Kernel Machines,34(5), 1–41. Bengio, Y., & LeCun, Y. (2007). Scaling learning algorithms towards AI. Large-scale Kernel Machines,34(5), 1–41.
go back to reference Berns, G. S., Bell, E., Capra, C. M., Prietula, M. J., Moore, S., Anderson, B., et al. (2012). The price of your soul: Neural evidence for the non-utilitarian representation of sacred values. Philosophical Transactions of the Royal Society B: Biological Sciences,367(1589), 754–762. Berns, G. S., Bell, E., Capra, C. M., Prietula, M. J., Moore, S., Anderson, B., et al. (2012). The price of your soul: Neural evidence for the non-utilitarian representation of sacred values. Philosophical Transactions of the Royal Society B: Biological Sciences,367(1589), 754–762.
go back to reference Bigman, Y. E., Waytz, A., Alterovitz, R., & Gray, K. (2019). Holding robots responsible: The elements of machine morality. Trends in Cognitive Sciences,23(5), 365–368. Bigman, Y. E., Waytz, A., Alterovitz, R., & Gray, K. (2019). Holding robots responsible: The elements of machine morality. Trends in Cognitive Sciences,23(5), 365–368.
go back to reference Boksem, M. A., & De Cremer, D. (2010). Fairness concerns predict medial frontal negativity amplitude in ultimatum bargaining. Social Neuroscience,5(1), 118–128. Boksem, M. A., & De Cremer, D. (2010). Fairness concerns predict medial frontal negativity amplitude in ultimatum bargaining. Social Neuroscience,5(1), 118–128.
go back to reference Bonnefon, J. F., Shariff, A., & Rahwan, I. (2016). The social dilemma of autonomous vehicles. Science,352(6293), 1573–1576. Bonnefon, J. F., Shariff, A., & Rahwan, I. (2016). The social dilemma of autonomous vehicles. Science,352(6293), 1573–1576.
go back to reference Borenstein, J., & Arkin, R. (2019). Robots, ethics, and intimacy: the need for scientific research. In D. Berkich & M. V. d’Alfonso (Eds.), On the Cognitive, Ethical, and Scientific Dimensions of Artificial Intelligence (Vol. 134, pp. 299–309). Springer. Borenstein, J., & Arkin, R. (2019). Robots, ethics, and intimacy: the need for scientific research. In D. Berkich & M. V. d’Alfonso (Eds.), On the Cognitive, Ethical, and Scientific Dimensions of Artificial Intelligence (Vol. 134, pp. 299–309). Springer.
go back to reference Bremner, P., Dennis, L. A., Fisher, M., & Winfield, A. F. (2019). On proactive, transparent, and verifiable ethical reasoning for robots. Proceedings of the IEEE,107(3), 541–561. Bremner, P., Dennis, L. A., Fisher, M., & Winfield, A. F. (2019). On proactive, transparent, and verifiable ethical reasoning for robots. Proceedings of the IEEE,107(3), 541–561.
go back to reference Brown, D. (1991). Human universals. New York: McGraw-Hill. Brown, D. (1991). Human universals. New York: McGraw-Hill.
go back to reference Brumbaugh, S. M., Sanchez, L. A., Nock, S. L., & Wright, J. D. (2008). Attitudes toward gay marriage in states undergoing marriage law transformation. Journal of Marriage and Family,70(2), 345–359. Brumbaugh, S. M., Sanchez, L. A., Nock, S. L., & Wright, J. D. (2008). Attitudes toward gay marriage in states undergoing marriage law transformation. Journal of Marriage and Family,70(2), 345–359.
go back to reference Cave, S., Nyrup, R., Vold, K., & Weller, A. (2018). Motivations and risks of machine ethics. Proceedings of the IEEE,107(3), 562–574. Cave, S., Nyrup, R., Vold, K., & Weller, A. (2018). Motivations and risks of machine ethics. Proceedings of the IEEE,107(3), 562–574.
go back to reference Corradi-Dell’Acqua, C., Civai, C., Rumiati, R. I., & Fink, G. R. (2013). Disentangling self-and fairness-related neural mechanisms involved in the ultimatum game: an fMRI study. Social Cognitive and Affective Neuroscience,8(4), 424–431. Corradi-Dell’Acqua, C., Civai, C., Rumiati, R. I., & Fink, G. R. (2013). Disentangling self-and fairness-related neural mechanisms involved in the ultimatum game: an fMRI study. Social Cognitive and Affective Neuroscience,8(4), 424–431.
go back to reference Crawford, K., & Calo, R. (2016). There is a blind spot in AI research. Nature,538(7625), 311–313. Crawford, K., & Calo, R. (2016). There is a blind spot in AI research. Nature,538(7625), 311–313.
go back to reference Crockett, M. J. (2013). Models of morality. Trends in Cognitive Sciences,17(8), 363–366. Crockett, M. J. (2013). Models of morality. Trends in Cognitive Sciences,17(8), 363–366.
go back to reference Crockett, M. J. (2016). How formal models can illuminate mechanisms of moral judgment and decision making. Current Directions in Psychological Science,25(2), 85–90. Crockett, M. J. (2016). How formal models can illuminate mechanisms of moral judgment and decision making. Current Directions in Psychological Science,25(2), 85–90.
go back to reference Crockett, M. J., Siegel, J. Z., Kurth-Nelson, Z., Dayan, P., & Dolan, R. J. (2017). Moral transgressions corrupt neural representations of value. Nature Neuroscience,20(6), 879. Crockett, M. J., Siegel, J. Z., Kurth-Nelson, Z., Dayan, P., & Dolan, R. J. (2017). Moral transgressions corrupt neural representations of value. Nature Neuroscience,20(6), 879.
go back to reference Cushman, F. (2015). From moral concern to moral constraint. Current Opinion in Behavioral Sciences,3, 58–62. Cushman, F. (2015). From moral concern to moral constraint. Current Opinion in Behavioral Sciences,3, 58–62.
go back to reference Debove, S., Baumard, N., & André, J. B. (2016). Models of the evolution of fairness in the ultimatum game: A review and classification. Evolution Andhuman Behavior, 37(3), 245–254. Debove, S., Baumard, N., & André, J. B. (2016). Models of the evolution of fairness in the ultimatum game: A review and classification. Evolution Andhuman Behavior, 37(3), 245–254.
go back to reference De Sio, F. S. (2017). Killing by autonomous vehicles and the legal doctrine of necessity. Ethical Theory and Moral Practice,20(2), 411–429. De Sio, F. S. (2017). Killing by autonomous vehicles and the legal doctrine of necessity. Ethical Theory and Moral Practice,20(2), 411–429.
go back to reference Dennis, L., Fisher, M., Slavkovik, M., & Webster, M. (2016). Formal verification of ethical choices in autonomous systems. Robotics and Autonomous Systems,77, 1–14. Dennis, L., Fisher, M., Slavkovik, M., & Webster, M. (2016). Formal verification of ethical choices in autonomous systems. Robotics and Autonomous Systems,77, 1–14.
go back to reference Dietrich, F., & List, C. (2017). What matters and how it matters: a choice-theoretic representation of moral theories. Philosophical Review,126(4), 421–479. Dietrich, F., & List, C. (2017). What matters and how it matters: a choice-theoretic representation of moral theories. Philosophical Review,126(4), 421–479.
go back to reference Doran, D., Schulz, S., & Besold, T. R. (2017). What does explainable AI really mean? A new conceptualization of perspectives. arXiv preprint arXiv:1710.00794. Doran, D., Schulz, S., & Besold, T. R. (2017). What does explainable AI really mean? A new conceptualization of perspectives. arXiv preprint arXiv:​1710.​00794.
go back to reference Doris, J. M. (2002). Lack of character: Personality and moral behavior. Cambridge: Cambridge University Press. Doris, J. M. (2002). Lack of character: Personality and moral behavior. Cambridge: Cambridge University Press.
go back to reference Dretske, F. (1994). If you can't make one, you don't know how it works. Midwest Studies in Philosophy, 19, 468–482. Dretske, F. (1994). If you can't make one, you don't know how it works. Midwest Studies in Philosophy, 19, 468–482.
go back to reference Driver, J. (2005). Normative ethics. In F. Jackson & M. Smith (Eds.), The Oxford Handbook of Contemporary Philosophy (pp. 31–62). Oxford: Oxford University Press. Driver, J. (2005). Normative ethics. In F. Jackson & M. Smith (Eds.), The Oxford Handbook of Contemporary Philosophy (pp. 31–62). Oxford: Oxford University Press.
go back to reference Elgin, C. Z. (2017). True enough. Cambridge: MIT Press. Elgin, C. Z. (2017). True enough. Cambridge: MIT Press.
go back to reference Everitt, T., Krakovna, V., Orseau, L., Hutter, M., & Legg, S. (2017). Reinforcement learning with a corrupted reward channel. arXiv preprint. arXiv:1705.08417. Everitt, T., Krakovna, V., Orseau, L., Hutter, M., & Legg, S. (2017). Reinforcement learning with a corrupted reward channel. arXiv preprint. arXiv:​1705.​08417.
go back to reference Farrell, J. (1987). Cheap talk, coordination, and entry. The Rand Journal of Economics,18(1), 34–39.MathSciNet Farrell, J. (1987). Cheap talk, coordination, and entry. The Rand Journal of Economics,18(1), 34–39.MathSciNet
go back to reference Fehr, E., & Schmidt, K. (2003). Theories of fairness and reciprocity–evidence and economic applications. In Advances in economics and econometrics, 8th World Congress, Econometric Society Monographs. Fehr, E., & Schmidt, K. (2003). Theories of fairness and reciprocity–evidence and economic applications. In Advances in economics and econometrics, 8th World Congress, Econometric Society Monographs.
go back to reference Feng, C., Luo, Y. J., & Krueger, F. (2015). Neural signatures of fairness-related normative decision making in the ultimatum game: A coordinate-based meta-analysis. Human Brain Mapping,36(2), 591–602. Feng, C., Luo, Y. J., & Krueger, F. (2015). Neural signatures of fairness-related normative decision making in the ultimatum game: A coordinate-based meta-analysis. Human Brain Mapping,36(2), 591–602.
go back to reference Flanagan, O., Sarkissian, H., & Wong, D. (2007). Naturalizing ethics. In W. Sinnott-Armstrong (Ed.), Moral psychology, Vol. 1. The evolution of morality: Adaptations and innateness (pp. 1–25). Cambridge: MIT Press. Flanagan, O., Sarkissian, H., & Wong, D. (2007). Naturalizing ethics. In W. Sinnott-Armstrong (Ed.), Moral psychology, Vol. 1. The evolution of morality: Adaptations and innateness  (pp. 1–25). Cambridge: MIT Press.
go back to reference Fleetwood, J. (2017). Public health, ethics, and autonomous vehicles. American Journal of Public Health,107(4), 532–537. Fleetwood, J. (2017). Public health, ethics, and autonomous vehicles. American Journal of Public Health,107(4), 532–537.
go back to reference Forsythe, R., Horowitz, J. L., Savin, N. E., & Sefton, M. (1994). Fairness in simple bargaining experiments. Games and Economic Behavior, 6(3), 347–369.MATH Forsythe, R., Horowitz, J. L., Savin, N. E., & Sefton, M. (1994). Fairness in simple bargaining experiments. Games and Economic Behavior, 6(3), 347–369.MATH
go back to reference Gábor, Z., Kalmár, Z., & Szepesvári, C. (1998, July). Multi-criteria reinforcement learning. In ICML (Vol. 98, pp. 197–205). Chicago. Gábor, Z., Kalmár, Z., & Szepesvári, C. (1998, July). Multi-criteria reinforcement learning. In ICML (Vol. 98, pp. 197–205). Chicago.
go back to reference Glimcher, P. W. (2011). Foundations of neuroeconomic analysis. Oxford: OUP USA. Glimcher, P. W. (2011). Foundations of neuroeconomic analysis. Oxford: OUP USA.
go back to reference Gogoll, J., & Müller, J. F. (2017). Autonomous cars: in favor of a mandatory ethics setting. Science and Engineering Ethics,23(3), 681–700. Gogoll, J., & Müller, J. F. (2017). Autonomous cars: in favor of a mandatory ethics setting. Science and Engineering Ethics,23(3), 681–700.
go back to reference Güth, W., Schmittberger, R., & Schwarze, B. (1982). An experimental analysis of ultimatum bargaining. Journal of Economic Behavior & Organization,3(4), 367–388. Güth, W., Schmittberger, R., & Schwarze, B. (1982). An experimental analysis of ultimatum bargaining. Journal of Economic Behavior & Organization,3(4), 367–388.
go back to reference Hadfield-Menell, D., Milli, S., Abbeel, P., Russell, S. J., & Dragan, A. (2017). Inverse reward design. In Advances in neural information processing systems, (pp. 6765–6774). Hadfield-Menell, D., Milli, S., Abbeel, P., Russell, S. J., & Dragan, A. (2017). Inverse reward design. In Advances in neural information processing systems, (pp. 6765–6774).
go back to reference Hartmann, S. (1996). The world as a process: Simulations in the natural and social sciences. in Hegselmann, Mueller, and Troitzsch 1996: 77–100. Hartmann, S. (1996). The world as a process: Simulations in the natural and social sciences. in Hegselmann, Mueller, and Troitzsch 1996: 77–100.
go back to reference Henrich, J., Ensminger, J., McElreath, R., Barr, A., Barrett, C., Bolyanatz, A., et al. (2010a). Markets, religion, community size, and the evolution of fairness and punishment. Science,327(5972), 1480–1484. Henrich, J., Ensminger, J., McElreath, R., Barr, A., Barrett, C., Bolyanatz, A., et al. (2010a). Markets, religion, community size, and the evolution of fairness and punishment. Science,327(5972), 1480–1484.
go back to reference Henrich, J., Heine, S. J., & Norenzayan, A. (2010b). The weirdest people in the world? Behavioral and Brain Sciences,33(2–3), 61–83. Henrich, J., Heine, S. J., & Norenzayan, A. (2010b). The weirdest people in the world? Behavioral and Brain Sciences,33(2–3), 61–83.
go back to reference Henrich, J., Heine, S. J., & Norenzayan, A. (2010c). Most people are not WEIRD. Nature,466(7302), 29. Henrich, J., Heine, S. J., & Norenzayan, A. (2010c). Most people are not WEIRD. Nature,466(7302), 29.
go back to reference Himmelreich, J. (2018). Never mind the trolley: The ethics of autonomous vehicles in mundane situations. Ethical Theory and Moral Practice,21(3), 669–684. Himmelreich, J. (2018). Never mind the trolley: The ethics of autonomous vehicles in mundane situations. Ethical Theory and Moral Practice,21(3), 669–684.
go back to reference Holstein, K., Wortman Vaughan, J., Daumé III, H., Dudik, M., & Wallach, H. (2019). Improving fairness in machine learning systems: What do industry practitioners need?. In Proceedings of the 2019 CHI conference on human factors in computing systems (pp. 1–16). Holstein, K., Wortman Vaughan, J., Daumé III, H., Dudik, M., & Wallach, H. (2019). Improving fairness in machine learning systems: What do industry practitioners need?. In Proceedings of the 2019 CHI conference on human factors in computing systems (pp. 1–16).
go back to reference Honarvar, A. R., & Ghasem-Aghaee, N. (2009). Casuist BDI-agent: a new extended BDI architecture with the capability of ethical reasoning. In International conference on artificial intelligence and computational intelligence (pp. 86–95). Berlin, Heidelberg: Springer. Honarvar, A. R., & Ghasem-Aghaee, N. (2009). Casuist BDI-agent: a new extended BDI architecture with the capability of ethical reasoning. In International conference on artificial intelligence and computational intelligence (pp. 86–95). Berlin, Heidelberg: Springer.
go back to reference Hoppenbrouwers, S. S., Van der Stigchel, S., Slotboom, J., Dalmaijer, E. S., & Theeuwes, J. (2015). Disentangling attentional deficits in psychopathy using visual search: Failures in the use of contextual information. Personality and Individual Differences,86, 132–138. Hoppenbrouwers, S. S., Van der Stigchel, S., Slotboom, J., Dalmaijer, E. S., & Theeuwes, J. (2015). Disentangling attentional deficits in psychopathy using visual search: Failures in the use of contextual information. Personality and Individual Differences,86, 132–138.
go back to reference Howard, D., & Muntean, I. (2017). Artificial moral cognition: moral functionalism and autonomous moral agency. In Philosophy and computing (pp. 121–159). Cham: Springer. Howard, D., & Muntean, I. (2017). Artificial moral cognition: moral functionalism and autonomous moral agency. In Philosophy and computing (pp. 121–159). Cham: Springer.
go back to reference Iyer, R., Li, Y., Li, H., Lewis, M., Sundar, R., & Sycara, K. (2018). Transparency and explanation in deep reinforcement learning neural networks. In Proceedings of the 2018 AAAI/ACM conference on AI, ethics, and society (pp. 144–150). Iyer, R., Li, Y., Li, H., Lewis, M., Sundar, R., & Sycara, K. (2018). Transparency and explanation in deep reinforcement learning neural networks. In Proceedings of the 2018 AAAI/ACM conference on AI, ethics, and society (pp. 144–150).
go back to reference Jobin, A., Ienca, M., & Vayena, E. (2019). The global landscape of AI ethics guidelines. Nature Machine Intelligence,1(9), 389–399. Jobin, A., Ienca, M., & Vayena, E. (2019). The global landscape of AI ethics guidelines. Nature Machine Intelligence,1(9), 389–399.
go back to reference Kahneman, D., Knetsch, J. L., & Thaler, R. (1986). Fairness as a constraint on profit seeking: Entitlements in the market. The American Economic Review, 728–741. Kahneman, D., Knetsch, J. L., & Thaler, R. (1986). Fairness as a constraint on profit seeking: Entitlements in the market. The American Economic Review, 728–741.
go back to reference Kamm, F. M. (2008). Intricate ethics: Rights, responsibilities, and permissable harm. Oxford: Oxford University Press. Kamm, F. M. (2008). Intricate ethics: Rights, responsibilities, and permissable harm. Oxford: Oxford University Press.
go back to reference Ku, H. H., & Hung, Y. C. (2019). Framing effects of per-person versus aggregate prices in group meals. Journal of Consumer Behaviour,18(1), 43–52. Ku, H. H., & Hung, Y. C. (2019). Framing effects of per-person versus aggregate prices in group meals. Journal of Consumer Behaviour,18(1), 43–52.
go back to reference Larson, J., Mattu, S., Kirchner, L., & Angwin, J. (2016). How we analyzed the COMPAS recidivism algorithm. ProPublica,5, 9. Larson, J., Mattu, S., Kirchner, L., & Angwin, J. (2016). How we analyzed the COMPAS recidivism algorithm. ProPublica,5, 9.
go back to reference Leike, J., Martic, M., Krakovna, V., Ortega, P.A., Everitt, T., Lefrancq, A., Orseau, L. & Legg, S. (2017). AI safety gridworlds. arXiv preprint arXiv:1711.09883. Leike, J., Martic, M., Krakovna, V., Ortega, P.A., Everitt, T., Lefrancq, A., Orseau, L. & Legg, S. (2017). AI safety gridworlds. arXiv preprint arXiv:​1711.​09883.
go back to reference Liu, C., Xu, X., & Hu, D. (2014). Multiobjective reinforcement learning: A comprehensive overview. IEEE Transactions on Systems, Man, and Cybernetics: Systems,45(3), 385–398. Liu, C., Xu, X., & Hu, D. (2014). Multiobjective reinforcement learning: A comprehensive overview. IEEE Transactions on Systems, Man, and Cybernetics: Systems,45(3), 385–398.
go back to reference Malle, B. F. (2016). Integrating robot ethics and machine morality: The study and design of moral competence in robots. Ethics and Information Technology,18(4), 243–256. Malle, B. F. (2016). Integrating robot ethics and machine morality: The study and design of moral competence in robots. Ethics and Information Technology,18(4), 243–256.
go back to reference Mannor, S., & Shimkin, N. (2004). A geometric approach to multi-criterion reinforcement learning. Journal of Machine Learning Research, 5, 325–360.MathSciNetMATH Mannor, S., & Shimkin, N. (2004). A geometric approach to multi-criterion reinforcement learning. Journal of Machine Learning Research, 5, 325–360.MathSciNetMATH
go back to reference Marchetti, A., Baglio, F., Massaro, D., Griffanti, L., Rossetto, F., Sangiuliano Intra, F., et al. (2019). Can psychological labels influence the decision-making process in an unfair condition? Behavioral and neural evidences using the ultimatum game task. Journal of Neuroscience, Psychology, and Economics,12(2), 105. Marchetti, A., Baglio, F., Massaro, D., Griffanti, L., Rossetto, F., Sangiuliano Intra, F., et al. (2019). Can psychological labels influence the decision-making process in an unfair condition? Behavioral and neural evidences using the ultimatum game task. Journal of Neuroscience, Psychology, and Economics,12(2), 105.
go back to reference May, J. (2018). Regard for reason in the moral mind. Oxford: Oxford University Press. May, J. (2018). Regard for reason in the moral mind. Oxford: Oxford University Press.
go back to reference Millar, J., Lin, P., Abney, K., & Bekey, G. A. (2017). Ethics settings for autonomous vehicles (pp. 20–34). Cambridge: MIT Press. Millar, J., Lin, P., Abney, K., & Bekey, G. A. (2017). Ethics settings for autonomous vehicles (pp. 20–34). Cambridge: MIT Press.
go back to reference Moor, J. H. (2006). The nature, importance, and difficulty of machine ethics. IEEE Intelligent Systems,21(4), 18–21. Moor, J. H. (2006). The nature, importance, and difficulty of machine ethics. IEEE Intelligent Systems,21(4), 18–21.
go back to reference Morgan, M. S. (1999). Learning from models. Ideas in Context,52, 347–388. Morgan, M. S. (1999). Learning from models. Ideas in Context,52, 347–388.
go back to reference Nowak, M. A., Page, K. M., & Sigmund, K. (2000). Fairness versus reason in the ultimatum game. Science,289(5485), 1773–1775. Nowak, M. A., Page, K. M., & Sigmund, K. (2000). Fairness versus reason in the ultimatum game. Science,289(5485), 1773–1775.
go back to reference Nyholm, S., & Smids, J. (2016). The ethics of accident-algorithms for self-driving cars: An applied trolley problem? Ethical Theory and Moral Practice,19(5), 1275–1289. Nyholm, S., & Smids, J. (2016). The ethics of accident-algorithms for self-driving cars: An applied trolley problem? Ethical Theory and Moral Practice,19(5), 1275–1289.
go back to reference Omohundro, S. M. (2008). The basic AI drives. In AGI (Vol. 171, pp. 483–492). Omohundro, S. M. (2008). The basic AI drives. In AGI (Vol. 171, pp. 483–492).
go back to reference Padoa-Schioppa, C. (2011). Neurobiology of economic choice: A good-based model. Annual Review of Neuroscience,34, 333–359. Padoa-Schioppa, C. (2011). Neurobiology of economic choice: A good-based model. Annual Review of Neuroscience,34, 333–359.
go back to reference Picard, R. (1997). Affective computing. Cambridge: MIT Press. Picard, R. (1997). Affective computing. Cambridge: MIT Press.
go back to reference Rand, D. G., Tarnita, C. E., Ohtsuki, H., & Nowak, M. A. (2013). Evolution of fairness in the one-shot anonymous Ultimatum Game. Proceedings of the National Academy of Sciences,110(7), 2581–2586.MathSciNetMATH Rand, D. G., Tarnita, C. E., Ohtsuki, H., & Nowak, M. A. (2013). Evolution of fairness in the one-shot anonymous Ultimatum Game. Proceedings of the National Academy of Sciences,110(7), 2581–2586.MathSciNetMATH
go back to reference Roff, H. Expected utilitarianism, manuscript. Roff, H. Expected utilitarianism, manuscript.
go back to reference Rosen, J. B., Rott, E., Ebersbach, G., & Kalbe, E. (2015). Altered moral decision-making in patients with idiopathic Parkinson’s disease. Parkinsonism & Related Disorders,21(10), 1191–1199. Rosen, J. B., Rott, E., Ebersbach, G., & Kalbe, E. (2015). Altered moral decision-making in patients with idiopathic Parkinson’s disease. Parkinsonism & Related Disorders,21(10), 1191–1199.
go back to reference Russell, S., Dewey, D., & Tegmark, M. (2015). Research priorities for robust and beneficial artificial intelligence. Ai Magazine,36(4), 105–114. Russell, S., Dewey, D., & Tegmark, M. (2015). Research priorities for robust and beneficial artificial intelligence. Ai Magazine,36(4), 105–114.
go back to reference Russell, S. J., & Norvig, P. (2016). Artificial intelligence: A modern approach. Malaysia: Pearson Education Limited.MATH Russell, S. J., & Norvig, P. (2016). Artificial intelligence: A modern approach. Malaysia: Pearson Education Limited.MATH
go back to reference Sanfey, A. G., Rilling, J. K., Aronson, J. A., Nystrom, L. E., & Cohen, J. D. (2003). The neural basis of economic decision-making in the ultimatum game. Science,300(5626), 1755–1758. Sanfey, A. G., Rilling, J. K., Aronson, J. A., Nystrom, L. E., & Cohen, J. D. (2003). The neural basis of economic decision-making in the ultimatum game. Science,300(5626), 1755–1758.
go back to reference Scheutz, M., & Malle, B. F. (2017). Moral robots. The Routledge Handbook of Neuroethics, Nueva York, Routledge/Taylor & Francis. Scheutz, M., & Malle, B. F. (2017). Moral robots. The Routledge Handbook of Neuroethics, Nueva York, Routledge/Taylor & Francis.
go back to reference Schroeder, T., Roskies, A. L., & Nichols, S. B. (2010). Moral motivation. In J. Doris (Ed.), The Moral Psychology Handbook. Oxford: Oxford University Press. Schroeder, T., Roskies, A. L., & Nichols, S. B. (2010). Moral motivation. In J. Doris (Ed.), The Moral Psychology Handbook. Oxford: Oxford University Press.
go back to reference Shenhav, A., & Greene, J. D. (2010). Moral judgments recruit domain-general valuation mechanisms to integrate representations of probability and magnitude. Neuron,67(4), 667–677. Shenhav, A., & Greene, J. D. (2010). Moral judgments recruit domain-general valuation mechanisms to integrate representations of probability and magnitude. Neuron,67(4), 667–677.
go back to reference Shevlin, H. De-skilling and social necessity, manuscript. Shevlin, H. De-skilling and social necessity, manuscript.
go back to reference Sinnott-Armstrong, W., Mallon, R., Mccoy, T., & Hull, J. G. (2008). Intention, temporal order, and moral judgments. Mind & Language,23(1), 90–106. Sinnott-Armstrong, W., Mallon, R., Mccoy, T., & Hull, J. G. (2008). Intention, temporal order, and moral judgments. Mind & Language,23(1), 90–106.
go back to reference Soares, N., Fallenstein, B., Armstrong, S., & Yudkowsky, E. (2015). Corrigibility. In Workshops at the twenty-ninth AAAI conference on artificial intelligence. Soares, N., Fallenstein, B., Armstrong, S., & Yudkowsky, E. (2015). Corrigibility. In Workshops at the twenty-ninth AAAI conference on artificial intelligence.
go back to reference Sripada, C. S., & Stich, S. (2005). A framework for the psychology of norms. The Innate Mind,2, 280–301. Sripada, C. S., & Stich, S. (2005). A framework for the psychology of norms. The Innate Mind,2, 280–301.
go back to reference Sripada, C. S., & Stich, S. (2006). A framework for the psychology of norms. The Innate Mind, 2, 280–301. Sripada, C. S., & Stich, S. (2006). A framework for the psychology of norms. The Innate Mind, 2, 280–301.
go back to reference Sterelny, K., & Fraser, B. (2017). Evolution and moral realism. The British Journal for the Philosophy of Science,68(4), 981–1006. Sterelny, K., & Fraser, B. (2017). Evolution and moral realism. The British Journal for the Philosophy of Science,68(4), 981–1006.
go back to reference Sutton, R. S., & Barto, A. G. (1998). Introduction to reinforcement learning (vol. 135). Cambridge: MIT press. Sutton, R. S., & Barto, A. G. (1998). Introduction to reinforcement learning (vol. 135). Cambridge: MIT press.
go back to reference Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction. MIT press. Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction. MIT press.
go back to reference Taylor, J., Yudkowsky, E., LaVictoire, P., & Critch, A. (2016). Alignment for advanced machine learning systems. Berkeley: Machine Intelligence Research Institute. Taylor, J., Yudkowsky, E., LaVictoire, P., & Critch, A. (2016). Alignment for advanced machine learning systems. Berkeley: Machine Intelligence Research Institute.
go back to reference Thaler, R. H. (1988). Anomalies: The ultimatum game. Journal of economic perspectives,2(4), 195–206. Thaler, R. H. (1988). Anomalies: The ultimatum game. Journal of economic perspectives,2(4), 195–206.
go back to reference Vallor, S. (2015). Moral deskilling and upskilling in a new machine age: Reflections on the ambiguous future of character. Philosophy & Technology,28(1), 107–124. Vallor, S. (2015). Moral deskilling and upskilling in a new machine age: Reflections on the ambiguous future of character. Philosophy & Technology,28(1), 107–124.
go back to reference Vamplew, P., Dazeley, R., Foale, C., Firmin, S., & Mummery, J. (2018). Human-aligned artificial intelligence is a multiobjective problem. Ethics and Information Technology,20(1), 27–40. Vamplew, P., Dazeley, R., Foale, C., Firmin, S., & Mummery, J. (2018). Human-aligned artificial intelligence is a multiobjective problem. Ethics and Information Technology,20(1), 27–40.
go back to reference Vanderelst, D., & Winfield, A. (2018). An architecture for ethical robots inspired by the simulation theory of cognition. Cognitive Systems Research,48, 56–66. Vanderelst, D., & Winfield, A. (2018). An architecture for ethical robots inspired by the simulation theory of cognition. Cognitive Systems Research,48, 56–66.
go back to reference Van Moffaert, K., Drugan, M. M., & Nowé, A. (2013). Hypervolume-based multi-objective reinforcement learning. In International Conference on Evolutionary Multi-Criterion Optimization (pp. 352-366). Springer, Berlin, Heidelberg. Van Moffaert, K., Drugan, M. M., & Nowé, A. (2013). Hypervolume-based multi-objective reinforcement learning. In International Conference on Evolutionary Multi-Criterion Optimization (pp. 352-366). Springer, Berlin, Heidelberg.
go back to reference Van Moffaert, K., & Nowé, A. (2014). Multi-objective reinforcement learning using sets of pareto dominating policies. The Journal of Machine LearningResearch, 15(1), 3483–3512.MathSciNetMATH Van Moffaert, K., & Nowé, A. (2014). Multi-objective reinforcement learning using sets of pareto dominating policies. The Journal of Machine LearningResearch, 15(1), 3483–3512.MathSciNetMATH
go back to reference Wallach, W., & Allen, C. (2008). Moral machines: Teaching robots right from wrong. Oxford: Oxford University Press. Wallach, W., & Allen, C. (2008). Moral machines: Teaching robots right from wrong. Oxford: Oxford University Press.
go back to reference Wallach, W., Franklin, S., & Allen, C. (2010). A conceptual and computational model of moral decision making in human and artificial agents. Topics in Cognitive Science,2(3), 454–485. Wallach, W., Franklin, S., & Allen, C. (2010). A conceptual and computational model of moral decision making in human and artificial agents. Topics in Cognitive Science,2(3), 454–485.
go back to reference Wallach, W., & Marchant, G. (2019). Toward the agile and comprehensive international governance of AI and Robotics. Proceedings of the IEEE,107(3), 505–508. Wallach, W., & Marchant, G. (2019). Toward the agile and comprehensive international governance of AI and Robotics. Proceedings of the IEEE,107(3), 505–508.
go back to reference Wei, C., Zheng, L., Che, L., Cheng, X., Li, L., & Guo, X. (2018). Social support modulates neural responses to unfairness in the ultimatum game. Frontiers in Psychology,9, 182. Wei, C., Zheng, L., Che, L., Cheng, X., Li, L., & Guo, X. (2018). Social support modulates neural responses to unfairness in the ultimatum game. Frontiers in Psychology,9, 182.
go back to reference Winfield, A. F., Michael, K., Pitt, J., & Evers, V. (2019). Machine ethics: the design and governance of ethical AI and autonomous systems. Proceedings of the IEEE,107(3), 509–517. Winfield, A. F., Michael, K., Pitt, J., & Evers, V. (2019). Machine ethics: the design and governance of ethical AI and autonomous systems. Proceedings of the IEEE,107(3), 509–517.
go back to reference Wolf, S. (1982). Moral saints. The Journal of Philosophy, 79(8), 419–439. Wolf, S. (1982). Moral saints. The Journal of Philosophy, 79(8), 419–439.
go back to reference Yang, R., Sun, X., & Narasimhan, K. (2019). A Generalized Algorithm for Multi-Objective Reinforcement Learning and Policy Adaptation. In Advances in Neural Information Processing Systems (pp. 14610–14621). Yang, R., Sun, X., & Narasimhan, K. (2019). A Generalized Algorithm for Multi-Objective Reinforcement Learning and Policy Adaptation. In Advances in Neural Information Processing Systems (pp. 14610–14621).
Metadata
Title
Moral Gridworlds: A Theoretical Proposal for Modeling Artificial Moral Cognition
Author
Julia Haas
Publication date
25-04-2020
Publisher
Springer Netherlands
Published in
Minds and Machines / Issue 2/2020
Print ISSN: 0924-6495
Electronic ISSN: 1572-8641
DOI
https://doi.org/10.1007/s11023-020-09524-9

Other articles of this Issue 2/2020

Minds and Machines 2/2020 Go to the issue

Premium Partner