AI Value Alignment and Sociology of Morality

AI Value Alignment and Sociology of Morality


Deviatko I.F.

Dr. Sci. (Soc.), Full Professor, HSE University; Chief Researcher, Institute of Sociology FCTAS RAS, Moscow, Russia deviatko@gmail.com

ID of the Article:


For citation:

Deviatko I.F. AI Value Alignment and Sociology of Morality. Sotsiologicheskie issledovaniya [Sociological Studies]. 2023. No 9. P. 16-28




Abstract

The article briefly examines popular ideas about the goals and possibilities of human control over artificial intelligence that has been developed at the earlier stages of the scientific and technological revolution and substantiates the thesis about their incompleteness in terms of not taking into account new asymmetries of control and technological realities that arose as a result of the “digital revolution”. An analysis of the reasons why the sociology and social psychology of morality are acquiring a decisive role as well as a new large-scale research field in the development of ethically oriented AI systems is presented reconfirming the importance of a theoretically based empirical study of the normative dimension of social life. An additional sociological substantiation and a narrower interpretation of the AI value alignment principles put forward by some authors as a solution to the problems of the ethical orientation of AI systems is proposed.


Keywords
Artificial intelligence; AI value alignment; sociology of morality; plurality of normative systems; justice

References

Быков А.В. Понятие «альтруизм» в социологии: от классических концепций к практическому забвению // Вестник РУДН. Серия: Социология. 2015. № 1. С. 5–18. [Bykov A.V. The concept of “altruism” in sociology: From classical theories to practical oblivion. RUDN Journal of Sociology. 2015. No. 1: 5–18. (In Russ.)]

Девятко И.Ф. О теоретических моделях, объясняющих восприятие справедливости на микро-, мезои макроуровнях социальной реальности. Социология: методология, методы, математическое моделирование. 2009. № 29. C. 10–29. [Deviatko I.F. (2009) On Theoretical Models Explaining the Perception of Justice on Micro, Meso and Macro Levels of Social Reality. Sociology: methodology, methods, mathematical modeling. No. 29: 10–29. (In Russ.)]

Девятко И.Ф. Понятие нормы в социологической теории: от классических оснований к новым интерпретациям природы норм и множественности нормативных систем // Нормы и мораль в социологической теории: от классических концепций к новым идеям / Отв. ред.: И.Ф. Девятко, Р.Н. Абрамов, И.В. Катерный. М.: Весь Мир, 2017. С. 10–42. [Deviatko I.F. (2017) Social Norms: From Attempts of Definition towards New Interpretations of Sources of Normative Value and Plurality of Normative Systems. In: Norms and Morals in Sociological Theory: from Classical Interpretations to New Ideas. Ed. by I.F. Deviatko, R.N. Abramov, I.V. Katerny. Мoscow: Ves’ Mir. (In Russ.)]

Девятко И.Ф. Социологические теории деятельности и практической рациональности. М.: «Аванти плюс», 2003. [Deviatko I.F. (2003) Sociological Theories of Agency and Practical Rationality. Moscow: Avanti Plus. (In Russ.)]

Калинин Р.Н. Изучение дистрибутивной справедливости в социальных науках: обзор концептуализаций и методологических подходов // Социология: методология, методы, математическое моделирование. 2019. № 49: 7–56. [Kalinin R.N. Distributive Justice Research in Social Sciences: A Review of Conceptualizations and Methodological Approaches. Sociology: methodology, methods, mathematical modeling. 2019. No 49: 7–56. (In Russ.)]

Калинин Р.Г., Девятко И.Ф. Кто заплатит за водопровод: социальный контекст восприятия дистрибутивной справедливости // Мониторинг общественного мнения: Экономические и социальные перемены. 2019. № 2: 95–114. [Kalinin R.G., Deviatko I.F. (2019) Who should pay for a water pipe: social context of distributive justice perception. Monitoring obshchestvennogo mneniya: ekonomicheskie i social’nye peremeny [Monitoring of Public Opinion: Economic and Social Changes]. No. 2: 95–114. (In Russ.)]

Awad E., Dsouza S., Kim R. еt al. (2018) The Moral Machine Experiment. Nature. 563: 59–64. DOI: 10.1038/ s41586-018-0637-6.

Balwit A., Korinek A. (2022) Aligned with Whom? Direct and Social Goals for AI Systems. CEPR Discussion Paper No. DP17298. Available at SSRN: https://ssrn.com/abstract=4121483 (accessed 21.05.2023).

Bostrom N. Superintelligence: Paths, Dangers, Strategies. Oxford University Press, 2014.

Boudon R., Betton E. (1999) Explaining the Feelings of Justice. Ethical Theory and Moral Practice. Vol. 2: 365–398.

Carroll L. (1895) What the Tortoise Said to Achilles. Mind. 1895 (April). IV (14): 278–280. DOI:10.1093/mind/ IV.14.278.

Curry O.S., Mullins D.A., Whitehouse H. (2019) Is It Good to Cooperate?: Testing the Theory of Moralityas-Cooperation in 60 Societies. Current Anthropology. 60(1): 47–69. DOI: 10.1086/701478.

Deviatko I.F., Gavrilov K.A. (2020) Causality and Blame Judgments of Negative Side Effects of Actions May Differ for Different Institutional Domains. SAGE Open. October 2020. DOI:10.1177/2158244020970942.

Foa E.B., Foa U.G. Resource Theory of Social Exchange. In: Handbook of Social Resource Theory: Theoretical Extensions, Empirical Insights, and Social Applications Critical Issues в Social Justice. Ed. by K. Törnblom, A. Kazemi. New York, NY: Springer New York, 2012: 15–32.

Franklin M., Awad E., Lagnado D. (2021) Blaming Automated Vehicles in Difficult Situations. iScience. 24(4). 102252.

Gabriel I. (2020) Artificial Intelligence, Values, and Alignment. Minds & Machines. 30: 411–437. DOI: 10.1007/s11023-020-09539-2.

Gray K., MacCormack J.K., Henry T., Banks E., Schein C., Armstrong-Carter E., Abrams S., Muscatell K.A. (2022) The Affective Harm Account (AHA) of Moral Judgment: Reconciling Cognition and Affect, Dyadic Morality and Disgust, Harm and Purity. Journal of Personality and Social Psychology. 123(6): 1199–1222. DOI: 10.1037/pspa0000310.

Greene J.D., Sommerville R.B., Nystrom L.E., Darley J.M., Cohen J.D. (2001) An fMRI Investigation of Emotional Engagement in Moral Judgment. Science. 293: 2105–2107. DOI: 10.1126/science.1062872.

Guglielmo S. (2015) Moral Judgment as Information Processing: An Integrative Review. Frontiers in Psychology. 6(1637). DOI: 10.3389/fpsyg.2015.01637.

Haidt J. (2012) The Righteous Mind: Why Good People Are Divided by Politics and Religion. Pantheon.

Kahane G. (2015) Sidetracked by Trolleys: Why Sacrificial Moral Dilemmas Tell Us Little (or Nothing) about Utilitarian Judgment. Social Neuroscience. 10(5): 551–560. DOI: 10.1080/17470919.2015.1023400.

Keiper A., Schulman A.N. (2011) The Problem with ‘Friendly’ Artificial Intelligence. The New Atlantis. No. 32: 80–89. URL: https://www.thenewatlantis.com/publications/the-problem-with-friendly-artificialintelligence (accessed 02.06.2023).

Konow J. (2003) Which Is the Fairest One of All? A Positive Analysis of Justice Theories. Journal of Economic Literature. Vol. XLI. (December): 1188–1239.

Levy M.G. (2023) Chatbots Don’t Know What Stuff Isn’t. Quanta. May 12. 2023. URL: https://www. quantamagazine.org/ai-like-chatgpt-are-no-good-at-not-20230512/ (accessed 21.05.2023).

Mitchell S., Potash S., Barocas E.S., D’Amour A., Lum K. (2021) Algorithmic Fairness: Choices, Assumptions, and Definitions. Annual Review of Statistics and Its Application. 8(1): 141–163.

Phillips E., Zhao X., Ullman D., Malle B.F. (2018) What is Human-like? Decomposing Robots’ Human-like Appearance Using the Anthropomorphic roBOT (ABOT) Database. In: Proceedings of the 2018 ACM/IEEE International Conference on Human-Robot Interaction (HRI ‘18). Association for Computing Machinery. New York, NY, USA: 105–113. DOI: 10.1145/3171221.3171268.

Rawls J. (1971) A Theory of Justice. Cambridge, MA: Harvard University Press. Wang X., Zhang Y., Zhu R. (2022) A Brief Review on Algorithmic Fairness. Management System Engineering. 1 (7). URL: https://link.springer.com/article/10.1007/s44176-022-00006-z (accessed 21.05.2023). DOI: 10.1007/s44176-022-00006-z.

Weidinger L., McKee K.R., Everett R. et al. (2023) Using the Veil of Ignorance to Align AI Systems with Principles of Justice. Proc. Natl. Acad. Sci. U.S.A., 18(120) (accessed 21.05.2023). DOI: 10.1073/ pnas.2213709120.

Weidinger L., Uesato J., Rauh M., Griffin C. et al. (2022) Taxonomy of Risks Posed by Language Models. In 2022 ACM Conference on Fairness, Accountability, and Transparency (FAccT ‘22). Association for Computing Machinery, New York, NY, USAН: 214–229. DOI: 10.1145/3531146.3533088.

Wiener N. (1960) Some Moral and Technical Consequences of Automation. Science. May 6. 131(3410): 1355–1358. DOI: 10.1126/science.131.3410.1355.

Yaari M., Bar-Hillel M. (1984) On Dividing Justly. Social Choice and Welfare. Vol. 1 (1): 1–24.

Yudkowsky E. (2008) Artificial Intelligence as a Positive and Negative Factor in Global Risk. In: Bostrom N., Ćirković M. (eds) Global Catastrophic Risks. Oxford University Press: 308–345.

 

Content No 9, 2023