Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Foundational Moral Values for AI Alignment (2311.17017v1)

Published 28 Nov 2023 in cs.CY and cs.AI

Abstract: Solving the AI alignment problem requires having clear, defensible values towards which AI systems can align. Currently, targets for alignment remain underspecified and do not seem to be built from a philosophically robust structure. We begin the discussion of this problem by presenting five core, foundational values, drawn from moral philosophy and built on the requisites for human existence: survival, sustainable intergenerational existence, society, education, and truth. We show that these values not only provide a clearer direction for technical alignment work, but also serve as a framework to highlight threats and opportunities from AI systems to both obtain and sustain these values.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (73)
  1. Concrete problems in ai safety. arXiv preprint arXiv:1606.06565, 2016.
  2. S. T. Aquinas et al. The summa theologica: Complete edition. Catholic Way Publishing, 2014.
  3. The complete works of Aristotle, volume 2. Princeton University Press Princeton, 1984.
  4. A general language assistant as a laboratory for alignment, 2021.
  5. Computational ethics. Trends in Cognitive Sciences, 26(5):388–405, 2022.
  6. Training a helpful and harmless assistant with reinforcement learning from human feedback, 2022a.
  7. Constitutional ai: Harmlessness from ai feedback, 2022b.
  8. Applications of ai in education. XRDS: Crossroads, The ACM Magazine for Students, 3(1):11–15, 1996.
  9. S. Besson and J. Tasioulas. The philosophy of international law. Oxford University Press, 2010.
  10. N. Bontridder and Y. Poullet. The role of artificial intelligence in disinformation. Data & Policy, 3:e32, 2021.
  11. N. Bostrom. Superintelligence: Paths, dangers, strategies. 2014.
  12. D. Brown. Human universals. pages 135–6, 1991.
  13. K. Casler and D. Kelemen. Young children’s rapid learning about artifacts. Developmental Science, 8(6):472–480, 2005.
  14. Creativity support in the age of large language models: An empirical study involving emerging writers. arXiv preprint arXiv:2309.12570, 2023.
  15. A multilevel framework for ai governance. arXiv preprint arXiv:2307.03198, 2023.
  16. C. A. Confucius. The analects. translated by dc lau, 1979.
  17. Deep learning of aftershock patterns following large earthquakes. Nature, 560(7720):632–634, 2018.
  18. J. Donnelly. Universal human rights in theory and practice. Cornell University Press, 2013.
  19. M. Fasoli. The overuse of digital technologies: human weaknesses, design strategies and ethical concerns. Philosophy & Technology, 34(4):1409–1427, 2021.
  20. M. Fernandez and H. Alani. Artificial intelligence and online extremism: Challenges and opportunities. 2021.
  21. Ethics in the age of disruptive technologies: An operational roadmap. 2023.
  22. Naturalizing ethics. The Blackwell companion to naturalism, pages 1–25, 2016.
  23. I. Gabriel. Artificial intelligence, values, and alignment. Minds and machines, 30(3):411–437, 2020.
  24. Chatgpt and the future of work: A comprehensive analysis of ai’s impact on jobs and employment. Partners Universal International Innovation Journal, 1(3):154–186, 2023.
  25. Improving alignment of dialogue agents via targeted human judgements, 2022.
  26. GoogleAI. Artificial intelligence at google: Our principles, 2022. URL https://ai.google/principles/.
  27. B. P. Green. Artificial intelligence, decision-making, and moral deskilling. Markkula Center for Applied Ethics, 2019.
  28. B. P. Green. Convergences in the ethics of space exploration. Social and conceptual issues in astrobiology, pages 179–196, 2020.
  29. Aligning ai with shared human values. arXiv preprint arXiv:2008.02275, 2020.
  30. R. D. Hicks. Aristotle de anima. Cambridge University Press, 2015.
  31. A multi-level framework for the ai alignment problem. arXiv preprint arXiv:2301.03740, 2023.
  32. IBM. Ai ethics, 2022. URL https://www.ibm.com/artificial-intelligence/ethics.
  33. Health system-scale language models are all-purpose prediction engines. Nature, pages 1–6, 2023.
  34. H. Jonas. The imperative of responsibility: In search of an ethics for the technological age. University of Chicago press, 1984.
  35. I. Kant and L. W. Beck. Immanuel kant: Foundations of the metaphysics of morals, 1989.
  36. Alignment of language agents. arXiv preprint arXiv:2103.14659, 2021.
  37. The empty signifier problem: Towards clearer paradigms for operationalising "alignment" in large language models, 2023.
  38. All the news that’s fit to fabricate: Ai-generated text as a tool of media misinformation. Journal of experimental political science, 9(1):104–117, 2022.
  39. N. Kshetri. Artificial intelligence in developing countries. IT Prof., 22(4):63–68, 2020.
  40. D. C. Lau. Confucius: the analects. 2000.
  41. D. Lee. Crossbow intruder who wanted to “kill queen” given nine-year sentence, Oct 2023. URL https://www.bbc.com/news/live/uk-66108009.
  42. J. Legge et al. Confucian analects: The great learning, and the doctrine of the mean. Courier Corporation, 1971.
  43. Scalable agent alignment via reward modeling: a research direction. arXiv preprint arXiv:1811.07871, 2018.
  44. A survey on bias and fairness in machine learning. ACM computing surveys (CSUR), 54(6):1–35, 2021.
  45. L. H. Meyer. Intergenerational justice. Routledge, 2017.
  46. MicrosoftStaff. Our approach, 2022. URL https://www.microsoft.com/en-us/ai/our-approach?activetab=pivot1%3Aprimaryr5.
  47. Auditing large language models: a three-layered approach. AI and Ethics, pages 1–31, 2023.
  48. T. Moynihan. X-risk: How humanity discovered its own extinction. MIT Press, 2020.
  49. Ml for flood forecasting at scale, 2019.
  50. M. C. Nussbaum. Women and human development: The capabilities approach, volume 3. Cambridge university press, 2000.
  51. OpenAI. Openai charter, 2023. URL https://openai.com/charter.
  52. Artificial intelligence in education: Challenges and opportunities for sustainable development. 2019.
  53. Plato. Meno. Liberal Arts Press New York, 1949.
  54. Plato’s Phaedo, volume 120. Cambridge University Press, 1972.
  55. S. Raaijmakers. Artificial intelligence for law enforcement: challenges and opportunities. IEEE security & privacy, 17(5):74–77, 2019.
  56. R. Raja and P. Nagasubramani. Impact of modern technology in education. Journal of Applied and Advanced Research, 3(1):33–35, 2018.
  57. R. A. Rappaport. Ecology, meaning, and religion. North Atlantic Books, 1979.
  58. S. Russell. Human compatible: Artificial intelligence and the problem of control. Penguin, 2019.
  59. Salesforce. Ethical use policy, 2022. URL https://www.salesforce.com/company/intentional-innovation/ethical-use-policy/.
  60. Combating disinformation in a social media age. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 10(6):e1385, 2020.
  61. H. Shue. Basic rights: Subsistence, affluence, and US foreign policy. princeton University press, 2020.
  62. Value kaleidoscope: Engaging ai with pluralistic human values, rights, and duties. arXiv preprint arXiv:2309.00779, 2023.
  63. Systematic review of smart health monitoring using deep learning and artificial intelligence. Neuroscience Informatics, 2(3):100028, 2022.
  64. J. Tasioulas. Artificial intelligence, humanistic ethics. Daedalus, 151(2):232–243, 2022.
  65. F. Teng. Climate change and moral responsibility toward future generations: A confucian perspective. Philosophy East and West, 71(2):451–472, 2021.
  66. P. Torres. Who would destroy the world? omnicidal agents and related phenomena. Aggression and Violent Behavior, 39:129–138, 2018.
  67. New dimensions in testimony: Digitally preserving a holocaust survivor’s interactive storytelling. In Interactive Storytelling: 8th International Conference on Interactive Digital Storytelling, ICIDS 2015, Copenhagen, Denmark, November 30-December 4, 2015, Proceedings 8, pages 269–281. Springer, 2015.
  68. D. o. E. United Nations and S. Affairs. Sustainable development: The 17 goals. URL https://sdgs.un.org/goals.
  69. S. Vallor. Moral deskilling and upskilling in a new machine age: Reflections on the ambiguous future of character. Philosophy & Technology, 28:107–124, 2015.
  70. Ethics in technology practice. The Markkula Center for Applied Ethics at Santa Clara University. https://www. scu. edu/ethics, 2018.
  71. H. Wang. Algorithmic colonization of love: The ethical challenges of dating app algorithms in the age of ai. Techné: Research in Philosophy and Technology, 27(2):260–280, 2023.
  72. fastmri: An open dataset and benchmarks for accelerated mri. arXiv preprint arXiv:1811.08839, 2018.
  73. Siren’s song in the ai ocean: A survey on hallucination in large language models. arXiv preprint arXiv:2309.01219, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Betty Li Hou (5 papers)
  2. Brian Patrick Green (2 papers)