Ensuring Safe AI via a Moral Emotion Motivational & Control System Mark R. Waser Digital Wisdom Institute Mark.Waser@Wisdom.Digital
Some Preliminary Priming This does not require consciousness and qualia This does not require autopoietic entity AI This can (and should) be applied NOW
AI is a Wicked Social Problem The most critical next step in our pursuit of AI is to agree on an ethical & empathic framework for its design -- Satya Nadal, 2017 If we use, to achieve our purposes, a mechanical agency whose operation we cannot interfere effectively . . . we had better be quite sure that the purpose put into the machine is the purpose we really desire. -- Norbert Wiener, 1960
In Search of a Solution We need something like a Manhattan Project on the topic of artificial intelligence, not to build it because I think we will inevitably do that, but to understand how to avoid an arms race & to build it in a way that is aligned with our interests -- Sam Harris, Can we build AI without losing control over it? … pretty much everyone agreed that they had no idea of how to define morality, or to select the right one for an AI -- Peter Voss, AI Safety Research: A Road to Nowhere
Agenda Decide what we want (business requirements) Safety A better life for humanity Decide how it will work (functional specification) Bottom-up control via moral emotions Top down control by understanding morality and the meaning of life NOT enslaved Bonus: We get to better understand & promote morality and the meaning of life
Values Alignment (aka Agreeing on the Meaning of Life) the convergent instrumental goal of acquiring resources poses a threat to humanity, for it means that a super-intelligent machine with almost any final goal (say, of solving the Riemann hypothesis) would want to take the resources we depend on for its own use . . . . an AI ‘does not love you, nor does it hate you, but you are made of atoms it can use for something else’ Moreover, the AI would correctly recognize that humans do not want their resources used for the AI’s purposes, and that humans therefore pose a threat to the fulfillment of its goals – a threat to be mitigated however possible. Muehlhauser & Bostrom (2014). WHY WE NEED FRIENDLY AI. Think 13: 41-47
Love Conquers All But . . . what if . . . the AI *does* love you?
Love & Altruism are super-rational advantageous beyond our ability to calculate and/or guarantee their ultimate effect (see also: faith)
Failures of Rationality Centipede Game
Instrumental Goals Evolve Self-improvement Rationality/integrity Preserve goals/utility function Decrease/prevent fraud/counterfeit utility Survival/self-protection Efficiency (in resource acquisition & use) Self-improvement Rationality/integrity Preserve goals/utility function Decrease/prevent fraud/counterfeit utility Survival/self-protection Efficiency (in resource acquisition & use) Community = assistance/non-interference through GTO reciprocation (OTfT + AP) Reproduction (adapted from Omohundro 2008 The Basic AI Drives)
But Some Are Short-Sighted or Negative Sum Preserve goals/utility function Money Power Size Efficiency Theft “Externalizing” costs & risks Cutting safety margins Suppressing diversity Survival/self-protection AI must be controlled!