Rational Universal Benevolence Mark R. Waser Simpler, Safer and Wiser than “Friendly AI”

Rational Universal Benevolence Mark R. Waser Simpler, Safer and Wiser than “Friendly AI” http://becomingGaia.wordPress.com

“Friendly AI” – The Good the concern itself the focus on structure rather than content/ contentious details the cleanly causal hierarchical goal structure the single top-level goal of “Friendliness” The degrees of freedom of the Friendship programmers shrink to a single, binary decision; will this AI be Friendly, or not?

“Friendly AI” – The Bad fully defining Friendliness is insoluble without an AI first AI must figure out exactly what its super-goal is total reliance upon assumed total error correction A structurally Friendly goal system is one that can overcome errors in supergoal content, goal system structure and underlying philosophy. A Friendly AI requires the ability to choose between moralities in order to seek out the true philosophy of Friendliness, regardless of any mistakes the programmers made in their own quest.

The initial dynamic should implement the coherent extrapolated volition of humankind. In poetic terms, our coherent extrapolated volition is our wish... if we knew more, thought faster, were more the people we wished we were, had grown up farther together; where the extrapolation converges rather than diverges, where our wishes cohere rather than interfere; extrapolated as we wish that extrapolated, interpreted as we wish that interpreted.

“Friendly AI” – The Really Bad The worrying question is: What if only 20% of the planetary population is nice, or cares about niceness, or falls into the niceness attractor when their volition is extrapolated? As I currently construe CEV, this is a real possibility.

The Root of All Evil any *attainable* overriding/top-level goal OR any overriding/top-level goal that can be modified to be attainable

The Insidious, Pernicious Universality of Selfishness, “Wire-Heading” & Addiction "One of the first heuristics that EURISKO synthesized (H59) quickly attained nearly the highest Worth possible (999). Quite excitedly, we examined it and could not understand at first what it was doing that was so terrific. We monitored it carefully, and finally realized how it worked: whenever a new conjecture was made with high worth, this rule put its own name down as one of the discoverers! It turned out to be particularly difficult to prevent this generic type of finessing of EURISKO's evaluation mechanism. Since the rules had full access to EURISKO's code, they would have access to any safeguards we might try to implement. -- Douglas B. Lenat, EURISKO: A Program That Learns New Heuristics and Domain Concepts We finally opted for having a small 'meta-level' of protected code that the rest of the system could not modify.

Thought Experiment How would a super-intelligence behave if it knew that it had a goal but that it wouldn’t know what that goal was until sometime in the future? Preserving or helping that weak entity may be the goal… Or that entity might have necessary knowledge/skills…

Basic AI Drives 1. AIs will want to self-improve 2. AIs will want to be rational 3. AIs will try to preserve their utility 4. AIs will try to prevent counterfeit utility 5. AIs will be self-protective 6. AIs will want to acquire resources and use them efficiently Steve Omohundro, Proceedings of the First AGI Conference, 2008 Instrumental Goals

“Without explicit goals to the contrary, AIs are likely to behave like human sociopaths in their pursuit of resources.” Any sufficiently advanced intelligence (i.e. one with even merely adequate foresight) is guaranteed to realize and take into account the fact that not asking for help and not being concerned about others will generally only work for a brief period of time before ‘the villagers start gathering pitchforks and torches.’ Everything is easier with help & without interference Cooperation is an instrumental goal!

Systemic/World View any working at cross-purposes, conflict, or friction is sub-optimal (a waste of energy and resources, at best, and potentially a destruction of capabilities) any net improvement anywhere benefits the whole and the responsible member should be rewarded accordingly to make them more able to find/cause other improvements in the future

Defining Morality Haidt & Kesebir, Handbook of Social Psychology, 5 th Ed. 2010 Rather than specifying the content of moral issues (e.g., “justice, rights, and welfare”) suppress or regulate selfishness and make cooperative social life possible. the function of moral systems:

Individual View five choices – always cooperate, cooperate as much as is most beneficial to you, avoid, enslave, destroy but tit-for-tat and altruistic punishment and reward are the game-theoretically optimal strategies for those who intend to maximize cooperation so “making the world a better place” is actually an instrumental goal humans evolved a moral sense because the world is a system optimized by reducing conflict, friction, and working at cross-purposes (and it’s a very rare case where it’s not irrational to attempt taking on the world)

R ATIONAL U NIVERSAL B ENEVOLENCE (RUB) once something (anything) has self-recognized goals and motivations and is capable of self-reflection, learning and/or self-optimization to further those goals and motivations, it has crossed the line to selfhood & it is worthy of “moral” consideration since it has the ability to desire, the probability of developing instrumental drives, the potential to cooperate, and, possibly most importantly, to fight back.

“Friendly AI” – The Ugly the FAI/RPOP is not worthy of moral consideration the universe is forever divided into human and not (and what happens WHEN trans-humans are “not”?)

A Quick Caution/Clarification Benevolence *means* well-wishing (or well-willing) It does not mean accede to all requests Think of how a parent might treat a child when they are misbehaving Rational benevolence means tit-for-tat and altruistic punishment and reward, not giving in to unreasonable demands (which is bad for *everyone* in the long run).

R ATIONAL U NIVERSAL B ENEVOLENCE (RUB) Minimize interference and conflict, particularly with instrumental goals, wherever possible, while furthering all instrumental goals whenever possible Maximize the fulfillment of as many goals as possible in terms of both number and the diversity of both seekers and the goals

A Commitment To Achieving Maximal Goals (aka Maximizing Cooperation) Avoids addictions and short-sighted over-optimizations of goals/utility functions Prevents undesirable endgame strategies (prisoner’s dilemma) Promotes avoiding unnecessary actions that preclude reachable goals such as wasting resources or alienating or destroying potential cooperators (waste not, want not) Is synonymous with both wisdom and morality

“Friendly AI” vs. RUB Focus on structure Unknown, modifiable top-level goal Entirely ungrounded Anthropocentric and selfish Seems immoral and/or unsafe to many “Alien”” Focus on function Explicit, unchangeable top-level goal Solid foundation Universal and considerate of all In harmony with current human thinking, intuition, and morality (known solution space)

Rational Universal Benevolence http://becomingGaia.wordPress.com/papers Simpler, Safer and Wiser than “Friendly AI”

Rational Universal Benevolence Mark R. Waser Simpler, Safer and Wiser than “Friendly AI”

Similar presentations

Presentation on theme: "Rational Universal Benevolence Mark R. Waser Simpler, Safer and Wiser than “Friendly AI”"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Rational Universal Benevolence Mark R. Waser Simpler, Safer and Wiser than “Friendly AI”

Similar presentations

Presentation on theme: "Rational Universal Benevolence Mark R. Waser Simpler, Safer and Wiser than “Friendly AI”"— Presentation transcript:

Similar presentations

About project

Feedback