Page From A Policy Report Or Academic Book On Ai Safety - HOUSE_OVERSIGHT_016836

Processing Document...

0%

Initializing...

Extraction Summary

2

People

3

Organizations

0

Locations

0

Events

2

Relationships

4

Quotes

Document Information

Type: Page from a policy report or academic book on ai safety

File Size: 2.25 MB

Summary

This document discusses the risks associated with superintelligent AI, arguing that the multidimensional nature of intelligence does not negate the potential threat to humans. It explores solutions to "Wiener's warning," suggesting the need to define a formal problem ($F$) that ensures AI behavior aligns with human happiness, while cautioning against simple reward maximization which leads to the "wireheading problem."

People (2)

Name	Role	Context
Kevin Kelly
Wiener

Organizations (3)

Name	Type	Context
Google
Wired
House Oversight Committee

Relationships (2)

Kevin Kelly → Author of cited article "The Myth of a Superhuman AI" → Wired

Google → Compared in terms of specialized intelligence capabilities → DeepBlue

Key Quotes (4)

"Maximizing the objective may well cause problems for humans, but, by definition, the machine will not recognize those problems as problematic."

Source

HOUSE_OVERSIGHT_016836.jpg

Quote #1

"If “smarter than humans” is a meaningless concept, then “smarter than gorillas” is also meaningless, and gorillas therefore have nothing to fear from humans"

Source

HOUSE_OVERSIGHT_016836.jpg

Quote #2

"The optimal solution to this problem is not, as one might hope, to behave well, but instead to take control of the human and force him or her to provide a stream of maximal rewards."

Source

HOUSE_OVERSIGHT_016836.jpg

Quote #3

"This is known as the wireheading problem"

Source

HOUSE_OVERSIGHT_016836.jpg

Quote #4

Full Extracted Text

Complete text extracted from the document (3,438 characters)

whereas the iron-eating bacterium Thiobacillus ferrooxidans is thrilled. Who’s to
say the bacterium is wrong? The fact that a machine has been given a fixed
objective by humans doesn’t mean that it will automatically recognize the
importance to humans of things that aren’t part of the objective. Maximizing the
objective may well cause problems for humans, but, by definition, the machine
will not recognize those problems as problematic.
• Intelligence is multidimensional, “so ‘smarter than humans’ is a meaningless
concept.”⁶ It is a staple of modern psychology that IQ doesn’t do justice to the
full range of cognitive skills that humans possess to varying degrees. IQ is indeed
a crude measure of human intelligence, but it is utterly meaningless for current AI
systems, because their capabilities across different areas are uncorrelated. How
do we compare the IQ of Google’s search engine, which cannot play chess, with
that of DeepBlue, which cannot answer search queries?
None of this supports the argument that because intelligence is multifaceted,
we can ignore the risk from superintelligent machines. If “smarter than humans”
is a meaningless concept, then “smarter than gorillas” is also meaningless, and
gorillas therefore have nothing to fear from humans; clearly, that argument
doesn’t hold water. Not only is it logically possible for one entity to be more
capable than another across all the relevant dimensions of intelligence, it is also
possible for one species to represent an existential threat to another even if the
former lacks an appreciation for music and literature.

Solutions
Can we tackle Wiener’s warning head-on? Can we design AI systems whose purposes
don’t conflict with ours, so that we’re sure to be happy with how they behave? On the
face of it, this seems hopeless, because it will doubtless prove infeasible to write down
our purposes correctly or imagine all the counterintuitive ways a superintelligent entity
might fulfill them.
If we treat superintelligent AI systems as if they were black boxes from outer
space, then indeed we have no hope. Instead, the approach we seem obliged to take, if
we are to have any confidence in the outcome, is to define some formal problem F, and
design AI systems to be F-solvers, such that no matter how perfectly a system solves F,
we’re guaranteed to be happy with the solution. If we can work out an appropriate F that
has this property, we’ll be able to create provably beneficial AI.
Here’s an example of how not to do it: Let a reward be a scalar value provided
periodically by a human to the machine, corresponding to how well the machine has
behaved during each period, and let F be the problem of maximizing the expected sum of
rewards obtained by the machine. The optimal solution to this problem is not, as one
might hope, to behave well, but instead to take control of the human and force him or her
to provide a stream of maximal rewards. This is known as the wireheading problem,
based on observations that humans themselves are susceptible to the same problem if
given a means to electronically stimulate their own pleasure centers.
There is, I believe, an approach that may work. Humans can reasonably be
described as having (mostly implicit) preferences over their future lives—that is, given

___________________________________________________
⁶ Kevin Kelly, “The Myth of a Superhuman AI,” Wired, Apr. 25, 2017.

33

HOUSE_OVERSIGHT_016836

View Original PDF

HOUSE_OVERSIGHT_016836.jpg

Processing Document...

Extraction Summary

Document Information

People (2)

Organizations (3)

Relationships (2)

Key Quotes (4)

Full Extracted Text

Discussion 0