The holy grail of artificial intelligence is the quest to
develop general artificial general intelligence. A considerable amount of progress has been made
on specific forms of intelligence. Computers are now able to perform many tasks
at suprahuman levels. But extending
these specific capabilities to more general intelligence has so far proven
elusive.
One of the most important barriers to extending computational intelligence from
specific to general capabilities is an inadequate understanding of the different
kinds of problems that a general intelligence system would have to solve. Research has focused on one narrow range of
problems with the apparent expectation that solving those problems will lead eventually
to general problems solving. These are
problems that can be solved by parameter optimization. Optimization means that the learning
algorithm adjusts the values of its model parameters to gradually approximate
the desired output of the system. But
there are problems that cannot be solved using these methods.
The sum of three cubes
problem is one of these problems. Conceptually,
it is not very complicated. It
could straightforwardly be solved by brute force, that is by trying numbers
until a solution is found. Still it has resisted solutions for some numbers despite more half a century of effort.
In general form, the three cubes problem is this: For any integer
k, express that number as the sum of three integers cubed. For example, the integer 29 can be expressed
as 29 = 3³ + 1³ + 1³ (29 = 27 + 1 + 1). It is easy to determine that some numbers cannot be represented as the sum of three cubes. For example, the number 32 cannot be
expressed as the sum of three cubes, but until just recently, no one knew
whether the integer 33 could be. Is there
some set of three integers that satisfy the equation 33 = x³+ y³+ z³? In fact, until recently, there were only two
numbers below 100 for which a solution was unknown, 33 and 42. All of the others were either known to be
impossible or the three integers were known.
There is no known optimization method for finding the three numbers that
when cubed sum up to 33 or 42. There are no known methods to gradually
approximate a solution. Once the correct
three integers have been found, it is easy to verify that they are, in fact,
correct, but there is no solution that is partially correct, only solutions
that are correct or incorrect. The best that one can do is to guess at likely
numbers. Andrew Booker, at the
University of Bristol, was recently able to solve the problem for k = 33 by
improving somewhat the methods used to guess potential solutions. His method reduced the number of integers
that needed to be searched by an estimated 20%, but even after this improvement,
his solution consumed 23 core-years of processing time. That is a substantial amount of effort for a
fairly trivial problem. According
to Booker, “I don’t think [finding solutions to the sum of three cubes
problems] are sufficiently interesting research goals in their own right to
justify large amounts of money to arbitrarily hog a supercomputer.”
Why this problem is interesting for artificial intelligence
The sum of three cubes problem has resisted solution for
over half a century. This problem is
very easy to describe, but difficult, or at least tedious, to solve. Understanding the difficulty posed by this
kind of problem and how that challenge was addressed is, I think, important for understanding why general intelligence is a challenge and what
can be done to meet that challenge.
Current versions of machine learning can all be described in
terms of three sets of numbers. One set
of numbers maps properties of the physical world to numbers that can be used by
a computer. One maps the output of the
computer to properties of the physical world.
The third set of numbers represents the model that maps inputs to outputs. Machine learning consists of adjusting this
set of model numbers (using some optimization algorithm) to better approximate the
desired relation between inputs and outputs.
This kind of framework can learn to recognize speech, to create novel
musical creations, and play chess, go, or Jeopardy. In fact, some version of this approach can
solve any problem that can be represented in this way.
But it is still the case, that the success of these systems
relies heavily on the ability of human designers to construct these three sets
of numbers and select optimization algorithms to adjust the model. The sum of three cubes problem is not amenable
to an optimization approach because there is no way to determine which changes
to X, Y, and Z, will bring it closer to the desired solution. There is no way to define closer.
In 1965, I. J. Good raised the possibility of an
ultraintelligent computer system that would surpass
human intelligence:
“Let an ultraintelligent machine be defined as a machine
that can far surpass all the intellectual activities of any man however clever.
Since the design of machines is one of these intellectual activities, an
ultraintelligent machine could design even better machines; there would then
unquestionably be an 'intelligence explosion,' and the intelligence of man
would be left far behind. Thus the first ultraintelligent machine is the last
invention that man need ever make, provided that the machine is docile enough
to tell us how to keep it under control.”
Presumably, solving the sum of three cubes problem would
also be among the intellectual activities that such a machine would address,
since it continues to be a problem addressed by humans. This problem is conceptually much simpler
than designing intelligence programs, but it may be even less tractable.
Booker’s improved algorithm was not discovered
automatically. There is no algorithm
that we know of that can produce new algorithms like the one he produced. It took humans over 64 years to come up with
one even this good, despite fairly widespread interest in the problem. We do not know how Booker came up with the
insight leading to this new algorithm, nor do we know how to go about designing
a method that could do so predictably. General
intelligence will require computers to be able to generate new problem
representations and new algorithms to solve new problems, but we have little
idea of how to get there.
Even this new method faced a huge combinatoric
challenge. There are just so many
combinations of three numbers that could be the solution to the problem. No matter how intelligent a system is, there
ultimately may be no amount of intelligence that can eliminate this combinatoric
problem. If even the problem of finding
three numbers can be combinatorically challenging, what will a general
intelligence system face when trying to solve problems with even more variables? The time required to test a large number of integers
and their cubes may be reduced, but it cannot be eliminated.
To this point, no one has come up with a computer system
that can design its own models. Deep
learning systems that are said to come up with their own representations actually
still work by adjusting parameters in a prestructured model. The transformations that occur within the
model (moving from one layer to the next) are determined by the architecture of
those layers. For example, a linear autoencoder
layer does not learn an arbitrary representation of the data, it “learns” to
perform a principal component
analysis, a well-known statistical technique. So far
someone still has to come up with the design of the network and optimization
methods used to solve the problems.
The sum of three cubes problem could be solved by simple brute
force if we were to allocate sufficient resources to its solution. With other kinds of problems even the space
in which to apply the brute-force search may be obscure. Some insight problems, for example, are
difficult until the solver finds the right representation, at which point they
are typically easy to solve. Like the
sum of three cubes problem, insight problems do not admit of partial solutions
that can be selected through optimization.
The key to solving is to think of the problem in the right way. Solving these problems requires a switch in
how those problems are represented.
Here’s an insight problem whose solution may be familiar: Curt and Goldie are lying dead on a wet rug
in a locked room. The room is in an old
house near some railroad tracks. How did
they die?
Once you come up with the right model for this situation,
solving it is trivial, but the difficult part is often coming up with the right
representation. There are many other
insight problems, but these have not at all been studied by computer scientists
so far as I am aware. But the problem of
coming up with good representations has been the very mechanism of progress in
artificial intelligence. So far it has
been done slowly and painstakingly by people.
There are many other problems that a generally intelligent
system will have to address if it is ever to achieve general intelligence, let
alone superintelligence. We may someday
be able to create artificial general intelligence systems that can address
these problems, but it will require a different computational approach than any
we have available today.