DeadSalmonData: data science

The holy grail of artificial intelligence is the quest to develop general artificial general intelligence. A considerable amount of progress has been made on specific forms of intelligence. Computers are now able to perform many tasks at suprahuman levels. But extending these specific capabilities to more general intelligence has so far proven elusive.

One of the most important barriers to extending computational intelligence from specific to general capabilities is an inadequate understanding of the different kinds of problems that a general intelligence system would have to solve. Research has focused on one narrow range of problems with the apparent expectation that solving those problems will lead eventually to general problems solving. These are problems that can be solved by parameter optimization. Optimization means that the learning algorithm adjusts the values of its model parameters to gradually approximate the desired output of the system. But there are problems that cannot be solved using these methods.

The sum of three cubes problem is one of these problems. Conceptually, it is not very complicated. It could straightforwardly be solved by brute force, that is by trying numbers until a solution is found. Still it has resisted solutions for some numbers despite more half a century of effort.

In general form, the three cubes problem is this: For any integer k, express that number as the sum of three integers cubed. For example, the integer 29 can be expressed as 29 = 3³ + 1³ + 1³ (29 = 27 + 1 + 1). It is easy to determine that some numbers cannot be represented as the sum of three cubes. For example, the number 32 cannot be expressed as the sum of three cubes, but until just recently, no one knew whether the integer 33 could be. Is there some set of three integers that satisfy the equation 33 = x³+ y³+ z³? In fact, until recently, there were only two numbers below 100 for which a solution was unknown, 33 and 42. All of the others were either known to be impossible or the three integers were known.

There is no known optimization method for finding the three numbers that when cubed sum up to 33 or 42. There are no known methods to gradually approximate a solution. Once the correct three integers have been found, it is easy to verify that they are, in fact, correct, but there is no solution that is partially correct, only solutions that are correct or incorrect. The best that one can do is to guess at likely numbers. Andrew Booker, at the University of Bristol, was recently able to solve the problem for k = 33 by improving somewhat the methods used to guess potential solutions. His method reduced the number of integers that needed to be searched by an estimated 20%, but even after this improvement, his solution consumed 23 core-years of processing time. That is a substantial amount of effort for a fairly trivial problem. According to Booker, “I don’t think [finding solutions to the sum of three cubes problems] are sufficiently interesting research goals in their own right to justify large amounts of money to arbitrarily hog a supercomputer.”

Why this problem is interesting for artificial intelligence

The sum of three cubes problem has resisted solution for over half a century. This problem is very easy to describe, but difficult, or at least tedious, to solve. Understanding the difficulty posed by this kind of problem and how that challenge was addressed is, I think, important for understanding why general intelligence is a challenge and what can be done to meet that challenge.

Current versions of machine learning can all be described in terms of three sets of numbers. One set of numbers maps properties of the physical world to numbers that can be used by a computer. One maps the output of the computer to properties of the physical world. The third set of numbers represents the model that maps inputs to outputs. Machine learning consists of adjusting this set of model numbers (using some optimization algorithm) to better approximate the desired relation between inputs and outputs. This kind of framework can learn to recognize speech, to create novel musical creations, and play chess, go, or Jeopardy. In fact, some version of this approach can solve any problem that can be represented in this way.

But it is still the case, that the success of these systems relies heavily on the ability of human designers to construct these three sets of numbers and select optimization algorithms to adjust the model. The sum of three cubes problem is not amenable to an optimization approach because there is no way to determine which changes to X, Y, and Z, will bring it closer to the desired solution. There is no way to define closer.

In 1965, I. J. Good raised the possibility of an ultraintelligent computer system that would surpass human intelligence:

“Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultraintelligent machine could design even better machines; there would then unquestionably be an 'intelligence explosion,' and the intelligence of man would be left far behind. Thus the first ultraintelligent machine is the last invention that man need ever make, provided that the machine is docile enough to tell us how to keep it under control.”

Presumably, solving the sum of three cubes problem would also be among the intellectual activities that such a machine would address, since it continues to be a problem addressed by humans. This problem is conceptually much simpler than designing intelligence programs, but it may be even less tractable.

Booker’s improved algorithm was not discovered automatically. There is no algorithm that we know of that can produce new algorithms like the one he produced. It took humans over 64 years to come up with one even this good, despite fairly widespread interest in the problem. We do not know how Booker came up with the insight leading to this new algorithm, nor do we know how to go about designing a method that could do so predictably. General intelligence will require computers to be able to generate new problem representations and new algorithms to solve new problems, but we have little idea of how to get there.

Even this new method faced a huge combinatoric challenge. There are just so many combinations of three numbers that could be the solution to the problem. No matter how intelligent a system is, there ultimately may be no amount of intelligence that can eliminate this combinatoric problem. If even the problem of finding three numbers can be combinatorically challenging, what will a general intelligence system face when trying to solve problems with even more variables? The time required to test a large number of integers and their cubes may be reduced, but it cannot be eliminated.

To this point, no one has come up with a computer system that can design its own models. Deep learning systems that are said to come up with their own representations actually still work by adjusting parameters in a prestructured model. The transformations that occur within the model (moving from one layer to the next) are determined by the architecture of those layers. For example, a linear autoencoder layer does not learn an arbitrary representation of the data, it “learns” to perform a principal component analysis, a well-known statistical technique. So far someone still has to come up with the design of the network and optimization methods used to solve the problems.

The sum of three cubes problem could be solved by simple brute force if we were to allocate sufficient resources to its solution. With other kinds of problems even the space in which to apply the brute-force search may be obscure. Some insight problems, for example, are difficult until the solver finds the right representation, at which point they are typically easy to solve. Like the sum of three cubes problem, insight problems do not admit of partial solutions that can be selected through optimization. The key to solving is to think of the problem in the right way. Solving these problems requires a switch in how those problems are represented.

Here’s an insight problem whose solution may be familiar: Curt and Goldie are lying dead on a wet rug in a locked room. The room is in an old house near some railroad tracks. How did they die?

Once you come up with the right model for this situation, solving it is trivial, but the difficult part is often coming up with the right representation. There are many other insight problems, but these have not at all been studied by computer scientists so far as I am aware. But the problem of coming up with good representations has been the very mechanism of progress in artificial intelligence. So far it has been done slowly and painstakingly by people.

There are many other problems that a generally intelligent system will have to address if it is ever to achieve general intelligence, let alone superintelligence. We may someday be able to create artificial general intelligence systems that can address these problems, but it will require a different computational approach than any we have available today.

The name for this blog was inspired by a paper by Craig M. Bennett, Abigail A. Baird, Michael B. Miller, and George L. Wolford. They studied how a dead salmon placed in a functional magnetic resonance imaging (fMRI) machine “responded” to pictures of people in emotional situations. The salmon was not entirely forthcoming with its judgments, but Bennett and his colleagues did record some of the dead salmon’s “brain activity.”

fMRI images are organized into tiny cubes, called “voxels.” In flat, 2-D images, each tiny region in the image is represented by a pixel. With the 3-D images derived using fMRI, each small 3-D region is called a voxel. Just as the signal in a pixel represents how much light there was at that small region, the fMRI voxel represents how much biological activity there was in the corresponding 3-D region. In living brains, that activity usually represents the level of neural activity in the region. Of the available voxels, Bennett and his colleagues found significant activity in 16 voxels, about 0.01% of the total volume.

Based on these findings and using statistical analyses that were commonly used at the time, the authors might have concluded that they had identified the part of the fish’s brain that was responsible for making judgments about the emotional state of human beings depicted in photographs. Before Bennett’s study was available. About 25 – 30% of studies using fMRI published in journals used the same simple techniques as did about 80% of the papers presented at a neuroscience conference.

Obviously, there is something wrong here. The idea of a dead fish responding to human emotion does not pass the “smell test.” It would be remarkable enough for a live fish to identify human emotion, but it is utterly beyond reason to think that a dead one could, or that the authors were measuring brain activity in a dead fish. On top of that, we have no evidence that the fish actually engaged in the task it was assigned, as it never revealed its decisions about the emotional valence of the pictures it “observed.”

The measures of brain activity obtained using functional magnetic resonance imaging, like most other measures of just about any phenomenon include some random variation in the measured quantity. Standard statistical measures were designed to take this random variability into account. Two measures may be numerically different without being reliable different from one another when both measures include some level of randomness. Statisticians have evolved the notion of statistical significance to indicate that an observed numerical difference is unlikely to have occurred by chance.

But these assessments of statistical significance also involve some degree of random variability. Statisticians recognize that there are two kinds of errors that can occur. Type I errors occur when we find a difference, when there is no actual difference (“false positive”), Type II errors occur when we fail to find a difference when there actually is one (“false negative”).

fMRI studies compare the measured activity in each voxel during the task, with its measured level of activity during a control interval. The more brain activity there is, the stronger the signal measured by the fMRI machine. If we were to compare the activity levels of 130,000 voxels under these two conditions, some of them would show high levels of activity just by chance because our measure is imperfect. We would expect some percentage of these comparisons to result in false positive errors. The specific percentage we would expect is determined by our standard of significance, often called the “p value.” It was pretty common in fMRI studies to set this value to 0.001, meaning that we would expect about 130 of the 130,000 voxels to show what appears to be a real difference, when there really is nothing but random measurement error. Multiple comparisons raise opportunities for multiple false positive errors.

There are statistical techniques for dealing with large numbers of comparisons. In summary, these methods modify the significance standard so that differences relative to control have to be greater before one accepts that they are due to anything but chance. When those adjustments are applied to the dead salmon fMRI data, it turns out that there are no significant areas of brain activity in the fish, as we really should expect. When the same techniques were applied by Bennett to 11 published fMRI studies, three of them had no significant results to report at all. If these corrections for multiple comparisons are not properly applied, researchers can come to false conclusions about the validity of their data, which can lead to wasted research effort and dead ends.

The importance of these findings extends far beyond one dead salmon, far beyond fMRI. Spurious correlations are common throughout big-data data science. For example, annual US spending on space, science and technology correlates over the years 1999 to 2009 with the annual incidence of suicides by hanging, strangulation, and suffocation. http://www.tylervigen.com/spurious-correlations If there is some kind of causal relationship here, it completely escapes me. If you have enough data and look hard enough you can find any number of correlations, but that is no guarantee that those correlations are at all meaningful. People have a way of finding patterns, even when there are none. Sometimes the statistics help.

Traditional statistics were developed for a time when the expensive part of research was collecting data. Now data are plentiful, but extracting meaning from those data is still a challenge. Simple methods are often the best, but they can also be too simple. In later posts we will consider other potential hazards of data science and what we can do to avoid them. Unless we are careful with our data and our analyses, it is easy to get seduced into believing that we understand something when really we do not. On the other hand, it is also easy to get seduced by approaches and methods that are popular, but may not be the best approach for the questions at hand.

In future posts, I hope to consider many of the ways in which data science, machine learning, and artificial intelligence can go wrong, but also some of the ways it can go right. Data science is of growing importance to our businesses and even to our culture, but it is also easily misunderstood. In the same way that we see animals and faces in clouds, we may attribute to data science, particularly to artificial intelligence, properties that are not really there. I hope to address some of these tendencies and make data science stronger and more useful as a result.

Craig Bennett, Abigail A. Baird, Michael B. Miller, George L. Wolford (2009). 15th Annual Meeting of the Organization for Human Brain Mapping. San Francisco, CA: 2009. Neural correlates of interspecies perspective taking in the post-mortem atlantic salmon: an argument for proper multiple comparisons correction.

Craig M. Bennett, Abigail A. Baird, Michael B. Miller, George L. Wolford. (2010). Neural correlates of interspecies perspective taking in the post-mortem Atlantic Salmon: An argument for multiple comparisons correction. Journal of Serendipitous and Unexpected Results.

DeadSalmonData

Sunday, April 7, 2019

The sum of three cubes problem and why it is interesting for artificial intelligence

Why this problem is interesting for artificial intelligence

Monday, June 18, 2018

Welcome to dead salmon data