❦
Suppose that human beings had absolutely no idea how they performed arithmetic. Imagine that human beings had evolved, rather than having learned, the ability to count sheep and add sheep. People using this built-in ability have no idea how it worked, the way Aristotle had no idea how his visual cortex supported his ability to see things. Peano Arithmetic as we know it has not been invented. There are philosophers working to formalize numerical intuitions, but they employ notations such as
Plus-Of(Seven, Six) = Thirteen
to formalize the intuitively obvious fact that when you add “seven” plus “six,” of course you get “thirteen.”
In this world, pocket calculators work by storing a giant lookup table of arithmetical facts, entered manually by a team of expert Artificial Arithmeticians, for starting values that range between zero and one hundred. While these calculators may be helpful in a pragmatic sense, many philosophers argue that they’re only simulating addition, rather than really adding. No machine can really count—that’s why humans have to count thirteen sheep before typing “thirteen” into the calculator. Calculators can recite back stored facts, but they can never know what the statements mean—if you type in “two hundred plus two hundred” the calculator says “Error: Outrange,” when it’s intuitively obvious, if you know what the words mean, that the answer is “four hundred.”
Some philosophers, of course, are not so naive as to be taken in by these intuitions. Numbers are really a purely formal system—the label “thirty-seven” is meaningful, not because of any inherent property of the words themselves, but because the label refers to thirty-seven sheep in the external world. A number is given this referential property by its semantic network of relations to other numbers. That’s why, in computer programs, the LISP token for “thirty-seven” doesn’t need any internal structure—it’s only meaningful because of reference and relation, not some computational property of “thirty-seven” itself.
No one has ever developed an Artificial General Arithmetician, though of course there are plenty of domain-specific, narrow Artificial Arithmeticians that work on numbers between “twenty” and “thirty,” and so on. And if you look at how slow progress has been on numbers in the range of “two hundred,” then it becomes clear that we’re not going to get Artificial General Arithmetic any time soon. The best experts in the field estimate it will be at least a hundred years before calculators can add as well as a human twelve-year-old.
But not everyone agrees with this estimate, or with merely conventional beliefs about Artificial Arithmetic. It’s common to hear statements such as the following:
There is more than one moral to this parable, and I have told it with different morals in different contexts. It illustrates the idea of levels of organization, for example—a CPU can add two large numbers because the numbers aren’t black-box opaque objects, they’re ordered structures of 32 bits.
But for purposes of overcoming bias, let us draw two morals:
Lest anyone accuse me of generalizing from fictional evidence, both lessons may be drawn from the real history of Artificial Intelligence as well.
The first danger is the object-level problem that the AA devices ran into: they functioned as tape recorders playing back “knowledge” generated from outside the system, using a process they couldn’t capture internally. A human could tell the AA device that “twenty-one plus sixteen equals thirty-seven,” and the AA devices could record this sentence and play it back, or even pattern-match “twenty-one plus sixteen” to output “thirty-seven!”—but the AA devices couldn’t generate such knowledge for themselves.
Which is strongly reminiscent of believing a physicist who tells you “Light is waves,” recording the fascinating words and playing them back when someone asks “What is light made of?,” without being able to generate the knowledge for yourself.
The second moral is the meta-level danger that consumed the Artificial Arithmetic researchers and opinionated bystanders—the danger of dancing around confusing gaps in your knowledge. The tendency to do just about anything except grit your teeth and buckle down and fill in the damn gap.
Whether you say, “It is emergent!,” or whether you say, “It is unknowable!,” in neither case are you acknowledging that there is a basic insight required which is possessable, but unpossessed by you.
How can you know when you’ll have a new basic insight? And there’s no way to get one except by banging your head against the problem, learning everything you can about it, studying it from as many angles as possible, perhaps for years. It’s not a pursuit that academia is set up to permit, when you need to publish at least one paper per month. It’s certainly not something that venture capitalists will fund. You want to either go ahead and build the system now, or give up and do something else instead.
Look at the comments above: none are aimed at setting out on a quest for the missing insight which would make numbers no longer mysterious, make “twenty-seven” more than a black box. None of the commenters realized that their difficulties arose from ignorance or confusion in their own minds, rather than an inherent property of arithmetic. They were not trying to achieve a state where the confusing thing ceased to be confusing.
If you read Judea Pearl’s Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference,1 then you will see that the basic insight behind graphical models is indispensable to problems that require it. (It’s not something that fits on a T-shirt, I’m afraid, so you’ll have to go and read the book yourself. I haven’t seen any online popularizations of Bayesian networks that adequately convey the reasons behind the principles, or the importance of the math being exactly the way it is, but Pearl’s book is wonderful.) There were once dozens of “non-monotonic logics” awkwardly trying to capture intuitions such as “If my burglar alarm goes off, there was probably a burglar, but if I then learn that there was a small earthquake near my home, there was probably not a burglar.” With the graphical-model insight in hand, you can give a mathematical explanation of exactly why first-order logic has the wrong properties for the job, and express the correct solution in a compact way that captures all the common-sense details in one elegant swoop. Until you have that insight, you’ll go on patching the logic here, patching it there, adding more and more hacks to force it into correspondence with everything that seems “obviously true.”
You won’t know the Artificial Arithmetic problem is unsolvable without its key. If you don’t know the rules, you don’t know the rule that says you need to know the rules to do anything. And so there will be all sorts of clever ideas that seem like they might work, like building an Artificial Arithmetician that can read natural language and download millions of arithmetical assertions from the Internet.
And yet somehow the clever ideas never work. Somehow it always turns out that you “couldn’t see any reason it wouldn’t work” because you were ignorant of the obstacles, not because no obstacles existed. Like shooting blindfolded at a distant target—you can fire blind shot after blind shot, crying, “You can’t prove to me that I won’t hit the center!” But until you take off the blindfold, you’re not even in the aiming game. When “no one can prove to you” that your precious idea isn’t right, it means you don’t have enough information to strike a small target in a vast answer space. Until you know your idea will work, it won’t.
From the history of previous key insights in Artificial Intelligence, and the grand messes that were proposed prior to those insights, I derive an important real-life lesson: When the basic problem is your ignorance, clever strategies for bypassing your ignorance lead to shooting yourself in the foot.
Pearl, Probabilistic Reasoning in Intelligent Systems. ↩︎