Monday, 8 December 2008

Does the Turing Test really tests intelligence?

The Turning Test has a solid stand in the Artificial Intelligence research community as the ultimate test for intelligent machines. That solid stand may be the result of historical reasons. Alan Turing visionary paper and predictions make the Turing Test at the heart of any discussion on machine intelligence. Perhaps because there is no machine can pass that test without cheating. A full intelligent machine cannot pass the test but a well programmed machine within a time limit can fool a human examiner and pass the test mechanism but not in its spirit. There may be other reasons but there are few, it seems, who would question the test itself. The declared aim of the Turing Test is to test intelligence and to provide a benchmark by which we can tell a machine is intelligent. But does it do that?

Let us examine the Turing Test closely. The test requires that a human examiner to have a conversation with two unseen entities. One of these entities is a human whilst the other is the machine to be tested. There is an agreed time limit of 5 minutes but that is often argued against. The key here is that the human could not tell which is the machine and which is the human through the conversation. Now, one of the conversation topics that we can trap a pretending machine is the weather.

Assume we asked how is the weather outside and we got some of the following for an answer:

* It is 21 degrees with northerly wind at speed of 5 knots
* It is a lovely weather today [do not you like it sunny?] (the actual weather outside is heavy rain)
* I do not like the weather in England, how do you cope with it?

Now which one do we think is a human answer and which one is a machine's? The first answer gives an impression of a machine with good weather sensors but could not be a human with a weather station who is mentally lazy and read it as it is? The last one could be a human who is foreigner to England but could not be a machine who just has a preset answers to divert the topics in directions in which it can converse? The answer in the middle is the interesting one. The first part of the answer, which can be a genuine answer by an intelligent being be it a human or a machine, gives the impression of a machine. However, when the optional part is added, which could again be a preset answer for a machine, it gives the sense of cynicism that one likely to connect with a human rather than with pattern matching machine. These cases show the flaws in the Turing Test argument and return us to the question, what does it test?

For an ultimate intelligent machine to pass the test, the machine has to be able to pretend to be human. This requires that the machine is conscious of itself that it is a machine. It is conscious of the fact that the test requires it to come cross as human. It is conscious of time and visual limitation. And finally it is conscious of what makes a human comes cross as human, i.e. all non-intelligent human quirkiness. After all we would be much quick to accept a robot to be intelligent if it can hold a conversation with a good laugh about football!

In my opinion, Turing Test does not test intelligence, or at least not solely so. It tests consciousness, self-awareness, and the ability to lie. The last is the most important because the ability to lie is distinctively a human characteristic associated with our ability to create from imagination.

Our complex cognition makes it difficult for us to distinguish between awareness, consciousness, thinking, intelligence, and recognition of cognitive processes. The latest is a good example of the complexity and the level of interweaving of our abilities. When we remember an experiences we recognize that we remembered after the memory have been recalled; but that recognition in itself make us aware of the process of memorizing; this often leads us to analyze the memory, the memorization process and the reasons why it was triggered; in other words, we become conscious of our existence in time, the existence of the memories associated with an experience and the stimulus that triggered these memories. This leaves us wonder where is intelligence in all of this and how can we quantify it for measurement?

The thoughts provoked by this article is not completely new. Similar notions of wonderment has been expressed over the Turing test and some attempts are being made to find a quantifiable test of intelligence. The advances in cognitive systems make the need to such test, or even better metrics, the greater and more urgent. Many of these alternatives, however, fail because of their focus on one element of intelligence or cognition, often focusing on learning and rational deduction. In most cases, intelligence is the result of integration of abilities, simple they may be, but together demonstrate the various facades of cognition and intelligence. For example, survival is an important ability but not necessary rational; social relations are important element of thinking but may not lead to rational decisions, e.g. parents staying with their children in a burning building.

Integrative (artificial) intelligence would require quantifiable metrics by itself measuring the different factors in ratios proportioned to their impact on behaviour. For example, learning can be form of categorization, but categorization is in itself can be form of thinking and decision making, though it may lead to stereo type based perception. Equally, categorization can be viewed as a form of memory organization to enable associative memorization. Thus, learning, thinking, memory, perception are all necessary in defining intelligence. In addition, embodiment is as important. Studies in animal intelligence gave us and could give us more insight in the separation between intelligence and the other aspects of mind and indeed of being a human.

These are some thoughts on what is needed in building metrics to test intelligent systems that are more coherent, unconfused, and measurable to truly test intelligence; but this is not an attempt by any means to set such metrics for doing so comprehensive discussion between psychologists, sociologists, AI researchers, neurologists and philosophers is needed to extract the components of intelligence from the mesh of the human mind and identify their weights in defining an intelligent being (be it a machine!).

A good reading list on the Turing list and associated topics can be found at: http://www.aisb.org.uk/publicunderstanding/turing_test.shtml

No comments: