Living with the Turing test.
Researchers from the National Advisory Committee for Aeronautics (NACA) using an IBM type 704 electronic data processing machine in 1957. Photo: Wikimedia Commons
As of last week, the Turing test has—allegedly—been passed. In 1950, Alan Turing famously predicted that in the early twenty-first century, computer programs capable of sending and receiving text messages would be able to fool human judges into mistaking them for humans 30 percent of the time, and that we would come to “speak of machines thinking without expecting to be contradicted.” Two weekends ago, at a Turing test competition held at the Royal Society in London, a piece of so-called “chatbot” software called “Eugene Goostman” crossed that mark, fooling ten of the thirty human judges who spoke with it.
The official press release described this as a “milestone in computing history”—a “historic event.” Was it? We should not, of course, take a press release’s word for it. (Said release describes the winning chatbot program as a “supercomputer,” a head-scratching conflation of hardware with software.)
The release says this is the first time a computer program has scored above 30 percent in an “unrestricted” Turing test. This appears to be plausibly true. We don’t have access to the transcripts of these conversations—the organizers declined my request—but we know that the persona adopted by the winning chatbot (“Eugene Goostman”) was that of a thirteen-year-old, non-native-speaking foreigner. The Turing tests of the 1990s were restricted by topics, with the judge’s questions limited to a single domain. Here, the place of those constraints has been taken by restricted fluency: both linguistic and cultural. From correspondence with the contest organizers, I learned that the human judges were themselves chosen to include children and nonnative speakers. So we might fairly argue about what, for a Turing test, truly counts. These questions are deeper than they seem. Read More