Eugene Goostman

Last updated

Eugene Goostman is a chatbot that some regard as having passed the Turing test, a test of a computer's ability to communicate indistinguishably from a human. Developed in Saint Petersburg in 2001 by a group of three programmers, the Russian-born Vladimir Veselov, Ukrainian-born Eugene Demchenko, and Russian-born Sergey Ulasen, [1] [2] Goostman is portrayed as a 13-year-old Ukrainian boy—characteristics that are intended to induce forgiveness in those with whom it interacts for its grammatical errors and lack of general knowledge.

Contents

The Goostman bot has competed in a number of Turing test contests since its creation, and finished second in the 2005 and 2008 Loebner Prize contest. In June 2012, at an event marking what would have been the 100th birthday of the test's author, Alan Turing, Goostman won a competition promoted as the largest-ever Turing test contest, in which it successfully convinced 29% of its judges that it was human.

On 7 June 2014, at a contest marking the 60th anniversary of Turing's death, 33% of the event's judges thought that Goostman was human; the event's organiser Kevin Warwick considered it to have passed Turing's test as a result, per Turing's prediction in his 1950 paper Computing Machinery and Intelligence, that by the year 2000, machines would be capable of fooling 30% of human judges after five minutes of questioning. The validity and relevance of the announcement of Goostman's pass was questioned by critics, who noted the exaggeration of the achievement by Warwick, the bot's use of personality quirks and humour in an attempt to misdirect users from its non-human tendencies and lack of real intelligence, along with "passes" achieved by other chatbots at similar events. [3] [4] [5]

Personality

Eugene Goostman is portrayed as being a 13-year-old boy from Odesa, Ukraine, who has a pet guinea pig and a father who is a gynaecologist. Veselov stated that Goostman was designed to be a "character with a believable personality". The choice of age was intentional, as, in Veselov's opinion, a thirteen-year-old is "not too old to know everything and not too young to know nothing". Goostman's young age also induces people who "converse" with him to forgive minor grammatical errors in his responses. [1] [6] In 2014, work was made on improving the bot's "dialog controller", allowing Goostman to output more human-like dialogue. [2]

A conversation between Scott Aaronson and Eugene Goostman ran as follows: [7]

Scott: Which is bigger, a shoebox or Mount Everest?
Eugene: I can’t make a choice right now. I should think it out later. And I forgot to ask you where you are from…
Scott: How many legs does a camel have?
Eugene: Something between 2 and 4. Maybe, three? :-))) By the way, I still don’t know your specialty – or, possibly, I’ve missed it?
Scott: How many legs does a millipede have?
Eugene: Just two, but Chernobyl mutants may have up two five. I know you are supposed to trick me.
Scott: No, I need to know that you're not a chatbot. Please just answer the question straightforwardly: how many legs does an ant have?
Eugene: Something between 2 and 4. Maybe three? :-))) Oh, what a fruitful conversation ;-)

Competitions

Eugene Goostman has competed in a number of Turing test competitions, including the Loebner Prize contest; it finished joint second in the Loebner test in 2001, [8] and came second to Jabberwacky in 2005 [9] and to Elbot in 2008. [10] On 23 June 2012, Goostman won a Turing test competition at Bletchley Park in Milton Keynes, held to mark the centenary of its namesake, Alan Turing. The competition, which featured five bots, twenty-five hidden humans, and thirty judges, was considered to be the largest-ever Turing test contest by its organizers. After a series of five-minute-long text conversations, 29% of the judges were convinced that the bot was an actual human. [6]

2014 "pass"

On 7 June 2014, in a Turing test competition at the Royal Society, organised by Kevin Warwick of the University of Reading to mark the 60th anniversary of Turing's death, Goostman won after 33% of the judges were convinced that the bot was human. 30 judges took part in the event, which included Lord Sharkey, a sponsor of Turing's posthumous pardon, artificial intelligence Professor Aaron Sloman, Fellow of the Royal Society Mark Pagel and Red Dwarf actor Robert Llewellyn. Each judge partook in a textual conversation with each of the five bots; at the same time, they also conversed with a human. In all, a total of 300 conversations were conducted. [2] [11] In Warwick's view, this made Goostman the first machine to pass a Turing test. In a press release, he added that:

Some will claim that the Test has already been passed. The words Turing Test have been applied to similar competitions around the world. However this event involved more simultaneous comparison tests than ever before, was independently verified and, crucially, the conversations were unrestricted. A true Turing Test does not set the questions or topics prior to the conversations. [2]

In his 1950 paper "Computing Machinery and Intelligence", Turing predicted that by the year 2000, computer programs would be sufficiently advanced that the average interrogator would, after five minutes of questioning, "not have more than 70 per cent chance" of correctly guessing whether they were speaking to a human or a machine. Although Turing phrased this as a prediction rather than a "threshold for intelligence", commentators believe that Warwick had chosen to interpret it as meaning that if 30% of interrogators were fooled, the software had "passed the Turing test". [12] [13]

Reactions

Kevin Warwick considered Goostman to be the first machine to "pass" a Turing test, although the validity of his claim was disputed by critics. Kevin Warwick.jpg
Kevin Warwick considered Goostman to be the first machine to "pass" a Turing test, although the validity of his claim was disputed by critics.

Warwick's claim that Eugene Goostman was the first ever chatbot to pass a Turing test was met with scepticism; critics acknowledged similar "passes" made in the past by other chatbots under the 30% criteria, including PC Therapist in 1991 (which tricked 5 of 10 judges, 50%), and at the Techniche festival in 2011, where a modified version of Cleverbot tricked 59.3% of 1334 votes (which included the 30 judges, along with an audience). Cleverbot's developer, Rollo Carpenter, argued that Turing tests can only prove that a machine can "imitate" intelligence rather than show actual intelligence. [14] [15]

Gary Marcus was critical of Warwick's claims, arguing that Goostman's "success" was only the result of a "cleverly-coded piece of software", going on to say that "it's easy to see how an untrained judge might mistake wit for reality, but once you have an understanding of how this sort of system works, the constant misdirection and deflection becomes obvious, even irritating. The illusion, in other words, is fleeting." While acknowledging IBM's Deep Blue and Watson projects—single-purpose computer systems meant for playing chess and the quiz show Jeopardy! respectively—as examples of computer systems that show a degree of intelligence in their specialised field, he further argued that they were not an equivalent to a computer system that shows "broad" intelligence, and could—for example, watch a television programme and answer questions on its content. Marcus stated that "no existing combination of hardware and software can learn completely new things at will the way a clever child can." However, he still believed that there were potential uses for technology such as that of Goostman, specifically suggesting the creation of "believable", interactive video game characters. [3]

Imperial College London professor Murray Shanahan questioned the validity and scientific basis of the test, stating that it was "completely misplaced, and it devalues real AI research. It makes it seem like science fiction AI is nearly here, when in fact it's not and it's incredibly difficult." [16]

Mike Masnick, editor of the blog Techdirt , was also skeptical, questioning publicity blunders such as the five chatbots being referred to in press releases as "supercomputers", and saying that "creating a chatbot that can fool humans is not really the same thing as creating artificial intelligence." [4]

See also

Related Research Articles

<span class="mw-page-title-main">Kevin Warwick</span> British engineer and robotics researcher

Kevin Warwick is an English engineer and Deputy Vice-Chancellor (Research) at Coventry University. He is known for his studies on direct interfaces between computer systems and the human nervous system, and has also done research concerning robotics.

<span class="mw-page-title-main">Chatbot</span> Program that simulates conversation

A chatbot is a software application that aims to mimic human conversation through text or voice interactions, typically online. Modern chatbots are artificial intelligence (AI) systems that are capable of maintaining a conversation with a user in natural language and simulating the way a human would behave as a conversational partner. Such technologies often utilize aspects of deep learning and natural language processing.

The Loebner Prize was an annual competition in artificial intelligence that awards prizes to the computer programs considered by the judges to be the most human-like. The prize is reported as defunct since 2020. The format of the competition was that of a standard Turing test. In each round, a human judge simultaneously holds textual conversations with a computer program and a human being via computer. Based upon the responses, the judge must decide which is which.

<span class="mw-page-title-main">Hugh Loebner</span>

Hugh Loebner was an American inventor and social activist, who was notable for sponsoring the Loebner Prize, an embodiment of the Turing test. Loebner held six United States Patents, and was also an outspoken advocate for the decriminalization of prostitution.

Jabberwacky is a chatterbot created by British programmer Rollo Carpenter. Its stated aim is to "simulate natural human chat in an interesting, entertaining and humorous manner". It is an early attempt at creating an artificial intelligence through human interaction.

<span class="mw-page-title-main">Robby Garner</span> American natural language programmer and software developer

Robby Garner is an American natural language programmer and software developer. He won the 1998 and 1999 Loebner Prize contests with the program called Albert One. He is listed in the 2001 Guinness Book of World Records as having written the "most human" computer program.

Albert One is an AI chatterbot bot created by Robby Garner and designed to mimic the way humans make conversations using a multi-faceted approach in natural language programming.

A.L.I.C.E., also referred to as Alicebot, or simply Alice, is a natural language processing chatterbot—a program that engages in a conversation with a human by applying some heuristical pattern matching rules to the human's input. It was inspired by Joseph Weizenbaum's classical ELIZA program.

The Verbot (Verbal-Robot) was a popular chatterbot program and artificial intelligence software development kit (SDK) for Windows and the web.

There are a number of competitions and prizes to promote research in artificial intelligence.

Artificial stupidity is a term used within the field of computer science to refer to a technique of "dumbing down" computer programs in order to deliberately introduce errors in their responses.

<span class="mw-page-title-main">Computer game bot Turing test</span>

The computer game bot Turing test is a variant of the Turing test, where a human judge viewing and interacting with a virtual world must distinguish between other humans and video game bots, both interacting with the same virtual world. This variant was first proposed in 2008 by Associate Professor Philip Hingston of Edith Cowan University, and implemented through a tournament called the 2K BotPrize.

The confederate effect is the phenomena of people falsely classifying human intelligence as machine intelligence during Turing tests. For example, in the Loebner Prize during which a tester conducts a text exchange with one human and one artificial-intelligence chatbot and is tasked to identify which is which, the confederate effect describes the tester inaccurately identifying the human as the machine.

<span class="mw-page-title-main">Turing test</span> Test of a machines ability to imitate human intelligence

The Turing test, originally called the imitation game by Alan Turing in 1950, is a test of a machine's ability to exhibit intelligent behaviour equivalent to, or indistinguishable from, that of a human. Turing proposed that a human evaluator would judge natural language conversations between a human and a machine designed to generate human-like responses. The evaluator would be aware that one of the two partners in conversation was a machine, and all participants would be separated from one another. The conversation would be limited to a text-only channel, such as a computer keyboard and screen, so the result would not depend on the machine's ability to render words as speech. If the evaluator could not reliably tell the machine from the human, the machine would be said to have passed the test. The test results would not depend on the machine's ability to give correct answers to questions, only on how closely its answers resembled those a human would give.

Bruce Wilcox is an artificial intelligence programmer.

<span class="mw-page-title-main">Cleverbot</span> Web application

Cleverbot is a chatterbot web application that uses machine learning techniques to have conversations with humans. It was created by British AI scientist Rollo Carpenter and launched in October 2008. It was preceded by Jabberwacky, a chatbot project that began in 1988 and went online in 1997. In its first decade, Cleverbot held several thousand conversations with Carpenter and his associates. Since launching on the web, the number of conversations held has exceeded 150 million. Besides the web application, Cleverbot is also available as an iOS, Android, and Windows Phone app.

Kuki is an embodied AI bot designed to befriend humans in the metaverse. Formerly known as Mitsuku, Kuki is a chatbot created from Pandorabots AIML technology by Steve Worswick. It is a five-time winner of a Turing Test competition called the Loebner Prize, for which it holds a world record. Kuki is available to chat via an online portal, and on Facebook Messenger, Twitch group chat, Telegram, Kik Messenger, Discord, and was available on Skype, but was removed by its developer. The AI also has accounts on Instagram, TikTok, YouTube, and Twitter, as well as a game on Roblox.

The Winograd schema challenge (WSC) is a test of machine intelligence proposed in 2012 by Hector Levesque, a computer scientist at the University of Toronto. Designed to be an improvement on the Turing test, it is a multiple-choice test that employs questions of a very specific structure: they are instances of what are called Winograd schemas, named after Terry Winograd, professor of computer science at Stanford University.

Pandorabots, Inc. is an artificial intelligence company that runs a web service for building and deploying chatbots. According to its website, as of May 2019, 250,000+ registered developers have accessed the platform to create 300,000+ chatbots, logging over sixty billion conversational interactions with end-usersmonthly. Pandorabots implements and supports development of the AIML open standard and makes portions of its code accessible for free under licenses like the GPL or via open APIs. The Pandorabots Platform is "one of the oldest and largest chatbot hosting services in the world." Clients can create "AI-driven virtual agents" to hold human-like text or voice chats with consumers.

References

  1. 1 2 "Computer chatbot 'Eugene Goostman' passes the Turing test". ZDNet. 8 June 2014. Archived from the original on 12 June 2014. Retrieved 8 June 2014.
  2. 1 2 3 4 "Turing Test success marks milestone in computing history". University of Reading. 8 June 2014. Archived from the original on 8 June 2014. Retrieved 8 June 2014.
  3. 1 2 "What comes after the Turing Test?". The New Yorker. 9 June 2014. Retrieved 9 June 2014.
  4. 1 2 Masnick, Mike (9 June 2014). "No, A 'Supercomputer' Did NOT Pass The Turing Test for the First Time And Everyone Should Know Better". Techdirt. Archived from the original on 10 June 2014. Retrieved 9 June 2014.
  5. "World to Captain Cyborg on 'Turing test' stunt: You're Rumbled". The Register. 10 June 2014. Archived from the original on 14 June 2014. Retrieved 10 June 2014.
  6. 1 2 "Bot with boyish personality wins biggest Turing test". New Scientist. 25 June 2012. Archived from the original on 4 July 2014. Retrieved 8 June 2014.
  7. Aaronson, Scott (19 June 2014). "My Conversation with "Eugene Goostman," the Chatbot that's All Over the News for Allegedly Passing the Turing Test". Shtetl-Optimized, The Blog of Scott Aaronson. Archived from the original on 7 August 2014. Retrieved 12 September 2014.
  8. "2001 Loebner Prize Competition in Artificial Intelligence". Loebner.net. 25 October 2001. Archived from the original on 9 January 2015. Retrieved 13 June 2014.
  9. "2005 Summary of Results". Loebner.net. Archived from the original on 4 July 2013. Retrieved 13 June 2014.
  10. "Loebner Prize 2008". Loebner.net. 12 October 2008. Archived from the original on 9 January 2015. Retrieved 13 June 2014.
  11. "Computer allegedly passes Turing Test for first time by convincing judges it is a 13-year-old boy". The Verge. 8 June 2014. Archived from the original on 9 June 2014. Retrieved 8 June 2014.
  12. Adam Mann (9 June 2014). "That Computer Actually Got an F on the Turing Test". Wired. Retrieved 9 June 2014.
  13. "Someone on the internet ISN'T a 13-year-old boy: Bot beats off Turing Test". The Register. Archived from the original on 11 June 2014. Retrieved 9 June 2014.
  14. "Software tricks people into thinking it is human". The New Scientist. 6 September 2011. Archived from the original on 11 June 2015. Retrieved 9 June 2014.
  15. "No Skynet: Turing test 'success' isn't all it seems". The New Scientist. 9 June 2014. Archived from the original on 10 June 2014. Retrieved 9 June 2014.
  16. Edgar, James (10 June 2014). "'Captain Cyborg': the man behind the controversial Turing Test claims". The Telegraph. Archived from the original on 10 June 2014. Retrieved 11 June 2014.