Reverse Turing test

Last updated

A reverse Turing test is a Turing test [1] in which the objective or roles between computers and humans have been reversed. Conventionally, the Turing test is conceived as having a human judge and a computer subject which attempts to appear human. The intent of this conventional test is for the judge to attempt to distinguish which of these two situations is actually occurring. It is presumed that a human subject will always be judged human, and a computer is then said to "pass the Turing test" if it is also judged human. Critical to the concept is the parallel situation of a human judge and a human subject, who also attempts to appear human. Any of these roles may be changed to form a "reverse Turing test".

Contents

Reversal of objective

Arguably the standard form of the reverse Turing test is one in which the subjects attempt to appear to be a computer rather than a human.

A formal reverse Turing test follows the same format as a Turing test. Human subjects attempt to imitate the conversational style of a conversation program. Doing this well involves deliberately ignoring, to some degree, the meaning of the conversation that is immediately apparent to a human, and the simulation of the kinds of errors that conversational programs typically make. Arguably unlike the conventional Turing test, this is most interesting when the judges are very familiar with the art of conversation programs, meaning that in the regular Turing test they can very rapidly tell the difference between a computer program and a human acting normally.

The humans that perform best in the reverse Turing test are those that know computers best, and so know the types of errors that computers can be expected to make in conversation. There is much shared ground between the skill of the reverse Turing test and the skill of mentally simulating a program's operation in the course of computer programming and especially debugging. As a result, programmers (especially hackers) will sometimes indulge in an informal reverse Turing test for recreation.

An informal reverse Turing test involves an attempt to simulate a computer without the formal structure of the Turing test. The judges of the test are typically not aware in advance that a reverse Turing test is occurring, and the test subject attempts to elicit from the 'judges' (who, correctly, think they are speaking to a human) a response along the lines of "is this really a human?". Describing such a situation as a "reverse Turing test" typically occurs retroactively.

There are also cases of accidental reverse Turing tests, occurring when a programmer is in a sufficiently non-human mood that his conversation unintentionally resembles that of a computer.[ citation needed ] In these cases the description is invariably retroactive and humorously intended. The subject may be described as having passed or failed a reverse Turing test or as having failed a Turing test. The latter description is arguably more accurate in these cases; see also the next section.

Failure by control subjects

Since Turing test judges are sometimes presented with genuinely human subjects, as a control, it inevitably occurs that a small proportion of such control subjects are judged to be computers. This is considered humorous and often embarrassing for the subject.[ citation needed ]

This situation may be described literally as the human "failing the Turing test", for a computer (the intended subject of the test) achieving the same result would be described in the same terms as having failed. The same situation may also be described as the human "failing the reverse Turing test" because to consider the human to be a subject of the test involves reversing the roles of the real and control subjects.[ citation needed ]

Judgement by computer

The term "reverse Turing test" has also been applied to a Turing test (test of humanity) that is administered by a computer. In other words, a computer administers a test to determine if the subject is or is not human. Such procedures, called CAPTCHAs, are used in some anti-spam systems to prevent automated bulk use of communications systems.

The use of captchas is controversial. [2] Circumvention methods exist that reduce their effectiveness. Also, many implementations of captchas (particularly ones desired to counter circumvention) are inaccessible to humans with disabilities, and/or are difficult for humans to pass.

Note that "CAPTCHA" is an acronym for "Completely Automated Public Turing test to tell Computers and Humans Apart" so that the original designers of the test regard the test as a Turing test to some degree.

Judgement of sufficient input

An alternative conception of a Reverse Turing Test is to use the test to determine whether sufficient information is being transmitted between the tester and the subject. For example, if the information sent by the tester is insufficient for the human doctor to perform diagnosis accurately, then a medical diagnostic program could not be blamed for also failing to diagnose accurately.[ citation needed ]

This formulation is of particular use in developing Artificial Intelligence programs, because it gives an indication of the input needed for a system that attempts to emulate human activities.[ citation needed ]

See also

Related Research Articles

In the field of artificial intelligence, the most difficult problems are informally known as AI-complete or AI-hard, implying that the difficulty of these computational problems, assuming intelligence is computational, is equivalent to that of solving the central artificial intelligence problem—making computers as intelligent as people, or strong AI. To call a problem AI-complete reflects an attitude that it would not be solved by a simple specific algorithm.

<span class="mw-page-title-main">ELIZA</span> Early natural language processing computer program

ELIZA is an early natural language processing computer program created from 1964 to 1966 at MIT by Joseph Weizenbaum. Created to explore communication between humans and machines, ELIZA simulated conversation by using a pattern matching and substitution methodology that gave users an illusion of understanding on the part of the program, but had no representation that could be considered really understanding what was being said by either party. Whereas the ELIZA program itself was written (originally) in MAD-SLIP, the pattern matching directives that contained most of its language capability were provided in separate "scripts", represented in a lisp-like representation. The most famous script, DOCTOR, simulated a psychotherapist of the Rogerian school, and used rules, dictated in the script, to respond with non-directional questions to user inputs. As such, ELIZA was one of the first chatterbots and one of the first programs capable of attempting the Turing test.

<span class="mw-page-title-main">Chatbot</span> Program that simulates conversation

A chatbot or chatterbot is a software application used to conduct an online chat conversation via text or text-to-speech, in lieu of providing direct contact with a live human agent. Chatbots are computer programs that are capable of maintaining a conversation with a user in natural language, understanding their intent, and replying based on preset rules and data. Designed to convincingly simulate the way a human would behave as a conversational partner, chatbot systems typically require continuous tuning and testing with many in production unable to adequately converse; in 2012 none of them could pass the standard Turing test. The term "ChatterBot" was originally coined by Michael Mauldin in 1994 to describe these conversational programs.

<span class="mw-page-title-main">Conversation</span> Interactive communication between two or more people

Conversation is interactive communication between two or more people. The development of conversational skills and etiquette is an important part of socialization. The development of conversational skills in a new language is a frequent focus of language teaching and learning. Conversation analysis is a branch of sociology which studies the structure and organization of human interaction, with a more specific focus on conversational interaction.

A CAPTCHA is a type of challenge–response test used in computing to determine whether the user is human.

The Loebner Prize was an annual competition in artificial intelligence that awards prizes to the computer programs considered by the judges to be the most human-like. The prize is reported as defunct since 2020. The format of the competition was that of a standard Turing test. In each round, a human judge simultaneously holds textual conversations with a computer program and a human being via computer. Based upon the responses, the judge must decide which is which.

Algorithmic learning theory is a mathematical framework for analyzing machine learning problems and algorithms. Synonyms include formal learning theory and algorithmic inductive inference. Algorithmic learning theory is different from statistical learning theory in that it does not make use of statistical assumptions and analysis. Both algorithmic and statistical learning theory are concerned with machine learning and can thus be viewed as branches of computational learning theory.

"Computing Machinery and Intelligence" is a seminal paper written by Alan Turing on the topic of artificial intelligence. The paper, published in 1950 in Mind, was the first to introduce his concept of what is now known as the Turing test to the general public.

PARRY was an early example of a chatbot, implemented in 1972 by psychiatrist Kenneth Colby.

An Internet bot, web robot, robot or simply bot, is a software application that runs automated tasks (scripts) over the Internet, usually with the intent to imitate human activity on the Internet, such as messaging, on a large scale. An Internet bot plays the client role in a client–server model whereas the server role is usually played by web servers. Internet bots are able to perform tasks, that are simple and repetitive, much faster than a person could ever do. The most extensive use of bots is for web crawling, in which an automated script fetches, analyzes and files information from web servers. More than half of all web traffic is generated by bots.

<span class="mw-page-title-main">Robby Garner</span> American natural language programmer and software developer

Robby Garner is an American natural language programmer and software developer. He won the 1998 and 1999 Loebner Prize contests with the program called Albert One. He is listed in the 2001 Guinness Book of World Records as having written the "most human" computer program.

The philosophy of artificial intelligence is a branch of the philosophy of technology that explores artificial intelligence and its implications for knowledge and understanding of intelligence, ethics, consciousness, epistemology, and free will. Furthermore, the technology is concerned with the creation of artificial animals or artificial people so the discipline is of considerable interest to philosophers. These factors contributed to the emergence of the philosophy of artificial intelligence. Some scholars argue that the AI community's dismissal of philosophy is detrimental.

The Verbot (Verbal-Robot) was a popular chatterbot program and artificial intelligence software development kit (SDK) for Windows and the web.

Artificial stupidity is a term used within the field of computer science to refer to a technique of "dumbing down" computer programs in order to deliberately introduce errors in their responses.

<span class="mw-page-title-main">Graphics Turing test</span>

In computer graphics the graphics Turing test is a variant of the Turing test, the twist being that a human judge viewing and interacting with an artificially generated world should be unable to reliably distinguish it from reality.

Kenneth Mark Colby was an American psychiatrist dedicated to the theory and application of computer science and artificial intelligence to psychiatry. Colby was a pioneer in the development of computer technology as a tool to try to understand cognitive functions and to assist both patients and doctors in the treatment process. He is perhaps best known for the development of a computer program called PARRY, which mimicked a person with paranoid schizophrenia and could "converse" with others. PARRY sparked serious debate about the possibility and nature of machine intelligence.

<span class="mw-page-title-main">Computer game bot Turing test</span>

The computer game bot Turing test is a variant of the Turing test, where a human judge viewing and interacting with a virtual world must distinguish between other humans and video game bots, both interacting with the same virtual world. This variant was first proposed in 2008 by Associate Professor Philip Hingston of Edith Cowan University, and implemented through a tournament called the 2K BotPrize.

The confederate effect is the phenomena of people falsely classifying human intelligence as machine intelligence during Turing tests. For example, in the Loebner Prize during which a tester conducts a text exchange with one human and one artificial-intelligence chatbot and is tasked to identify which is which, the confederate effect describes the tester inaccurately identifying the human as the machine.

<span class="mw-page-title-main">Turing test</span> Test of a machines ability to imitate human intelligence

The Turing test, originally called the imitation game by Alan Turing in 1950, is a test of a machine's ability to exhibit intelligent behaviour equivalent to, or indistinguishable from, that of a human. Turing proposed that a human evaluator would judge natural language conversations between a human and a machine designed to generate human-like responses. The evaluator would be aware that one of the two partners in conversation was a machine, and all participants would be separated from one another. The conversation would be limited to a text-only channel, such as a computer keyboard and screen, so the result would not depend on the machine's ability to render words as speech. If the evaluator could not reliably tell the machine from the human, the machine would be said to have passed the test. The test results would not depend on the machine's ability to give correct answers to questions, only on how closely its answers resembled those a human would give.

Informal methods of validation and verification are some of the more frequently used in modeling and simulation. They are called informal because they are more qualitative than quantitative. While many methods of validation or verification rely on numerical results, informal methods tend to rely on the opinions of experts to draw a conclusion. While numerical results are not the primary focus, this does not mean that the numerical results are completely ignored. There are several reasons why an informal method might be chosen. In some cases, informal methods offer the convenience of quick testing to see if a model can be validated. In other instances, informal methods are simply the best available option. Informal methods are not less effective than formal methods and should be performed with the same discipline and structure that one would expect in "formal" methods. When executed properly, solid conclusions can be made.

References

  1. Albury, W. R. (June 1996). "Claude Bernard: Rationalite d'une methode. Pierre Gendron". Isis. 87 (2): 372–373. doi:10.1086/357537. ISSN   0021-1753.
  2. "Techi".