Filename extension | .aiml |
---|---|
Developed by | Dr. Richard S. Wallace |
Initial release | July 16, 2001 [1] |
Latest release | |
Type of format | Artificial intelligence |
Extended from | XML |
Open format? | Yes |
Website | www |
Artificial Intelligence Markup Language (AIML) is an XML dialect for creating natural language software agents.
The XML dialect called AIML was developed by Richard Wallace and a worldwide free software community between 1995[ citation needed ] and 2002. AIML formed the basis for what was initially a highly extended Eliza called "A.L.I.C.E." ("Artificial Linguistic Internet Computer Entity"), which won the annual Loebner Prize Competition in Artificial Intelligence [3] three times, and was also the Chatterbox Challenge [4] Champion in 2004.
Because the A.L.I.C.E. AIML set was released under the GPL, and because most AIML interpreters are offered under a free or open source license, many "Alicebot clones" have been created based upon the original implementation of the program and its AIML knowledge base. Free AIML sets [5] in several languages have been developed and made available by the user community. There are AIML interpreters available in Java, Ruby, Python, C++, C#, Pascal, and other languages (see below [ dead link ]). A semi-formal specification [6] and a W3C XML Schema for AIML [7] are available.
Since early 2013, The A.L.I.C.E foundation has been working on a draft specification for AIML 2.0. [8]
AIML contains several elements. The most important of these are described in further detail below.
Categories in AIML form the fundamental unit of knowledge. A category consists of at least two further elements: the pattern and template elements. Here is a simple category:
<category><pattern>WHATISYOURNAME</pattern><template>MynameisMichaelN.SEvanious.</template></category>
When this category is loaded, an AIML bot will respond to the input "What is your name" with the response "My name is Michael N.S Evanious."
A pattern is a string of characters intended to match one or more user inputs. A literal pattern like
WHAT IS YOUR NAME
will match only one input, ignoring case: "what is your name". But patterns may also contain wildcards, which match one or more words. A pattern like
WHAT IS YOUR *
will match an infinite number of inputs, including "what is your name", "what is your shoe size", "what is your purpose in life", etc.
The AIML pattern syntax is a very simple pattern language, substantially less complex than regular expressions and as such less than level 3 in the Chomsky hierarchy. To compensate for the simple pattern matching capabilities, AIML interpreters can provide preprocessing functions to expand abbreviations, remove misspellings, etc.
The AIML syntax itself is at least as complex as finite state machines and as such at least of level 3 in the Chomsky hierarchy. This is because a state correlates to one topic. To implement that behavior, the topic should have a "*" Pattern to make sure, that the state is not left accidentally. A state transit is implemented with the <think><setname="topic">state2</set></think>
Tag. This way, the bot will be able to "remember" the topic talked about or even user privileges, which are gained during the chat.
A template specifies the response to a matched pattern. A template may be as simple as some literal text, like
My name is John.
A template may use variables, such as the example
My name is <bot name="name"/>.
which will substitute the bot's name into the sentence, or
You told me you are <get name="user-age"/> years old.
which will substitute the user's age (if known) into the sentence.
Template elements include basic text formatting, conditional response (if-then/else), and random responses.
Templates may also redirect to other patterns, using an element called srai (Symbolic Reduction in Artificial Intelligence). This can be used to implement synonymy, as in this example (where CDATA is used to avoid the need for XML escaping):
<category><pattern>WHATISYOURNAME</pattern><template><![CDATA[My name is <bot name="name"/>.]]></template></category><category><pattern>WHATAREYOUCALLED</pattern><template><srai>whatisyourname</srai></template></category>
The first category simply answers an input "what is your name" with a statement of the bot's name. The second category, however, says that the input "what are you called" should be redirected to the category that matches the input "what is your name"—in other words, it is saying that the two phrases are equivalent.
Templates can contain other types of content, which may be processed by whatever user interface the bot is talking through. So, for example, a template may use HTML tags for formatting, which can be ignored by clients that don't support HTML.
sed is a Unix utility that parses and transforms text, using a simple, compact programming language. It was developed from 1973 to 1974 by Lee E. McMahon of Bell Labs, and is available today for most operating systems. sed was based on the scripting features of the interactive editor ed and the earlier qed. It was one of the earliest tools to support regular expressions, and remains in use for text processing, most notably with the substitution command. Popular alternative tools for plaintext string manipulation and "stream editing" include AWK and Perl.
SNOBOL is a series of programming languages developed between 1962 and 1967 at AT&T Bell Laboratories by David J. Farber, Ralph Griswold and Ivan P. Polonsky, culminating in SNOBOL4. It was one of a number of text-string-oriented languages developed during the 1950s and 1960s; others included COMIT and TRAC.
XSLT is a language originally designed for transforming XML documents into other XML documents, or other formats such as HTML for web pages, plain text or XSL Formatting Objects, which may subsequently be converted to other formats, such as PDF, PostScript and PNG. Support for JSON and plain-text transformation was added in later updates to the XSLT 1.0 specification.
In computer science, a preprocessor is a program that processes its input data to produce output that is used as input in another program. The output is said to be a preprocessed form of the input data, which is often used by some subsequent programs like compilers. The amount and kind of processing done depends on the nature of the preprocessor; some preprocessors are only capable of performing relatively simple textual substitutions and macro expansions, while others have the power of full-fledged programming languages.
A chatbot is a software application or web interface that is designed to mimic human conversation through text or voice interactions. Modern chatbots are typically online and use generative artificial intelligence systems that are capable of maintaining a conversation with a user in natural language and simulating the way a human would behave as a conversational partner. Such chatbots often use deep learning and natural language processing, but simpler chatbots have existed for decades.
SableVM was a clean room implementation of Java bytecode interpreter implementing the Java virtual machine (VM) specification, second edition. SableVM was designed to be a robust, extremely portable, efficient, and fully specifications-compliant Java Virtual Machine that would be easy to maintain and to extend. It is now no longer being maintained.
Jakarta Faces, formerly Jakarta Server Faces and JavaServer Faces (JSF) is a Java specification for building component-based user interfaces for web applications. It was formalized as a standard through the Java Community Process as part of the Java Platform, Enterprise Edition. It is an MVC web framework that simplifies the construction of user interfaces (UI) for server-based applications by using reusable UI components in a page.
Richard S. Wallace is an American author of AIML and Botmaster of A.L.I.C.E.. He is also the founder of the A.L.I.C.E Artificial Intelligence Foundation. Dr. Wallace's work has appeared in the New York Times, WIRED, CNN, ZDTV and in numerous foreign language publications across Asia, Latin America and Europe.
Extensible Application Markup Language is a declarative XML-based language developed by Microsoft for initializing structured values and objects. It is available under Microsoft's Open Specification Promise.
Windows Presentation Foundation (WPF) is a free and open-source user interface framework for Windows-based desktop applications. WPF applications are based in .NET, and are primarily developed using C# and XAML.
A.L.I.C.E., also referred to as Alicebot, or simply Alice, is a natural language processing chatterbot—a program that engages in a conversation with a human by applying some heuristical pattern matching rules to the human's input. It was inspired by Joseph Weizenbaum's classical ELIZA program.
A dialogue system, or conversational agent (CA), is a computer system intended to converse with a human. Dialogue systems employed one or more of text, speech, graphics, haptics, gestures, and other modes for communication on both the input and output channel.
A dialog manager (DM) is a component of a dialog system (DS), responsible for the state and flow of the conversation. Usually:
FeedSync for Atom and RSS, previously Simple Sharing Extensions, are extensions to RSS and Atom feed formats designed to enable the bi-directional synchronization of information by using a variety of data sources. Initially developed by Ray Ozzie, Chief Software Architect at Microsoft, it is now maintained by Jack Ozzie, George Moromisato, Matt Augustine, Paresh Suthar and Steven Lees. Dave Winer, the designer of the UserLand Software RSS specification variants, has given input for the specifications.
The identity transform is a data transformation that copies the source data into the destination data without change.
SCXML stands for State Chart XML: State Machine Notation for Control Abstraction. It is an XML-based markup language that provides a generic state-machine-based execution environment based on Harel statecharts.
Apache MyFaces is an Apache Software Foundation project that creates and maintains an open-source JavaServer Faces implementation, along with several libraries of JSF components that can be deployed on the core implementation. The project is divided into several sub-projects:
Bruce Wilcox is an artificial intelligence programmer.
The following outline is provided as an overview of and topical guide to natural-language processing:
Pandorabots, Inc. is an artificial intelligence company that runs a web service for building and deploying chatbots. Pandorabots implements and supports development of the Artificial Intelligence Markup Language and makes portions of its code accessible for free. The Pandorabots Platform is "one of the oldest and largest chatbot hosting services in the world", allowing creation of virtual agents to hold human-like text or voice chats with consumers.The platform is written in Allegro Common LISP.