Nested quotation

Last updated

A nested quotation is a quotation that is encapsulated inside another quotation, forming a hierarchy with multiple levels. When focusing on a certain quotation, one must interpret it within its scope. Nested quotation can be used in literature (as in nested narration), speech, and computer science (as in "meta"-statements that refer to other statements as strings). Nested quotation can be very confusing until evaluated carefully and until each quotation level is put into perspective.

Contents

In literature

In languages that allow for nested quotes and use quotation mark punctuation to indicate direct speech, hierarchical quotation sublevels are usually punctuated by alternating between primary quotation marks and secondary quotation marks. For a comprehensive analysis of the major quotation mark systems employed in major writing systems, see Quotation mark.

In JavaScript programming

Nested quotes often become an issue using the eval keyword. [1] The eval function is a function that converts and interprets a string as actual JavaScript code, and runs that code. If that string is specified as a literal, then the code must be written as a quote itself (and escaped accordingly).

For example:

eval("var a=3; alert();");

This code declares a variable a, which is assigned the value 3, and a blank alert window is popped up to the user.

Nested strings (level 2)

Suppose we had to make a quote inside the quoted interpreted code. In JavaScript, you can only have one unescaped quote sublevel, which has to be the alternate of the top-level quote. If the 2nd-level quote symbol is the same as the first-level symbol, these quotes must be escaped. [2] For example:

alert("I don't need to escape here"); alert('Nor is it "required" here'); alert('But now I do or it won\'t work');

Nested strings (level 3 and beyond)

Furthermore, (unlike in the literature example), the third-level nested quote must be escaped in order not to conflict with either the first- or second-level quote delimiters. This is true regardless of alternating-symbol encapsulation. Every level after the third level must be recursively escaped for all the levels of quotes in which it is contained. This includes the escape character itself, the backslash (“\”), which is escaped by itself (“\\”).

For every sublevel in which a backslash is contained, it must be escaped for the level above it, and then all the backslashes used to escape that backslash as well as the original backslash, must be escaped, and so on and so forth for every level that is ascended. This is to avoid ambiguity and confusion in escaping.

Here are some examples that demonstrate some of the above principles:

document.write("<html><head></head><body><p>Hello, this is the body of the document.");document.writeln("</p>");document.write("<p>A newline in HTML code acts simply as whitespace, whereas a &lt;br&gt; starts a new line.");document.write("</p></body></html>\n");eval('eval(\"eval(\\\"alert(\\\\\\\"Now I\\\\\\\\\\\\\\\'m confused!\\\\\\\")\\\")\")');

Note that the number of backslashes increase from 0 to 1 to 3 to 7 to 15, indicating a rule for successively nested symbols, meaning that the length of the escape sequences grows exponentially with quotation depth.

See also

Related Research Articles

In computing and telecommunication, an escape character is a character that invokes an alternative interpretation on the following characters in a character sequence. An escape character is a particular case of metacharacters. Generally, the judgement of whether something is an escape character or not depends on the context.

<span class="mw-page-title-main">S-expression</span> Data serialization format

In computer programming, an S-expression is an expression in a like-named notation for nested list (tree-structured) data. S-expressions were invented for and popularized by the programming language Lisp, which uses them for source code as well as data.

In English writing, quotation marks or inverted commas, also known informally as quotes, talking marks, speech marks, quote marks, quotemarks or speechmarks, are punctuation marks placed on either side of a word or phrase in order to identify it as a quotation, direct speech or a literal title or name. Quotation marks may be used to indicate that the meaning of the word or phrase they surround should be taken to be different from that typically associated with it, and are often used in this way to express irony. They are also sometimes used to emphasise a word or phrase, although this is usually considered incorrect.

In computer science, an escape sequence is a combination of characters that has a meaning other than the literal characters contained therein; it is marked by one or more preceding characters.

A metacharacter is a character that has a special meaning to a computer program, such as a shell interpreter or a regular expression (regex) engine.

The backslash\ is a typographical mark used mainly in computing and mathematics. It is the mirror image of the common slash /. It is a relatively recent mark, first documented in the 1930s. It is sometimes called a hack, whack, escape, reverse slash, slosh, downwhack, backslant, backwhack, bash, reverse slant, reverse solidus, and reversed virgule.

A string literal or anonymous string is a literal for a string value in the source code of a computer program. Modern programming languages commonly use a quoted sequence of characters, formally "bracketed delimiters", as in x = "foo", where "foo" is a string literal with value foo. Methods such as escape sequences can be used to avoid the problem of delimiter collision and allow the delimiters to be embedded in a string. There are many alternate notations for specifying string literals especially in complicated cases. The exact notation depends on the programming language in question. Nevertheless, there are general guidelines that most modern programming languages follow.

An HTML element is a type of HTML document component, one of several types of HTML nodes. The first used version of HTML was written by Tim Berners-Lee in 1993 and there have since been many versions of HTML. The most commonly used version is HTML 4.01, which became official standard in December 1999. An HTML document is composed of a tree of simple HTML nodes, such as text nodes, and HTML elements, which add semantics and formatting to parts of document. Each element can have HTML attributes specified. Elements can also have content, including other elements and text.

YAML(see § History and name) is a human-readable data-serialization language. It is commonly used for configuration files and in applications where data is being stored or transmitted. YAML targets many of the same communications applications as Extensible Markup Language (XML) but has a minimal syntax which intentionally differs from Standard Generalized Markup Language (SGML). It uses both Python-style indentation to indicate nesting, and a more compact format that uses [...] for lists and {...} for maps but forbids tab characters to use as indentation thus only some JSON files are valid YAML 1.2.

A path is a string of characters used to uniquely identify a location in a directory structure. It is composed by following the directory tree hierarchy in which components, separated by a delimiting character, represent each directory. The delimiting character is most commonly the slash ("/"), the backslash character ("\"), or colon (":"), though some operating systems may use a different delimiter. Paths are used extensively in computer science to represent the directory/file relationships common in modern operating systems and are essential in the construction of Uniform Resource Locators (URLs). Resources can be represented by either absolute or relative paths.

The Extended Speech Assessment Methods Phonetic Alphabet (X-SAMPA) is a variant of SAMPA developed in 1995 by John C. Wells, professor of phonetics at University College London. It is designed to unify the individual language SAMPA alphabets, and extend SAMPA to cover the entire range of characters in the 1993 version of International Phonetic Alphabet (IPA). The result is a SAMPA-inspired remapping of the IPA into 7-bit ASCII.

The backtick` is a typographical mark used mainly in computing. It is also known as backquote, grave, or grave accent.

In some programming languages, eval, short for the English evaluate, is a function which evaluates a string as though it were an expression in the language, and returns a result; in others, it executes multiple lines of code as though they had been included instead of the line including the eval. The input to eval is not necessarily a string; it may be structured representation of code, such as an abstract syntax tree, or of special type such as code. The analog for a statement is exec, which executes a string as if it were a statement; in some languages, such as Python, both are present, while in other languages only one of either eval or exec is.

<span class="mw-page-title-main">JSON</span> Open standard file format and data interchange

JSON is an open standard file format and data interchange format that uses human-readable text to store and transmit data objects consisting of attribute–value pairs and arrays. It is a common data format with diverse uses in electronic data interchange, including that of web applications with servers.

The term CDATA, meaning character data, is used for distinct, but related, purposes in the markup languages SGML and XML. The term indicates that a certain portion of the document is general character data, rather than non-character data or character data with a more specific, limited structure.

In computer programming, leaning toothpick syndrome (LTS) is the situation in which a quoted expression becomes unreadable because it contains a large number of escape characters, usually backslashes ("\"), to avoid delimiter collision.

This comparison of programming languages compares the features of language syntax (format) for over 50 computer programming languages.

<span class="mw-page-title-main">Comment (computer programming)</span> Explanatory note in the source code of a computer program

In computer programming, a comment is a programmer-readable explanation or annotation in the source code of a computer program. They are added with the purpose of making the source code easier for humans to understand, and are generally ignored by compilers and interpreters. The syntax of comments in various programming languages varies considerably.

Magic quotes was a feature of the PHP scripting language, wherein strings are automatically escaped—special characters are prefixed with a backslash—before being passed on. It was introduced to help newcomers write functioning SQL commands without requiring manual escaping. It was later described as intended to prevent inexperienced developers from writing code that was vulnerable to SQL injection attacks.

A batch file is a script file in DOS, OS/2 and Microsoft Windows. It consists of a series of commands to be executed by the command-line interpreter, stored in a plain text file. A batch file may contain any command the interpreter accepts interactively and use constructs that enable conditional branching and looping within the batch file, such as IF, FOR, and GOTO labels. The term "batch" is from batch processing, meaning "non-interactive execution", though a batch file might not process a batch of multiple data.

References

  1. "JavaScript eval() Method".
  2. "JavaScript Strings".