Nested quotation

Last updated

A nested quotation is a quotation that is encapsulated inside another quotation, forming a hierarchy with multiple levels. When focusing on a certain quotation, one must interpret it within its scope. Nested quotation can be used in literature (as in nested narration), speech, and computer science (as in "meta"-statements that refer to other statements as strings). Nested quotation can be very confusing until evaluated carefully and until each quotation level is put into perspective.

Contents

In literature

In languages that allow for nested quotes and use quotation mark punctuation to indicate direct speech, hierarchical quotation sublevels are usually punctuated by alternating between primary quotation marks and secondary quotation marks. For a comprehensive analysis of the major quotation mark systems employed in major writing systems, see Quotation mark.

In JavaScript programming

Nested quotes often become an issue using the eval keyword. [1] The eval function is a function that converts and interprets a string as actual JavaScript code, and runs that code. If that string is specified as a literal, then the code must be written as a quote itself (and escaped accordingly).

For example:

eval("var a=3; alert();");

This code declares a variable a, which is assigned the value 3, and a blank alert window is popped up to the user.

Nested strings (level 2)

Suppose we had to make a quote inside the quoted interpreted code. In JavaScript, you can only have one unescaped quote sublevel, which has to be the alternate of the top-level quote. If the 2nd-level quote symbol is the same as the first-level symbol, these quotes must be escaped. [2] For example:

alert("I don't need to escape here"); alert('Nor is it "required" here'); alert('But now I do or it won\'t work');

Nested strings (level 3 and beyond)

Furthermore, (unlike in the literature example), the third-level nested quote must be escaped in order not to conflict with either the first- or second-level quote delimiters. This is true regardless of alternating-symbol encapsulation. Every level after the third level must be recursively escaped for all the levels of quotes in which it is contained. This includes the escape character itself, the backslash (“\”), which is escaped by itself (“\\”).

For every sublevel in which a backslash is contained, it must be escaped for the level above it, and then all the backslashes used to escape that backslash as well as the original backslash, must be escaped, and so on and so forth for every level that is ascended. This is to avoid ambiguity and confusion in escaping.

Here are some examples that demonstrate some of the above principles:

document.write("<html><head></head><body><p>Hello, this is the body of the document.");document.writeln("</p>");document.write("<p>A newline in HTML code acts simply as whitespace, whereas a &lt;br&gt; starts a new line.");document.write("</p></body></html>\n");eval('eval(\"eval(\\\"alert(\\\\\\\"Now I\\\\\\\\\\\\\\\'m confused!\\\\\\\")\\\")\")');

Note that the number of backslashes increase from 0 to 1 to 3 to 7 to 15, indicating a rule for successively nested symbols, meaning that the length of the escape sequences grows exponentially with quotation depth.

See also

Related Research Articles

In computing and telecommunication, an escape character is a character that invokes an alternative interpretation on the following characters in a character sequence. An escape character is a particular case of metacharacters. Generally, the judgement of whether something is an escape character or not depends on the context.

<span class="mw-page-title-main">S-expression</span> Data serialization format

In computer programming, an S-expression is an expression in a like-named notation for nested list (tree-structured) data. S-expressions were invented for and popularized by the programming language Lisp, which uses them for source code as well as data.

In computer science, an escape sequence is a combination of characters that has a meaning other than the literal characters contained therein; it is marked by one or more preceding characters.

A metacharacter is a character that has a special meaning to a computer program, such as a shell interpreter or a regular expression (regex) engine.

The backslash\ is a mark used mainly in computing and mathematics. It is the mirror image of the common slash /. It is a relatively recent mark, first documented in the 1930s. It is sometimes called a hack, whack, escape, reverse slash, slosh, downwhack, backslant, backwhack, bash, reverse slant, reverse solidus, and reversed virgule.

A string literal or anonymous string is a literal for a string value in the source code of a computer program. Modern programming languages commonly use a quoted sequence of characters, formally "bracketed delimiters", as in x = "foo", where "foo" is a string literal with value foo. Methods such as escape sequences can be used to avoid the problem of delimiter collision and allow the delimiters to be embedded in a string. There are many alternate notations for specifying string literals especially in complicated cases. The exact notation depends on the programming language in question. Nevertheless, there are general guidelines that most modern programming languages follow.

An HTML element is a type of HTML document component, one of several types of HTML nodes. The first used version of HTML was written by Tim Berners-Lee in 1993 and there have since been many versions of HTML. The current de facto standard is governed by the industry group WHATWG and is known as the HTML Living Standard.

YAML is a human-readable data serialization language. It is commonly used for configuration files and in applications where data are being stored or transmitted. YAML targets many of the same communications applications as Extensible Markup Language (XML) but has a minimal syntax that intentionally differs from Standard Generalized Markup Language (SGML). It uses Python-style indentation to indicate nesting and does not require quotes around most string values.

A path is a string of characters used to uniquely identify a location in a directory structure. It is composed by following the directory tree hierarchy in which components, separated by a delimiting character, represent each directory. The delimiting character is most commonly the slash ("/"), the backslash character ("\"), or colon (":"), though some operating systems may use a different delimiter. Paths are used extensively in computer science to represent the directory/file relationships common in modern operating systems and are essential in the construction of Uniform Resource Locators (URLs). Resources can be represented by either absolute or relative paths.

The Extended Speech Assessment Methods Phonetic Alphabet (X-SAMPA) is a variant of SAMPA developed in 1995 by John C. Wells, professor of phonetics at University College London. It is designed to unify the individual language SAMPA alphabets, and extend SAMPA to cover the entire range of characters in the 1993 version of International Phonetic Alphabet (IPA). The result is a SAMPA-inspired remapping of the IPA into 7-bit ASCII.

The backtick` is a typographical mark used mainly in computing. It is also known as backquote, grave, or grave accent.

In some programming languages, eval, short for the English evaluate, is a function which evaluates a string as though it were an expression in the language, and returns a result; in others, it executes multiple lines of code as though they had been included instead of the line including the eval. The input to eval is not necessarily a string; it may be structured representation of code, such as an abstract syntax tree, or of special type such as code. The analog for a statement is exec, which executes a string as if it were a statement; in some languages, such as Python, both are present, while in other languages only one of either eval or exec is.

<span class="mw-page-title-main">JSON</span> Open standard file format and data interchange

JSON is an open standard file format and data interchange format that uses human-readable text to store and transmit data objects consisting of attribute–value pairs and arrays. It is a commonly used data format with diverse uses in electronic data interchange, including that of web applications with servers.

An INI file is a configuration file for computer software that consists of a text-based content with a structure and syntax comprising key–value pairs for properties, and sections that organize the properties. The name of these configuration files comes from the filename extension INI, for initialization, used in the MS-DOS operating system which popularized this method of software configuration. The format has become an informal standard in many contexts of configuration, but many applications on other operating systems use different file name extensions, such as conf and cfg.

The term CDATA, meaning character data, is used for distinct, but related, purposes in the markup languages SGML and XML. The term indicates that a certain portion of the document is general character data, rather than non-character data or character data with a more specific, limited structure.

.properties is a file extension for files mainly used in Java-related technologies to store the configurable parameters of an application. They can also be used for storing strings for Internationalization and localization; these are known as Property Resource Bundles.

In computer programming, leaning toothpick syndrome (LTS) is the situation in which a quoted expression becomes unreadable because it contains a large number of escape characters, usually backslashes ("\"), to avoid delimiter collision.

This comparison of programming languages compares the features of language syntax (format) for over 50 computer programming languages.

Magic quotes was a feature of the PHP scripting language, wherein strings are automatically escaped—special characters are prefixed with a backslash—before being passed on. It was introduced to help newcomers write functioning SQL commands without requiring manual escaping. It was later described as intended to prevent inexperienced developers from writing code that was vulnerable to SQL injection attacks.

A batch file is a script file in DOS, OS/2 and Microsoft Windows. It consists of a series of commands to be executed by the command-line interpreter, stored in a plain text file. A batch file may contain any command the interpreter accepts interactively and use constructs that enable conditional branching and looping within the batch file, such as IF, FOR, and GOTO labels. The term "batch" is from batch processing, meaning "non-interactive execution", though a batch file might not process a batch of multiple data.

References

  1. "JavaScript eval() Method".
  2. "JavaScript Strings".