Here document

Last updated

In computing, a here document (here-document, here-text, heredoc, hereis, here-string or here-script) is a file literal or input stream literal: it is a section of a source code file that is treated as if it were a separate file. The term is also used for a form of multiline string literals that use similar syntax, preserving line breaks and other whitespace (including indentation) in the text.

Contents

Here documents originate in the Unix shell, [1] and are found in the Bourne shell (sh), C shell (csh), [2] tcsh (tcsh), [3] KornShell (ksh), Bourne Again Shell (bash), and Z shell (zsh), among others. Here document-style string literals are found in various high-level languages, notably the Perl programming language (syntax inspired by Unix shell) and languages influenced by Perl, such as PHP and Ruby. JavaScript also supports this functionality via template literals, a feature added in its 6th revision (ES6). Other high-level languages such as Python, Julia and Tcl have other facilities for multiline strings.

Here documents can be treated either as files or strings. Some shells treat them as a format string literal, allowing variable substitution and command substitution inside the literal.

Overview

The most common syntax for here documents, originating in Unix shells, is << followed by a delimiting identifier (often the word EOF or END [4] ), followed, starting on the next line, by the text to be quoted, and then closed by the same delimiting identifier on its own line. This syntax is because here documents are formally stream literals, and the content of the here document is often redirected to stdin (standard input) of the preceding command or current shell script/executable.

The here document syntax analogous to the shell syntax for input redirection, which is < followed by the name of the file to be used as input.

Other languages often use substantially similar syntax, but details of syntax and actual functionality can vary significantly. When used simply for string literals, the << does not indicate indirection, but is simply a starting delimiter convention. In some languages, such as Ruby, << is also used for input redirection, thus resulting in << being used twice if one wishes to redirect from a here document string literal.

File literals

Narrowly speaking, here documents are file literals or stream literals. These originate in the Unix shell, though similar facilities are available in some other languages.

Unix shells

Here documents are available in many Unix shells. [1] In the following example, text is passed to the tr command (transliterating lower to upper-case) using a here document. This could be in a shell file, or entered interactively at a prompt.

$LANG=Ctra-zA-Z<< END> one two three> four five six> END ONETWOTHREE FOURFIVESIX 

In this case END was used as the delimiting identifier. It specified the start and end of the here document. The redirect and the delimiting identifier do not need to be separated by a space: <<END or << END both work equally well.

By default, behavior is largely identical to the contents of double quotes: variable names are replaced by their values, commands within backticks are evaluated, etc. [lower-alpha 1]

$cat<< EOF> \$ Working dir "$PWD" `pwd`> EOF $Workingdir"/home/user"/home/user 

This can be disabled by quoting any part of the label, which is then ended by the unquoted value; [lower-alpha 2] the behavior is essentially identical to that if the contents were enclosed in single quotes. Thus for example by setting it in single quotes:

$ cat<<'EOF'> \$ Working dir "$PWD" `pwd`> EOF\$ Working dir "$PWD" `pwd`

Double quotes may also be used, but this is subject to confusion, because expansion does occur in a double-quoted string, but does not occur in a here document with double-quoted delimiter. [5] Single- and double-quoted delimiters are distinguished in some other languages, notably Perl (see below), where behavior parallels the corresponding string quoting.

In POSIX shell but not csh/tcsh, appending a minus sign to the << (i.e. <<-) has the effect that leading tabs are ignored. [3] This allows indenting here documents in shell scripts (primarily for alignment with existing indentation) without changing their value: [lower-alpha 3]

A script containing:

LANG=Ctra-zA-Z<<- END_TEXTHere doc with <<- A single space character (i.e. 0x20 )  is at the beginning of this line This line begins with a single TAB character i.e 0x09  as does the next line END_TEXTechoTheintendedendwasbeforethislineechoandthesewerenotprocessedbytr echo+++++++++++++++  LANG=Ctra-zA-Z<< END_TEXTHere doc with << A single space character (i.e. 0x20 )  is at the beginning of this line This line begins with a single TAB character i.e 0x09 as does the next line END_TEXTechoTheintendedendwasbeforethisline,echobutbecausethelinewiththedelimitingIdentifierbeganwithaTABitwasNOTrecognizedand echothetrcommandcontinuedprocessing. 

produces:

HERE DOC WITH <<- A SINGLE SPACE CHARACTER (I.E. 0X20 )  IS AT THE BEGINNING OF THIS LINETHIS LINE BEGINS WITH A SINGLE TAB CHARACTER I.E 0X09  AS DOES THE NEXT LINEThe intended end was before this lineand these were not processed by tr+++++++++++++++HERE DOC WITH << A SINGLE SPACE CHARACTER (I.E. 0X20 )  IS AT THE BEGINNING OF THIS LINE THIS LINE BEGINS WITH A SINGLE TAB CHARACTER I.E 0X09 AS DOES THE NEXT LINE END_TEXTECHO THE INTENDED END WAS BEFORE THIS LINE, ECHO BUT BECAUSE THE LINE WITH THE DELIMITING IDENTIFIER BEGAN WITH A TAB IT WAS NOT RECOGNIZED ANDECHO THE TR COMMAND CONTINUED PROCESSING.

Another use is to output to a file:

$ cat<<EOF>~/testFile001 >   3 spaces precede this text.> A single tab character is at the beginning of this line.>Nothing precedes this textEOF

Here strings

A here string (available in bash, ksh, or zsh) is syntactically similar, consisting of <<<, and effects input redirection from a word (a sequence treated as a unit by the shell, in this context generally a string literal). In this case the usual shell syntax is used for the word (“here string syntax”), with the only syntax being the redirection: a here string is an ordinary string used for input redirection, not a special kind of string.

A single word need not be quoted:

$ LANG=Ctra-zA-Z<<<one ONE

In case of a string with spaces, it must be quoted:

$ LANG=Ctra-zA-Z<<<'one two three'ONE TWO THREE

This could also be written as:

$ foo='one two three'$ LANG=Ctra-zA-Z<<<"$foo"ONE TWO THREE

Multiline strings are acceptable, yielding:

$ LANG=Ctra-zA-Z<<<'one > two three'ONETWO THREE

Note that leading and trailing newlines, if present, are included:

$ LANG=Ctra-zA-Z<<<'> one> two three> 'ONETWO THREE$

The key difference from here documents is that, in here documents, the delimiters are on separate lines; the leading and trailing newlines are stripped. Unlike here documents, here strings do not use delimiters.

Here strings are particularly useful for commands that often take short input, such as the calculator bc :

$ bc<<<2^10 1024

Note that here string behavior can also be accomplished (reversing the order) via piping and the echo command, as in:

$ echo'one two three'|LANG=Ctra-zA-Z ONE TWO THREE

however here strings are particularly useful when the last command needs to run in the current process, as is the case with the read builtin:

$ echo'one two three'|read-rabc $ echo"$a$b$c"

yields nothing, while

$ read-rabc<<<'one two three'$ echo"$a$b$c"one two three

This happens because in the previous example piping causes read to run in a subprocess, and as such can not affect the environment of the parent process.

Microsoft NMAKE

In Microsoft NMAKE, here documents are referred to as inline files . Inline files are referenced as << or <<pathname: the first notation creates a temporary file, the second notation creates (or overwrites) the file with the specified pathname. An inline file is terminated with << on a line by itself, optionally followed by the (case-insensitive) keyword KEEP or NOKEEP to indicate whether the created file should be kept.

target0:dependent0 command0<< temporaryinlinefile...<<target1:dependent1 command1<< temporary,butpreservedinlinefile...<<KEEPtarget2:dependent2 command2<<filename2 named,butdiscardedinlinefile...<<NOKEEPtarget3:dependent3 command3<<filename3 namedinlinefile...<<KEEP

R

R does not have file literals, but provides equivalent functionality by combining string literals with a string-to-file function. R allows arbitrary whitespace, including newlines, in strings. A string then can be turned into a file descriptor using the textConnection() function. For example, the following turns a data table embedded in the source code into a data-frame variable:

str<-"State          Population Income Illiteracy Life.Exp Murder HS.Grad FrostAlabama              3615   3624        2.1    69.05   15.1    41.3    20Alaska                365   6315        1.5    69.31   11.3    66.7   152Arizona              2212   4530        1.8    70.55    7.8    58.1    15Arkansas             2110   3378        1.9    70.66   10.1    39.9    65"x<-read.table(textConnection(str),header=TRUE,row.names=1)

Data segment

Perl [6] and Ruby [7] have a form of file literal, which can be considered a form of data segment. In these languages, including the line __DATA__ (Perl) or __END__ (Ruby, old Perl) marks the end of the code segment and the start of the data segment. Only the contents prior to this line are executed, and the contents of the source file after this line are available as a file object: PACKAGE::DATA in Perl (e.g., main::DATA) and DATA in Ruby. As an inline file, these are semantically similar to here documents, though there can be only one per script. However, in these languages the term "here document" instead refers to multiline string literals, as discussed below.

Data URI Scheme

As further explained in Data URI scheme, all major web browsers understand URIs that start with data: as here document.

Multiline string literals

The term "here document" or "here string" is also used for multiline string literals in various programming languages, notably Perl (syntax influenced by Unix shell), and languages influenced by Perl, notably PHP and Ruby. The shell-style << syntax is often retained, despite not being used for input redirection.

Perl-influenced

Perl

In Perl there are several different ways to invoke here docs. [8] The delimiters around the tag have the same effect within the here doc as they would in a regular string literal: For example, using double quotes around the tag allows variables to be interpolated, but using single quotes doesn't, and using the tag without either behaves like double quotes. Using backticks as the delimiters around the tag runs the contents of the heredoc as a shell script. It is necessary to make sure that the end tag is at the beginning of the line or the tag will not be recognized by the interpreter.

Note that the here doc does not start at the tag—but rather starts on the next line. So the statement containing the tag continues on after the tag.

Here is an example with double quotes:

my$sender="Buffy the Vampire Slayer";my$recipient="Spike";print<<"END";Dear $recipient,I wish you to leave Sunnydale and never return.Not Quite Love,$senderEND

Output:

Dear Spike,I wish you to leave Sunnydale and never return.Not Quite Love,Buffy the Vampire Slayer

Here is an example with single quotes:

print<<'END';Dear $recipient,I wish you to leave Sunnydale and never return.Not Quite Love,$senderEND

Output:

Dear $recipient,I wish you to leave Sunnydale and never return.Not Quite Love,$sender

And an example with backticks (may not be portable):

my$shell_script_stdout=<<`END`;echofooechobarEND

It is possible to start multiple heredocs on the same line:

say(<<BEGIN."this is the middle\n".<<END);Thisisthebeginning:BEGINAndnowitisover!END# this is equivalent to:say("This is the beginning:\nthis is the middle\nAnd now it is over!\n");

The tag itself may contain whitespace, which may allow heredocs to be used without breaking indentation.

say<<'  END';HelloWorldEND

Although since Perl version 5.26, [9] heredocs can include indention:

#prints "Hello there\n" with no leading whitespace.if(1){print<<~EOF;HellothereEOF}

In addition to these strings, Perl also features file literals, namely the contents of the file following __DATA__ (formerly __END__) on a line by itself. This is accessible as the file object PACKAGE::DATA such as main::DATA, and can be viewed as a form of data segment.

PHP

In PHP, here documents are referred to as heredocs. In PHP heredocs are not string literals. Heredoc text behaves just like a double-quoted string, but without the double quotes. For example, meaning `$` will be parsed as a variable start, and `${` or `{$` as a complex variable start.

<?php$name="Joe Smith";$occupation="Programmer";echo<<<EOF This is a heredoc section. For more information talk to $name, your local $occupation. Thanks!EOF;$toprint=<<<EOF Hey $name! You can actually assign the heredoc section to a variable!EOF;echo$toprint;?>

Outputs

This is a heredoc section.For more information talk to Joe Smith, your local Programmer.Thanks!Hey Joe Smith! You can actually assign the heredoc section to a variable!

In PHP versions prior to 7.3, the line containing the closing identifier must not contain any other characters, except an optional ending semicolon. Otherwise, it will not be considered to be a closing identifier, and PHP will continue looking for one. If a proper closing identifier is not found, a parse error will result at the last line of the script. However, from version 7.3, it is no longer required that the closing identifier be followed by a semicolon or newline. Additionally the closing identifier may be indented, in which case the indentation will be stripped from all lines in the doc string. [10]

In PHP 5.3 and later, like Perl, it is possible to not interpolate variables by surrounding the tag with single quotes; this is called a nowdoc: [11]

$x=<<<'END'Dear $recipient,I wish you to leave Sunnydale and never return.Not Quite Love,$senderEND;

In PHP 5.3+ it is also possible to surround the tag with double quotes, which like Perl has the same effect as not surrounding the tag with anything at all.

Ruby

The following Ruby code displays a grocery list by using a here document.

puts<<GROCERY_LISTGrocery list----1. Salad mix.2. Strawberries.*3. Cereal.4. Milk.** OrganicGROCERY_LIST

The result:

$ rubygrocery-list.rb Grocery list------------1. Salad mix.2. Strawberries.*3. Cereal.4. Milk.** Organic

The << in a here document does not indicate input redirection, but Ruby also uses << for input redirection, so redirecting to a file from a here document involves using << twice, in different senses:

File::open("grocery-list","w")do|f|f<<<<GROCERY_LISTGrocery list----1. Salad mix.2. Strawberries.*3. Cereal.4. Milk.** OrganicGROCERY_LISTend

As with Unix shells, Ruby also allows for the delimiting identifier not to start on the first column of a line, if the start of the here document is marked with the slightly different starter <<-. Besides, Ruby treats here documents as a double-quoted string, and as such, it is possible to use the #{} construct to interpolate code. The following example illustrates both of these features:

now=Time.nowputs<<-EOF  It's #{now.hour} o'clock John, where are your kids?  EOF

Ruby expands on this by providing the "<<~" syntax for omitting indentation on the here document:

puts<<~EOF  This line is indented two spaces.    This line is indented four spaces.      This line is indented six spaces.  EOF

The common indentation of two spaces is omitted from all lines:

$ rubyindented-heredoc.rb This line is indented two spaces.  This line is indented four spaces.    This line is indented six spaces.

Like Perl, Ruby allows for starting multiple here documents in one line:

puts<<BEGIN+"<--- middle --->\n"+<<ENDThis is the beginning:BEGINAnd now it is over!END# this equals this expression:puts"This is the beginning:\n<--- middle --->\nAnd now it is over!"

As with Perl, Ruby features file literals, namely the contents of the file following __END__ on a line by itself. This is accessible as the file object DATA and can be viewed as a form of data segment.

Python

Python supports multi-line strings as a "verbatim" string. They may be enclosed in 3 single (') or double (") quotation marks, the latter is shown in the examples below.

print("""Customer: Not much of a cheese shop is it?Shopkeeper: Finest in the district , sir.""")

From Python 3.6 onwards, verbatim f-strings support variable and expression interpolation.

shop_type="CHEESE"accolade="finest"print(f"""Customer: Not much of a {shop_type.lower()} shop is it?Shopkeeper: {accolade.capitalize()} in the district , sir.""")

C++

Since C++11, C++ supports string literals with custom delimiter ("my_delimiter" in this example):

#include<ostream>constchar*str=R"my_delimiter(Start of string. New lineslash \ quote " ' parens ) ( End of string)my_delimiter";std::cout<<str<<std::endl;

will print out

Start of string. New line slash \ quote " ' parens ) ( End of string 

D

Since version 2.0, D has support for here document-style strings using the 'q' prefix character. These strings begin with q"IDENT followed immediately by a newline (for an arbitrary identifier IDENT), and end with IDENT" at the start of a line.

intmain(){stringlist=q"IDENT1. Item One2. Item Two3. Item ThreeIDENT";writef(list);}

D also supports a few quoting delimiters, with similar syntax, with such strings starting with q"[ and ending with ]" or similarly for other delimiter character (any of () <> {} or []).

OS/JCL

On IBM's Job Control Language (JCL) used on its earlier MVS and current z/OS operating systems, data which is inline to a job stream can be identified by an * on a DD statement, such as //SYSIN DD * or //SYSIN DD *,DLM=text In the first case, the lines of text follow and are combined into a pseudo file with the DD name SYSIN. All records following the command are combined until either another OS/JCL command occurs (any line beginning with //), the default EOF sequence (/*) is found, or the physical end of data occurs. In the second case, the conditions are the same, except the DLM= operand is used to specify the text string signalling end of data, which can be used if a data stream contains JCL (again, any line beginning with //), or the /* sequence (such as comments in C or C++ source code). The following compiles and executes an assembly language program, supplied as in-line data to the assembler.

//AHARJOB('ALEX HARRIS')//EXECASMLG//SYSIN DD *APROG   START        XR    15,15        BR    14        END/*//* JOB ENDS

The //SYSIN DD * statement is the functional equivalent of <</* Indicating s stream of data follows, terminated by /*.

Racket

Racket's here strings start with #<< followed by characters that define a terminator for the string. [12] The content of the string includes all characters between the #<< line and a line whose only content is the specified terminator. More precisely, the content of the string starts after a newline following #<<, and it ends before a newline that is followed by the terminator.

#lang racket(displayln#<<HERESTRINGThis is a simple here string in Racket.  * One  * Two  * ThreeHERESTRING)

Outputs:

This is a simple here string in Racket.  * One  * Two  * Three

No escape sequences are recognized between the starting and terminating lines; all characters are included in the string (and terminator) literally.

#lang racket(displayln#<<A here string in Racket ☺This string spans for multiple linesand can contain any Unicode symbol.So things like λ, ☠, α, β, are all fine.In the next line comes the terminator. It can contain any Unicode symbol as well, even spaces and smileys!A here string in Racket ☺)

Outputs:

This string spans for multiple linesand can contain any Unicode symbol.So things like λ, ☠, α, β, are all fine.In the next line comes the terminator. It can contain any Unicode symbol as well, even spaces and smileys!

Here strings can be used normally in contexts where normal strings would:

#lang racket(printf#<<ENDDear ~a,Thanks for the insightful conversation ~a.                ~aEND"Isaac""yesterday""Carl")

Outputs:

Dear Isaac,Thanks for the insightful conversation yesterday.                Carl

An interesting alternative is to use the language extension at-exp to write @-expressions. [13] They look like this:

#lang at-expracket  (displayln @string-append{ This is a long string, very convenient when a long chunk of text is needed.  No worries about escaping "quotes" or \escapes. It's also okay to have λ, γ, θ, ...  Embed code: @(number->string(+34)) })

Outputs:

This is a long string,very convenient when along chunk of text isneeded.No worries about escaping"quotes" or \escapes. It'salso okay to have λ, γ, θ, ...Embed code: 7

An @-expression is not specific nor restricted to strings, it is a syntax form that can be composed with the rest of the language.

Windows PowerShell

In PowerShell, here documents are referred to as here-strings. A here-string is a string which starts with an open delimiter (@" or @') and ends with a close delimiter ("@ or '@) on a line by itself, which terminates the string. All characters between the open and close delimiter are considered the string literal. Using a here-string with double quotes allows variables to be interpreted, using single quotes doesn't. Variable interpolation occurs with simple variables (e.g. $x but NOT $x.y or $x[0]). You can execute a set of statements by putting them in $() (e.g. $($x.y) or $(Get-Process | Out-String)).

In the following PowerShell code, text is passed to a function using a here-string. The function ConvertTo-UpperCase is defined as follows:

PS  > functionConvertTo-UpperCase($string){$string.ToUpper()}PS  > ConvertTo-UpperCase@'>> one two three>> eins zwei drei>> '@ONE TWO THREEEINS ZWEI DREI

Here is an example that demonstrates variable interpolation and statement execution using a here-string with double quotes:

PS  > $doc,$marty='Dr. Emmett Brown','Marty McFly'PS  > $time=[DateTime]'Friday, October 25, 1985 8:00:00 AM'PS  > $diff=New-TimeSpan-Minutes25PS  > @">> $doc : Are those my clocks I hear?>> $marty : Yeah! Uh, it's $($time.Hour) o'clock!>> $doc : Perfect! My experiment worked! They're all exactly $($diff.Minutes) minutes slow.>> $marty : Wait a minute. Wait a minute. Doc... Are you telling me that it's $(($time + $diff).ToShortTimeString())?>> $doc : Precisely.>> $marty : Damn! I'm late for school!>> "@Dr. Emmett Brown : Are those my clocks I hear?Marty McFly : Yeah! Uh, it's 8 o'clock!Dr. Emmett Brown : Perfect! My experiment worked! They're all exactly 25 minutes slow.Marty McFly : Wait a minute. Wait a minute. Doc... Are you telling me that it's 08:25?Dr. Emmett Brown : Precisely.Marty McFly : Damn! I'm late for school!

Using a here-string with single quotes instead, the output would look like this:

PS  > @'>> $doc : Are those my clocks I hear?>> $marty : Yeah! Uh, it's $($time.Hour) o'clock!>> $doc : Perfect! My experiment worked! They're all exactly $($diff.Minutes) minutes slow.>> $marty : Wait a minute. Wait a minute. Doc... Are you telling me that it's $(($time + $diff).ToShortTimeString())?>> $doc : Precisely.>> $marty : Damn! I'm late for school!>> '@$doc : Are those my clocks I hear?$marty : Yeah! Uh, it's $($time.Hour) o'clock!$doc : Perfect! My experiment worked! They're all exactly $($diff.Minutes) minutes slow.$marty : Wait a minute. Wait a minute. Doc... Are you telling me that it's $(($time + $diff).ToShortTimeString())?$doc : Precisely.$marty : Damn! I'm late for school!

DIGITAL Command Language (DCL)

In DCL scripts, any input line which does not begin with a $ symbol is implicitly treated as input to the preceding command - all lines which do not begin with $ are here-documents. The input is either passed to the program, or can be explicitly referenced by the logical name SYS$INPUT (analogous to the Unix concept of stdin).

For instance, explicitly referencing the input as SYS$INPUT:

$ TYPESYS$INPUTThis text will be directly echoedto the screen by the TYPE command.$!othercommands... 

produces:

This text will be directly echoedto the screen by the TYPE command.

Additionally, the DECK command, initially intended for punched card support (hence its name: it signified the beginning of a data deck) can be used to supply input to the preceding command. [14] The input deck is ended either by the command $ EOD, or the character pattern specified by the /DOLLARS parameter to DECK.

Example of a program totalling up monetary values:

$ RUN ADD_SUMS.EXE $ DECK $13.53 $3.33 $2.33 $ EOD 

Would produce the following output (presuming ADD_SUMS was written to read the values and add them):

$19.19

Example of using DECK /DOLLARS to create one command file from another:

$ COPYSYS$INPUTSYS$SCRATCH:TEMP.COM $ DECK/DOLLARS=$$$$$ TYPESYS$INPUTThis is an example of using DECK to createa command file from within a command file$$$$ $!othercommandsfollow... 

YAML

YAML primarily relies on whitespace indentation for structure, making it resistant to delimiter collision and capable representing multi-line strings with folded string literals:

---caption:"Exampleofheredoc-stylefunctionalityusingYAML"date:"2007-06-01"example:>HTML goes into YAML without modificationmessage:|<blockquote style="font: italic 12pt Times"><p>"Three is always greater than two,even for large values of two"</p><p>--Author Unknown</p></blockquote>

See also

Related Research Articles

<span class="mw-page-title-main">AWK</span> Programming language

AWK is a domain-specific language designed for text processing and typically used as a data extraction and reporting tool. Like sed and grep, it is a filter, and is a standard feature of most Unix-like operating systems.

<span class="mw-page-title-main">Regular expression</span> Sequence of characters that forms a search pattern

A regular expression, sometimes referred to as rational expression, is a sequence of characters that specifies a match pattern in text. Usually such patterns are used by string-searching algorithms for "find" or "find and replace" operations on strings, or for input validation. Regular expression techniques are developed in theoretical computer science and formal language theory.

sed Standard UNIX utility for editing streams of data

sed is a Unix utility that parses and transforms text, using a simple, compact programming language. It was developed from 1973 to 1974 by Lee E. McMahon of Bell Labs, and is available today for most operating systems. sed was based on the scripting features of the interactive editor ed and the earlier qed. It was one of the earliest tools to support regular expressions, and remains in use for text processing, most notably with the substitution command. Popular alternative tools for plaintext string manipulation and "stream editing" include AWK and Perl.

<span class="mw-page-title-main">C shell</span> Unix shell

The C shell is a Unix shell created by Bill Joy while he was a graduate student at University of California, Berkeley in the late 1970s. It has been widely distributed, beginning with the 2BSD release of the Berkeley Software Distribution (BSD) which Joy first distributed in 1978. Other early contributors to the ideas or the code were Michael Ubell, Eric Allman, Mike O'Brien and Jim Kulp.

A string literal or anonymous string is a literal for a string value in the source code of a computer program. Modern programming languages commonly use a quoted sequence of characters, formally "bracketed delimiters", as in x = "foo", where "foo" is a string literal with value foo. Methods such as escape sequences can be used to avoid the problem of delimiter collision and allow the delimiters to be embedded in a string. There are many alternate notations for specifying string literals especially in complicated cases. The exact notation depends on the programming language in question. Nevertheless, there are general guidelines that most modern programming languages follow.

The printf family of functions in the C programming language are a set of functions that take a format string as input among a variable sized list of other values and produce as output a string that corresponds to the format specifier and given input values. The string is written in a simple template language: characters are usually copied literally into the function's output, but format specifiers, which start with a % character, indicate the location and method to translate a piece of data to characters. The design has been copied to expose similar functionality in other programming languages.

In computer programming, glob patterns specify sets of filenames with wildcard characters. For example, the Unix Bash shell command mv *.txttextfiles/ moves all files with names ending in .txt from the current directory to the directory textfiles. Here, * is a wildcard and *.txt is a glob pattern. The wildcard * stands for "any string of any length including empty, but excluding the path separator characters ".

The backtick` is a typographical mark used mainly in computing. It is also known as backquote, grave, or grave accent.

<span class="mw-page-title-main">Delimiter</span> Characters that specify the boundary between regions in a data stream

A delimiter is a sequence of one or more characters for specifying the boundary between separate, independent regions in plain text, mathematical expressions or other data streams. An example of a delimiter is the comma character, which acts as a field delimiter in a sequence of comma-separated values. Another example of a delimiter is the time gap used to separate letters and words in the transmission of Morse code.

<span class="mw-page-title-main">Redirection (computing)</span> Form of interprocess communication

In computing, redirection is a form of interprocess communication, and is a function common to most command-line interpreters, including the various Unix shells that can redirect standard streams to user-specified locations.

In computer programming, a one-liner program originally was textual input to the command line of an operating system shell that performed some function in just one line of input. In the present day, a one-liner can be

In computer programming, a sigil is a symbol affixed to a variable name, showing the variable's datatype or scope, usually a prefix, as in $foo, where $ is the sigil.

<span class="mw-page-title-main">Python syntax and semantics</span> Set of rules defining correctly structured programs

The syntax of the Python programming language is the set of rules that defines how a Python program will be written and interpreted. The Python language has many similarities to Perl, C, and Java. However, there are some definite differences between the languages. It supports multiple programming paradigms, including structured, object-oriented programming, and functional programming, and boasts a dynamic type system and automatic memory management.

In computer programming, leaning toothpick syndrome (LTS) is the situation in which a quoted expression becomes unreadable because it contains a large number of escape characters, usually backslashes ("\"), to avoid delimiter collision.

This comparison of programming languages compares the features of language syntax (format) for over 50 computer programming languages.

The less-than sign is a mathematical symbol that denotes an inequality between two values. The widely adopted form of two equal-length strokes connecting in an acute angle at the left, <, has been found in documents dated as far back as the 1560s. In mathematical writing, the less-than sign is typically placed between two values being compared and signifies that the first number is less than the second number. Examples of typical usage include 12 < 1 and −2 < 0.

A batch file is a script file in DOS, OS/2 and Microsoft Windows. It consists of a series of commands to be executed by the command-line interpreter, stored in a plain text file. A batch file may contain any command the interpreter accepts interactively and use constructs that enable conditional branching and looping within the batch file, such as IF, FOR, and GOTO labels. The term "batch" is from batch processing, meaning "non-interactive execution", though a batch file might not process a batch of multiple data.

In computing, command substitution is a facility that allows a command to be run and its output to be pasted back on the command line as arguments to another command. Command substitution first appeared in the Bourne shell, introduced with Version 7 Unix in 1979, and has remained a characteristic of all later Unix shells. The feature has since been adopted in other programming languages as well, including Perl, PHP, Ruby and Microsoft's Powershell under Windows. It also appears in Microsoft's CMD.EXE in the FOR command and the ( ) command.

The structure of the Perl programming language encompasses both the syntactical rules of the language and the general ways in which programs are organized. Perl's design philosophy is expressed in the commonly cited motto "there's more than one way to do it". As a multi-paradigm, dynamically typed language, Perl allows a great degree of flexibility in program design. Perl also encourages modularization; this has been attributed to the component-based design structure of its Unix roots, and is responsible for the size of the CPAN archive, a community-maintained repository of more than 100,000 modules.

In computer programming, string interpolation is the process of evaluating a string literal containing one or more placeholders, yielding a result in which the placeholders are replaced with their corresponding values. It is a form of simple template processing or, in formal terms, a form of quasi-quotation. The placeholder may be a variable name, or in some languages an arbitrary expression, in either case evaluated in the current context.

References

  1. 1 2 "Here-Document description in the POSIX/SUS standard". Archived from the original on 2014-04-27. Retrieved 2018-04-20.
  2. "Here document - Rosetta Code". rosettacode.org. Retrieved 2017-02-20.
  3. 1 2 Darwin tcsh man page
  4. Wayne Pollock. "Shell Here Document Overview". hccfl.edu. Archived from the original on 2014-05-29. Retrieved 2014-05-28.
  5. See for example Using variables inside a bash heredoc
  6. perldata: Special Literals
  7. Ruby: Object: __END__
  8. Perl operators and precedence
  9. "Perl5260delta - what is new for perl v5.26.0 - Perldoc Browser".
  10. "Heredoc in PHP manual". php.net.
  11. "PHP: Strings - Manual". php.net.
  12. Here string in Racket Documentation
  13. @ Syntax in Racket Documentation
  14. "HP OpenVMS DCL Dictionary". Archived from the original on 2016-03-04. Retrieved 2015-04-21.

General

  1. In more detail, in bash: “all lines of the here-document are subjected to parameter expansion, command substitution, and arithmetic expansion. In the latter case, the character sequence \newline is ignored, and ‘\’ must be used to quote the characters ‘\’, ‘$’, and ‘`’.”, per 3.6.6 Here Documents. Note that " has no special meaning in a here document and does not need to be escaped, unlike in a double-quoted string; otherwise they are essentially identical.
  2. “Quoting” includes escaping, so if \EOF is used, this is quoted, so variable interpolation does not occur, and it ends with EOF, while if \\EOF is used, this is quoted and ends with \EOF. This perhaps surprising behavior is however easily implemented in a shell, by the tokenizer simply recording a token was quoted (during the evaluation phase of lexical analysis), without needing to preserve the original, quoted value.
    One application is to use \' as the starting delimiter, and thus ' as the ending delimiter, which is similar to a multiline string literal but stripping starting and ending linebreaks.
  3. Note that while tabs can typically be entered in editors, at the command line they are typically entered by Ctrl+V + Tab ↹ instead, due to tab completion, and in the example they are actual tabs, so the example can be copy and pasted.