Comparison of programming languages (syntax)

Last updated

This comparison of programming languages compares the features of language syntax (format) for over 50 computer programming languages.

Contents

Expressions

Programming language expressions can be broadly classified into four syntax structures:

prefix notation
infix notation
suffix, postfix, or Reverse Polish notation
math-like notation

Statements

When a programming languages has statements, they typically have conventions for:

A statement separator demarcates the boundary between two separate statements. A statement terminator defines the end of an individual statement. Languages that interpret the end of line to be the end of a statement are called "line-oriented" languages.

"Line continuation" is a convention in line-oriented languages where the newline character could potentially be misinterpreted as a statement terminator. In such languages, it allows a single statement to span more than just one line.

LanguageStatement separator-terminatorSecondary separator-terminator [1]
ABAP period separated
Ada semicolon terminated
ALGOL semicolon separated
ALGOL 68 semicolon and comma separated [2]
APL newline terminated[Direct_function ⋄] separated Secondary
AppleScript newline terminated
AutoHotkey newline terminated
BASIC newline terminatedcolon separated
Boo newline terminated
C semicolon terminates statementscomma separates expressions
C++ semicolon terminates statementscomma separates expressions
C# semicolon terminated
COBOL whitespace separated, sometimes period separated, optionally separated with commas and semi-colons.
Cobra newline terminated
CoffeeScript newline terminated
CSS semicolon terminated
D semicolon terminated
Eiffel newline terminatedsemicolon
Erlang colon separated, period terminated
F# newline terminatedsemicolon
Fortran newline terminatedsemicolon
Forth semicolons terminate word definitions. space terminates word use
GFA BASIC newline terminated
Go semicolon separated (inserted by compiler)
Haskell (in do-notation)newline separated
Haskell (in do-notation, when braces are used)semicolon separated
Java semicolon terminated
JavaScript semicolon separated (but often inserted as statement terminator)
Kotlin semicolon separated (but sometimes implicitly inserted on newlines)
Lua whitespace separated (semicolon optional)
Mathematica
also called Wolfram
semicolon separated
MATLAB newline terminatedsemicolon or comma [3]
MUMPS
also called M
newline terminates line-scope, the closest to a "statement" that M hasa space separates/terminates a command, allowing another command to follow
Nim newline terminated
Object Pascal (Delphi)semicolon separated
Objective-C semicolon terminated
OCaml semicolon separated
Pascal semicolon separated
Perl semicolon separated
PHP semicolon terminated
Pick Basic newline terminatedsemicolon separated
PowerShell newline terminatedsemicolon separated
Prolog comma separated (conjunction), semicolon separated (disjunction), period terminated (clause)
Python newline terminatedsemicolon
R newline terminated [4] semicolon [4]
Raku semicolon separated
Red whitespace separated
Ruby newline terminatedsemicolon
Rust semicolon terminatedcomma separates expressions
Scala newline terminated (semicolon optional)semicolon
Seed7 semicolon separated (semicolon termination is allowed)
Simula semicolon separated
S-Lang semicolon separated
Smalltalk period separated
Standard ML semicolon separated
Swift semicolon separated (inserted by compiler)
V (Vlang) newline terminatedcomma or semicolon separated
Visual Basic newline terminatedcolon separated
Visual Basic .NET newline terminatedcolon separated
Wolfram Language semicolon separated
Xojo newline terminated
Zig semicolon terminated
LanguageStatement separator-terminatorSecondary separator-terminator [1]

Line continuation

Line continuation is generally done as part of lexical analysis: a newline normally results in a token being added to the token stream, unless line continuation is detected.

Whitespace – Languages that do not need continuations
Ampersand as last character of line
Backslash as last character of line
Backtick as last character of line
Hyphen as last character of line
Underscore as last character of line
Ellipsis (as three periods–not one special character)
Comma delimiter as last character of line
Left bracket delimiter as last character of line
Operator as last object of line
Operator as first character of continued line
Backslash as first character of continued line
Some form of inline comment serves as line continuation
Character position
[End and Begin] using normal quotes

Libraries

To import a library is a way to read external, possibly compiled, routines, programs or packages. Imports can be classified by level (module, package, class, procedure,...) and by syntax (directive name, attributes,...)

File import
Package import
Class import
Procedure/function import
Constant import

The above statements can also be classified by whether they are a syntactic convenience (allowing things to be referred to by a shorter name, but they can still be referred to by some fully qualified name without import), or whether they are actually required to access the code (without which it is impossible to access the code, even with fully qualified names).

Syntactic convenience
Required to access code

Blocks

A block is a notation for a group of two or more statements, expressions or other units of code that are related in such a way as to comprise a whole.

Braces (a.k.a. curly brackets) { ... }
Parentheses ( ... )
Square brackets [ ... ]
begin ... end
do ... end
do ... done
do ... end
X ... end (e.g. if ... end):
(begin ...)
(progn ...)
(do ...)
Indentation
Others

Comments

Comments can be classified by:

Inline comments

Inline comments are generally those that use a newline character to indicate the end of a comment, and an arbitrary delimiter or sequence of tokens to indicate the beginning of a comment.

Examples:

SymbolLanguages
C Fortran I to Fortran 77 (C in column 1)
REM BASIC, Batch files, Visual Basic
:: Batch files, COMMAND.COM, cmd.exe
NB. J; from the (historically) common abbreviation Nota bene, the Latin for "note well".
APL; the mnemonic is that the glyph (jot overstruck with shoe-down) resembles a desk lamp, and hence "illuminates" the foregoing.
# Boo, Bourne shell and other UNIX shells, Cobra, Perl, Python, Ruby, Seed7, PowerShell, PHP, R, Make, Maple, Elixir, Julia, Nim [12]
% TeX, Prolog, MATLAB, [13] Erlang, S-Lang, Visual Prolog, PostScript
// ActionScript, Boo, C (C99), C++, C#, D, F#, Go, Java, JavaScript, Kotlin, Object Pascal (Delphi), Objective-C, PHP, Rust, Scala, Sass, Swift, Xojo, V (Vlang), Zig
'Monkey, Visual Basic, VBScript, Small Basic, Gambas, Xojo
! Factor, Fortran, Basic Plus, Inform, Pick Basic
;Most assembly languages, AutoHotkey, AutoIt, Lisp, Common Lisp, Clojure, PGN, Rebol, Red, Scheme
-- Euphoria, Haskell, SQL, Ada, AppleScript, Eiffel, Lua, VHDL, SGML, PureScript, Elm
* Assembler S/360 (* in column 1), COBOL I to COBOL 85, PAW, Fortran IV to Fortran 77 (* in column 1), Pick Basic, GAMS (* in column 1)
|| Curl
" Vimscript, ABAP
\ Forth
*>COBOL 90

Block comments

Block comments are generally those that use a delimiter to indicate the beginning of a comment, and another delimiter to indicate the end of a comment. In this context, whitespace and newline characters are not counted as delimiters. In the examples, the symbol ~ represents the comment; and, the symbols surrounding it are understood by the interpreters/compilers as the delimiters.

Examples:

SymbolLanguages
comment ~ ; ALGOL 60, SIMULA
¢ ~ ¢,
# ~ #, co ~ co,
comment ~ comment
ALGOL 68 [14] [15]
/* ~ */ActionScript, AutoHotkey, C, C++, C#, D, [16] Go, Java, JavaScript, Kotlin, Objective-C, PHP, PL/I, Prolog, Rexx, Rust (can be nested), Scala (can be nested), SAS, SASS, SQL, Swift (can be nested), V (Vlang), Visual Prolog, CSS
#cs ~ #ce AutoIt [17]
/+ ~ +/D (can be nested) [16]
/# ~ #/ Cobra (can be nested)
<# ~ #> PowerShell
<!-- ~ --> HTML, XML
=begin ~ =cutPerl (Plain Old Documentation)
#`( ~ ) Raku (bracketing characters can be (), <>, {}, [], any Unicode characters with BiDi mirrorings, or Unicode characters with Ps/Pe/Pi/Pf properties)
=begin ~ =endRuby
#<TAG> ~ #</TAG>, #stop ~ EOF ,
#iffalse ~ #endif, #ifntrue ~ #endif,
#if false ~ #endif, #if !true ~ #endif
S-Lang [18]
{- ~ -} Haskell (can be nested)
(* ~ *) Delphi, ML, Mathematica, Object Pascal, Pascal, Seed7, AppleScript, OCaml (can be nested), Standard ML (can be nested), Maple, Newspeak, F#
{ ~ }Delphi, Object Pascal, Pascal, PGN, Red
{# ~ #}Nunjucks, Twig
{{! ~ }} Mustache, Handlebars
{{!-- ~ --}}Handlebars (cannot be nested, but may contain {{ and }})
|# ~ #|Curl
%{ ~ %} MATLAB [13] (the symbols must be in a separate line)
#| ~ |# Lisp, Scheme, Racket (can be nested in all three).
#= ~ =# Julia [19]
#[ ~ ]# Nim [20]
--[[ ~ ]],
--[=[ ~ ]=],
--[=...=[ ~ ]=...=]
Lua (brackets can have any number of matching = characters; can be nested within non-matching delimiters)
" ~ " Smalltalk
(comment ~ ) Clojure
#If COMMENT Then ~ #End If [lower-alpha 1] Visual Basic .NET
#if COMMENT ~ #endif [lower-alpha 2] C#
' comment _ or REM comment _ [lower-alpha 3] Classic Visual Basic, VBA, VBScript

Unique variants

Fortran
COBOL
Cobra
Curl
Lua
Perl
PHP
Python
Elixir
Raku
Ruby
S-Lang
Scheme and Racket
ABAP

ABAP supports two different kinds of comments. If the first character of a line, including indentation, is an asterisk (*) the whole line is considered as a comment, while a single double quote (") begins an in-line comment which acts until the end of the line. ABAP comments are not possible between the statements EXEC SQL and ENDEXEC because Native SQL has other usages for these characters. In the most SQL dialects the double dash (--) can be used instead.

Esoteric languages

Comment comparison

There is a wide variety of syntax styles for declaring comments in source code. BlockComment in italics is used here to indicate block comment style. InlineComment in italics is used here to indicate inline comment style.

LanguageIn-line commentBlock comment
Ada, Eiffel, Euphoria, Occam, SPARK, ANSI SQL, and VHDL -- InlineComment
ALGOL 60 comment BlockComment;
ALGOL 68 ¢ BlockComment ¢

commentBlockCommentcomment
coBlockCommentco
# BlockComment #
£ BlockComment £

APL InlineComment
AppleScript -- InlineComment(* BlockComment *)
Assembly language (varies); InlineComment   one example (most assembly languages use line comments only)
AutoHotkey ; InlineComment/* BlockComment */
AWK, Bourne shell, C shell, Maple, PowerShell # InlineComment<# BlockComment #>
Bash # InlineComment<<EOF
BlockComment
EOF


: '
BlockComment
'
BASIC (various dialects):'InlineComment (not all dialects)

*InlineComment (not all dialects)
!InlineComment (not all dialects)
REM InlineComment

C (K&R, ANSI/C89/C90), CHILL, PL/I, REXX /* BlockComment */
C (C99), C++, Go, Swift, JavaScript, V (Vlang) // InlineComment/* BlockComment */
C# // InlineComment
/// InlineComment (XML documentation comment)
/* BlockComment */
/** BlockComment */ (XML documentation comment)
#if COMMENT
 BlockComment
#endif
(Compiler directive) [lower-alpha 2]
COBOL I to COBOL 85 * InlineComment (* in column 7)
COBOL 2002*> InlineComment
Curl ||InlineComment|# BlockComment #|

|foo# BlockComment #|

Cobra # InlineComment/# BlockComment #/ (nestable)
D // InlineComment
/// Documentation InlineComment (ddoc comments)
/* BlockComment */
/** Documentation BlockComment */ (ddoc comments)

/+ BlockComment +/ (nestable)
/++ Documentation BlockComment +/ (nestable, ddoc comments)

DCL $! InlineComment
ECMAScript (JavaScript, ActionScript, etc.)// InlineComment/* BlockComment */
Elixir # InlineComment~S"""
BlockComment
"""

@doc """
BlockComment
"""
(Documentation, only works in modules)
@moduledoc
BlockComment
"""
(Module documentation)
@typedoc
BlockComment
"""
(Type documentation)
Forth \ InlineComment( BlockComment ) (single line and multiline)

( before -- after ) stack comment convention

FORTRAN I to FORTRAN 77C InlineComment (C in column 1)
Fortran 90 and later! InlineComment#if 0
 BlockComment
#endif
[lower-alpha 4]
Haskell -- InlineComment{- BlockComment -}
J NB.
Java // InlineComment/* BlockComment */

/** BlockComment */ (Javadoc documentation comment)

Julia # InlineComment#=BlockComment=#
Lisp, Scheme ; InlineComment#|BlockComment|#
Lua -- InlineComment --[==[ BlockComment]==] (variable number of = signs, nestable with delimiters with different numbers of = signs)
Maple # InlineComment(* BlockComment *)
Mathematica (* BlockComment *)
Matlab % InlineComment%{
BlockComment (nestable)
%}

Note: Both percent–bracket symbols must be the only non-whitespace characters on their respective lines.
Nim # InlineComment#[BlockComment]#
Object Pascal // InlineComment(* BlockComment *)
{ BlockComment }
OCaml (* BlockComment (* nestable *) *)
Pascal, Modula-2, Modula-3, Oberon, ML:(* BlockComment *)
Perl, Ruby # InlineComment=begin
BlockComment
=cut
(=end in Ruby) (POD documentation comment)

__END__
Comments after end of code

PGN, Red ; InlineComment{ BlockComment }
PHP # InlineComment
// InlineComment
/* BlockComment */
/** Documentation BlockComment */ (PHP Doc comments)
PILOT R:InlineComment
PLZ/SYS ! BlockComment !
PL/SQL, TSQL -- InlineComment/* BlockComment */
Prolog % InlineComment/* BlockComment */
Python # InlineComment'''BlockComment'''
""" BlockComment """

(Documentation string when first line of module, class, method, or function)

R # InlineComment
Raku # InlineComment#`{
BlockComment
}

=comment
    This comment paragraph goes until the next POD directive
    or the first blank line.
[25] [26]

Rust // InlineComment

/// InlineComment ("Outer" rustdoc comment)
//! InlineComment ("Inner" rustdoc comment)

/* BlockComment */ (nestable)

/** BlockComment */ ("Outer" rustdoc comment)
/*! BlockComment */ ("Inner" rustdoc comment)

SAS * BlockComment;
/* BlockComment */
Seed7 # InlineComment(* BlockComment *)
Simula comment BlockComment;
! BlockComment;
Smalltalk "BlockComment"
Smarty {* BlockComment *}
Standard ML (* BlockComment *)
TeX, LaTeX, PostScript, Erlang, S-Lang % InlineComment
Texinfo @c InlineComment

@comment InlineComment

TUTOR * InlineComment
command $$ InlineComment
Visual Basic 'InlineComment
Rem InlineComment
'BlockComment _
BlockComment

Rem BlockComment _
BlockComment
[lower-alpha 3]
Visual Basic .NET 'InlineComment

'''InlineComment (XML documentation comment)
Rem InlineComment

#If COMMENT Then
 BlockComment
#End If
Visual Prolog % InlineComment/* BlockComment */
Wolfram Language (* BlockComment *)
Xojo ' InlineComment
// InlineComment
rem InlineComment
Zig // InlineComment
/// InlineComment
//! InlineComment

See also

Related Research Articles

A "Hello, World!" program is generally a simple computer program which outputs to the screen a message similar to "Hello, World!" while ignoring any user input. A small piece of code in most general-purpose programming languages, this program is used to illustrate a language's basic syntax. A "Hello, World!" program is often the first written by a student of a new programming language, but such a program can also be used as a sanity check to ensure that the computer software intended to compile or run source code is correctly installed, and that its operator understands how to use it.

<span class="mw-page-title-main">Ruby (programming language)</span> General-purpose programming language

Ruby is an interpreted, high-level, general-purpose programming language. It was designed with an emphasis on programming productivity and simplicity. In Ruby, everything is an object, including primitive data types. It was developed in the mid-1990s by Yukihiro "Matz" Matsumoto in Japan.

This is a "genealogy" of programming languages. Languages are categorized under the ancestor language with the strongest influence. Those ancestor languages are listed in alphabetic order. Any such categorization has a large arbitrary element, since programming languages often incorporate major ideas from multiple sources.

A string literal or anonymous string is a literal for a string value in the source code of a computer program. Modern programming languages commonly use a quoted sequence of characters, formally "bracketed delimiters", as in x = "foo", where "foo" is a string literal with value foo. Methods such as escape sequences can be used to avoid the problem of delimiter collision and allow the delimiters to be embedded in a string. There are many alternate notations for specifying string literals especially in complicated cases. The exact notation depends on the programming language in question. Nevertheless, there are general guidelines that most modern programming languages follow.

YAML(see § History and name) is a human-readable data serialization language. It is commonly used for configuration files and in applications where data is being stored or transmitted. YAML targets many of the same communications applications as Extensible Markup Language (XML) but has a minimal syntax that intentionally differs from Standard Generalized Markup Language (SGML). It uses Python-style indentation to indicate nesting and does not require quotes around most string values.

uuencoding is a form of binary-to-text encoding that originated in the Unix programs uuencode and uudecode written by Mary Ann Horton at the University of California, Berkeley in 1980, for encoding binary data for transmission in email systems.

printf is a C standard library function that formats text and writes it to standard output.

In computer programming, glob patterns specify sets of filenames with wildcard characters. For example, the Unix Bash shell command mv *.txttextfiles/ moves all files with names ending in .txt from the current directory to the directory textfiles. Here, * is a wildcard and *.txt is a glob pattern. The wildcard * stands for "any string of any length including empty, but excluding the path separator characters ".

The off-side rule describes syntax of a computer programming language that defines the bounds of a code block via indentation.

In computer programming, a one-liner program originally was textual input to the command line of an operating system shell that performed some function in just one line of input. In the present day, a one-liner can be

In computing, a here document is a file literal or input stream literal: it is a section of a source code file that is treated as if it were a separate file. The term is also used for a form of multiline string literals that use similar syntax, preserving line breaks and other whitespace in the text.

In computer programming, an entry point is the place in a program where the execution of a program begins, and where the program has access to command line arguments.

A foreign function interface (FFI) is a mechanism by which a program written in one programming language can call routines or make use of services written or compiled in another one. An FFI is often used in contexts where calls are made into binary dynamic-link library.

<span class="mw-page-title-main">Python syntax and semantics</span> Set of rules defining correctly structured programs

The syntax of the Python programming language is the set of rules that defines how a Python program will be written and interpreted. The Python language has many similarities to Perl, C, and Java. However, there are some definite differences between the languages. It supports multiple programming paradigms, including structured, object-oriented programming, and functional programming, and boasts a dynamic type system and automatic memory management.

Coding conventions are a set of guidelines for a specific programming language that recommend programming style, practices, and methods for each aspect of a program written in that language. These conventions usually cover file organization, indentation, comments, declarations, statements, white space, naming conventions, programming practices, programming principles, programming rules of thumb, architectural best practices, etc. These are guidelines for software structural quality. Software programmers are highly recommended to follow these guidelines to help improve the readability of their source code and make software maintenance easier. Coding conventions are only applicable to the human maintainers and peer reviewers of a software project. Conventions may be formalized in a documented set of rules that an entire team or company follows, or may be as informal as the habitual coding practices of an individual. Coding conventions are not enforced by compilers.

<span class="mw-page-title-main">Comment (computer programming)</span> Explanatory note in the source code of a computer program

In computer programming, a comment is a programmer-readable explanation or annotation in the source code of a computer program. They are added with the purpose of making the source code easier for humans to understand, and are generally ignored by compilers and interpreters. The syntax of comments in various programming languages varies considerably.

<span class="mw-page-title-main">JuffEd</span> Programmer-oriented text editor

JuffEd is a text editor for programmers and advanced users. It is designed to be a simple and lightweight cross-platform text editor. It uses Qt4 toolkit and QScintilla editing component.

Many programming languages and other computer files have a directive, often called include, import, or copy, that causes the contents of the specified file to be inserted into the original file. These included files are called header files or copybooks. They are often used to define the physical layout of program data, pieces of procedural code, and/or forward declarations while promoting encapsulation and the reuse of code or data.

The structure of the Perl programming language encompasses both the syntactical rules of the language and the general ways in which programs are organized. Perl's design philosophy is expressed in the commonly cited motto "there's more than one way to do it". As a multi-paradigm, dynamically typed language, Perl allows a great degree of flexibility in program design. Perl also encourages modularization; this has been attributed to the component-based design structure of its Unix roots, and is responsible for the size of the CPAN archive, a community-maintained repository of more than 100,000 modules.

References

  1. 1 2 For multiple statements on one line
  2. Three different kinds of clauses, each separates phrases and the units differently:
      1. serial-clause using go-on-token (viz. semicolon): begin a; b; c end – units are executed in order.
      2. collateral-clause using and-also-token (viz. ","): begin a, b, c end – order of execution is to be optimised by the compiler.
      3. parallel-clause using and-also-token (viz. ","): par begin a, b, c end – units must be run in parallel threads.
  3. semicolon – result of receding statement hidden, comma – result displayed
  4. 1 2 From the R Language Definition, section 3.2 Control structures: "A semicolon always indicates the end of a statement while a new line may indicate the end of a statement. If the current statement is not syntactically complete new lines are simply ignored by the evaluator."
  5. Bash Reference Manual, 3.1.2.1 Escape Character
  6. Python Documentation, 2. Lexical analysis: 2.1.5. Explicit line joining
  7. Mathworks.com Archived 7 February 2010 at the Wayback Machine
  8. https://ss64.com/nt/syntax-brackets.html
  9. "Scripts - Definition & Usage | AutoHotkey".
  10. For an M-file (MATLAB source) to be accessible by name, its parent directory must be in the search path (or current directory).
  11. 1 2 3 "Verbose Syntax - F# | Microsoft Learn". Microsoft Learn. 5 November 2021. Retrieved 17 November 2022.
  12. "Nim Manual".
  13. 1 2 "Mathworks.com". Archived from the original on 21 November 2013. Retrieved 25 June 2013.
  14. "Algol68_revised_report-AB.pdf on PDF pp. 61–62, original document pp. 121–122" (PDF). Retrieved 27 May 2014.
  15. "HTML Version of the Algol68 Revised Report AB". Archived from the original on 17 March 2013. Retrieved 27 May 2014.
  16. 1 2 "DLang.org, Lexical" . Retrieved 27 May 2014.
  17. "AutoItScript.com Keyword Reference, #comments-start" . Retrieved 27 May 2014.
  18. "slang-2.2.4/src/slprepr.c – line 43 to 113" . Retrieved 28 May 2014.
  19. "Punctuation · The Julia Language".
  20. "Nim Manual".
  21. "Python tip: You can use multi-line strings as multi-line comments", 11 September 2011, Guido van Rossum
  22. "Writing Documentation — Elixir v1.12.3" . Retrieved 28 July 2023.
  23. "Perl 6 Documentation (Syntax)". docs.perl6.org. Comments. Retrieved 5 April 2017.
  24. "Using the FPP Preprocessor". Archived from the original on 18 November 2022. Retrieved 18 November 2022.
  25. "Perl 6 POD Comments". 25 May 2023.
  26. "Perl 6 POD (Abbreviated Blocks)". 25 May 2023.

Notes

  1. Visual Basic .NET does not support traditional multi-line comments, but they can be emulated through compiler directives.
  2. 1 2 While C# supports traditional block comments /* ... */, compiler directives can be used to mimic them just as in VB.NET.
  3. 1 2 The line continuation character _ can be used to extend a single-line comment to the next line without needing to type ' or REM again. This can be done up to 24 times in a row.
  4. Fortran does not support traditional block comments, but some compilers support preprocessor directives in the style of C/C++, allowing a programmer to emulate multi-line comments. [24]