Programming style

Last updated

Programming style, also known as coding style, refers to the conventions and patterns used in writing source code, resulting in a consistent and readable codebase. These conventions often encompass aspects such as indentation, naming conventions, capitalization, and comments. Consistent programming style is generally considered beneficial for code readability and maintainability, particularly in collaborative environments.

Contents

Maintaining a consistent style across a codebase can improve readability and ease of software maintenance. It allows developers to quickly understand code written by others and reduces the likelihood of errors during modifications. Adhering to standardized coding guidelines ensures that teams follow a uniform approach, making the codebase easier to manage and scale. Many organizations and open-source projects adopt specific coding standards to facilitate collaboration and reduce cognitive load.

Style guidelines can be formalized in documents known as coding conventions, which dictate specific formatting and naming rules. These conventions may be prescribed by official standards for a programming language or developed internally within a team or project. For example, Python's PEP 8 is a widely recognized style guide that outlines best practices for writing Python code. In contrast, languages like C or Java may have industry standards that are either formally documented or adhered to by convention.

Automation

Adherence to coding style can be enforced through automated tools, which format code according to predefined guidelines. These tools reduce the manual effort required to maintain style consistency, allowing programmers to focus on logic and functionality. For instance, tools such as Black for Python and clang-format for C++ automatically reformat code to comply with specified coding standards.

Style guidelines

Common elements of coding style include:

Indentation

Indentation style can assist a reader in various way including: identifying control flow and blocks of code. In some programming languages, indentation is used to delimit blocks of code and therefore is not matter of style. In languages that ignore whitespace, indentation can affect readability.

For example, formatted in a commonly-used style:

if(hours<24&&minutes<60&&seconds<60){returntrue;}else{returnfalse;}

Arguably, poorly formatted:

if(hours<24&&minutes<60&&seconds<60){returntrue;}else{returnfalse;}

Notable indenting styles

ModuLiq

The ModuLiq Zero Indentation Style groups by empty line rather than indenting.

Example:

if(hours<24&&minutes<60&&seconds<60)returntrue;elsereturnfalse;
Lua

Lua does not use the traditional curly braces or parentheses; rather, the expression in a conditional statement must be followed by then, and the block must be closed with end.

ifhours<24andminutes<60andseconds<60thenreturntrueelsereturnfalseend

Indenting is optional in Lua. and, or, and not function as logical operators.

Python

Python relies on the off-side rule , using indenting to indicate and implement control structure, thus eliminating the need for bracketing (i.e., { and }). However, copying and pasting indented code can cause problems, because the indent level of the pasted code may not be the same as the indent level of the target line. Such reformatting by hand is tedious and error prone, but some text editors and integrated development environments (IDEs) have features to do it automatically. There are also problems when indented code is rendered unusable when posted on a forum or web page that removes whitespace, though this problem can be avoided where it is possible to enclose code in whitespace-preserving tags such as "<pre> ... </pre>" (for HTML), "[code]" ... "[/code]" (for bbcode), etc.

ifhours<24andminutes<60andseconds<60:returnTrueelse:returnFalse

Python starts a block with a colon (:).

Python programmers tend to follow a commonly agreed style guide known as PEP8. [1] There are tools designed to automate PEP8 compliance.

Haskell

Haskell, like Python, has the off-side rule. It has a two-dimension syntax where indenting is meaningful to define blocks (although, an alternate syntax uses curly braces and semicolons).

Haskell is a declarative language, there are statements, but declarations within a Haskell script.

Example:

letc_1=1c_2=2infxy=c_1*x+c_2*y

may be written in one line as:

let{c_1=1;c_2=2}infxy=c_1*x+c_2*y

Haskell encourages the use of literate programming, where extended text explains the genesis of the code. In literate Haskell scripts (named with the lhs extension), everything is a comment except blocks marked as code. The program can be written in LaTeX, in such case the code environment marks what is code. Also, each active code paragraph can be marked by preceding and ending it with an empty line, and starting each line of code with a greater than sign and a space. Here an example using LaTeX markup:

Thefunction\verb+isValidDate+testifdateisvalid\begin{code}isValidDate::Date->BoolisValidDatedate=hh>=0&&mm>=0&&ss>=0&&hh<24&&mm<60&&ss<60where(hh,mm,ss)=fromDatedate\end{code}observethatinthiscasetheoverloadedfunctionis\verb+fromDate::Date->(Int,Int,Int)+.

And an example using plain text:

ThefunctionisValidDatetestifdateisvalid>isValidDate::Date->Bool>isValidDatedate=hh>=0&&mm>=0&&ss>=0>&&hh<24&&mm<60&&ss<60>where(hh,mm,ss)=fromDatedateobservethatinthiscasetheoverloadedfunctionisfromDate::Date->(Int,Int,Int).

Vertical alignment

Some programmers consider it valuable to align similar elements vertically (as tabular, in columns), citing that it can make typo-generated bugs more obvious.

For example, unaligned:

$search=array('a','b','c','d','e');$replacement=array('foo','bar','baz','quux');$value=0;$anothervalue=1;$yetanothervalue=2;

aligned:

$search=array('a','b','c','d','e');$replacement=array('foo','bar','baz','quux');$value=0;$anothervalue=1;$yetanothervalue=2;

Unlike the unaligned code, the aligned code implies that the search and replace values are related since they have corresponding elements. As there is one more value for search than replacement, if this is a bug, it is more likely to be spotted via visual inspection.

Cited disadvantages of vertical alignment include:

Maintaining alignment can be alleviated by a tool that provides support (i.e. for elastic tabstops), although that creates a reliance on such tools.

As an example, simple refactoring operations to rename "$replacement" to "$r" and "$anothervalue" to "$a" results in:

$search=array('a','b','c','d','e');$r=array('foo','bar','baz','quux');$value=0;$a=1;$yetanothervalue=2;

With unaligned formatting, these changes do not have such a dramatic, inconsistent or undesirable effect:

$search=array('a','b','c','d','e');$r=array('foo','bar','baz','quux');$value=0;$a=1;$yetanothervalue=2;

Whitespace

A free-format language ignores whitespace characters: spaces, tabs and new lines so the programmer is free to style the code in different ways without affecting the meaning of the code. Generally, the programmer uses style that is considered to enhance readability.

The two code snippets below are the same logically, but differ in whitespace.

inti;for(i=0;i<10;++i){printf("%d",i*i+i);}

versus

inti;for(i=0;i<10;++i){printf("%d",i*i+i);}

The use of tabs for whitespace is debatable. Alignment issues arise due to differing tab stops in different environments and mixed use of tabs and spaces.

As an example, one programmer prefers tab stops of four and has their toolset configured this way, and uses these to format their code.

intix;// Index to scan arraylongsum;// Accumulator for sum

Another programmer prefers tab stops of eight, and their toolset is configured this way. When someone else examines the original person's code, they may well find it difficult to read.

intix;// Index to scan arraylongsum;// Accumulator for sum

One widely used solution to this issue may involve forbidding the use of tabs for alignment or rules on how tab stops must be set. Note that tabs work fine provided they are used consistently, restricted to logical indentation, and not used for alignment:

classMyClass{intfoobar(intqux,// first parameterintquux);// second parameterintfoobar2(intqux,// first parameterintquux,// second parameterintquuux);// third parameter};

See also

Related Research Articles

<span class="mw-page-title-main">Quine (computing)</span> Self-replicating program

A quine is a computer program that takes no input and produces a copy of its own source code as its only output. The standard terms for these programs in the computability theory and computer science literature are "self-replicating programs", "self-reproducing programs", and "self-copying programs".

A string literal or anonymous string is a literal for a string value in the source code of a computer program. Modern programming languages commonly use a quoted sequence of characters, formally "bracketed delimiters", as in, where is a string literal with value. Methods such as escape sequences can be used to avoid the problem of delimiter collision and allow the delimiters to be embedded in a string. There are many alternate notations for specifying string literals especially in complicated cases. The exact notation depends on the programming language in question. Nevertheless, there are general guidelines that most modern programming languages follow.

In computer programming, indentation style is a convention, a.k.a. style, governing the indentation of blocks of source code. An indentation style generally involves consistent width of whitespace before each line of a block, so that the lines of code appear to be related, and dictates whether to use space or tab characters for the indentation whitespace.

YAML is a human-readable data serialization language. It is commonly used for configuration files and in applications where data are being stored or transmitted. YAML targets many of the same communications applications as Extensible Markup Language (XML) but has a minimal syntax that intentionally differs from Standard Generalized Markup Language (SGML). It uses Python-style indentation to indicate nesting and does not require quotes around most string values.

Pretty-printing is the application of any of various stylistic formatting conventions to text files, such as source code, markup, and similar kinds of content. These formatting conventions may entail adhering to an indentation style, using different color and typeface to highlight syntactic elements of source code, or adjusting size, to make the content easier for people to read, and understand. Pretty-printers for source code are sometimes called code formatters or beautifiers.

<span class="mw-page-title-main">Tab key</span> Key on a keyboard for tabulation

The tab keyTab ↹ on a keyboard is used to advance the cursor to the next tab stop.

In computer programming, a free-form language is a programming language in which the positioning of characters on the page in program text is insignificant. Program text does not need to be placed in specific columns as on old punched card systems, and frequently ends of lines are insignificant. Whitespace characters are used only to delimit tokens, and have no other significance. Free-form languages allow a greater degree of flexibility and have fewer syntactic rules to learn, which could lower the entry barrier for beginners.

<span class="mw-page-title-main">Code folding</span> Tool of editors for programming, scripting and markup

Code or text folding, or less commonly holophrasting, is a feature of some graphical user interfaces that allows the user to selectively hide ("fold") or display ("unfold") parts of a document. This allows the user to manage large amounts of text while viewing only those subsections that are currently of interest. It is typically used with documents which have a natural tree structure consisting of nested elements. Other names for these features include expand and collapse, code hiding, and outlining. In Microsoft Word, the feature is called "collapsible outlining".

The off-side rule describes syntax of a computer programming language that defines the bounds of a code block via indentation.

In the written form of many languages, indentation describes empty space, a.k.a. white space, used around text to signify an important aspect of the text such as:

In computing, a here document is a file literal or input stream literal: it is a section of a source code file that is treated as if it were a separate file. The term is also used for a form of multiline string literals that use similar syntax, preserving line breaks and other whitespace in the text.

In computer programming, an entry point is the place in a program where the execution of a program begins, and where the program has access to command line arguments.

In the macOS, iOS, NeXTSTEP, and GNUstep programming frameworks, property list files are files that store serialized objects. Property list files use the filename extension .plist, and thus are often referred to as p-list files.

scanf, short for scan formatted, is a C standard library function that reads and parses text from standard input.

<span class="mw-page-title-main">Python syntax and semantics</span> Set of rules defining correctly structured programs

The syntax of the Python programming language is the set of rules that defines how a Python program will be written and interpreted. The Python language has many similarities to Perl, C, and Java. However, there are some definite differences between the languages. It supports multiple programming paradigms, including structured, object-oriented programming, and functional programming, and boasts a dynamic type system and automatic memory management.

Secondary notation is the set of visual cues used to improve the readability of a formal notation. Examples of secondary notation include the syntax highlighting of computer source code, sizes and color codes for easy recognition of consumer symbols such as bank notes or coins, or the regular typographic conventions often found in technical books to highlight sections with the same type of content.

This comparison of programming languages compares the features of language syntax (format) for over 50 computer programming languages.

Coding conventions are a set of guidelines for a specific programming language that recommend programming style, practices, and methods for each aspect of a program written in that language. These conventions usually cover file organization, indentation, comments, declarations, statements, white space, naming conventions, programming practices, programming principles, programming rules of thumb, architectural best practices, etc. These are guidelines for software structural quality. Software programmers are highly recommended to follow these guidelines to help improve the readability of their source code and make software maintenance easier. Coding conventions are only applicable to the human maintainers and peer reviewers of a software project. Conventions may be formalized in a documented set of rules that an entire team or company follows, or may be as informal as the habitual coding practices of an individual. Coding conventions are not enforced by compilers.

<span class="mw-page-title-main">Comment (computer programming)</span> Explanatory note in the source code of a computer program

In computer programming, a comment is a programmer-readable explanation or annotation in the source code of a computer program. They are added with the purpose of making the source code easier for humans to understand, and are generally ignored by compilers and interpreters. The syntax of comments in various programming languages varies considerably.

Nemerle is a general-purpose, high-level, statically typed programming language designed for platforms using the Common Language Infrastructure (.NET/Mono). It offers functional, object-oriented, aspect-oriented, reflective and imperative features. It has a simple C#-like syntax and a powerful metaprogramming system.

References

  1. "PEP 0008: Style Guide for Python Code". python.org.