Code folding

Last updated
Code folding example on PHP code with Vim Vim-folding.png
Code folding example on PHP code with Vim

Code or text folding, or less commonly holophrasting, [1] is a feature of some graphical user interfaces that allows the user to selectively hide ("fold") or display ("unfold") parts of a document. This allows the user to manage large amounts of text while viewing only those subsections that are currently of interest. It is typically used with documents which have a natural tree structure consisting of nested elements. Other names for these features include expand and collapse, code hiding, and outlining. In Microsoft Word, the feature is called "collapsible outlining".

Contents

Many user interfaces provide disclosure widgets for code folding in a sidebar, indicated for example by a triangle that points sideways (if collapsed) or down (if expanded), or by a [-] box for collapsible (expanded) text, and a [+] box for expandable (collapsed) text.

Code folding is found in text editors, source code editors, and IDEs. The folding structure typically follows the syntax tree of the program defined by the computer language. It may also be defined by levels of indentation, or be specified explicitly using an in-band marker (saved as part of the source code) or out-of-band.

Text folding is a similar feature used on ordinary text, where the nested elements consist of paragraphs, sections, or outline levels. Programs offering this include folding editors, outliners, and some word processors.

Data folding is found in some hex editors and is used to structure a binary file or hide inaccessible data sections. [2]

Folding is also frequently used in data comparison, to select one version or another, or only the differences.

History

The earliest known example of code folding in an editor is in NLS. [3] Probably the first widely available folding editor was the 1974 Structured Programming Facility (SPF) editor for IBM 370 mainframes, which could hide lines based on their indentation. It displayed on character-mapped 3270 terminals. [4] It was very useful for prolix languages like COBOL. It evolved into the Interactive System Productivity Facility (ISPF).

Use

Code folding has various use patterns, primarily organizing code or hiding less useful information so one can focus on more important information. Common patterns follow. [5]

Outlining

Most basically, applications use code folding to outline source code, collapsing each block to a single line. This can be only top-level blocks like functions and classes, nested blocks like nested functions and methods, or all blocks, notably control-flow blocks. This allows one to get an overview of code, easily navigating and rearranging it, and to drill down into more detail as needed, without being distracted by other code. Viewing-wise, this allows one to quickly see a list of all functions (without their bodies), while navigation-wise this replaces extensive paging past long functions – or searching for the target – with going directly to the next function.

Hiding boilerplate code

Some languages or libraries require extensive boilerplate code. This results in extremely long code, which can obscure the main point. Further, substantive code can be lost in the boilerplate.

For example, in Java a single private field with a getter and setter requires at least 3 lines, if each is on a separate line:

privateStringname=null;publicStringgetName(){returnname;}publicvoidsetName(Stringname){this.name=name;}

This expands to 10 lines with conventional function line breaks and spacing between functions (including trailing newline):

privateStringname=null;publicStringgetName(){returnname;}publicvoidsetName(Stringname){this.name=name;}

Documentation with Javadoc expands this to 20 lines:

/** * Property <code>name</code>  readable/writable. */privateStringname=null;/** * Getter for property <code>name</code> */publicStringgetName(){returnname;}/** * Setter for property <code>name</code>. * @param name */publicvoidsetName(Stringname){this.name=name;}

If there are many such fields, the result can easily be hundreds of lines of code with very little "interesting" content – code folding can reduce this to a single line per field, or even to a single line for all fields. Further, if all routine fields are folded, but non-routine fields (where getter or setter is not just returning or assigning a private field) are not folded, it becomes easier to see the substantive code.

Collapsing metadata

Metadata can be lengthy, and is generally less important than the data it is describing. Collapsing metadata allows one to primarily focus on the data, not the metadata. For example, a long list of attributes in C# may be manually collapsed as follows: [6]

#region Attributes[Browsable(false)][MergableProperty(false)][DefaultValue(null)][PersistenceMode(PersistenceMode.InnerProperty)][TemplateContainer(typeof(MyType))][TemplateInstance(TemplateInstance.Single)]#endregionpublicITemplateContentTemplate{get;set;}

The resulting code displays as:

AttributespublicITemplateContentTemplate{get;set;}

Collapsing comments

Comments are a form of human-readable metadata, and lengthy comments can disrupt the flow of code. This can be the case either for a long comment for a short section of code, such as a paragraph to explain one line, or comments for documentation generators, such as Javadoc or XML Documentation. Code folding allows one to have long comments, but to display them only when required. In cases where a long comment has a single summary line, such as Python docstrings, the summary can still be displayed when the section is collapsed, allowing a summary/detailed view.

Showing structure or sandwich code in structured programming

Structured programming consists of nested blocks of code, and long blocks of code – such as long switch statements – can obscure the overall structure. Code folding allows one to see the overall structure and expand to a specific level. Further, in some uses, particularly strict structured programming (single function exit), there are code patterns that are hard to see when looking at expanded code. For example, in resource management in structured programming, one generally acquires a resource, followed by a block of code using the resource, and finishing with releasing the resource. The acquisition/release pairing is hard to see if there is a long block of code in between, but easy to see if the intervening block is folded. Similarly, in conditional code like if...then...else, secondary blocks may be far from the condition statement.

Grouping code

Fold groups can be used to group code, either by explicit grouping – similar to comment blocks separating a module into sections, or class members into associated groups – or implicitly, such as by automatically grouping class members by access level.

Hiding legacy code

Legacy code – or any code that a developer does not wish to view or change at a given point in time – can be folded away so that programmers can concentrate on the code under consideration.

Hiding in-source data tables

Conventions

In order to support code folding, the text editor must provide a mechanism for identifying "folding points" within a text file. Some text editors provide this mechanism automatically, while others provide defaults that can either be overridden or augmented by the user.

There are various mechanisms, coarsely divided as automatic and manual – do they require any specification by the programmer? Folding points are usually determined with one or more of the following mechanisms. Each of these has its own distinct advantages and difficulties, and it is essentially up to the developers who create the text editor software to decide which to implement. Text editors that provide support for multiple folding mechanisms typically allow the user to choose which is most appropriate for the file being edited.

Syntax-dependent

Syntax-dependent folding points are those that rely on the content of the file being edited in order to specify where specific folding regions should begin and end. Syntax-based folding points are typically defined around any or all of the standard sub-features of the markup language or programming language in use. These are desirable due to being automatic and agreeing with code structure, but may require significant work to implement, and time to compute when editing a file.

Indentation-based

Indentation-based folding points are generally specified by the position and sequence of non-printing whitespace, such as tabs and spaces, within the text. This is most often used as a simple form of syntax-based folding, as indentation almost always reflects nesting level in indent styles for structured programming languages.

This convention is particularly suitable to syntaxes that have an off-side rule, so the structure largely agrees with the indent. Examples include Python and text files that require indentation as a rule by themselves. However, even in these cases, structure does not exactly agree with indent, such as in line continuation, and thus syntax-dependent folding is preferred.

Token-based

Token-based folding points are specified using special delimiters that serve no other purpose in the text than to identify the boundaries of folding points. This convention can be compared to indentation-based folding points, where printable characters are used instead of whitespace. The most common delimiter tokens are {{{to begin the folded section, and}}} to end it.

Another notable token is #region (C# directives), respectively #Region (Visual Basic directives), used in Microsoft Visual Studio Code Editor. These are treated syntactically as compiler directives, though they do not affect compilation.

As a manual method, token-based folding allows discretion in grouping code based on arbitrary criteria, such as "functions related to a given task", which cannot be inferred from syntactic analysis.

Token-based folding requires in-band signalling, with folding tokens essentially being structured comments, and unlike other methods, are present in the source code and visible to other programmers. This allows them to be shared, but also requires their use (or preservation) by all programmers working on a particular file, and can cause friction and maintenance burden.

User-specified

User-specified folding allows the user to fold sections of text using a generic selection method, but without changing the source code (out-of-band), instead being specified only in the editor. For example, a programmer may select some lines of text and specify that they should be folded. Folded text might be anonymous or named, and this may be preserved across editing sessions or discarded. Unlike token-based folding, this does not change the source text – it thus is not shared with other editors of the file, and is not visible in the code.

Examples

The following document contains folding tokens ({{{ ... }}}):

 Heading 1  {{{  Body  }}}   Heading 2  {{{  Body  }}}   Heading 3  {{{  Body  }}} 

When loaded into a folding editor, the outline structure will be shown:

 Heading 1  {{{ ...   Heading 2  {{{ ...   Heading 3  {{{ ... 

Usually clicking on the {{{ marks makes the appropriate body text appear.

Software with code folding capability

One of the earliest folding editors was STET, an editor written for the VM/CMS operating system in 1977 by Mike Cowlishaw. STET is a text editor (for documentation, programs, etc.) which folds files on the basis of blocks of lines; any block of lines can be folded and replaced by a name line (which in turn can be part of a block which itself can then be folded).

A folding editor appeared in the occam IDE circa 1983, which was called the Inmos Transputer Development System (TDS) [7] ,. [8] The "f" editor (in list below) probably is the most intact legacy from this work.

The Macintosh computer historically had a number of source code editors that "folded" portions of code via "disclosure triangles". The UserLand Software product Frontier is a scripting environment that has this capability. [9]

Folding is provided by many modern text editors, and syntax-based or semantics-based folding is now a component of many software development environments. Editors include:

NameTokenIndentationSyntaxUser
ABAP Editor Yes ?Yes ?
AkelPad ? ?Yes ?
Anjuta IDE  ?YesYes ?
Atom [lower-alpha 1]  ?Yes ?Yes
BBEdit  ? ?Yes ?
Brackets Plug-inYesYesNo
Codeanywhere YesYesYes ?
Codenvy YesYesYes ?
Code::Blocks IDE YesYesYesYes
Cubic IDE YesYesYesYes
CudaText ? ? ? ?
Delphi IDE Yes ?Yes ?
Dreamweaver  ? ? ?Yes
Eclipse  ? ?Yes ?
EditPlusNoYesNoNo
Emacs Yes [lower-alpha 2]  ? [lower-alpha 3] Yes [lower-alpha 4] Yes [lower-alpha 5]
EmEditor Professional  ?YesYes ?
FlashDevelop IDE ? ?Yes ?
geany  ?YesYes ?
gedit YesYesYes ?
ISPF  ?Yes ?Yes
JED YesYes [lower-alpha 6]  ?No
jEdit YesYesYesYes
Kate YesYesYesYes
MATLAB NoNoYesNo
MS Visual Studio YesYesYesYes
NetBeans IDE YesYesYesYes
Notepad++  ?YesYesYes
NuSphere PHPEd  ? ?YesYes
Qt Creator  ? ?Yes ?
SciTE YesYesYes ?
STET [lower-alpha 7]  ? ? ? ?
TextMate YesYesYesYes
UltraEdit NoNoYesYes
Vim YesYesYesYes
Visual Expert  ? ?Yes ?
Visual Studio Code YesYesYesNo
Xcode YesYesYesYes
Zend Studio  ? ? ? ?


Other editors

See also

Notes

  1. http://flight-manual.atom.io/using-atom/sections/folding/
  2. Token-based folding is implemented by the folding minor mode. One can also use outline and allout minor modes for sectioning program sources.
  3. One can use the set-selective-display function in Emacs to hide lines based on the indentation level, as suggested in the Universal code folding note.
  4. Syntax-dependent folding is supported by the outline and allout modes for special dedicated outline-syntaxes; by the hideshow minor mode for some programming languages; also, by the semantic-tag-folding minor mode and the senator-fold-tag command for syntaxes supported by semantic (a component of CEDET), as well as by doc-mode for JavaDoc or Doxygen comments, by TeX-fold-mode , sgml-fold-element command, nxml-outln library in the corresponding language-specific modes, and possibly in other modes for particular syntaxes. Sometimes, the standard simple outline minor mode is used to simulate syntax-based folding, cf. the use of it in properly indented Emacs Lisp source code, the use of it (see near the end of the page) for properly indented HTML. Several folding mechanisms are unified by the fold-dwim interface. See also CategoryHideStuff.
  5. Folding of user-selected regions in Emacs is implemented by the hide-region-hide command.
  6. The set_selective_display function may be used to hide lines indented beyond a specified amount.
  7. STET may have been the first text editor that supported folding[ citation needed ]

Related Research Articles

<span class="mw-page-title-main">Text editor</span> Computer software used to edit plain text documents

A text editor is a type of computer program that edits plain text. Such programs are sometimes known as "notepad" software. Text editors are provided with operating systems and software development packages, and can be used to change files such as configuration files, documentation files and programming language source code.

Lexical tokenization is conversion of a text into meaningful lexical tokens belonging to categories defined by a "lexer" program. In case of a natural language, those categories include nouns, verbs, adjectives, punctuations etc. In case of a programming language, the categories include identifiers, operators, grouping symbols and data types. Lexical tokenization is related to the type of tokenization used in Large language models (LLMs), but with two differences. First, lexical tokenization is usually based on a lexical grammar, whereas LLM tokenizers are usually probability-based. Second, LLM tokenizers perform a second step that converts the tokens into numerical values.

<span class="mw-page-title-main">Atari BASIC</span> Dialect of the BASIC programming language

Atari BASIC is an interpreter for the BASIC programming language that shipped with the Atari 8-bit family of 6502-based home computers. Unlike most American BASICs of the home computer era, Atari BASIC is not a derivative of Microsoft BASIC and differs in significant ways. It includes keywords for Atari-specific features and lacks support for string arrays.

A string literal or anonymous string is a literal for a string value in the source code of a computer program. Modern programming languages commonly use a quoted sequence of characters, formally "bracketed delimiters", as in x = "foo", where "foo" is a string literal with value foo. Methods such as escape sequences can be used to avoid the problem of delimiter collision and allow the delimiters to be embedded in a string. There are many alternate notations for specifying string literals especially in complicated cases. The exact notation depends on the programming language in question. Nevertheless, there are general guidelines that most modern programming languages follow.

<span class="mw-page-title-main">Syntax highlighting</span> Tool of editors for programming, scripting, and markup

Syntax highlighting is a feature of text editors that is used for programming, scripting, or markup languages, such as HTML. The feature displays text, especially source code, in different colours and fonts according to the category of terms. This feature facilitates writing in a structured language such as a programming language or a markup language as both structures and syntax errors are visually distinct. This feature is also employed in many programming related contexts, either in the form of colorful books or online websites to make understanding code snippets easier for readers. Highlighting does not affect the meaning of the text itself; it is intended only for human readers.

In computer programming, indentation style is a convention, a.k.a. style, governing the indentation of blocks of source code that is intended to result in code that conveys structure.

YAML(see § History and name) is a human-readable data serialization language. It is commonly used for configuration files and in applications where data is being stored or transmitted. YAML targets many of the same communications applications as Extensible Markup Language (XML) but has a minimal syntax that intentionally differs from Standard Generalized Markup Language (SGML). It uses Python-style indentation to indicate nesting and does not require quotes around most string values.

Pretty-printing is the application of any of various stylistic formatting conventions to text files, such as source code, markup, and similar kinds of content. These formatting conventions may entail adhering to an indentation style, using different color and typeface to highlight syntactic elements of source code, or adjusting size, to make the content easier for people to read, and understand. Pretty-printers for source code are sometimes called code formatters or beautifiers.

Fold, folding or foldable may refer to:

The off-side rule describes syntax of a computer programming language that defines the bounds of a code block via indentation.

TextPad is a text editor for Microsoft Windows developed by Helios Software Solutions. It is currently in its eighth major version. TextPad was initially released in 1992 as shareware, with users requested to pay a registration fee to support future development. As of 1996 the company was an associate member of the Association of Shareware Professionals. By 1998 the company was pointing out that the editor was "shareware " and payment was necessary to continue to use it.

TI-BASIC is the official name of a BASIC-like language built into Texas Instruments (TI)'s graphing calculators. TI-BASIC is a language family of three different and incompatible versions, released on different products:

A lightweight markup language (LML), also termed a simple or humane markup language, is a markup language with simple, unobtrusive syntax. It is designed to be easy to write using any generic text editor and easy to read in its raw form. Lightweight markup languages are used in applications where it may be necessary to read the raw document as well as the final rendered output.

<span class="mw-page-title-main">Source-code editor</span> Text editor specializing in software code

A source-code editor is a text editor program designed specifically for editing source code of computer programs. It may be a standalone application or it may be built into an integrated development environment (IDE).

In computing, a here document is a file literal or input stream literal: it is a section of a source code file that is treated as if it were a separate file. The term is also used for a form of multiline string literals that use similar syntax, preserving line breaks and other whitespace in the text.

This article provides basic comparisons for notable text editors. More feature details for text editors are available from the Category of text editor features and from the individual products' articles. This article may not be up-to-date or necessarily all-inclusive.

This comparison of programming languages compares the features of language syntax (format) for over 50 computer programming languages.

Coding conventions are a set of guidelines for a specific programming language that recommend programming style, practices, and methods for each aspect of a program written in that language. These conventions usually cover file organization, indentation, comments, declarations, statements, white space, naming conventions, programming practices, programming principles, programming rules of thumb, architectural best practices, etc. These are guidelines for software structural quality. Software programmers are highly recommended to follow these guidelines to help improve the readability of their source code and make software maintenance easier. Coding conventions are only applicable to the human maintainers and peer reviewers of a software project. Conventions may be formalized in a documented set of rules that an entire team or company follows, or may be as informal as the habitual coding practices of an individual. Coding conventions are not enforced by compilers.

<span class="mw-page-title-main">Comment (computer programming)</span> Explanatory note in the source code of a computer program

In computer programming, a comment is a programmer-readable explanation or annotation in the source code of a computer program. They are added with the purpose of making the source code easier for humans to understand, and are generally ignored by compilers and interpreters. The syntax of comments in various programming languages varies considerably.

RE/flex is a free and open source computer program written in C++ that generates fast lexical analyzers in C++. RE/flex offers full Unicode support, indentation anchors, word boundaries, lazy quantifiers, and performance tuning options. RE/flex accepts Flex lexer specifications and offers options to generate scanners for Bison parsers. RE/flex includes a fast C++ regular expression library.

References

  1. Simon Gauvin, Omid Banyasad, "Transparency, holophrasting, and automatic layout applied to control structures for visual dataflow programming languages", in Proceedings of the 2006 ACM symposium on Software visualization, p. 67–75
  2. "Data folding in HxD hex editor (listed as feature of RAM-Editor)" . Retrieved 2007-04-30.
  3. The Mother of All Demos, presented by Douglas Engelbart (1968) , retrieved 2019-12-29
  4. "History of ISPF" . Retrieved 2015-10-27.
  5. Atwood 2008.
  6. Post #31, Rob, July 2008
  7. North American Transputer Users Group. Conference (2nd : 1989 : Durham, N.C.) (1990). Transputer research and applications, 2 : NATUG-2, proceedings of the Second Conference of the North American Transputer Users Group, October 18-19, 1989, Durham, NC. Board, John A., Duke University. Amsterdam: IOS Press. p. 85. ISBN   9051990278. OCLC   35478471.{{cite book}}: CS1 maint: numeric names: authors list (link)
  8. Cormie, David (1986). "INMOS Technical Note 03 - Getting started with the TDS" (PDF). transputer.net. Retrieved 2019-07-19.
  9. "Outliners.com". Archived from the original on 2006-12-23. Retrieved 2006-12-27.
  10. LEXX A programmable structured editor IBM Journal of Research and Development, Vol 31, No. 1, 1987, IBM Reprint order number G322-0151