Text editor

Last updated
Editors like Leafpad, shown here, are often included with operating systems as a default helper application for opening text files. Leafpad-screenshot.png
Editors like Leafpad, shown here, are often included with operating systems as a default helper application for opening text files.

A text editor is a type of computer program that edits plain text. An example of such program is "notepad" software (e.g. Windows Notepad). [1] [2] [3] Text editors are provided with operating systems and software development packages, and can be used to change files such as configuration files, documentation files and programming language source code. [4]

Contents

Plain text and rich text

There are important differences between plain text (created and edited by text editors) and rich text (such as that created by word processors or desktop publishing software).

Plain text exclusively consists of character representation. Each character is represented by a fixed-length sequence of one, two, or four bytes, or as a variable-length sequence of one to four bytes, in accordance to specific character encoding conventions, such as ASCII, ISO/IEC 2022, Shift JIS, UTF-8, or UTF-16. These conventions define many printable characters, but also non-printing characters that control the flow of the text, such as space, line break, and page break. Plain text contains no other information about the text itself, not even the character encoding convention employed. Plain text is stored in text files, although text files do not exclusively store plain text. Since the early days of computers, plain text was (once by necessity and now by convention) generally displayed using a monospace font, such that horizontal alignment and columnar formatting were sometimes done using whitespace characters.

Rich text, on the other hand, may contain metadata, character formatting data (e.g. typeface, size, weight and style), paragraph formatting data (e.g. indentation, alignment, letter and word distribution, and space between lines or other paragraphs), and page specification data (e.g. size, margin and reading direction). Rich text can be very complex. Rich text can be saved in binary format (e.g. DOC), text files adhering to a markup language (e.g. RTF or HTML), or in a hybrid form of both (e.g. Office Open XML).

Text editors are intended to open and save text files containing either plain text or anything that can be interpreted as plain text, including the markup for rich text or the markup for something else (e.g. SVG).

History

A box of punched cards with several program decks. PunchCardDecks.agr.jpg
A box of punched cards with several program decks.

Before text editors existed, computer text was punched into cards with keypunch machines. [5] Physical boxes of these thin cardboard cards were then inserted into a card reader. Magnetic tape, drum and disk card image files created from such card decks often had no line-separation characters at all, and assumed fixed-length [a] 80- or 90-character [6] records. [7] An alternative to cards was Punched tape. It could be created by some teleprinters (such as the Teletype), which used special characters to indicate ends of records. [8] Some early operating systems included batch text editors, either integrated with language processors or as separate utility programs; one early example was the ability to edit SQUOZE source files for SCAT [9] in the SHARE Operating System.

The first interactive text editors were "line editors" oriented to teleprinter- or typewriter-style terminals without displays. Commands (often a single keystroke) effected edits to a file at an imaginary insertion point called the "cursor". Edits were verified by typing a command to print a small section of the file, and periodically by printing the entire file. In some line editors, the cursor could be moved by commands that specified the line number in the file, text strings (context) for which to search, and eventually regular expressions. Line editors were major improvements over keypunching. Some line editors could be used by keypunch; editing commands could be taken from a deck of cards and applied to a specified file. Some common line editors supported a "verify" mode in which change commands displayed the altered lines.

When computer terminals with video screens became available, screen-based text editors (sometimes called just "screen editors") became common. One of the earliest full-screen editors was O26, which was written for the operator console of the CDC 6000 series computers in 1967. Another early full-screen editor was vi. Written in the 1970s, it is still a standard editor [10] on Unix and Linux operating systems. Also written in the 1970s was the UCSD Pascal Screen Oriented Editor, which was optimized both for indented source code and general text. [11] Emacs, one of the first free and open-source software projects, is another early full-screen or real-time editor, one that was ported to many systems. [12] The 1977 Commodore PET was the first mass-market computer to feature a full-screen editor. A full-screen editor's ease-of-use and speed (compared to the line-based editors) motivated many early purchases of video terminals. [13]

The core data structure in a text editor is the one that manages the string (sequence of characters) or list of records that represents the current state of the file being edited. While the former could be stored in a single long consecutive array of characters, the desire for text editors that could more quickly insert text, delete text, and undo/redo previous edits led to the development of more complicated sequence data structures. [14] A typical text editor uses a gap buffer, a linked list of lines (as in PaperClip), a piece table, or a rope, as its sequence data structure.

Types of text editors

Emacs, a text editor popular among programmers, running on Microsoft Windows Colorsyntax.png
Emacs, a text editor popular among programmers, running on Microsoft Windows
gedit is a text editor shipped with GNOME Gedit 3.32 screenshot.png
gedit is a text editor shipped with GNOME

Some text editors are small and simple, while others offer broad and complex functions. For example, Unix and Unix-like operating systems have the pico editor (or a variant), but many also include the vi and Emacs editors. Microsoft Windows systems come with the simple Notepad, though many peopleespecially programmersprefer other editors with more features. Under Apple Macintosh's classic Mac OS there was the native TeachText later replaced by SimpleText in 1994, which was replaced in Mac OS X by TextEdit, which combines features of a text editor with those typical of a word processor such as rulers, margins and multiple font selection. These features are not available simultaneously, but must be switched by user command, or through the program automatically determining the file type.

Most word processors can read and write files in plain text format, allowing them to open files saved from text editors. Saving these files from a word processor, however, requires ensuring the file is written in plain text format, and that any text encoding or BOM settings will not obscure the file for its intended use. Non-WYSIWYG word processors, such as WordStar, are more easily pressed into service as text editors, and in fact were commonly used as such during the 1980s. The default file format of these word processors often resembles a markup language, with the basic format being plain text and visual formatting achieved using non-printing control characters or escape sequences. Later word processors like Microsoft Word store their files in a binary format and are almost never used to edit plain text files. [15]

Some text editors can edit unusually large files such as log files or an entire database placed in a single file. Simpler text editors may just read files into the computer's main memory. With larger files, this may be a slow process, and the entire file may not fit. Some text editors do not let the user start editing until this read-in is complete. Editing performance also often suffers in nonspecialized editors, with the editor taking seconds or even minutes to respond to keystrokes or navigation commands. Specialized editors have optimizations such as only storing the visible portion of large files in memory, improving editing performance.

Some editors are programmable, meaning, e.g., they can be customized for specific uses. With a programmable editor it is easy to automate repetitive tasks or, add new functionality or even implement a new application within the framework of the editor. One common motive for customizing is to make a text editor use the commands of another text editor with which the user is more familiar, or to duplicate missing functionality the user has come to depend on. Software developers often use editor customizations tailored to the programming language or development environment they are working in. The programmability of some text editors is limited to enhancing the core editing functionality of the program, but Emacs can be extended far beyond editing text filesfor web browsing, reading email, online chat, managing files or playing games and is often thought of as a Lisp execution environment with a Text User Interface. Emacs can even be programmed to emulate Vi, its rival in the traditional editor wars of Unix culture. [16] [17]

An important group of programmable editors uses REXX [b] as a scripting language. These "orthodox editors" contain a "command line" into which commands and macros can be typed and text lines into which line commands [c] and macros can be typed. Most such editors are derivatives of ISPF/PDF EDIT or of XEDIT, IBM's flagship editor for VM/SP through z/VM. Among them are THE, KEDIT, X2, Uni-edit, and SEDIT.

A text editor written or customized for a specific use can determine what the user is editing and assist the user, often by completing programming terms and showing tooltips with relevant documentation. Many text editors for software developers include source code syntax highlighting and automatic indentation to make programs easier to read and write. Programming editors often let the user select the name of an include file, function or variable, then jump to its definition. Some also allow for easy navigation back to the original section of code by storing the initial cursor location or by displaying the requested definition in a popup window or temporary buffer. Some editors implement this ability themselves, but often an auxiliary utility like ctags is used to locate the definitions.

Typical features

Advanced features

Specialized editors

Some editors include special features and extra functions, for instance,

Programmable editors can usually be enhanced to perform any or all of these functions, but simpler editors focus on just one, or, like gPHPedit, are targeted at a single programming language.

See also

Notes

  1. By the late 1960s editors were available that supported variable-length records.
  2. Originally macros were written in assembler, CLIST (TSO), CMS EXEC (VM), EXEC2 (VM/SE) or PL/I, but most users dropped CLIST, EXEC and EXEC2 once REXX was available.
  3. A line command is a command typed into the sequence number entry area associated with a specific line of text and whose scope is limited to that line, or, in the case of a block command, associated with the block of lines between the beginning and ending line commands. An example of the latter would be typing the command ucc (block upper case) into the entry areas of two lines; this has the same effect as typing uc (upper case) into the entry area of each line in the range.

Related Research Articles

<span class="mw-page-title-main">Emacs Lisp</span> Dialect of Lisp in the Emacs text editor

Emacs Lisp is a Lisp dialect made for Emacs. It is used for implementing most of the editing functionality built into Emacs, the remainder being written in C, as is the Lisp interpreter.

<span class="mw-page-title-main">Macro (computer science)</span> Rule for substituting a set input with a set output

In computer programming, a macro is a rule or pattern that specifies how a certain input should be mapped to a replacement output. Applying a macro to an input is known as macro expansion.

TECO, short for Text Editor & Corrector, is both a character-oriented text editor and a programming language, that was developed in 1962 for use on Digital Equipment Corporation computers, and has since become available on PCs and Unix. Dan Murphy developed TECO while a student at the Massachusetts Institute of Technology (MIT).

vi (text editor) Keyboard-oriented text editor

vi is a screen-oriented text editor originally created for the Unix operating system. The portable subset of the behavior of vi and programs based on it, and the ex editor language supported within these programs, is described by the Single Unix Specification and POSIX.

<span class="mw-page-title-main">WordStar</span> Word processor application

WordStar is a discontinued word processor application for microcomputers. It was published by MicroPro International and originally written for the CP/M-80 operating system, with later editions added for MS-DOS and other 16-bit PC OSes. Rob Barnaby was the sole author of the early versions of the program.

<span class="mw-page-title-main">GNU TeXmacs</span> Open-source word processor

GNU TeXmacs is a scientific word processor and typesetting component of the GNU Project. It originated as a variant of GNU Emacs with TeX functionalities, though it shares no code with those programs, while using TeX fonts. It is written and maintained by Joris van der Hoeven and a group of developers. The program produces structured documents with a WYSIWYG user interface. New document styles can be created by the user. The editor provides high-quality typesetting algorithms and TeX and other fonts for publishing professional looking documents.

A text file is a kind of computer file that is structured as a sequence of lines of electronic text. A text file exists stored as data within a computer file system.

<span class="mw-page-title-main">Computer terminal</span> Computer input/output device for users

A computer terminal is an electronic or electromechanical hardware device that can be used for entering data into, and transcribing data from, a computer or a computing system. Most early computers only had a front panel to input or display bits and had to be connected to a terminal to print or input text through a keyboard. Teleprinters were used as early-day hard-copy terminals and predated the use of a computer screen by decades. The computer would typically transmit a line of data which would be printed on paper, and accept a line of data from a keyboard over a serial or other interface. Starting in the mid-1970s with microcomputers such as the Sphere 1, Sol-20, and Apple I, display circuitry and keyboards began to be integrated into personal and workstation computer systems, with the computer handling character generation and outputting to a CRT display such as a computer monitor or, sometimes, a consumer TV, but most larger computers continued to require terminals.

<span class="mw-page-title-main">Keyboard shortcut</span> Assignments for computer keyboard keys

In computing, a keyboard shortcut is a software-based assignment of an action to one or more keys on a computer keyboard. Most operating systems and applications come with a default set of keyboard shortcuts, some of which may be modified by the user in the settings.

<span class="mw-page-title-main">Text-based user interface</span> Type of interface based on outputting to or controlling a text display

In computing, text-based user interfaces (TUI), is a retronym describing a type of user interface (UI) common as an early form of human–computer interaction, before the advent of bitmapped displays and modern conventional graphical user interfaces (GUIs). Like modern GUIs, they can use the entire screen area and may accept mouse and other inputs. They may also use color and often structure the display using box-drawing characters such as ┌ and ╣. The modern context of use is usually a terminal emulator.

Ctags is a programming tool that generates an index file of names found in source and header files of various programming languages to aid code comprehension. Depending on the language, functions, variables, class members, macros and so on may be indexed. These tags allow definitions to be quickly and easily located by a text editor, a code search engine, or other utility. Alternatively, there is also an output mode that generates a cross reference file, listing information about various names found in a set of language files in human-readable form.

<span class="mw-page-title-main">XEDIT</span> Visual editor

XEDIT is a visual editor for VM/CMS using block mode IBM 3270 terminals.

This article provides basic comparisons for notable text editors. More feature details for text editors are available from the Category of text editor features and from the individual products' articles. This article may not be up-to-date or necessarily all-inclusive.

<span class="mw-page-title-main">Snippet (programming)</span> Small region of re-usable source code, machine code, or text

Snippet is a programming term for a small region of re-usable source code, machine code, or text. Ordinarily, these are formally defined operative units to incorporate into larger programming modules. Snippet management is a feature of some text editors, program source code editors, IDEs, and related software. It allows the user to avoid repetitive typing in the course of routine edit operations.

ORVYL is a time-sharing monitor developed by Stanford University for IBM System/360 and System/370 computers in 1967–68. ORVYL was one of the first time-sharing systems to be made available for IBM computers. Wylbur is a text editor and word processor program designed to work either without ORVYL, or in conjunction with ORVYL.

<span class="mw-page-title-main">GNU Emacs</span> GNU version of the Emacs text editor

GNU Emacs is a text editor and suite of free software tools. Its development began in 1984 by GNU Project founder Richard Stallman, based on the Emacs editor developed for Unix operating systems. GNU Emacs has been a central component of the GNU project and a flagship project of the free software movement.

Emacs, originally named EMACS, is a family of text editors that are characterized by their extensibility. The manual for the most widely used variant, GNU Emacs, describes it as "the extensible, customizable, self-documenting, real-time display editor". Development of the first Emacs began in the mid-1970s, and work on GNU Emacs, directly descended from the original, is ongoing; its latest version is 29.4 , released June 2024.

<span class="mw-page-title-main">Scripting language</span> Programming language designed for scripting

In computing, a script is a relatively short and simple set of instructions that typically automate an otherwise manual process. The act of writing a script is called scripting. Scripting language or script language describes a programming language that is used for scripting.

Bracketed paste is a mode of some terminal emulators which allows programs running in the terminal to treat pasted text differently from text typed normally.

References

  1. H. Albert Napier; Ollie N. Rivers; Stuart Wagner (2005). Creating a Winning E-Business. Cengage Learning. p. 330. ISBN   1111796092.
  2. Peter Norton; Scott H. Clark (2002). Peter Norton's New Inside the PC. Sams Publishing. p. 54. ISBN   0672322897.
  3. L. Gopalakrishnan; G. Padmanabhan; Sudhat Shukla (2003). Your Home PC: Making the Most of Your Personal Computer. Tata McGraw-Hill Education. p. 190. ISBN   0070473544.
  4. "The Best Free Text Editors for Windows, Linux, and Mac". 28 April 2012. Every operating system comes with a default, basic text editor, but most of us install our own enhanced text editors to get more features.
  5. Louden, Kenneth C.; Lambert, Kenneth A. (2011-01-26). Programming Languages: Principles and Practices. Cengage Learning. p. 5. ISBN   978-1-133-38749-7.
  6. "UNIVAC 90-COLUMN PUNCHED 'CARD-TO-MAGNETIC TAPE CONVERTER" (PDF). UNIVAC II Data Automation System (PDF). Remington-Rand Univac Division of Sperry Rand Corporation. 1957. p. 246. Retrieved December 16, 2022.,
  7. Alavudeen, A.; Venkateshwaran, N. (2008-08-18). Computer Integrated Manufacturing. PHI Learning Pvt. Ltd. p. 180. ISBN   978-81-203-3345-1.
  8. Upton, Eben; Duntemann, Jeffrey; Roberts, Ralph; Mamtora, Tim; Everard, Ben (2016-08-22). Learning Computer Architecture with Raspberry Pi. John Wiley & Sons. pp. 232–234. ISBN   978-1-119-18394-5.
  9. "Modify and Load" (PDF). SOS Reference Manual (PDF). IBM. November 1959 [Distribution No.1 published in 1959]. p. 05.01.01. Retrieved December 15, 2022.
  10. "The Open Group Base Specifications Issue 6, IEEE Std 1003.1, 2004 Edition". The IEEE and The Open Group. 2004. Retrieved January 18, 2010.
  11. L. Bowles, Kenneth; Hollan, James (1978-07-01). "An introduction to the UCSD PASCAL system". Behavior Research Methods. 10 (4): 531–534. doi: 10.3758/BF03205341 .
  12. "Introducing the Emacs editing environment". IBM . Archived from the original on 2014-06-06. Retrieved 2014-06-06.
  13. "Multics Emacs: The History, Design and Implementation". Some Multics users purchased these terminals ..., using them either as "glass teletypes" or via "local editing."
  14. Charles Crowley. "Data Structures for Text Sequences". Section "Introduction".
  15. "Text Editors for Programmeres - Programming Tools". If you open a .doc file in a text editor, you will notice that most of the file is formatting codes. Text editors, however, do not add formatting codes, which makes it easier to compile your code.
  16. "Vim to Emacs' Evil chaotic migration guide". juanjoalvarez.net. 19 September 2014.
  17. "Gitorious". Archived from the original on 28 May 2015. Retrieved 27 May 2015.
  18. "Searching". Notepad++ User Manual. Retrieved 21 December 2021.
  19. Philipp Acsany. "Choosing the Best Coding Font for Programming". 2023.