Ctags

Last updated
Ctags (BSD)
Developer(s) Ken Arnold
Repository
Operating system Unix and Unix-like
Platform Cross-platform
Type Programming tool (Specifically: Code navigation tool)
License BSD

Ctags is a programming tool that generates an index file (or tag file) of names found in source and header files of various programming languages to aid code comprehension. Depending on the language, functions, variables, class members, macros and so on may be indexed. These tags allow definitions to be quickly and easily located by a text editor, a code search engine, or other utility. Alternatively, there is also an output mode that generates a cross reference file, listing information about various names found in a set of language files in human-readable form.

Contents

The original Ctags was introduced in BSD Unix 2.0 [1] [2] and was written by Ken Arnold, with Fortran support by Jim Kleckner and Pascal support by Bill Joy. It is part of the initial release of Single Unix Specification and XPG4 of 1992.

Editors that support ctags

Tag index files are supported by many source code editors, including:

Variants of ctags

There are a few other implementations of the ctags program:

Etags

GNU Emacs comes with two ctags utilities, etags and ctags, which are compiled from the same source code. Etags generates a tag table file for Emacs, while the ctags command is used to create a similar table in a format understood by vi. They have different sets of command line options: etags does not recognize and ignores options which only make sense for vi style tag files produced by the ctags command. [6]

Exuberant Ctags

Exuberant Ctags, written and maintained by Darren Hiebert until 2009, [7] was initially distributed with Vim, but became a separate project upon the release of Vim 6. It includes support for Emacs and etags compatibility. [8] [9]

Exuberant Ctags includes support for over 40 programming languages with the ability to add support for even more using regular expressions.

Universal Ctags

Universal Ctags is a fork of Exuberant Ctags, with the objective of continuing its development. A few parsers are rewritten to better support the languages. [10]

Language-specific

Hasktags creates ctags compatible tag files for Haskell source files. [11] It includes support for creating Emacs etags files. [12]

jsctags is a ctags-compatible code indexing solution for JavaScript. [13] It is specialized for JavaScript and uses the CommonJS packaging system. It outperforms Exuberant Ctags for JavaScript code, finding more tags than the latter. [14]

Tags file formats

There are multiple tag file formats. Some of them are described below. In the following, \x## represents the byte with hexadecimal representation ##. Every line ends with a line feed (LF, \n = \x0A).

Ctags and descendants

The original ctags and the Exuberant/Universal descendants have similar file formats: [15]

Ctags

This is the format used by vi and various clones. The tags file is normally named "tags".

The tags file is a list of lines, each line in the format:

{tagname}\t{tagfile}\t{tagaddress}

The fields are specified as follows:

  • {tagname} Any identifier, not containing white space
  • \t Exactly one tab (\x0b) character, although many versions of vi can handle any amount of white space.
  • {tagfile} The name of the file where {tagname} is defined, relative to the current directory
  • {tagaddress} An ex mode command that will take the editor to the location of the tag. For POSIX implementations of vi this may only be a search or a line number, providing added security against arbitrary command execution.

The tags file is sorted on the {tagname} field which allows for fast searching of the tags file.

Extended Ctags

This is the format used by Vim's Exuberant Ctags and Universal Ctags. These programs can generate an original ctags file format or an extended format that attempts to retain backward compatibility.

The extended tags file is a list of lines, each line in the format:

{tagname}\t{tagfile}\t{tagaddress}[;"\t{tagfield...}]

The fields up to and including {tagaddress} are the same as for ctags above.

Optional additional fields are indicated by square brackets ("[...]") and include:

  • ;" semicolon + double quote: Ends the {tagaddress} in a way that looks like the start of a comment to vi or ex.
  • {tagfield} extension fields: tab separated "key:value" pairs for more information.

This format is compatible with non-POSIX vi as the additional data is interpreted as a comment. POSIX implementations of vi must be changed to support it, however. [15]

Etags

This is the format used by Emacs etags. The tags file is normally named "TAGS".

The etags files consists of multiple sectionsone section per input source file. Sections are plain-text with several non-printable ascii characters used for special purposes. These characters are represented as underlined hexadecimal codes below.

A section starts with a two line header (the first two bytes make up a magic number):

\x0c {src_file},{size_of_tag_definition_data_in_bytes}

The header is followed by tag definitions, one definition per line, with the format:

{tag_definition_text}\x7f{tagname}\x01{line_number},{byte_offset}

{tagname}\x01 can be omitted if the name of the tag can be deduced from the text at the tag definition.

Example

Given a single line test.c source code:

#define CCC(x)

The TAGS (etags) file would look like this:

\x0c test.c,21 #define CCC(\x7fCCC\x011,0

The tags (ctags) file may look like:

CCC( test.c 1 

or more flexibly using a search:

CCC( test.c /^#define CCC(/ 

See also

Related Research Articles

The editor war is the rivalry between users of the Emacs and vi text editors. The rivalry has become an enduring part of hacker culture and the free software community.

An integrated development environment (IDE) is a software application that provides comprehensive facilities for software development. An IDE normally consists of at least a source-code editor, build automation tools, and a debugger. Some IDEs, such as IntelliJ IDEA, Eclipse and Lazarus contain the necessary compiler, interpreter or both; others, such as SharpDevelop and NetBeans, do not.

<span class="mw-page-title-main">Text editor</span> Computer software used to edit plain text documents

A text editor is a type of computer program that edits plain text. An example of such program is "notepad" software. Text editors are provided with operating systems and software development packages, and can be used to change files such as configuration files, documentation files and programming language source code.

<span class="mw-page-title-main">Vim (text editor)</span> Improved version of the Vi keyboard-oriented text editor

Vim is a free and open-source, screen-based text editor program. It is an improved clone of Bill Joy's vi. Vim's author, Bram Moolenaar, derived Vim from a port of the Stevie editor for Amiga and released a version to the public in 1991. Vim is designed for use both from a command-line interface and as a standalone application in a graphical user interface. Since its release for the Amiga, cross-platform development has made it available on many other systems. In 2018, it was voted the most popular editor amongst Linux Journal readers; in 2015 the Stack Overflow developer survey found it to be the third most popular text editor, and in 2019 the fifth most popular development environment.

vi (text editor) Keyboard-oriented text editor

vi is a screen-oriented text editor originally created for the Unix operating system. The portable subset of the behavior of vi and programs based on it, and the ex editor language supported within these programs, is described by the Single Unix Specification and POSIX.

In computing, the utility diff is a data comparison tool that computes and displays the differences between the contents of files. Unlike edit distance notions used for other purposes, diff is line-oriented rather than character-oriented, but it is like Levenshtein distance in that it tries to determine the smallest set of deletions and insertions to create one file from the other. The utility displays the changes in one of several standard formats, such that both humans or computers can parse the changes, and use them for patching.

<span class="mw-page-title-main">Bourne shell</span> Command-line interpreter for operating systems

The Bourne shell (sh) is a shell command-line interpreter for computer operating systems.

In computing, tar is a computer software utility for collecting many files into one archive file, often referred to as a tarball, for distribution or backup purposes. The name is derived from "tape archive", as it was originally developed to write data to sequential I/O devices with no file system of their own, such as devices that use magnetic tape. The archive data sets created by tar contain various file system parameters, such as name, timestamps, ownership, file-access permissions, and directory organization. POSIX abandoned tar in favor of pax, yet tar sees continued widespread use.

uuencoding is a form of binary-to-text encoding that originated in the Unix programs uuencode and uudecode written by Mary Ann Horton at the University of California, Berkeley in 1980, for encoding binary data for transmission in email systems.

<span class="mw-page-title-main">NEdit</span>

NEdit, the Nirvana editor, is a text editor and source code editor for the X Window System. It has an interface similar to text editors on Microsoft Windows and Macintosh, rather than to older UNIX editors like Emacs. It was initially developed by Mark Edel for Fermilab and released under a very restrictive licence, but today it is distributed under the less restrictive GPL-2.0-or-later and is developed as an independent open-source project by a team of developers. Nedit was also distributed with the IRIX operating system.

This article provides basic comparisons for notable text editors. More feature details for text editors are available from the Category of text editor features and from the individual products' articles. This article may not be up-to-date or necessarily all-inclusive.

The computer program par is a text formatting utility for Unix and Unix-like operating systems, written by Adam M. Costello as a replacement for the fmt command.

cscope Free software that allows for searching source code written in C, C++, and Java

cscope is a programming tool which works in console mode, text-based interface, that allows computer programmers or software developers to search source code of the programming language C, with some support for C++ and Java. It is often used on very large projects to aid code comprehension to find source code, functions, declarations, definitions, and regular expression. cscope is free and released under a BSD license. The original developer of cscope is Joe Steffen.

info (Unix)

Info is a software utility which forms a hypertextual, multipage documentation and help viewer working on a command-line interface.

ne (text editor) Text editor for POSIX operating systems

ne is a console text editor for POSIX computer operating systems such as Linux or Mac OS X. It uses the terminfo library, but it can also be compiled using a bundled copy of the GNU termcap implementation. There is also a Cygwin version. It was developed by Sebastiano Vigna of the University of Milan.

<span class="mw-page-title-main">GNU Emacs</span> GNU version of the Emacs text editor

GNU Emacs is a text editor and suite of free software tools. Its development began in 1984 by GNU Project founder Richard Stallman, based on the Emacs editor developed for Unix operating systems. GNU Emacs has been a central component of the GNU project and a flagship project of the free software movement.

Emacs, originally named EMACS, is a family of text editors that are characterized by their extensibility. The manual for the most widely used variant, GNU Emacs, describes it as "the extensible, customizable, self-documenting, real-time display editor". Development of the first Emacs began in the mid-1970s, and work on GNU Emacs, directly descended from the original, is ongoing; its latest version is 29.4 , released June 2024.

<span class="mw-page-title-main">GNU GLOBAL</span>

GNU GLOBAL is a software tool for source code tagging to aid code comprehension. It works in a uniform fashion in various environments, allowing users to find all objects declared in the source files and to move among them easily. It is particularly useful for working on projects containing numerous sub-projects and complex syntax trees generated by the compilation process functions using conditional compilation). It is similar to older tagging software such as ctags and etags, but differs in its independence from any specific text editor.

<span class="mw-page-title-main">Pluma (text editor)</span> Default text editor for the MATE desktop environment

Pluma is a fork of gedit 2 and the default text editor of the MATE desktop environment used in Linux distributions. It extends the basic functionality with other features and plugins.

References

  1. "2BSD/man/ctags.u". www.tuhs.org. Retrieved 2024-10-21.
  2. "2BSD/src/ctags.sh". www.tuhs.org. Retrieved 2024-10-21.
  3. "Language Server Protocol". CodeLite Documentation. Retrieved 2022-06-12.
  4. "Aminet - dev/C/Ctags.lha".
  5. Workspace Symbols, Python for Visual Studio Code Docs
  6. etags.emacs(1)    Linux General Commands Manual
  7. "Exuberant Ctags". ctags.sourceforge.net.
  8. "Vim documentation: version6: ctags-gone" . Retrieved 2007-04-28.
  9. etags.ctags(1)    Linux General Commands Manual
  10. "Universal Ctags Documentation" . Retrieved 2018-08-02.
  11. "GHC documentation: Other Haskell utility programs" . Retrieved 2010-03-05.
  12. "hasktags: Produces ctags "tags" and etags "TAGS" files for Haskell programs" . Retrieved 2010-03-05.
  13. "pcwalton github repository for jsctags". GitHub . Archived from the original on 2010-08-22. Retrieved 2016-09-10.
  14. Patrick Walton (25 May 2010). "Introducing jsctags" . Retrieved 2010-05-25.
  15. 1 2 "Proposal for extended Vi tags file format" . Retrieved 2007-06-30.