Coccinelle (software)

Last updated
Coccinelle
Stable release
1.1.0 [1] / February 25, 2021;3 years ago (2021-02-25)
Repository
Written in OCaml and Python
Type Static program analysis
License GPLv2
Website coccinelle.gitlabpages.inria.fr/website/

Coccinelle (French for ladybug) is an open-source utility for matching and transforming the source code of programs written in the C programming language.

Contents

Utility

Coccinelle was initially used to aid the evolution of the Linux kernel, providing support for changes to library application programming interfaces (APIs) such as renaming a function, adding a function argument whose value is somehow context-dependent, and reorganizing a data structure.

It can also be used to find defective programming patterns in code (i.e., pieces of code that are erroneous with high probability such as possible NULL pointer dereference) without transforming them. Therefore coccinelle's role is close to that of static analysis tools. Examples of such use are provided by the applications of the herodotos tool, which keeps track of warnings generated by coccinelle. [2] [3]

Support for Coccinelle is provided by IRILL. Funding for the development has been provided by the Agence Nationale de la Recherche (France), the Danish Research Council for Technology and Production Sciences, and INRIA.

The source code of Coccinelle is licensed under the terms of version 2 of the GNU General Public License (GPL).

Semantic Patch Language

The source code to be matched or replaced is specified using a "semantic patch" syntax based on the patch syntax. [4] The Semantic Patch Language (SmPL) pattern resembles a unified diff with C-like declarations. [5] [6]

Example

@@ expression lock, flags; expression urb; @@spin_lock_irqsave(lock, flags); <... - usb_submit_urb(urb)+ usb_submit_urb(urb, GFP_ATOMIC)...> spin_unlock_irqrestore(lock, flags);  @@ expression urb; @@- usb_submit_urb(urb)+ usb_submit_urb(urb, GFP_KERNEL)

Related Research Articles

<span class="mw-page-title-main">Regular expression</span> Sequence of characters that forms a search pattern

A regular expression, sometimes referred to as rational expression, is a sequence of characters that specifies a match pattern in text. Usually such patterns are used by string-searching algorithms for "find" or "find and replace" operations on strings, or for input validation. Regular expression techniques are developed in theoretical computer science and formal language theory.

OCaml is a general-purpose, high-level, multi-paradigm programming language which extends the Caml dialect of ML with object-oriented features. OCaml was created in 1996 by Xavier Leroy, Jérôme Vouillon, Damien Doligez, Didier Rémy, Ascánder Suárez, and others.

<span class="mw-page-title-main">Firmware</span> Low-level computer software

In computing, firmware is a specific class of computer software that provides the low-level control for a device's specific hardware. Firmware, such as the BIOS of a personal computer, may contain basic functions of a device, and may provide hardware abstraction services to higher-level software such as operating systems. For less complex devices, firmware may act as the device's complete operating system, performing all control, monitoring and data manipulation functions. Typical examples of devices containing firmware are embedded systems, home and personal-use appliances, computers, and computer peripherals.

grep is a command-line utility for searching plain-text data sets for lines that match a regular expression. Its name comes from the ed command g/re/p, which has the same effect. grep was originally developed for the Unix operating system, but later available for all Unix-like systems and some others such as OS-9.

In computer science, a preprocessor is a program that processes its input data to produce output that is used as input in another program. The output is said to be a preprocessed form of the input data, which is often used by some subsequent programs like compilers. The amount and kind of processing done depends on the nature of the preprocessor; some preprocessors are only capable of performing relatively simple textual substitutions and macro expansions, while others have the power of full-fledged programming languages.

<span class="mw-page-title-main">F Sharp (programming language)</span> Microsoft programming language

F# is a general-purpose, strongly typed, multi-paradigm programming language that encompasses functional, imperative, and object-oriented programming methods. It is most often used as a cross-platform Common Language Infrastructure (CLI) language on .NET, but can also generate JavaScript and graphics processing unit (GPU) code.

<span class="mw-page-title-main">Syntax highlighting</span> Tool of editors for programming, scripting, and markup

Syntax highlighting is a feature of text editors that is used for programming, scripting, or markup languages, such as HTML. The feature displays text, especially source code, in different colours and fonts according to the category of terms. This feature facilitates writing in a structured language such as a programming language or a markup language as both structures and syntax errors are visually distinct. This feature is also employed in many programming related contexts, either in the form of colorful books or online websites to make understanding code snippets easier for readers. Highlighting does not affect the meaning of the text itself; it is intended only for human readers.

In computer science, pattern matching is the act of checking a given sequence of tokens for the presence of the constituents of some pattern. In contrast to pattern recognition, the match usually has to be exact: "either it will or will not be a match." The patterns generally have the form of either sequences or tree structures. Uses of pattern matching include outputting the locations of a pattern within a token sequence, to output some component of the matched pattern, and to substitute the matching pattern with some other token sequence.

In computer programming, glob patterns specify sets of filenames with wildcard characters. For example, the Unix Bash shell command mv *.txttextfiles/ moves all files with names ending in .txt from the current directory to the directory textfiles. Here, * is a wildcard and *.txt is a glob pattern. The wildcard * stands for "any string of any length including empty, but excluding the path separator characters ".

Sparse is a computer software tool designed to find possible coding faults in the Linux kernel. Unlike other such tools, this static analysis tool was initially designed to only flag constructs that were likely to be of interest to kernel developers, such as the mixing of pointers to user and kernel address spaces.

udev is a device manager for the Linux kernel. As the successor of devfsd and hotplug, udev primarily manages device nodes in the /dev directory. At the same time, udev also handles all user space events raised when hardware devices are added into the system or removed from it, including firmware loading as required by certain devices.

The DMS Software Reengineering Toolkit is a proprietary set of program transformation tools available for automating custom source program analysis, modification, translation or generation of software systems for arbitrary mixtures of source languages for large scale software systems. DMS was originally motivated by a theory for maintaining designs of software called Design Maintenance Systems. DMS and "Design Maintenance System" are registered trademarks of Semantic Designs.

<span class="mw-page-title-main">Ksplice</span> Live patch extension for the Linux kernel

Ksplice is an open-source extension of the Linux kernel that allows security patches to be applied to a running kernel without the need for reboots, avoiding downtimes and improving availability. Ksplice supports only the patches that do not make significant semantic changes to kernel's data structures.

<span class="mw-page-title-main">Linux kernel</span> Operating system kernel

The Linux kernel is a free and open-source, monolithic, modular, multitasking, Unix-like operating system kernel. It was originally written in 1991 by Linus Torvalds for his i386-based PC, and it was soon adopted as the kernel for the GNU operating system, which was written to be a free (libre) replacement for Unix.

nftables is a subsystem of the Linux kernel providing filtering and classification of network packets/datagrams/frames. It has been available since Linux kernel 3.13 released on 19 January 2014.

LinuxCNC is a free, open-source Linux software system that implements numerical control capability using general purpose computers to control CNC machines. It's mainly intended to run on PC AMD x86-64 systems. Designed by various volunteer developers at linuxcnc.org, it is typically bundled as an ISO file with a modified version of Debian Linux which provides the required real-time kernel.

The following outline is provided as an overview of and topical guide to the Perl programming language:

OMeta is a specialized object-oriented programming language for pattern matching, developed by Alessandro Warth and Ian Piumarta in 2007 under the Viewpoints Research Institute. The language is based on parsing expression grammar (PEGs) rather than context-free grammar with the intent of providing "a natural and convenient way for programmers to implement tokenizers, parsers, visitors, and tree-transformers".

kpatch is a feature of the Linux kernel that implements live patching of a running kernel, which allows kernel patches to be applied while the kernel is still running. By avoiding the need for rebooting the system with a new kernel that contains the desired patches, kpatch aims to maximize the system uptime and availability. At the same time, kpatch allows kernel-related security updates to be applied without deferring them to scheduled downtimes. Internally, kpatch allows entire functions in a running kernel to be replaced with their patched versions, doing that safely by stopping all running processes while the live patching is performed.

Julia Laetitia Lawall is a computer scientist specializing in programming languages. Educated in the US, she has worked in the US, Denmark, and France, where she is a director of research for Inria. She is one of the developers of Coccinelle, a tool for finding patterns and making systematic transformations of source code, and she has also done research on domain-specific languages for operating systems.

References

  1. "Coccinelle: A Program Matching and Transformation Tool for Systems Code". coccinelle.gitlabpages.inria.fr. Retrieved 2021-03-09.
  2. Palix, Nicolas; Lawall, Julia; Muller, Gilles (2010). "Tracking code patterns over multiple software versions with Herodotos" (PDF). Proceedings of the 9th International Conference on Aspect-Oriented Software Development (PDF). ACM. pp. 169–180. doi:10.1145/1739230.1739250. ISBN   9781605589589. S2CID   1082611.
  3. Nicolas Palix. "Nicolas Palix: Herodotos".
  4. Padioleau, Yoann; Lawall, Julia; Muller, Gilles (2007). "Semantic Patches, Documenting and Automating Collateral Evolutions in Linux Device Drivers" (PDF). coccinelle.gitlabpages.inria.fr. Retrieved 2020-08-29.
  5. Valerie Henson (2009-01-20). "Semantic patching with Coccinelle". Linux Weekly News . Retrieved 2011-04-25.
  6. Wolfram Sang (2010-03-30). "Evolutionary development of a semantic patch using Coccinelle". Linux Weekly News . Retrieved 2011-04-25.