Canonical form

Last updated January 31, 2025

In mathematics and computer science, a canonical, normal, or standardform of a mathematical object is a standard way of presenting that object as a mathematical expression. Often, it is one which provides the simplest representation of an object and allows it to be identified in a unique way. The distinction between "canonical" and "normal" forms varies from subfield to subfield. In most fields, a canonical form specifies a unique representation for every object, while a normal form simply specifies its form, without the requirement of uniqueness.^[1]

The canonical form of a positive integer in decimal representation is a finite sequence of digits that does not begin with zero. More generally, for a class of objects on which an equivalence relation is defined, a canonical form consists in the choice of a specific object in each class. For example:

Jordan normal form is a canonical form for matrix similarity.
The row echelon form is a canonical form, when one considers as equivalent a matrix and its left product by an invertible matrix.

In computer science, and more specifically in computer algebra, when representing mathematical objects in a computer, there are usually many different ways to represent the same object. In this context, a canonical form is a representation such that every object has a unique representation (with canonicalization being the process through which a representation is put into its canonical form).^[2] Thus, the equality of two objects can easily be tested by testing the equality of their canonical forms.

Despite this advantage, canonical forms frequently depend on arbitrary choices (like ordering the variables), which introduce difficulties for testing the equality of two objects resulting on independent computations. Therefore, in computer algebra, normal form is a weaker notion: A normal form is a representation such that zero is uniquely represented. This allows testing for equality by putting the difference of two objects in normal form.

Canonical form can also mean a differential form that is defined in a natural (canonical) way.

Definition

Given a set S of objects with an equivalence relation R on S, a canonical form is given by designating some objects of S to be "in canonical form", such that every object under consideration is equivalent to exactly one object in canonical form. In other words, the canonical forms in S represent the equivalence classes, once and only once. To test whether two objects are equivalent, it then suffices to test equality on their canonical forms. A canonical form thus provides a classification theorem and more, in that it not only classifies every class, but also gives a distinguished (canonical) representative for each object in the class.

Formally, a canonicalization with respect to an equivalence relation R on a set S is a mapping c:S→S such that for all s, s₁, s₂ ∈ S:

c(s) = c(c(s)) (idempotence),
s₁Rs₂ if and only if c(s₁) = c(s₂) (decisiveness), and
sRc(s) (representativeness).

Property 3 is redundant; it follows by applying 2 to 1.

In practical terms, it is often advantageous to be able to recognize the canonical forms. There is also a practical, algorithmic question to consider: how to pass from a given object s in S to its canonical form s*? Canonical forms are generally used to make operating with equivalence classes more effective. For example, in modular arithmetic, the canonical form for a residue class is usually taken as the least non-negative integer in it. Operations on classes are carried out by combining these representatives, and then reducing the result to its least non-negative residue. The uniqueness requirement is sometimes relaxed, allowing the forms to be unique up to some finer equivalence relation, such as allowing for reordering of terms (if there is no natural ordering on terms).

A canonical form may simply be a convention, or a deep theorem. For example, polynomials are conventionally written with the terms in descending powers: it is more usual to write x² + x + 30 than x + 30 + x², although the two forms define the same polynomial. By contrast, the existence of Jordan canonical form for a matrix is a deep theorem.

History

According to OED and LSJ, the term canonical stems from the Ancient Greek word kanonikós ( κανονικός , "regular, according to rule") from kanṓn ( κᾰνών , "rod, rule"). The sense of norm, standard, or archetype has been used in many disciplines. Mathematical usage is attested in a 1738 letter from Logan.^[3] The German term kanonische Form is attested in a 1846 paper by Eisenstein,^[4] later the same year Richelot uses the term Normalform in a paper,^[5] and in 1851 Sylvester writes:^[6]

"I now proceed to [...] the mode of reducing Algebraical Functions to their simplest and most symmetrical, or as my admirable friend M. Hermite well proposes to call them, their Canonical forms."

In the same period, usage is attested by Hesse ("Normalform"),^[7] Hermite ("forme canonique"),^[8] Borchardt ("forme canonique"),^[9] and Cayley ("canonical form").^[10]

In 1865, the Dictionary of Science, Literature and Art defines canonical form as:

"In Mathematics, denotes a form, usually the simplest or most symmetrical, to which, without loss of generality, all functions of the same class can be reduced."

Examples

Note: in this section, "up to" some equivalence relation E means that the canonical form is not unique in general, but that if one object has two different canonical forms, they are E-equivalent.

Large number notation

Standard form is used by many mathematicians and scientists to write extremely large numbers in a more concise and understandable way, the most prominent of which being the scientific notation.^[11]

Number theory

Canonical representation of a positive integer
The canonical form of a continued fraction for representing a number is the simple continued fraction

Linear algebra

Objects	A is equivalent to B if:	Normal form	Notes
Normal matrices over the complex numbers	$A=U^{*}BU$ for some unitary matrix U	Diagonal matrices (up to reordering)	This is the Spectral theorem
Matrices over the complex numbers	$A=UBV^{*}$ for some unitary matrices U and V	Diagonal matrices with real non-negative entries (in descending order)	Singular value decomposition
Matrices over an algebraically closed field	$A=P^{-1}BP$ for some invertible matrix P	Jordan normal form (up to reordering of blocks)
Matrices over an algebraically closed field	$A=P^{-1}BP$ for some invertible matrix P	Weyr canonical form (up to reordering of blocks)
Matrices over a field	$A=P^{-1}BP$ for some invertible matrix P	Frobenius normal form
Matrices over a principal ideal domain	$A=P^{-1}BQ$ for some invertible matrices P and Q	Smith normal form	The equivalence is the same as allowing invertible elementary row and column transformations
Matrices over the integers	$A=UB$ for some unimodular matrix U	Hermite normal form
Matrices over the integers modulo n		Howell normal form
Finite-dimensional vector spaces over a field K	A and B are isomorphic as vector spaces	$K^{n}$ , n a non-negative integer

Algebra

Objects	A is equivalent to B if:	Normal form
Finitely generated R-modules with R a principal ideal domain	A and B are isomorphic as R-modules	Primary decomposition (up to reordering) or invariant factor decomposition

Geometry

In analytic geometry:

The equation of a line: Ax + By = C, with A² + B² = 1 and C ≥ 0
The equation of a circle: $(x-h)^{2}+(y-k)^{2}=r^{2}$

By contrast, there are alternative forms for writing equations. For example, the equation of a line may be written as a linear equation in point-slope and slope-intercept form.

Convex polyhedra can be put into canonical form such that:

All faces are flat,
All edges are tangent to the unit sphere, and
The centroid of the polyhedron is at the origin.^[12]

Integrable systems

Every differentiable manifold has a cotangent bundle. That bundle can always be endowed with a certain differential form, called the canonical one-form. This form gives the cotangent bundle the structure of a symplectic manifold, and allows vector fields on the manifold to be integrated by means of the Euler-Lagrange equations, or by means of Hamiltonian mechanics. Such systems of integrable differential equations are called integrable systems.

Dynamical systems

The study of dynamical systems overlaps with that of integrable systems; there one has the idea of a normal form (dynamical systems).

Three dimensional geometry

In the study of manifolds in three dimensions, one has the first fundamental form, the second fundamental form and the third fundamental form.

Functional analysis

Objects	A is equivalent to B if:	Normal form
Hilbert spaces	If A and B are both Hilbert spaces of infinite dimension, then A and B are isometrically isomorphic.	$\ell ^{2}(I)$ sequence spaces (up to exchanging the index set I with another index set of the same cardinality)
Commutative C*-algebras with unit	A and B are isomorphic as C*-algebras	The algebra $C(X)$ of continuous functions on a compact Hausdorff space, up to homeomorphism of the base space.

Rewriting systems

The symbolic manipulation of a formula from one form to another is called a "rewriting" of that formula. One can study the abstract properties of rewriting generic formulas, by studying the collection of rules by which formulas can be validly manipulated. These are the "rewriting rules"—an integral part of an abstract rewriting system. A common question is whether it is possible to bring some generic expression to a single, common form, the normal form. If different sequences of rewrites still result in the same form, then that form can be termed a normal form, with the rewrite being called a confluent. It is not always possible to obtain a normal form.

Lambda calculus

A lambda term is in beta normal form if no beta reduction is possible; lambda calculus is a particular case of an abstract rewriting system. In the untyped lambda calculus, for example, the term $(\lambda x.(xx)\;\lambda x.(xx))$ does not have a normal form. In the typed lambda calculus, every well-formed term can be rewritten to its normal form.

Graph theory

In graph theory, a branch of mathematics, graph canonization is the problem of finding a canonical form of a given graph G. A canonical form is a labeled graph Canon(G) that is isomorphic to G, such that every graph that is isomorphic to G has the same canonical form as G. Thus, from a solution to the graph canonization problem, one could also solve the problem of graph isomorphism: to test whether two graphs G and H are isomorphic, compute their canonical forms Canon(G) and Canon(H), and test whether these two canonical forms are identical.

Computing

In computing, the reduction of data to any kind of canonical form is commonly called data normalization.

For instance, database normalization is the process of organizing the fields and tables of a relational database to minimize redundancy and dependency.^[13]

In the field of software security, a common vulnerability is unchecked malicious input (see Code injection ). The mitigation for this problem is proper input validation. Before input validation is performed, the input is usually normalized by eliminating encoding (e.g., HTML encoding) and reducing the input data to a single common character set.

Other forms of data, typically associated with signal processing (including audio and imaging) or machine learning, can be normalized in order to provide a limited range of values.

In content management, the concept of a single source of truth (SSOT) is applicable, just as it is in database normalization generally and in software development. Competent content management systems provide logical ways of obtaining it, such as transclusion.

Notes

↑ In some occasions, the term "canonical" and "normal" can also be used interchangeably, as in Jordan canonical form and Jordan normal form (see Jordan normal form on MathWorks).
↑ The term 'canonization' is sometimes incorrectly used for this.
↑ Letter from James Logan to William Jones, Correspondence of Scientific Men of the Seventeenth Century. University Press. 1841. ISBN 978-1-02-008678-6.
↑ "Journal für die reine und angewandte Mathematik 1846". de Gruyter.
↑ Journal für die reine und angewandte Mathematik 1846. de Gruyter.
↑ "The Cambridge and Dublin mathematical journal 1851". Macmillan.
↑ Hesse, Otto (1865). "Vorlesungen aus der analytischen Geometrie der geraden Linie, des Punktes und des Kreises in der Ebene" (in German). Teubner.
↑ "The Cambridge and Dublin mathematical journal 1854". 1854.
↑ "Journal für die reine und angewandte Mathematik, 1854". de Gruyter.
↑ Cayley, Arthur (1889). The Collected Mathematical Papers. University. ISBN 978-1-4181-8586-2.
↑ "Big Numbers and Scientific Notation". Teaching Quantitative Literacy. Retrieved 2019-11-20.
↑ Ziegler, Günter M. (1995), Lectures on Polytopes, Graduate Texts in Mathematics, vol. 152, Springer-Verlag, pp. 117–118, ISBN 0-387-94365-X
↑ "Description of the database normalization basics". support.microsoft.com. Retrieved 2019-11-20.

Related Research Articles

In mathematics, an equivalence relation is a binary relation that is reflexive, symmetric and transitive. The equipollence relation between line segments in geometry is a common example of an equivalence relation. A simpler example is equality. Any number $is equal to itself (reflexive). If, then (symmetric). If and, then (transitive).$

In mathematics, when the elements of some set $have a notion of equivalence, then one may naturally split the set into equivalence classes . These equivalence classes are constructed so that elements and belong to the same equivalence class if, and only if, they are equivalent.$

<span class="mw-page-title-main">Isomorphism</span> Inversible mapping (mathematics)

In mathematics, an isomorphism is a structure-preserving mapping between two structures of the same type that can be reversed by an inverse mapping. Two mathematical structures are isomorphic if an isomorphism exists between them. The word is derived from Ancient Greek ἴσος (isos) 'equal' and μορφή (morphe) 'form, shape'.

In logic and computer science, specifically automated reasoning, unification is an algorithmic process of solving equations between symbolic expressions, each of the form Left-hand side = Right-hand side. For example, using x,y,z as variables, and taking f to be an uninterpreted function, the singleton equation set { f(1,y) = f(x,2) } is a syntactic first-order unification problem that has the substitution { x ↦ 1, y ↦ 2 } as its only solution.

In mathematics, equality is a relationship between two quantities or expressions, stating that they have the same value, or represent the same mathematical object. Equality between $A$ and $B$ is written $A = B$ , and pronounced " $A$ equals $B$ ". In this equality, $A$ and $B$ are distinguished by calling them left-hand side (LHS), and right-hand side (RHS). Two objects that are not equal are said to be distinct.

In geometry, a geodesic is a curve representing in some sense the locally shortest path (arc) between two points in a surface, or more generally in a Riemannian manifold. The term also has meaning in any differentiable manifold with a connection. It is a generalization of the notion of a "straight line".

Discrete mathematics is the study of mathematical structures that are fundamentally discrete rather than continuous. In contrast to real numbers that have the property of varying "smoothly", the objects studied in discrete mathematics – such as integers, graphs, and statements in logic – do not vary smoothly in this way, but have distinct, separated values. Discrete mathematics, therefore, excludes topics in "continuous mathematics" such as calculus and analysis.

In mathematics, computer science, and logic, rewriting covers a wide range of methods of replacing subterms of a formula with other terms. Such methods may be achieved by rewriting systems. In their most basic form, they consist of a set of objects, plus relations on how to transform those objects.

In mathematics, an expression is a written arrangement of symbols following the context-dependent, syntactic conventions of mathematical notation. Symbols can denote numbers, variables, operations, and functions. Other symbols include punctuation marks and brackets, used for grouping where there is not a well-defined order of operations.

In mathematics, a duality translates concepts, theorems or mathematical structures into other concepts, theorems or structures in a one-to-one fashion, often by means of an involution operation: if the dual of $A$ is $B$ , then the dual of $B$ is $A$ . In other cases the dual of the dual – the double dual or bidual – is not necessarily identical to the original. Such involutions sometimes have fixed points, so that the dual of $A$ is $A$ itself. For example, Desargues' theorem is self-dual in this sense under the standard duality in projective geometry.

The language of mathematics has a wide vocabulary of specialist and technical terms. It also has a certain amount of jargon: commonly used phrases which are part of the culture of mathematics, rather than of the subject. Jargon often appears in lectures, and sometimes in print, as informal shorthand for rigorous arguments or precise ideas. Much of this uses common English words, but with a specific non-obvious meaning when used in a mathematical sense.

In computer science and mathematics, confluence is a property of rewriting systems, describing which terms in such a system can be rewritten in more than one way, to yield the same result. This article describes the properties in the most abstract setting of an abstract rewriting system.

In computational mathematics, a word problem is the problem of deciding whether two given expressions are equivalent with respect to a set of rewriting identities. A prototypical example is the word problem for groups, but there are many other instances as well. A deep result of computational theory is that answering this question is in many important cases undecidable.

In mathematics, a classification theorem answers the classification problem: "What are the objects of a given type, up to some equivalence?". It gives a non-redundant enumeration: each object is equivalent to exactly one class.

In abstract rewriting, an object is in normal form if it cannot be rewritten any further, i.e. it is irreducible. Depending on the rewriting system, an object may rewrite to several normal forms or none at all. Many properties of rewriting systems relate to normal forms.

In mathematics, in the theory of rewriting systems, Newman's lemma, also commonly called the diamond lemma, states that a terminating abstract rewriting system (ARS), that is, one in which there are no infinite reduction sequences, is confluent if it is locally confluent. In fact a terminating ARS is confluent precisely when it is locally confluent.

<span class="mw-page-title-main">Computer algebra</span> Scientific area at the interface between computer science and mathematics

In mathematics and computer science, computer algebra, also called symbolic computation or algebraic computation, is a scientific area that refers to the study and development of algorithms and software for manipulating mathematical expressions and other mathematical objects. Although computer algebra could be considered a subfield of scientific computing, they are generally considered as distinct fields because scientific computing is usually based on numerical computation with approximate floating point numbers, while symbolic computation emphasizes exact computation with expressions containing variables that have no given value and are manipulated as symbols.

In mathematical logic and theoretical computer science, an abstract rewriting system is a formalism that captures the quintessential notion and properties of rewriting systems. In its simplest form, an ARS is simply a set together with a binary relation, traditionally denoted with $; this definition can be further refined if we index (label) subsets of the binary relation. Despite its simplicity, an ARS is sufficient to describe important properties of rewriting systems like normal forms, termination, and various notions of confluence.$

In mathematical logic, a term denotes a mathematical object while a formula denotes a mathematical fact. In particular, terms appear as components of a formula. This is analogous to natural language, where a noun phrase refers to an object and a whole sentence refers to a fact.

In computer science, algebraic semantics is a form of axiomatic semantics based on algebraic laws for describing and reasoning about program specifications in a formal manner.

References

Shilov, Georgi E. (1977), Silverman, Richard A. (ed.), Linear Algebra, Dover, ISBN 0-486-63518-X .
Hansen, Vagn Lundsgaard (2006), Functional Analysis: Entering Hilbert Space, World Scientific Publishing, ISBN 981-256-563-9 .

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] In some occasions, the term "canonical" and "normal" can also be used interchangeably, as in Jordan canonical form and Jordan normal form (see Jordan normal form on MathWorks).

[2] The term 'canonization' is sometimes incorrectly used for this.

[3] Letter from James Logan to William Jones, Correspondence of Scientific Men of the Seventeenth Century. University Press. 1841. ISBN 978-1-02-008678-6.

[4] "Journal für die reine und angewandte Mathematik 1846". de Gruyter.

[5] Journal für die reine und angewandte Mathematik 1846. de Gruyter.

[6] "The Cambridge and Dublin mathematical journal 1851". Macmillan.

[7] Hesse, Otto (1865). "Vorlesungen aus der analytischen Geometrie der geraden Linie, des Punktes und des Kreises in der Ebene" (in German). Teubner.

[8] "The Cambridge and Dublin mathematical journal 1854". 1854.

[9] "Journal für die reine und angewandte Mathematik, 1854". de Gruyter.

[10] Cayley, Arthur (1889). The Collected Mathematical Papers. University. ISBN 978-1-4181-8586-2.

[11] "Big Numbers and Scientific Notation". Teaching Quantitative Literacy. Retrieved 2019-11-20.

[12] Ziegler, Günter M. (1995), Lectures on Polytopes, Graduate Texts in Mathematics, vol. 152, Springer-Verlag, pp. 117–118, ISBN 0-387-94365-X

[13] "Description of the database normalization basics". support.microsoft.com. Retrieved 2019-11-20.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]