In mathematics, a multiset (or bag, or mset) is a modification of the concept of a set that, unlike a set, [1] allows for multiple instances for each of its elements. The number of instances given for each element is called the multiplicity of that element in the multiset. As a consequence, an infinite number of multisets exist that contain only elements a and b, but vary in the multiplicities of their elements:
These objects are all different when viewed as multisets, although they are the same set, since they all consist of the same elements. As with sets, and in contrast to tuples , the order in which elements are listed does not matter in discriminating multisets, so {a, a, b} and {a, b, a} denote the same multiset. To distinguish between sets and multisets, a notation that incorporates square brackets is sometimes used: the multiset {a, a, b} can be denoted by [a, a, b]. [2]
The cardinality or "size" of a multiset is the sum of the multiplicities of all its elements. For example, in the multiset {a, a, b, b, b, c} the multiplicities of the members a, b, and c are respectively 2, 3, and 1, and therefore the cardinality of this multiset is 6.
Nicolaas Govert de Bruijn coined the word multiset in the 1970s, according to Donald Knuth. [3] : 694 However, the concept of multisets predates the coinage of the word multiset by many centuries. Knuth himself attributes the first study of multisets to the Indian mathematician Bhāskarāchārya, who described permutations of multisets around 1150. Other names have been proposed or used for this concept, including list, bunch, bag, heap, sample, weighted set, collection, and suite. [3] : 694
Wayne Blizard traced multisets back to the very origin of numbers, arguing that "in ancient times, the number n was often represented by a collection of n strokes, tally marks, or units." [4] These and similar collections of objects can be regarded as multisets, because strokes, tally marks, or units are considered indistinguishable. This shows that people implicitly used multisets even before mathematics emerged.
Practical needs for this structure have caused multisets to be rediscovered several times, appearing in literature under different names. [5] : 323 For instance, they were important in early AI languages, such as QA4, where they were referred to as bags, a term attributed to Peter Deutsch. [6] A multiset has been also called an aggregate, heap, bunch, sample, weighted set, occurrence set, and fireset (finitely repeated element set). [5] : 320 [7]
Although multisets were used implicitly from ancient times, their explicit exploration happened much later. The first known study of multisets is attributed to the Indian mathematician Bhāskarāchārya circa 1150, who described permutations of multisets. [3] : 694 The work of Marius Nizolius (1498–1576) contains another early reference to the concept of multisets. [8] Athanasius Kircher found the number of multiset permutations when one element can be repeated. [9] Jean Prestet published a general rule for multiset permutations in 1675. [10] John Wallis explained this rule in more detail in 1685. [11]
Multisets appeared explicitly in the work of Richard Dedekind. [12] [13]
Other mathematicians formalized multisets and began to study them as precise mathematical structures in the 20th century. For example, Hassler Whitney (1933) described generalized sets ("sets" whose characteristic functions may take any integer value: positive, negative or zero). [5] : 326 [14] : 405 Monro (1987) investigated the category Mul of multisets and their morphisms, defining a multiset as a set with an equivalence relation between elements "of the same sort", and a morphism between multisets as a function that respects sorts. He also introduced a multinumber : a function f (x) from a multiset to the natural numbers, giving the multiplicity of element x in the multiset. Monro argued that the concepts of multiset and multinumber are often mixed indiscriminately, though both are useful. [5] : 327–328 [15]
One of the simplest and most natural examples is the multiset of prime factors of a natural number n. Here the underlying set of elements is the set of prime factors of n. For example, the number 120 has the prime factorization which gives the multiset {2, 2, 2, 3, 5}.
A related example is the multiset of solutions of an algebraic equation. A quadratic equation, for example, has two solutions. However, in some cases they are both the same number. Thus the multiset of solutions of the equation could be {3, 5}, or it could be {4, 4}. In the latter case it has a solution of multiplicity 2. More generally, the fundamental theorem of algebra asserts that the complex solutions of a polynomial equation of degree d always form a multiset of cardinality d.
A special case of the above are the eigenvalues of a matrix, whose multiplicity is usually defined as their multiplicity as roots of the characteristic polynomial. However two other multiplicities are naturally defined for eigenvalues, their multiplicities as roots of the minimal polynomial, and the geometric multiplicity, which is defined as the dimension of the kernel of A − λI (where λ is an eigenvalue of the matrix A). These three multiplicities define three multisets of eigenvalues, which may be all different: Let A be a n × n matrix in Jordan normal form that has a single eigenvalue. Its multiplicity is n, its multiplicity as a root of the minimal polynomial is the size of the largest Jordan block, and its geometric multiplicity is the number of Jordan blocks.
A multiset may be formally defined as an ordered pair (A, m) where A is the underlying set of the multiset, formed from its distinct elements, and is a function from A to the set of positive integers, giving the multiplicity – that is, the number of occurrences – of the element a in the multiset as the number m(a).
(It is also possible to allow multiplicity 0 or , especially when considering submultisets. [16] This article is restricted to finite, positive multiplicities.)
Representing the function m by its graph (the set of ordered pairs ) allows for writing the multiset {a, a, b} as {(a, 2), (b, 1)}, and the multiset {a, b} as {(a, 1), (b, 1)}. This notation is however not commonly used; more compact notations are employed.
If is a finite set, the multiset (A, m) is often represented as
where upper indices equal to 1 are omitted. For example, the multiset {a, a, b} may be written or If the elements of the multiset are numbers, a confusion is possible with ordinary arithmetic operations; those normally can be excluded from the context. On the other hand, the latter notation is coherent with the fact that the prime factorization of a positive integer is a uniquely defined multiset, as asserted by the fundamental theorem of arithmetic. Also, a monomial is a multiset of indeterminates; for example, the monomial x3y2 corresponds to the multiset {x, x, x, y, y}.
A multiset corresponds to an ordinary set if the multiplicity of every element is 1. An indexed family (ai)i∈I, where i varies over some index set I, may define a multiset, sometimes written {ai}. In this view the underlying set of the multiset is given by the image of the family, and the multiplicity of any element x is the number of index values i such that . In this article the multiplicities are considered to be finite, so that no element occurs infinitely many times in the family; even in an infinite multiset, the multiplicities are finite numbers.
It is possible to extend the definition of a multiset by allowing multiplicities of individual elements to be infinite cardinals instead of positive integers, but not all properties carry over to this generalization.
Elements of a multiset are generally taken in a fixed set U, sometimes called a universe, which is often the set of natural numbers. An element of U that does not belong to a given multiset is said to have a multiplicity 0 in this multiset. This extends the multiplicity function of the multiset to a function from U to the set of non-negative integers. This defines a one-to-one correspondence between these functions and the multisets that have their elements in U.
This extended multiplicity function is commonly called simply the multiplicity function, and suffices for defining multisets when the universe containing the elements has been fixed. This multiplicity function is a generalization of the indicator function of a subset, and shares some properties with it.
The support of a multiset in a universe U is the underlying set of the multiset. Using the multiplicity function , it is characterized as
A multiset is finite if its support is finite, or, equivalently, if its cardinality is finite. The empty multiset is the unique multiset with an empty support (underlying set), and thus a cardinality 0.
The usual operations of sets may be extended to multisets by using the multiplicity function, in a similar way to using the indicator function for subsets. In the following, A and B are multisets in a given universe U, with multiplicity functions and
Two multisets are disjoint if their supports are disjoint sets. This is equivalent to saying that their intersection is the empty multiset or that their sum equals their union.
There is an inclusion–exclusion principle for finite multisets (similar to the one for sets), stating that a finite union of finite multisets is the difference of two sums of multisets: in the first sum we consider all possible intersections of an odd number of the given multisets, while in the second sum we consider all possible intersections of an even number of the given multisets.[ citation needed ]
The number of multisets of cardinality k, with elements taken from a finite set of cardinality n, is sometimes called the multiset coefficient or multiset number. This number is written by some authors as , a notation that is meant to resemble that of binomial coefficients; it is used for instance in (Stanley, 1997), and could be pronounced "n multichoose k" to resemble "n choose k" for Like the binomial distribution that involves binomial coefficients, there is a negative binomial distribution in which the multiset coefficients occur. Multiset coefficients should not be confused with the unrelated multinomial coefficients that occur in the multinomial theorem.
The value of multiset coefficients can be given explicitly as where the second expression is as a binomial coefficient; [lower-alpha 1] many authors in fact avoid separate notation and just write binomial coefficients. So, the number of such multisets is the same as the number of subsets of cardinality k of a set of cardinality n + k − 1. The analogy with binomial coefficients can be stressed by writing the numerator in the above expression as a rising factorial power to match the expression of binomial coefficients using a falling factorial power:
For example, there are 4 multisets of cardinality 3 with elements taken from the set {1, 2} of cardinality 2 (n = 2, k = 3), namely {1, 1, 1}, {1, 1, 2}, {1, 2, 2}, {2, 2, 2}. There are also 4 subsets of cardinality 3 in the set {1, 2, 3, 4} of cardinality 4 (n + k − 1), namely {1, 2, 3}, {1, 2, 4}, {1, 3, 4}, {2, 3, 4}.
One simple way to prove the equality of multiset coefficients and binomial coefficients given above involves representing multisets in the following way. First, consider the notation for multisets that would represent {a, a, a, a, a, a, b, b, c, c, c, d, d, d, d, d, d, d} (6 as, 2 bs, 3 cs, 7 ds) in this form:
This is a multiset of cardinality k = 18 made of elements of a set of cardinality n = 4. The number of characters including both dots and vertical lines used in this notation is 18 + 4 − 1. The number of vertical lines is 4 − 1. The number of multisets of cardinality 18 is then the number of ways to arrange the 4 − 1 vertical lines among the 18 + 4 − 1 characters, and is thus the number of subsets of cardinality 4 − 1 of a set of cardinality 18 + 4 − 1. Equivalently, it is the number of ways to arrange the 18 dots among the 18 + 4 − 1 characters, which is the number of subsets of cardinality 18 of a set of cardinality 18 + 4 − 1. This is thus is the value of the multiset coefficient and its equivalencies:
From the relation between binomial coefficients and multiset coefficients, it follows that the number of multisets of cardinality k in a set of cardinality n can be written Additionally,
A recurrence relation for multiset coefficients may be given as with
The above recurrence may be interpreted as follows. Let be the source set. There is always exactly one (empty) multiset of size 0, and if n = 0 there are no larger multisets, which gives the initial conditions.
Now, consider the case in which n, k > 0. A multiset of cardinality k with elements from [n] might or might not contain any instance of the final element n. If it does appear, then by removing n once, one is left with a multiset of cardinality k − 1 of elements from [n], and every such multiset can arise, which gives a total of possibilities.
If n does not appear, then our original multiset is equal to a multiset of cardinality k with elements from [n − 1], of which there are
Thus,
The generating function of the multiset coefficients is very simple, being As multisets are in one-to-one correspondence with monomials, is also the number of monomials of degree d in n indeterminates. Thus, the above series is also the Hilbert series of the polynomial ring
As is a polynomial in n, it and the generating function are well defined for any complex value of n.
The multiplicative formula allows the definition of multiset coefficients to be extended by replacing n by an arbitrary number α (negative, real, or complex):
With this definition one has a generalization of the negative binomial formula (with one of the variables set to 1), which justifies calling the negative binomial coefficients:
This Taylor series formula is valid for all complex numbers α and X with |X| < 1. It can also be interpreted as an identity of formal power series in X, where it actually can serve as definition of arbitrary powers of series with constant coefficient equal to 1; the point is that with this definition all identities hold that one expects for exponentiation, notably
and formulas such as these can be used to prove identities for the multiset coefficients.
If α is a nonpositive integer n, then all terms with k > −n are zero, and the infinite series becomes a finite sum. However, for other values of α, including positive integers and rational numbers, the series is infinite.
Multisets have various applications. [7] They are becoming fundamental in combinatorics. [17] [18] [19] [20] Multisets have become an important tool in the theory of relational databases, which often uses the synonym bag. [21] [22] [23] For instance, multisets are often used to implement relations in database systems. In particular, a table (without a primary key) works as a multiset, because it can have multiple identical records. Similarly, SQL operates on multisets and returns identical records. For instance, consider "SELECT name from Student". In the case that there are multiple records with name "Sara" in the student table, all of them are shown. That means the result of an SQL query is a multiset; if the result were instead a set, the repetitive records in the result set would have been eliminated. Another application of multisets is in modeling multigraphs. In multigraphs there can be multiple edges between any two given vertices. As such, the entity that specifies the edges is a multiset, and not a set.
There are also other applications. For instance, Richard Rado used multisets as a device to investigate the properties of families of sets. He wrote, "The notion of a set takes no account of multiple occurrence of any one of its members, and yet it is just this kind of information that is frequently of importance. We need only think of the set of roots of a polynomial f (x) or the spectrum of a linear operator." [5] : 328–329
Different generalizations of multisets have been introduced, studied and applied to solving problems.
n+k −1 |
k |
−1 |
0 |
n+k −1 |
k |
n+k −1 |
k |
In mathematics, the binomial coefficients are the positive integers that occur as coefficients in the binomial theorem. Commonly, a binomial coefficient is indexed by a pair of integers n ≥ k ≥ 0 and is written It is the coefficient of the xk term in the polynomial expansion of the binomial power (1 + x)n; this coefficient can be computed by the multiplicative formula
In elementary algebra, the binomial theorem (or binomial expansion) describes the algebraic expansion of powers of a binomial. According to the theorem, it is possible to expand the polynomial (x + y)n into a sum involving terms of the form axbyc, where the exponents b and c are nonnegative integers with b + c = n, and the coefficient a of each term is a specific positive integer depending on n and b. For example, for n = 4,
In mathematics, a combination is a selection of items from a set that has distinct members, such that the order of selection does not matter. For example, given three fruits, say an apple, an orange and a pear, there are three combinations of two that can be drawn from this set: an apple and a pear; an apple and an orange; or a pear and an orange. More formally, a k-combination of a set S is a subset of k distinct elements of S. So, two combinations are identical if and only if each combination has the same members. If the set has n elements, the number of k-combinations, denoted by or , is equal to the binomial coefficient
A finite difference is a mathematical expression of the form f (x + b) − f (x + a). If a finite difference is divided by b − a, one gets a difference quotient. The approximation of derivatives by finite differences plays a central role in finite difference methods for the numerical solution of differential equations, especially boundary value problems.
In probability theory and statistics, the negative binomial distribution is a discrete probability distribution that models the number of failures in a sequence of independent and identically distributed Bernoulli trials before a specified (non-random) number of successes occurs. For example, we can define rolling a 6 on some dice as a success, and rolling any other number as a failure, and ask how many failure rolls will occur before we see the third success. In such a case, the probability distribution of the number of failures that appear will be a negative binomial distribution.
In mathematics, Pascal's triangle is an infinite triangular array of the binomial coefficients which play a crucial role in probability theory, combinatorics, and algebra. In much of the Western world, it is named after the French mathematician Blaise Pascal, although other mathematicians studied it centuries before him in Persia, India, China, Germany, and Italy.
In mathematics, a recurrence relation is an equation according to which the th term of a sequence of numbers is equal to some combination of the previous terms. Often, only previous terms of the sequence appear in the equation, for a parameter that is independent of ; this number is called the order of the relation. If the values of the first numbers in the sequence have been given, the rest of the sequence can be calculated by repeatedly applying the equation.
In mathematics, a generating function is a representation of an infinite sequence of numbers as the coefficients of a formal power series. Generating functions are often expressed in closed form, by some expression involving operations on the formal series.
In order theory, a field of mathematics, an incidence algebra is an associative algebra, defined for every locally finite partially ordered set and commutative ring with unity. Subalgebras called reduced incidence algebras give a natural construction of various types of generating functions used in combinatorics and number theory.
In combinatorics, a branch of mathematics, the inclusion–exclusion principle is a counting technique which generalizes the familiar method of obtaining the number of elements in the union of two finite sets; symbolically expressed as
In mathematics, especially in the field of algebra, a polynomial ring or polynomial algebra is a ring formed from the set of polynomials in one or more indeterminates with coefficients in another ring, often a field.
In mathematics, the multinomial theorem describes how to expand a power of a sum in terms of powers of the terms in that sum. It is the generalization of the binomial theorem from binomials to multinomials.
In mathematics, the binomial series is a generalization of the polynomial that comes from a binomial formula expression like for a nonnegative integer . Specifically, the binomial series is the MacLaurin series for the function , where and . Explicitly,
In mathematics, the Gaussian binomial coefficients are q-analogs of the binomial coefficients. The Gaussian binomial coefficient, written as or , is a polynomial in q with integer coefficients, whose value when q is set to a prime power counts the number of subspaces of dimension k in a vector space of dimension n over , a finite field with q elements; i.e. it is the number of points in the finite Grassmannian .
In mathematics, especially in combinatorics, Stirling numbers of the first kind arise in the study of permutations. In particular, the unsigned Stirling numbers of the first kind count permutations according to their number of cycles.
In combinatorics, the twelvefold way is a systematic classification of 12 related enumerative problems concerning two finite sets, which include the classical problems of counting permutations, combinations, multisets, and partitions either of a set or of a number. The idea of the classification is credited to Gian-Carlo Rota, and the name was suggested by Joel Spencer.
In mathematics, more specifically field theory, the degree of a field extension is a rough measure of the "size" of the field extension. The concept plays an important role in many parts of mathematics, including algebra and number theory—indeed in any area where fields appear prominently.
In mathematics, an approximately finite-dimensional (AF) C*-algebra is a C*-algebra that is the inductive limit of a sequence of finite-dimensional C*-algebras. Approximate finite-dimensionality was first defined and described combinatorially by Ola Bratteli. Later, George A. Elliott gave a complete classification of AF algebras using the K0 functor whose range consists of ordered abelian groups with sufficiently nice order structure.
In mathematics, the multiplicity of a member of a multiset is the number of times it appears in the multiset. For example, the number of times a given polynomial has a root at a given point is the multiplicity of that root.
Tau functions are an important ingredient in the modern mathematical theory of integrable systems, and have numerous applications in a variety of other domains. They were originally introduced by Ryogo Hirota in his direct method approach to soliton equations, based on expressing them in an equivalent bilinear form.
By a set (Menge) we are to understand any collection into a whole (Zusammenfassung zu einem Gansen) M of definite and separate objects m (p.85)
{{cite book}}
: CS1 maint: location missing publisher (link)