The multifit algorithm is an algorithm for multiway number partitioning, originally developed for the problem of identical-machines scheduling. It was developed by Coffman, Garey and Johnson. [1] Its novelty comes from the fact that it uses an algorithm for another famous problem - the bin packing problem - as a subroutine.
The input to the algorithm is a set S of numbers, and a parameter n. The required output is a partition of S into n subsets, such that the largest subset sum (also called the makespan) is as small as possible.
The algorithm uses as a subroutine, an algorithm called first-fit-decreasing bin packing (FFD). The FFD algorithm takes as input the same set S of numbers, and a bin-capacity c. It heuristically packs numbers into bins such that the sum of numbers in each bin is at most C, aiming to use as few bins as possible. Multifit runs FFD multiple times, each time with a different capacity C, until it finds some C such that FFD with capacity C packs S into at most n bins. To find it, it uses binary search as follows.
Multifit is a constant-factor approximation algorithm. It always finds a partition in which the makespan is at most a constant factor larger than the optimal makespan. To find this constant, we must first analyze FFD. While the standard analysis of FFD considers approximation w.r.t. number of bins when the capacity is constant, here we need to analyze approximation w.r.t. capacity when the number of bins is constant. Formally, for every input size S and integer n, let be the smallest capacity such that S can be packed into n bins of this capacity. Note that is the value of the optimal solution to the original scheduling instance.
Let be the smallest real number such that, for every input S, FFD with capacity uses at most n bins.
Coffman, Garey and Johnson prove the following upper bounds on : [1]
During the MultiFit algorithm, the lower bound L is always a capacity for which it is impossible to pack S into n bins. Therefore, . Initially, the difference is at most sum(S) / n, which is at most . After the MultiFit algorithm runs for k iterations, the difference shrinks k times by half, so . Therefore, . Therefore, the scheduling returned by MultiFit has makespan at most times the optimal makespan. When is sufficiently large, the approximation factor of MultiFit can be made arbitrarily close to , which is at most 1.22.
Later papers performed a more detailed analysis of MultiFit, and proved that its approximation ratio is at most 6/5=1.2, [2] and later, at most 13/11≈1.182. [3] The original proof of this missed some cases; [4] presented a complete and simpler proof. The 13/11 cannot be improved: see lower bound below. [2]
For n=4: the following [5] shows that , which is tight. The inputs are 9,7,6,5,5, 4,4,4,4,4,4,4,4,4. They can be packed into 4 bins of capacity 17 as follows:
But if we run FFD with bin capacity smaller than 20, then the filled bins are:
Note that the sum in each of the first 4 bins is 16, so we cannot put another 4 inside it. Therefore, 4 bins are not sufficient.
For n=13: the following [2] shows that , which is tight. The inputs can be packed into 13 bins of capacity 66 as follows:
But if we run FFD with bin capacity smaller than 66*13/11 = 78, then the filled bins are:
Note that the sum in each of the first 13 bins is 65, so we cannot put another 13 inside it. Therefore, 13 bins are not sufficient.
MultiFit can also be used in the more general setting called uniform-machines scheduling, where machines may have different processing speeds. [6] When there are two uniform machines, the approximation factor is . When MultiFit is combined with the LPT algorithm, the ratio improves to .[ clarification needed ]
A dual goal to minimizing the largest sum (makespan) is maximizing the smallest sum. Deuermeyer, Friesen and Langston claim that MultiFit does not have a good approximation factor for this problem: [7]
"In the solution of the makespan problem using MULTIFIT, it is easy to construct examples where one processor is never used.[ clarification needed ] Such a solution is tolerable for the makespan problem, but is totally unacceptable for our problem [since the smallest sum is 0]. Modifications of MULTIFIT can be devised which would be more suitable for our problem, but we could find none which produces a better worst-case bound than that of LPT."
The upper bounds on are proved by contradiction. For any integers p ≥ q, if , then there exists a (p/q)-counterexample, defined as an instance S and a number n of bins such that
If there exists such a counterexample, then there also exists a minimal (p/q)-counterexample, which is a (p/q)-counterexample with a smallest number of items in S and a smallest number of bins n. In a minimal (p/q)-counterexample, FFD packs all items in S except the last (smallest) one into n bins with capacity p. Given a minimal (p/q)-counterexample, denote by P1,...,Pn the (incomplete) FFD packing into these n bins with capacity p, by Pn+1 the bin containing the single smallest item, and by Q1,...,Qn the (complete) optimal packing into n bins with capacity q. The following lemmas can be proved:
From the above lemmas, it is already possible to prove a loose upper bound . Proof. Let S, n be a minimal (5/4)-counterexample. The above lemmas imply that -
To prove tighter bounds, one needs to take a closer look at the FFD packing of the minimal (p/q)-counterexample. The items and FFD bins P1,...,Pn are termed as follows:
The following lemmas follow immediately from these definitions and the operation of FFD.
In a minimal counterexample, there are no regular 1-bins (since each bin contains at least 2 items), so by the above lemmas, the FFD bins P1,...,Pn are ordered by type:
The upper bound [1] is proved by assuming a minimal (122/100)-counterexample. Each item is given a weight based on its size and its bin in the FFD packing. The weights are determined such that the total weight in each FFD bin is at least x, and the total weight in almost each optimal bin is at most x (for some predetermined x). This implies that the number of FFD bins is at most the number of optimal bins, which contradicts the assumption that it is a counterexample.
By the lemmas above, we know that:
If D>4, the size of each item is larger than 26, so each optimal bin (with capacity 100) must contain at most 3 items. Each item is smaller than 56-2D and each FFD bin has a sum larger than 100-D, so each FFD bin must contain at least 3 items. Therefore, there are at most n FFD bins - contradiction. So from now on, we assume D≤4. The items are assigned types and weights as follows.
Note that the weight of each item is at most its size (the weight can be seen as the size "rounded down"). Still, the total weight of items in every FFD bin is at least 100-D:
The total weight of items in most optimal bins is at most 100-D:
The upper bound [3] is proved by assuming a minimal ((120-3d)/100)-counterexample, with some d<20/33, and deriving a contradiction.
MultiFit is not monotone in the following sense: it is possible that an input decreases while the max-sum in the partition returned by MultiFit increases. As an example, [1] : Fig.4 suppose n=3 and the input numbers are:
44, 24, 24, 22, 21, 17, 8, 8, 6, 6.
FFD packs these inputs into 3 bins of capacity 60 (which is optimal):
But if the "17" becomes "16", then FFD with capacity 60 needs 4 bins:
so MultiFit must increase the capacity, for example, to 62:
This is in contrast to other number partitioning algorithms - List scheduling and Longest-processing-time-first scheduling - which are monotone. [8]
Multifit has been extended to the more general problem of maximin-share allocation of chores. [5] In this problem, S is a set of chores, and there are n agents who assign potentially different valuations to the chores. The goal is to give to each agent, a set of chores worth at most r times the maximum value in an optimal scheduling based on i's valuations. A naive approach is to let each agent in turn use the MultiFit algorithm to calculate the threshold, and then use the algorithm where each agent uses his own threshold. If this approach worked, we would get an approximation of 13/11. However, this approach fails due to the non-monotonicity of FFD.
Here is an example. [5] : Ex.5.2 Suppose there are four agents, and they have valuations of two types:
Type A: | 51 | 27.5 | 27.5 | 27.5 | 27.5 | 25 | 12 | 12 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 | 10 |
Type B: | 51 | 27.5 | 27.5 | 27.5 | 27.5 | 24 | 20 | 20 | 8.33 | 8.33 | 8.33 | 8.33 | 8.33 | 8.33 | 8.33 | 8.33 | 8.33 |
Both types can partition the chores into 4 parts of total value 75. Type A:
Type B:
If all four agents are of the same, then FFD with threshold 75 fills the 4 optimal bins. But suppose there is one agent of type B, and the others are of type A. Then, in the first round, the agent of type B takes the bundle 51, 24 (the other agents cannot take it since for them the values are 51,25 whose sum is more than 75).In the following rounds, the following bundles are filled for the type A agents:
so the last two chores remain unallocated.
Using a more sophisticated threshold calculation, it is possible to guarantee to each agent at most 11/9≈1.22 of his optimal value if the optimal value is known, and at most 5/4≈1.25 of his optimal value (using a polynomial time algorithm) if the optimal value is not known. [5]
Using more elaborate arguments, it is possible to guarantee to each agent the same ratio of MultiFit. [9]
The knapsack problem is the following problem in combinatorial optimization:
A splay tree is a binary search tree with the additional property that recently accessed elements are quick to access again. Like self-balancing binary search trees, a splay tree performs basic operations such as insertion, look-up and removal in O(log n) amortized time. For random access patterns drawn from a non-uniform random distribution, their amortized time can be faster than logarithmic, proportional to the entropy of the access pattern. For many patterns of non-random operations, also, splay trees can take better than logarithmic time, without requiring advance knowledge of the pattern. According to the unproven dynamic optimality conjecture, their performance on all access patterns is within a constant factor of the best possible performance that could be achieved by any other self-adjusting binary search tree, even one selected to fit that pattern. The splay tree was invented by Daniel Sleator and Robert Tarjan in 1985.
The bin packing problem is an optimization problem, in which items of different sizes must be packed into a finite number of bins or containers, each of a fixed given capacity, in a way that minimizes the number of bins used. The problem has many applications, such as filling up containers, loading trucks with weight capacity constraints, creating file backups in media, and technology mapping in FPGA semiconductor chip design.
In optimization theory, maximum flow problems involve finding a feasible flow through a flow network that obtains the maximum possible flow rate.
A fully polynomial-time approximation scheme (FPTAS) is an algorithm for finding approximate solutions to function problems, especially optimization problems. An FPTAS takes as input an instance of the problem and a parameter ε > 0. It returns as output a value which is at least times the correct value, and at most times the correct value.
The strip packing problem is a 2-dimensional geometric minimization problem. Given a set of axis-aligned rectangles and a strip of bounded width and infinite height, determine an overlapping-free packing of the rectangles into the strip, minimizing its height. This problem is a cutting and packing problem and is classified as an Open Dimension Problem according to Wäscher et al.
In computer science, multiway number partitioning is the problem of partitioning a multiset of numbers into a fixed number of subsets, such that the sums of the subsets are as similar as possible. It was first presented by Ronald Graham in 1969 in the context of the identical-machines scheduling problem. The problem is parametrized by a positive integer k, and called k-way number partitioning. The input to the problem is a multiset S of numbers, whose sum is k*T.
In the bin covering problem, items of different sizes must be packed into a finite number of bins or containers, each of which must contain at least a certain given total size, in a way that maximizes the number of bins used.
The multiple subset sum problem is an optimization problem in computer science and operations research. It is a generalization of the subset sum problem. The input to the problem is a multiset of n integers and a positive integer m representing the number of subsets. The goal is to construct, from the input integers, some m subsets. The problem has several variants:
Next-fit is an online algorithm for bin packing. Its input is a list of items of different sizes. Its output is a packing - a partition of the items into bins of fixed capacity, such that the sum of sizes of items in each bin is at most the capacity. Ideally, we would like to use as few bins as possible, but minimizing the number of bins is an NP-hard problem. The next-fit algorithm uses the following heuristic:
Best-fit is an online algorithm for bin packing. Its input is a list of items of different sizes. Its output is a packing - a partition of the items into bins of fixed capacity, such that the sum of sizes of items in each bin is at most the capacity. Ideally, we would like to use as few bins as possible, but minimizing the number of bins is an NP-hard problem. The best-fit algorithm uses the following heuristic:
First-fit (FF) is an online algorithm for bin packing. Its input is a list of items of different sizes. Its output is a packing - a partition of the items into bins of fixed capacity, such that the sum of sizes of items in each bin is at most the capacity. Ideally, we would like to use as few bins as possible, but minimizing the number of bins is an NP-hard problem. The first-fit algorithm uses the following heuristic:
Harmonic bin-packing is a family of online algorithms for bin packing. The input to such an algorithm is a list of items of different sizes. The output is a packing - a partition of the items into bins of fixed capacity, such that the sum of sizes of items in each bin is at most the capacity. Ideally, we would like to use as few bins as possible, but minimizing the number of bins is an NP-hard problem.
First-fit-decreasing (FFD) is an algorithm for bin packing. Its input is a list of items of different sizes. Its output is a packing - a partition of the items into bins of fixed capacity, such that the sum of sizes of items in each bin is at most the capacity. Ideally, we would like to use as few bins as possible, but minimizing the number of bins is an NP-hard problem, so we use an approximately-optimal heuristic.
Longest-processing-time-first (LPT) is a greedy algorithm for job scheduling. The input to the algorithm is a set of jobs, each of which has a specific processing-time. There is also a number m specifying the number of machines that can process the jobs. The LPT algorithm works as follows:
Balanced number partitioning is a variant of multiway number partitioning in which there are constraints on the number of items allocated to each set. The input to the problem is a set of n items of different sizes, and two integers m, k. The output is a partition of the items into m subsets, such that the number of items in each subset is at most k. Subject to this, it is required that the sums of sizes in the m subsets are as similar as possible.
The configuration linear program (configuration-LP) is a linear programming technique used for solving combinatorial optimization problems. It was introduced in the context of the cutting stock problem. Later, it has been applied to the bin packing and job scheduling problems. In the configuration-LP, there is a variable for each possible configuration - each possible multiset of items that can fit in a single bin. Usually, the number of configurations is exponential in the problem size, but in some cases it is possible to attain approximate solutions using only a polynomial number of configurations.
High-multiplicity bin packing is a special case of the bin packing problem, in which the number of different item-sizes is small, while the number of items with each size is large. While the general bin-packing problem is NP-hard, the high-multiplicity setting can be solved in polynomial time, assuming that the number of different sizes is a fixed constant.
The Karmarkar–Karp (KK) bin packing algorithms are several related approximation algorithm for the bin packing problem. The bin packing problem is a problem of packing items of different sizes into bins of identical capacity, such that the total number of bins is as small as possible. Finding the optimal solution is computationally hard. Karmarkar and Karp devised an algorithm that runs in polynomial time and finds a solution with at most bins, where OPT is the number of bins in the optimal solution. They also devised several other algorithms with slightly different approximation guarantees and run-time bounds.
{{cite journal}}
: CS1 maint: multiple names: authors list (link)