List scheduling

Last updated August 14, 2024

List scheduling is a greedy algorithm for Identical-machines scheduling. The input to this algorithm is a list of jobs that should be executed on a set of m machines. The list is ordered in a fixed order, which can be determined e.g. by the priority of executing the jobs, or by their order of arrival. The algorithm repeatedly executes the following steps until a valid schedule is obtained:

Example

Suppose there are five jobs with processing-times {4,5,6,7,8}, and m=2 processors. Then, the resulting schedule is {4,6,8}, {5,7}, and the makespan is max(18,12)=18; if m=3, then the resulting schedule is {4,7}, {5,8}, {6}, and the makespan is max(11,13,6)=13.

Performance guarantee

The algorithm runs in time $O(n)$ , where n is the number of jobs. The algorithm always returns a partition of the jobs whose makespan is at most $2-1/m$ times the optimal makespan.^[1] This is due to the fact that both the length of the longest job and the average length of all jobs are lower bounds for the optimal makespan. The algorithm can be used as an online algorithm, when the order in which the items arrive cannot be controlled.

Ordering strategies

Instead of using an arbitrary order, one can pre-order the jobs in order to attain better guarantees. Some known list scheduling strategies are:^[2]

Highest level first algorithm, or HLF;
Longest path algorithm or LP;
Longest-processing-time-first scheduling, or LPT; this variant decreases the approximation ratio to ${\frac {4}{3}}-{\frac {1}{3k}}$ .
Critical path method.
Heterogeneous Earliest Finish Time or HEFT. For the case heterogeneous workers.

Anomalies

The list scheduling algorithm has several anomalies.^[1] Suppose there are m=3 machines, and the job lengths are:

3, 2, 2, 2, 4, 4, 4, 4, 9

Further, suppose that all the "4" jobs must be executed after the fourth "2" job. Then, list scheduling returns the following schedule:

3, 9
2, 2, 4, 4
2, [2 idle], 4, 4

and the makespan is 12.

Anomaly 1. If the "4" jobs do not depend on previous jobs anymore, then the list schedule is:

3, 4, 9
2, 2, 4
2, 4, 4

and the makespan is 16. Removing dependencies has enlarged the makespan.

Anomaly 2. Suppose the job lengths decrease by 1, to 2, 1, 1, 1, 3, 3, 3, 3, 8 (with the original dependencies). Then, the list schedule is:

2, 3, 3
1, 1, 3, 8
1, [1 idle], 3

and the makespan is 13. Shortening all jobs has enlarged the makespan.

Anomaly 3. Suppose there is one more machine (with the original lengths, with or without dependencies). Then, the list schedule is:

3, 4
2, 4, 9
2, 4
2, 4

and the makespan is 15. Adding a machine has enlarged the makespan.

The anomalies are bounded as follows. Suppose initially we had m₁ machines and the makespan was t₁. Now, we have m₂ machines, the dependencies are the same or relaxed, the job lengths are the same or shorter, the list is the same or different, and the makespan is t_2. Then:^[1]^[3]

${\frac {t_{2}}{t_{1}}}\leq 1+{\frac {m_{1}-1}{m_{2}}}$ .

In particular, with the same number of machines, the ratio is $2-{\frac {1}{m}}$ . A special case is when the original schedule is optimal; this yields the bound $2-{\frac {1}{m}}$ on the approximation ratio.

Related Research Articles

The bin packing problem is an optimization problem, in which items of different sizes must be packed into a finite number of bins or containers, each of a fixed given capacity, in a way that minimizes the number of bins used. The problem has many applications, such as filling up containers, loading trucks with weight capacity constraints, creating file backups in media, splitting a network prefix into multiple subnets, and technology mapping in FPGA semiconductor chip design.

In computer science, a topological sort or topological ordering of a directed graph is a linear ordering of its vertices such that for every directed edge (u,v) from vertex u to vertex v, u comes before v in the ordering. For instance, the vertices of the graph may represent tasks to be performed, and the edges may represent constraints that one task must be performed before another; in this application, a topological ordering is just a valid sequence for the tasks. Precisely, a topological sort is a graph traversal in which each node v is visited only after all its dependencies are visited. A topological ordering is possible if and only if the graph has no directed cycles, that is, if it is a directed acyclic graph (DAG). Any DAG has at least one topological ordering, and algorithms are known for constructing a topological ordering of any DAG in linear time. Topological sorting has many applications, especially in ranking problems such as feedback arc set. Topological sorting is possible even when the DAG has disconnected components.

Uniform machine scheduling is an optimization problem in computer science and operations research. It is a variant of optimal job scheduling. We are given n jobs J₁, J₂, ..., J_n of varying processing times, which need to be scheduled on m different machines. The goal is to minimize the makespan - the total time required to execute the schedule. The time that machine i needs in order to process job j is denoted by p_i,j. In the general case, the times p_i,j are unrelated, and any matrix of positive processing times is possible. In the specific variant called uniform machine scheduling, some machines are uniformly faster than others. This means that, for each machine i, there is a speed factor s_i, and the run-time of job j on machine i is p_i,j = p_j / s_i.

Single-machine scheduling or single-resource scheduling is an optimization problem in computer science and operations research. We are given n jobs J₁, J₂, ..., J_n of varying processing times, which need to be scheduled on a single machine, in a way that optimizes a certain objective, such as the throughput.

Job-shop scheduling, the job-shop problem (JSP) or job-shop scheduling problem (JSSP) is an optimization problem in computer science and operations research. It is a variant of optimal job scheduling. In a general job scheduling problem, we are given n jobs J₁, J₂, ..., J_n of varying processing times, which need to be scheduled on m machines with varying processing power, while trying to minimize the makespan – the total length of the schedule. In the specific variant known as job-shop scheduling, each job consists of a set of operationsO₁, O₂, ..., O_n which need to be processed in a specific order. Each operation has a specific machine that it needs to be processed on and only one operation in a job can be processed at a given time. A common relaxation is the flexible job shop, where each operation can be processed on any machine of a given set.

The Price of Anarchy (PoA) is a concept in economics and game theory that measures how the efficiency of a system degrades due to selfish behavior of its agents. It is a general notion that can be extended to diverse systems and notions of efficiency. For example, consider the system of transportation of a city and many agents trying to go from some initial location to a destination. Here, efficiency means the average time for an agent to reach the destination. In the 'centralized' solution, a central authority can tell each agent which path to take in order to minimize the average travel time. In the 'decentralized' version, each agent chooses its own path. The Price of Anarchy measures the ratio between average travel time in the two cases.

Flow-shop scheduling is an optimization problem in computer science and operations research. It is a variant of optimal job scheduling. In a general job-scheduling problem, we are given n jobs J₁, J₂, ..., J_n of varying processing times, which need to be scheduled on m machines with varying processing power, while trying to minimize the makespan – the total length of the schedule. In the specific variant known as flow-shop scheduling, each job contains exactly m operations. The i-th operation of the job must be executed on the i-th machine. No machine can perform more than one operation simultaneously. For each operation of each job, execution time is specified.

David Bernard Shmoys is a Professor in the School of Operations Research and Information Engineering and the Department of Computer Science at Cornell University. He obtained his Ph.D. from the University of California, Berkeley in 1984. His major focus has been in the design and analysis of algorithms for discrete optimization problems.

In the mathematical modeling of job shop scheduling problems, disjunctive graphs are a way of modeling a system of tasks to be scheduled and timing constraints that must be respected by the schedule. They are mixed graphs, in which vertices may be connected by both directed and undirected edges. The two types of edges represent constraints of two different types:

In computer science, the analysis of parallel algorithms is the process of finding the computational complexity of algorithms executed in parallel – the amount of time, storage, or other resources needed to execute them. In many respects, analysis of parallel algorithms is similar to the analysis of sequential algorithms, but is generally more involved because one must reason about the behavior of multiple cooperating threads of execution. One of the primary goals of parallel analysis is to understand how a parallel algorithm's use of resources changes as the number of processors is changed.

Truthful job scheduling is a mechanism design variant of the job shop scheduling problem from operations research.

Parallel task scheduling is an optimization problem in computer science and operations research. It is a variant of optimal job scheduling. In a general job scheduling problem, we are given n jobs J₁, J₂, ..., J_n of varying processing times, which need to be scheduled on m machines while trying to minimize the makespan - the total length of the schedule. In the specific variant known as parallel-task scheduling, all machines are identical. Each job j has a length parameter p_j and a size parameter q_j, and it must run for exactly p_j time-steps on exactly q_j machines in parallel.

In computer science, multiway number partitioning is the problem of partitioning a multiset of numbers into a fixed number of subsets, such that the sums of the subsets are as similar as possible. It was first presented by Ronald Graham in 1969 in the context of the identical-machines scheduling problem. The problem is parametrized by a positive integer k, and called k-way number partitioning. The input to the problem is a multiset S of numbers, whose sum is k*T.

In computer science, greedy number partitioning is a class of greedy algorithms for multiway number partitioning. The input to the algorithm is a set S of numbers, and a parameter k. The required output is a partition of S into k subsets, such that the sums in the subsets are as nearly equal as possible. Greedy algorithms process the numbers sequentially, and insert the next number into a bin in which the sum of numbers is currently smallest.

The multifit algorithm is an algorithm for multiway number partitioning, originally developed for the problem of identical-machines scheduling. It was developed by Coffman, Garey and Johnson. Its novelty comes from the fact that it uses an algorithm for another famous problem - the bin packing problem - as a subroutine.

Identical-machines scheduling is an optimization problem in computer science and operations research. We are given n jobs J₁, J₂, ..., J_n of varying processing times, which need to be scheduled on m identical machines, such that a certain objective function is optimized, for example, the makespan is minimized.

Unrelated-machines scheduling is an optimization problem in computer science and operations research. It is a variant of optimal job scheduling. We need to schedule n jobs J₁, J₂, ..., J_n on m different machines, such that a certain objective function is optimized. The time that machine i needs in order to process job j is denoted by p_i,j. The term unrelated emphasizes that there is no relation between values of p_i,j for different i and j. This is in contrast to two special cases of this problem: uniform-machines scheduling - in which p_i,j = p_i / s_j, and identical-machines scheduling - in which p_i,j = p_i.

Longest-processing-time-first (LPT) is a greedy algorithm for job scheduling. The input to the algorithm is a set of jobs, each of which has a specific processing-time. There is also a number m specifying the number of machines that can process the jobs. The LPT algorithm works as follows:

Order the jobs by descending order of their processing-time, such that the job with the longest processing time is first.
Schedule each job in this sequence into a machine in which the current load is smallest.

Balanced number partitioning is a variant of multiway number partitioning in which there are constraints on the number of items allocated to each set. The input to the problem is a set of n items of different sizes, and two integers m, k. The output is a partition of the items into m subsets, such that the number of items in each subset is at most k. Subject to this, it is required that the sums of sizes in the m subsets are as similar as possible.

Fractional job scheduling is a variant of optimal job scheduling in which it is allowed to break jobs into parts and process each part separately on the same or a different machine. Breaking jobs into parts may allow for improving the overall performance, for example, decreasing the makespan. Moreover, the computational problem of finding an optimal schedule may become easier, as some of the optimization variables become continuous. On the other hand, breaking jobs apart might be costly.

References

1 2 3 Graham, Ron L. (1969-03-01). "Bounds on Multiprocessing Timing Anomalies" . SIAM Journal on Applied Mathematics. 17 (2): 416–429. doi:10.1137/0117039. ISSN 0036-1399.
↑ Micheli, Giovanni De (1994). Synthesis and optimization of digital circuits. New York: McGraw-Hill. ISBN 978-0070163331.
↑ Graham, Ron L. (1966). "Bounds for Certain Multiprocessing Anomalies". Bell System Technical Journal. 45 (9): 1563–1581. doi:10.1002/j.1538-7305.1966.tb01709.x. ISSN 1538-7305.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[:1-1] 1 2 3 Graham, Ron L. (1969-03-01). "Bounds on Multiprocessing Timing Anomalies" . SIAM Journal on Applied Mathematics. 17 (2): 416–429. doi:10.1137/0117039. ISSN 0036-1399.

[2] Micheli, Giovanni De (1994). Synthesis and optimization of digital circuits. New York: McGraw-Hill. ISBN 978-0070163331.

[:2-3] Graham, Ron L. (1966). "Bounds for Certain Multiprocessing Anomalies". Bell System Technical Journal. 45 (9): 1563–1581. doi:10.1002/j.1538-7305.1966.tb01709.x. ISSN 1538-7305.

[1]

[2]

[3]