List scheduling

Last updated

List scheduling is a greedy algorithm for Identical-machines scheduling. The input to this algorithm is a list of jobs that should be executed on a set of m machines. The list is ordered in a fixed order, which can be determined e.g. by the priority of executing the jobs, or by their order of arrival. The algorithm repeatedly executes the following steps until a valid schedule is obtained:

Contents

Example

Suppose there are five jobs with processing-times {4,5,6,7,8}, and m=2 processors. Then, the resulting schedule is {4,6,8}, {5,7}, and the makespan is max(18,12)=18; if m=3, then the resulting schedule is {4,7}, {5,8}, {6}, and the makespan is max(11,13,6)=13.

Performance guarantee

The algorithm runs in time , where n is the number of jobs. The algorithm always returns a partition of the jobs whose makespan is at most times the optimal makespan. [1] This is due to the fact that both the length of the longest job and the average length of all jobs are lower bounds for the optimal makespan. The algorithm can be used as an online algorithm, when the order in which the items arrive cannot be controlled.

Ordering strategies

Instead of using an arbitrary order, one can pre-order the jobs in order to attain better guarantees. Some known list scheduling strategies are: [2]

Anomalies

The list scheduling algorithm has several anomalies. [1] Suppose there are m=3 machines, and the job lengths are:

3, 2, 2, 2, 4, 4, 4, 4, 9

Further, suppose that all the "4" jobs must be executed after the fourth "2" job. Then, list scheduling returns the following schedule:

and the makespan is 12.

Anomaly 1. If the "4" jobs do not depend on previous jobs anymore, then the list schedule is:

and the makespan is 16. Removing dependencies has enlarged the makespan.

Anomaly 2. Suppose the job lengths decrease by 1, to 2, 1, 1, 1, 3, 3, 3, 3, 8 (with the original dependencies). Then, the list schedule is:

and the makespan is 13. Shortening all jobs has enlarged the makespan.

Anomaly 3. Suppose there is one more machine (with the original lengths, with or without dependencies). Then, the list schedule is:

and the makespan is 15. Adding a machine has enlarged the makespan.

The anomalies are bounded as follows. Suppose initially we had m1 machines and the makespan was t1. Now, we have m2 machines, the dependencies are the same or relaxed, the job lengths are the same or shorter, the list is the same or different, and the makespan is t2. Then: [1] [3]

.

In particular, with the same number of machines, the ratio is . A special case is when the original schedule is optimal; this yields the bound on the approximation ratio.

Related Research Articles

In computer science, a topological sort or topological ordering of a directed graph is a linear ordering of its vertices such that for every directed edge uv from vertex u to vertex v, u comes before v in the ordering. For instance, the vertices of the graph may represent tasks to be performed, and the edges may represent constraints that one task must be performed before another; in this application, a topological ordering is just a valid sequence for the tasks. Precisely, a topological sort is a graph traversal in which each node v is visited only after all its dependencies are visited. A topological ordering is possible if and only if the graph has no directed cycles, that is, if it is a directed acyclic graph (DAG). Any DAG has at least one topological ordering, and algorithms are known for constructing a topological ordering of any DAG in linear time. Topological sorting has many applications especially in ranking problems such as feedback arc set. Topological sorting is possible even when the DAG has disconnected components.

Uniform machine scheduling is an optimization problem in computer science and operations research. It is a variant of optimal job scheduling. We are given n jobs J1, J2, ..., Jn of varying processing times, which need to be scheduled on m different machines. The goal is to minimize the makespan - the total time required to execute the schedule. The time that machine i needs in order to process job j is denoted by pi,j. In the general case, the times pi,j are unrelated, and any matrix of positive processing times is possible. In the specific variant called uniform machine scheduling, some machines are uniformly faster than others. This means that, for each machine i, there is a speed factor si, and the run-time of job j on machine i is pi,j = pj / si.

Single-machine scheduling or single-resource scheduling is an optimization problem in computer science and operations research. We are given n jobs J1, J2, ..., Jn of varying processing times, which need to be scheduled on a single machine, in a way that optimizes a certain objective, such as the throughput.

Optimal job scheduling is a class of optimization problems related to scheduling. The inputs to such problems are a list of jobs and a list of machines. The required output is a schedule – an assignment of jobs to machines. The schedule should optimize a certain objective function. In the literature, problems of optimal job scheduling are often called machine scheduling, processor scheduling, multiprocessor scheduling, or just scheduling.

Interval scheduling is a class of problems in computer science, particularly in the area of algorithm design. The problems consider a set of tasks. Each task is represented by an interval describing the time in which it needs to be processed by some machine. For instance, task A might run from 2:00 to 5:00, task B might run from 4:00 to 10:00 and task C might run from 9:00 to 11:00. A subset of intervals is compatible if no two intervals overlap on the machine/resource. For example, the subset {A,C} is compatible, as is the subset {B}; but neither {A,B} nor {B,C} are compatible subsets, because the corresponding intervals within each subset overlap.

Job-shop scheduling, the job-shop problem (JSP) or job-shop scheduling problem (JSSP) is an optimization problem in computer science and operations research. It is a variant of optimal job scheduling. In a general job scheduling problem, we are given n jobs J1J2, ..., Jn of varying processing times, which need to be scheduled on m machines with varying processing power, while trying to minimize the makespan – the total length of the schedule. In the specific variant known as job-shop scheduling, each job consists of a set of operationsO1O2, ..., On which need to be processed in a specific order. Each operation has a specific machine that it needs to be processed on and only one operation in a job can be processed at a given time. A common relaxation is the flexible job shop, where each operation can be processed on any machine of a given set.

The Price of Anarchy (PoA) is a concept in economics and game theory that measures how the efficiency of a system degrades due to selfish behavior of its agents. It is a general notion that can be extended to diverse systems and notions of efficiency. For example, consider the system of transportation of a city and many agents trying to go from some initial location to a destination. Let efficiency in this case mean the average time for an agent to reach the destination. In the 'centralized' solution, a central authority can tell each agent which path to take in order to minimize the average travel time. In the 'decentralized' version, each agent chooses its own path. The Price of Anarchy measures the ratio between average travel time in the two cases.

Flow-shop scheduling is an optimization problem in computer science and operations research. It is a variant of optimal job scheduling. In a general job-scheduling problem, we are given n jobs J1J2, ..., Jn of varying processing times, which need to be scheduled on m machines with varying processing power, while trying to minimize the makespan – the total length of the schedule. In the specific variant known as flow-shop scheduling, each job contains exactly m operations. The i-th operation of the job must be executed on the i-th machine. No machine can perform more than one operation simultaneously. For each operation of each job, execution time is specified.

<span class="mw-page-title-main">David Shmoys</span> American mathematician

David Bernard Shmoys is a Professor in the School of Operations Research and Information Engineering and the Department of Computer Science at Cornell University. He obtained his Ph.D. from the University of California, Berkeley in 1984. His major focus has been in the design and analysis of algorithms for discrete optimization problems.

Open-shop scheduling or open-shop scheduling problem (OSSP) is an optimization problem in computer science and operations research. It is a variant of optimal job scheduling. In a general job-scheduling problem, we are given n jobs J1J2, ..., Jn of varying processing times, which need to be scheduled on m machines with varying processing power, while trying to minimize the makespan - the total length of the schedule. In the specific variant known as open-shop scheduling, each job consists of a set of operationsO1O2, ..., On which need to be processed in an arbitrary order. The problem was first studied by Teofilo F. Gonzalez and Sartaj Sahni in 1976.

In computer science, the analysis of parallel algorithms is the process of finding the computational complexity of algorithms executed in parallel – the amount of time, storage, or other resources needed to execute them. In many respects, analysis of parallel algorithms is similar to the analysis of sequential algorithms, but is generally more involved because one must reason about the behavior of multiple cooperating threads of execution. One of the primary goals of parallel analysis is to understand how a parallel algorithm's use of resources changes as the number of processors is changed.

Truthful job scheduling is a mechanism design variant of the job shop scheduling problem from operations research.

Parallel task scheduling is an optimization problem in computer science and operations research. It is a variant of optimal job scheduling. In a general job scheduling problem, we are given n jobs J1J2, ..., Jn of varying processing times, which need to be scheduled on m machines while trying to minimize the makespan - the total length of the schedule. In the specific variant known as parallel-task scheduling, all machines are identical. Each job j has a length parameter pj and a size parameter qj, and it must run for exactly pj time-steps on exactly qj machines in parallel.

In computer science, multiway number partitioning is the problem of partitioning a multiset of numbers into a fixed number of subsets, such that the sums of the subsets are as similar as possible. It was first presented by Ronald Graham in 1969 in the context of the Identical-machines scheduling problem. The problem is parametrized by a positive integer k, and called k-way number partitioning. The input to the problem is a multiset S of numbers, whose sum is k*T.

In computer science, greedy number partitioning is a class of greedy algorithms for multiway number partitioning. The input to the algorithm is a set S of numbers, and a parameter k. The required output is a partition of S into k subsets, such that the sums in the subsets are as nearly equal as possible. Greedy algorithms process the numbers sequentially, and insert the next number into a bin in which the sum of numbers is currently smallest.

The multifit algorithm is an algorithm for multiway number partitioning, originally developed for the problem of identical-machines scheduling. It was developed by Coffman, Garey and Johnson. Its novelty comes from the fact that it uses an algorithm for another famous problem - the bin packing problem - as a subroutine.

Identical-machines scheduling is an optimization problem in computer science and operations research. We are given n jobs J1, J2, ..., Jn of varying processing times, which need to be scheduled on m identical machines, such that a certain objective function is optimized, for example, the makespan is minimized.

Unrelated-machines scheduling is an optimization problem in computer science and operations research. It is a variant of optimal job scheduling. We need to schedule n jobs J1, J2, ..., Jn on m different machines, such that a certain objective function is optimized. The time that machine i needs in order to process job j is denoted by pi,j. The term unrelated emphasizes that there is no relation between values of pi,j for different i and j. This is in contrast to two special cases of this problem: uniform-machines scheduling - in which pi,j = pi / sj, and identical-machines scheduling - in which pi,j = pi.

Longest-processing-time-first (LPT) is a greedy algorithm for job scheduling. The input to the algorithm is a set of jobs, each of which has a specific processing-time. There is also a number m specifying the number of machines that can process the jobs. The LPT algorithm works as follows:

  1. Order the jobs by descending order of their processing-time, such that the job with the longest processing time is first.
  2. Schedule each job in this sequence into a machine in which the current load is smallest.

Balanced number partitioning is a variant of multiway number partitioning in which there are constraints on the number of items allocated to each set. The input to the problem is a set of n items of different sizes, and two integers mk. The output is a partition of the items into m subsets, such that the number of items in each subset is at most k. Subject to this, it is required that the sums of sizes in the m subsets are as similar as possible.

References

  1. 1 2 3 Graham, Ron L. (1969-03-01). "Bounds on Multiprocessing Timing Anomalies". SIAM Journal on Applied Mathematics. 17 (2): 416–429. doi:10.1137/0117039. ISSN   0036-1399.
  2. Micheli, Giovanni De (1994). Synthesis and optimization of digital circuits. New York: McGraw-Hill. ISBN   978-0070163331.
  3. Graham, Ron L. (1966). "Bounds for Certain Multiprocessing Anomalies". Bell System Technical Journal. 45 (9): 1563–1581. doi:10.1002/j.1538-7305.1966.tb01709.x. ISSN   1538-7305.