Interval scheduling

Last updated April 07, 2024

Interval scheduling is a class of problems in computer science, particularly in the area of algorithm design. The problems consider a set of tasks. Each task is represented by an interval describing the time in which it needs to be processed by some machine (or, equivalently, scheduled on some resource). For instance, task A might run from 2:00 to 5:00, task B might run from 4:00 to 10:00 and task C might run from 9:00 to 11:00. A subset of intervals is compatible if no two intervals overlap on the machine/resource. For example, the subset {A,C} is compatible, as is the subset {B}; but neither {A,B} nor {B,C} are compatible subsets, because the corresponding intervals within each subset overlap.

Single-Interval Scheduling Maximization
Unweighted
Weighted
Group Interval Scheduling Decision
NP-complete when some groups contain 3 or more intervals
Polynomial when all groups contain at most 2 intervals
Group Interval Scheduling Maximization
MaxSNP-complete when some groups contain 2 or more intervals
Polynomial 2-approximation
LP-based approximation algorithms
Related problems
Variations
Sources

The interval scheduling maximization problem (ISMP) is to find a largest compatible set, i.e., a set of non-overlapping intervals of maximum size. The goal here is to execute as many tasks as possible, that is, to maximize the throughput. It is equivalent to finding a maximum independent set in an interval graph.

A generalization of the problem considers $k>1$ machines/resources.^[1] Here the goal is to find $k$ compatible subsets whose union is the largest.

In an upgraded version of the problem, the intervals are partitioned into groups. A subset of intervals is compatible if no two intervals overlap, and moreover, no two intervals belong to the same group (i.e., the subset contains at most a single representative of each group). Each group of intervals corresponds to a single task, and represents several alternative intervals in which it can be executed.

The group interval scheduling decision problem (GISDP) is to decide whether there exists a compatible set in which all groups are represented. The goal here is to execute a single representative task from each group. GISDPk is a restricted version of GISDP in which the number of intervals in each group is at most k.

The group interval scheduling maximization problem (GISMP) is to find a largest compatible set - a set of non-overlapping representatives of maximum size. The goal here is to execute a representative task from as many groups as possible. GISMPk is a restricted version of GISMP in which the number of intervals in each group is at most k. This problem is often called JISPk, where J stands for Job.

GISMP is the most general problem; the other two problems can be seen as special cases of it:

ISMP is the special case in which each task belongs to its own group (i.e. it is equal to GISMP1).
GISDP is the problem of deciding whether the maximum exactly equals the number of groups.

All these problems can be generalized by adding a weight for each interval, representing the profit from executing the task in that interval. Then, the goal is to maximize the total weight.

All these problems are special cases of single-machine scheduling, since they assume that all tasks must run on a single processor. Single-machine scheduling is a special case of optimal job scheduling.

Single-Interval Scheduling Maximization

Single-interval scheduling refers to creating an interval schedule in which no intervals overlap.

Unweighted

Several algorithms, that may look promising at first sight, actually do not find the optimal solution:^[2]

Selecting the intervals that start earliest is not an optimal solution, because if the earliest interval happens to be very long, accepting it would make us reject many other shorter requests.
Selecting the shortest intervals or selecting intervals with the fewest conflicts is also not optimal.

The following greedy algorithm, called Earliest deadline first scheduling, does find the optimal solution for unweighted single-interval scheduling:

Select the interval, x, with the earliest finishing time.
Remove x, and all intervals intersecting x, from the set of candidate intervals.
Repeat until the set of candidate intervals is empty.

Whenever we select an interval at step 1, we may have to remove many intervals in step 2. However, all these intervals necessarily cross the finishing time of x, and thus they all cross each other. Hence, at most 1 of these intervals can be in the optimal solution. Hence, for every interval in the optimal solution, there is an interval in the greedy solution. This proves that the greedy algorithm indeed finds an optimal solution.

A more formal explanation is given by a Charging argument.

The greedy algorithm can be executed in time O(n log n), where n is the number of tasks, using a preprocessing step in which the tasks are sorted by their finishing times.

Weighted

Problems involving weighted interval scheduling are equivalent to finding a maximum-weight independent set in an interval graph. Such problems can be solved in polynomial time.^[3]

Assuming the vectors are sorted from earliest to latest finish time, the following pseudocode determines the maximum weight of a single-interval schedule in Θ(n) time:

// The vectors are already sorted from earliest to latest finish time.intv[numOfVectors+1];// list of interval vectorsintw[numOfVectors+1];// w[j] is the weight for v[j].intp[numOfVectors+1];// p[j] is the # of vectors that end before v[j] begins.intM[numOfVectors+1];intfinalSchedule[];// v[0] does not exist, and the first interval vector is assigned to v[1].w[0]=0;p[0]=0;M[0]=0;// The following code determines the value of M for each vector.// The maximum weight of the schedule is equal to M[numOfVectors].for(inti=1;i<numOfVectors+1;i++){M[i]=max(w[i]+M[p[i]],M[i-1]);}// Function to construct the optimal scheduleschedule(j){if(j==0){return;}elseif(w[j]+M[p[j]]>=M[j-1]){prepend(v[j],finalSchedule);// prepends v[j] to schedule.schedule(p[j]);}else{schedule(j-1);}}

^[4]

Example

If we have the following 9 vectors sorted by finish time, with the weights above each corresponding interval, we can determine which of these vectors are included in our maximum weight schedule which only contains a subset of the following vectors.

Here, we input our final vector (where j=9 in this example) into our schedule function from the code block above. We perform the actions in the table below until j is set to 0, at which point, we only include into our final schedule the encountered intervals which met the ${\textstyle w[j]+M[p[j]]\geq M[j-1]}$ requirement. This final schedule is the schedule with the maximum weight.

j	Calculation	${\textstyle w[j]+M[p[j]]\geq M[j-1]}$ (i.e. This vector is included in the final schedule)	Set j to
9	${\begin{aligned}&w[j]+M[p[j]]=w[9]+M[p[9]]=5+M[6]=5+16=21\\&M[j-1]=M[9-1]=M[8]=20\end{aligned}}$	True	j=p[j]=p[9]=6
6	${\begin{aligned}&w[j]+M[p[j]]=w[6]+M[p[6]]=5+M[4]=5+11=16\\&M[j-1]=M[6-1]=M[5]=11\end{aligned}}$	True	j=p[j]=p[6]=4
4	${\begin{aligned}&w[j]+M[p[j]]=w[4]+M[p[4]]=3+M[1]=3+5=8\\&M[j-1]=M[4-1]=M[3]=11\\\end{aligned}}$	False	j=j-1=4-1=3
3	${\begin{aligned}&w[j]+M[p[j]]=w[3]+M[p[3]]=6+M[1]=6+5=11\\&M[j-1]=M[3-1]=M[2]=5\end{aligned}}$	True	j=p[j]=p[3]=1
1	${\begin{aligned}&w[j]+M[p[j]]=w[1]+M[p[1]]=5+M[0]=5+0=5\\&M[j-1]=M[1-1]=M[0]=0\end{aligned}}$	True	j=p[j]=p[1]=0

Group Interval Scheduling Decision

NP-complete when some groups contain 3 or more intervals

GISDPk is NP-complete when $k\geq 3$ ,^[5] even when all intervals have the same length.^[6] This can be shown by a reduction from the following version of the Boolean satisfiability problem, which was shown ^[7] to be NP-complete likewise to the unrestricted version.

Let

X=\{x_{1},x_{2},\dots ,x_{p}\}

be a set of Boolean variables. Let

C=\{c_{1},c_{2},\dots ,c_{q}\}

be a set of clauses over X such that (1) each clause in C has at most three literals and (2) each variable is restricted to appear once or twice positively and once negatively overall in C. Decide whether there is an assignment to variables of X such that each clause in C has at least one true literal.

Given an instance of this satisfiability problem, construct the following instance of GISDP. All intervals have a length of 3, so it is sufficient to represent each interval by its starting time:

For every variable $x_{i}$ (for i=1,...,p), create a group with two intervals: one starting at $50i-10$ (representing the assignment $x_{i}=\mathrm {false}$ ) and another starting at $50i+10$ (representing the assignment $x_{i}=\mathrm {true}$ ).
For every clause $c_{j}$ $Interval scheduling$ (for j=1,...,q), create a group with the following intervals:
- For every variable $x_{i}$ that appears positively for the first time in C— an interval starting at $50i-12$ .
- For every variable $x_{i}$ that appears positively for the second time in C— an interval starting at $50i-8$ . Note that both these intervals intersect the interval $50i-10$ , associated with $x_{i}=\mathrm {false}$ .
- For every variable $x_{i}$ that appears negatively - an interval starting at $50i+8$ . This interval intersects the interval $50i+10$ associated with $x_{i}={\text{true}}$ .

Note that there is no overlap between intervals in groups associated with different clauses. This is ensured since a variable appears at most twice positively and once negatively.

The constructed GISDP has a feasible solution (i.e. a scheduling in which each group is represented), if and only if the given set of boolean clauses has a satisfying assignment. Hence GISDP3 is NP-complete, and so is GISDPk for every $k\geq 3$ .

Polynomial when all groups contain at most 2 intervals

GISDP2 can be solved at polynomial time by the following reduction to the 2-satisfiability problem:^[6]

For every group i create two variables, representing its two intervals: $x_{i}$ and $y_{i}$ .
For every group i, create the clauses: $x_{i}\cup y_{i}$ and $\neg {x_{i}}\cup \neg {y_{i}}$ , which represent the assertion that exactly one of these two intervals should be selected.
For every two intersecting intervals (i.e. $x_{i}$ and $y_{j}$ ) create the clause: $\neg {x_{i}}\cup \neg {y_{j}}$ , which represent the assertion that at most one of these two intervals should be selected.

This construction contains at most O(n²) clauses (one for each intersection between intervals, plus two for each group). Each clause contains 2 literals. The satisfiability of such formulas can be decided in time linear in the number of clauses (see 2-SAT). Therefore, the GISDP2 can be solved in polynomial time.

Group Interval Scheduling Maximization

MaxSNP-complete when some groups contain 2 or more intervals

GISMPk is NP-complete even when $k\geq 2$ .^[8]

Moreover, GISMPk is MaxSNP-complete, i.e., it does not have a PTAS unless P=NP. This can be proved by showing an approximation-preserving reduction from MAX 3-SAT-3 to GISMP2.^[8]

Polynomial 2-approximation

The following greedy algorithm finds a solution that contains at least 1/2 of the optimal number of intervals:^[8]

Select the interval, x, with the earliest finishing time.
Remove x, and all intervals intersecting x, and all intervals in the same group of x, from the set of candidate intervals.
Continue until the set of candidate intervals is empty.

A formal explanation is given by a Charging argument.

The approximation factor of 2 is tight. For example, in the following instance of GISMP2:

Group #1: {[0..2], [4..6]}
Group #2: {[1..3]}

The greedy algorithm selects only 1 interval [0..2] from group #1, while an optimal scheduling is to select [1..3] from group #2 and then [4..6] from group #1.

A more general approximation algorithm attains a 2-factor approximation for the weighted case.^[3]

LP-based approximation algorithms

Using the technique of Linear programming relaxation, it is possible to approximate the optimal scheduling with slightly better approximation factors. The approximation ratio of the first such algorithm is asymptotically 2 when k is large, but when k=2 the algorithm achieves an approximation ratio of 5/3.^[8] The approximation factor for arbitrary k was later improved to 1.582.^[9]

Variations

An important class of scheduling algorithms is the class of dynamic priority algorithms. When none of the intervals overlap the optimum solution is trivial. The optimum for the non-weighted version can found with the earliest deadline first scheduling. Weighted interval scheduling is a generalization where a value is assigned to each executed task and the goal is to maximize the total value. The solution need not be unique.

The interval scheduling problem is 1-dimensional – only the time dimension is relevant. The Maximum disjoint set problem is a generalization to 2 or more dimensions. This generalization, too, is NP-complete.

Another variation is resource allocation, in which a set of intervals s are scheduled using resources k such that k is minimized. That is, all the intervals must be scheduled, but the objective is to minimize the usage of resources.

Another variation is when there are m processors instead of a single processor. I.e., m different tasks can run in parallel. See identical-machines scheduling.

Single-machine scheduling is also a very similar problem.

Sources

↑ Kolen, A. (2007). "Interval scheduling: A survey". Naval Research Logistics. 54 (5): 530–543. doi: 10.1002/nav.20231 . S2CID 15288326.
↑ Kleinberg, Jon; Tardos, Éva (2006). Algorithm Design . Pearson/Addison-Wesley. ISBN 978-0-321-29535-4.
1 2 Bar-Noy, Amotz; Bar-Yehuda, Reuven; Freund, Ari; (Seffi) Naor, Joseph; Schieber, Baruch (2001-09-01). "A unified approach to approximating resource allocation and scheduling". Journal of the ACM. 48 (5): 1069–1090. doi:10.1145/502102.502107. ISSN 0004-5411. S2CID 12329294.
↑ Kleinberg, Jon; Tardos, Eva (2006). Algorithm Design (1st ed.). Pearson. p. 254. ISBN 9780321295354.
↑ Nakajima, K.; Hakimi, S. L. (1982). "Complexity results for scheduling tasks with discrete starting times". Journal of Algorithms. 3 (4): 344. doi:10.1016/0196-6774(82)90030-X.
1 2 Mark Keil, J. (1992). "On the complexity of scheduling tasks with discrete starting times". Operations Research Letters. 12 (5): 293–295. doi:10.1016/0167-6377(92)90087-j.
↑ Papadimitriou, Christos H.; Steiglitz, Kenneth (July 1998). Combinatorial Optimization : Algorithms and Complexity. Dover. ISBN 978-0-486-40258-1.
1 2 3 4 Spieksma, F. C. R. (1999). "On the approximability of an interval scheduling problem". Journal of Scheduling. 2 (5): 215–227. CiteSeerX 10.1.1.603.5538 . doi:10.1002/(sici)1099-1425(199909/10)2:5<215::aid-jos27>3.0.co;2-y. citing Kolen in personal communication
↑ Chuzhoy, Julia; Ostrovsky, Rafail; Rabani, Yuval (2006). "Approximation Algorithms for the Job Interval Selection Problem and Related Scheduling Problems". Mathematics of Operations Research . 31 (4): 730–738. CiteSeerX 10.1.1.105.2578 . doi:10.1287/moor.1060.0218.

Related Research Articles

The knapsack problem is the following problem in combinatorial optimization:

The travelling salesman problem, also known as the travelling salesperson problem (TSP), asks the following question: "Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city exactly once and returns to the origin city?" It is an NP-hard problem in combinatorial optimization, important in theoretical computer science and operations research.

In computer science, 2-satisfiability, 2-SAT or just 2SAT is a computational problem of assigning values to variables, each of which has two possible values, in order to satisfy a system of constraints on pairs of variables. It is a special case of the general Boolean satisfiability problem, which can involve constraints on more than two variables, and of constraint satisfaction problems, which can allow more than two choices for the value of each variable. But in contrast to those more general problems, which are NP-complete, 2-satisfiability can be solved in polynomial time.

In graph theory, an independent set, stable set, coclique or anticlique is a set of vertices in a graph, no two of which are adjacent. That is, it is a set $of vertices such that for every two vertices in, there is no edge connecting the two. Equivalently, each edge in the graph has at most one endpoint in . A set is independent if and only if it is a clique in the graph's complement. The size of an independent set is the number of vertices it contains. Independent sets have also been called "internally stable sets", of which "stable set" is a shortening.$

<span class="mw-page-title-main">Perfect graph</span> Graph with tight clique-coloring relation

In graph theory, a perfect graph is a graph in which the chromatic number equals the size of the maximum clique, both in the graph itself and in every induced subgraph. In all graphs, the chromatic number is greater than or equal to the size of the maximum clique, but they can be far apart. A graph is perfect when these numbers are equal, and remain equal after the deletion of arbitrary subsets of vertices.

In combinatorics and computer science, covering problems are computational problems that ask whether a certain combinatorial structure 'covers' another, or how large the structure has to be to do that. Covering problems are minimization problems and usually integer linear programs, whose dual problems are called packing problems.

Set packing is a classical NP-complete problem in computational complexity theory and combinatorics, and was one of Karp's 21 NP-complete problems. Suppose one has a finite set S and a list of subsets of S. Then, the set packing problem asks if some k subsets in the list are pairwise disjoint.

A fully polynomial-time approximation scheme (FPTAS) is an algorithm for finding approximate solutions to function problems, especially optimization problems. An FPTAS takes as input an instance of the problem and a parameter ε > 0. It returns as output a value which is at least $times the correct value, and at most times the correct value.$

<span class="mw-page-title-main">Kőnig's theorem (graph theory)</span> Theorem showing that maximum matching and minimum vertex cover are equivalent for bipartite graphs

In the mathematical area of graph theory, Kőnig's theorem, proved by Dénes Kőnig, describes an equivalence between the maximum matching problem and the minimum vertex cover problem in bipartite graphs. It was discovered independently, also in 1931, by Jenő Egerváry in the more general case of weighted graphs.

In mathematics, the relaxation of a (mixed) integer linear program is the problem that arises by removing the integrality constraint of each variable.

In applied mathematics, the maximum generalized assignment problem is a problem in combinatorial optimization. This problem is a generalization of the assignment problem in which both tasks and agents have a size. Moreover, the size of each task might vary from one agent to the other.

The study of facility location problems (FLP), also known as location analysis, is a branch of operations research and computational geometry concerned with the optimal placement of facilities to minimize transportation costs while considering factors like avoiding placing hazardous materials near housing, and competitors' facilities. The techniques also apply to cluster analysis.

<span class="mw-page-title-main">Maximum cut</span> Problem of finding a maximum cut in a graph

In a graph, a maximum cut is a cut whose size is at least the size of any other cut. That is, it is a partition of the graph's vertices into two complementary sets $S$ and $T$ , such that the number of edges between $S$ and $T$ is as large as possible. Finding such a cut is known as the max-cut problem.

The maximum coverage problem is a classical question in computer science, computational complexity theory, and operations research. It is a problem that is widely taught in approximation algorithms.

David Bernard Shmoys is a Professor in the School of Operations Research and Information Engineering and the Department of Computer Science at Cornell University. He obtained his Ph.D. from the University of California, Berkeley in 1984. His major focus has been in the design and analysis of algorithms for discrete optimization problems.

In computational geometry, a maximum disjoint set (MDS) is a largest set of non-overlapping geometric shapes selected from a given set of candidate shapes.

In combinatorial optimization, the matroid parity problem is a problem of finding the largest independent set of paired elements in a matroid. The problem was formulated by Lawler (1976) as a common generalization of graph matching and matroid intersection. It is also known as polymatroid matching, or the matchoid problem.

In computer science, multiway number partitioning is the problem of partitioning a multiset of numbers into a fixed number of subsets, such that the sums of the subsets are as similar as possible. It was first presented by Ronald Graham in 1969 in the context of the identical-machines scheduling problem. The problem is parametrized by a positive integer k, and called k-way number partitioning. The input to the problem is a multiset S of numbers, whose sum is k*T.

Unrelated-machines scheduling is an optimization problem in computer science and operations research. It is a variant of optimal job scheduling. We need to schedule n jobs J₁, J₂, ..., J_n on m different machines, such that a certain objective function is optimized. The time that machine i needs in order to process job j is denoted by p_i,j. The term unrelated emphasizes that there is no relation between values of p_i,j for different i and j. This is in contrast to two special cases of this problem: uniform-machines scheduling - in which p_i,j = p_i / s_j, and identical-machines scheduling - in which p_i,j = p_i.

Balanced number partitioning is a variant of multiway number partitioning in which there are constraints on the number of items allocated to each set. The input to the problem is a set of n items of different sizes, and two integers m, k. The output is a partition of the items into m subsets, such that the number of items in each subset is at most k. Subject to this, it is required that the sums of sizes in the m subsets are as similar as possible.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[Survey-1] Kolen, A. (2007). "Interval scheduling: A survey". Naval Research Logistics. 54 (5): 530–543. doi: 10.1002/nav.20231 . S2CID 15288326.

[KleinbergTardos-2] Kleinberg, Jon; Tardos, Éva (2006). Algorithm Design . Pearson/Addison-Wesley. ISBN 978-0-321-29535-4.

[:0-3] 1 2 Bar-Noy, Amotz; Bar-Yehuda, Reuven; Freund, Ari; (Seffi) Naor, Joseph; Schieber, Baruch (2001-09-01). "A unified approach to approximating resource allocation and scheduling". Journal of the ACM. 48 (5): 1069–1090. doi:10.1145/502102.502107. ISSN 0004-5411. S2CID 12329294.

[4] Kleinberg, Jon; Tardos, Eva (2006). Algorithm Design (1st ed.). Pearson. p. 254. ISBN 9780321295354.

[NakajimaHakimi-5] Nakajima, K.; Hakimi, S. L. (1982). "Complexity results for scheduling tasks with discrete starting times". Journal of Algorithms. 3 (4): 344. doi:10.1016/0196-6774(82)90030-X.

[Keil-6] 1 2 Mark Keil, J. (1992). "On the complexity of scheduling tasks with discrete starting times". Operations Research Letters. 12 (5): 293–295. doi:10.1016/0167-6377(92)90087-j.

[7] Papadimitriou, Christos H.; Steiglitz, Kenneth (July 1998). Combinatorial Optimization : Algorithms and Complexity. Dover. ISBN 978-0-486-40258-1.

[Spieksma-8] 1 2 3 4 Spieksma, F. C. R. (1999). "On the approximability of an interval scheduling problem". Journal of Scheduling. 2 (5): 215–227. CiteSeerX 10.1.1.603.5538 . doi:10.1002/(sici)1099-1425(199909/10)2:5<215::aid-jos27>3.0.co;2-y. citing Kolen in personal communication

[ChuzoiEtAl-9] Chuzhoy, Julia; Ostrovsky, Rafail; Rabani, Yuval (2006). "Approximation Algorithms for the Job Interval Selection Problem and Related Scheduling Problems". Mathematics of Operations Research . 31 (4): 730–738. CiteSeerX 10.1.1.105.2578 . doi:10.1287/moor.1060.0218.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

v t e Optimal job scheduling problems
One-stage jobs	Single machine Identical machines Uniform machines Unrelated machines
Multi-stage jobs	Parallel tasks Open shop Flow shop Job shop
Optimization objectives	Makespan Earliness Lateness Tardiness Throughput
Other requirements	Interval scheduling Truthful job scheduling