Pilot job

Last updated October 03, 2025

In computer science, a pilot job is a type of multilevel scheduling, in which a resource is acquired by an application so that the application can schedule work into that resource directly, rather than going through a local job scheduler, which might lead to queue waits for each work unit. This term comes from the Condor High-Throughput Computing System, in which Condor GlideIns^[1] provides this functionality. Other examples of pilot jobs are: the BigJob implemented in SAGA,^[2] Swift Coasters as part of the Swift ^[3] parallel scripting system, the Falkon^[4] lightweight task execution framework, and HTCaaS.^[5]

Pilot jobs are most often used on systems that have queues, as part of their purpose is, in some sense, to avoid multiple waits in these queues. These are most often found in parallel computing systems, but pilot jobs are usually part of a distributed application, and are many times associated with Many-task computing.

References

↑ Sfiligoi, I. (2008). "glideinWMS—a generic pilot-based workload management system". J. Phys.: Conf. Ser. 119 (6) 062044. Bibcode:2008JPhCS.119f2044S. doi: 10.1088/1742-6596/119/6/062044 .
↑ Luckow, André; Lacinski, Lukasz; Jha, Shantenu (2010). "SAGA BigJob: An Extensible and Interoperable Pilot-Job Abstraction for Distributed Applications and Systems". 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing. pp. 135–144. doi:10.1109/CCGRID.2010.91. ISBN 978-1-4244-6987-1. S2CID 10132313.
↑ Wilde, Michael; et al. (2011). "Swift: A language for distributed parallel scripting". Parallel Computing. 37 (9): 633–652. CiteSeerX 10.1.1.658.8990 . doi:10.1016/j.parco.2011.05.005.
↑ I. Raicu, Y. Zhao, C. Dumitrescu, I. Foster, M. Wilde. "Falkon: A Fast and Lightweight Task Execution Framework," IEEE/ACM SC, 2007, http://www.cs.iit.edu/~iraicu/research/publications/2007_SC07_Falkon.pdf
↑ Jik-Soo Kim, Seungwoo Rho, Seoyoung Kim, Sangwan Kim, Seokkyoo Kim, and Soonwook Hwang, HTCaaS: Leveraging Distributed Supercomputing Infrastructures for Large-Scale Scientific Computing, ACM 6th Workshop on Many-Task Computing on Clouds, Grids, and Supercomputers (MTAGS'13) held with SC13, November 2013, http://datasys.cs.iit.edu/events/MTAGS13/p02.pdf

External links

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Sfiligoi, I. (2008). "glideinWMS—a generic pilot-based workload management system". J. Phys.: Conf. Ser. 119 (6) 062044. Bibcode:2008JPhCS.119f2044S. doi: 10.1088/1742-6596/119/6/062044 .

[2] Luckow, André; Lacinski, Lukasz; Jha, Shantenu (2010). "SAGA BigJob: An Extensible and Interoperable Pilot-Job Abstraction for Distributed Applications and Systems". 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing. pp. 135–144. doi:10.1109/CCGRID.2010.91. ISBN 978-1-4244-6987-1. S2CID 10132313.

[3] Wilde, Michael; et al. (2011). "Swift: A language for distributed parallel scripting". Parallel Computing. 37 (9): 633–652. CiteSeerX 10.1.1.658.8990 . doi:10.1016/j.parco.2011.05.005.

[4] I. Raicu, Y. Zhao, C. Dumitrescu, I. Foster, M. Wilde. "Falkon: A Fast and Lightweight Task Execution Framework," IEEE/ACM SC, 2007, http://www.cs.iit.edu/~iraicu/research/publications/2007_SC07_Falkon.pdf

[5] Jik-Soo Kim, Seungwoo Rho, Seoyoung Kim, Sangwan Kim, Seokkyoo Kim, and Soonwook Hwang, HTCaaS: Leveraging Distributed Supercomputing Infrastructures for Large-Scale Scientific Computing, ACM 6th Workshop on Many-Task Computing on Clouds, Grids, and Supercomputers (MTAGS'13) held with SC13, November 2013, http://datasys.cs.iit.edu/events/MTAGS13/p02.pdf

[1]

[2]

[3]

[4]

[5]