Job scheduler

Last updated

A job scheduler is a computer application for controlling unattended background program execution of jobs. [1] This is commonly called batch scheduling, as execution of non-interactive jobs is often called batch processing, though traditional job and batch are distinguished and contrasted; see that page for details. Other synonyms include batch system, distributed resource management system (DRMS), distributed resource manager (DRM), and, commonly today, workload automation (WLA). The data structure of jobs to run is known as the job queue.

Contents

Modern job schedulers typically provide a graphical user interface and a single point of control for definition and monitoring of background executions in a distributed network of computers. Increasingly, job schedulers are required to orchestrate the integration of real-time business activities with traditional background IT processing across different operating system platforms and business application environments.

Job scheduling should not be confused with process scheduling, which is the assignment of currently running processes to CPUs by the operating system.

Overview

Basic features expected of job scheduler software include:

If software from a completely different area includes all or some of those features, this software can be considered to have job scheduling capabilities.

Most operating systems, such as Unix and Windows, provide basic job scheduling capabilities, notably by at and batch, cron, and the Windows Task Scheduler. Web hosting services provide job scheduling capabilities through a control panel or a webcron solution. Many programs such as DBMS, backup, ERPs, and BPM also include relevant job-scheduling capabilities. Operating system ("OS") or point program supplied job-scheduling will not usually provide the ability to schedule beyond a single OS instance or outside the remit of the specific program. Organizations needing to automate unrelated IT workload may also leverage further advanced features from a job scheduler, such as:

These advanced capabilities can be written by in-house developers but are more often provided by suppliers who specialize in systems-management software.

Main concepts

There are many concepts that are central to almost every job scheduler implementation and that are widely recognized with minimal variations: Jobs, Dependencies, Job Streams, and Users.

Beyond the basic, single OS instance scheduling tools there are two major architectures that exist for Job Scheduling software.

History

Job Scheduling has a long history. Job Schedulers have been one of the major components of IT infrastructure since the early mainframe systems. At first, stacks of punched cards were processed one after the other, hence the term "batch processing".

From a historical point of view, we can distinguish two main eras about Job Schedulers:

  1. The mainframe era
    • Job Control Language (JCL) on IBM mainframes. Initially based on JCL functionality to handle dependencies, this era is typified by the development of sophisticated scheduling solutions (such as Job Entry Subsystem 2/3) forming part of the systems management and automation toolset on the mainframe.
  2. The open systems era
    • Modern schedulers on a variety of architectures and operating systems. With standard scheduling tools limited to commands such as at and batch, the need for mainframe standard job schedulers has grown with the increased adoption of distributed computing environments.

In terms of the type of scheduling there are also distinct eras:

  1. Batch processing - the traditional date and time based execution of background tasks based on a defined period during which resources were available for batch processing (the batch window). In effect the original mainframe approach transposed onto the open systems environment.
  2. Event-driven process automation - where background processes cannot be simply run at a defined time, either because the nature of the business demands that workload is based on the occurrence of external events (such as the arrival of an order from a customer or a stock update from a store branch), or because there is no / insufficient batch window.
  3. Service Oriented job scheduling - recent developments in Service Oriented Architecture (SOA) have seen a move towards deploying job scheduling as a reusable IT infrastructure service that can play a role in the integration of existing business application workload with new Web Services based real-time applications.

Scheduling

Various schemes are used to decide which particular job to run. Parameters that might be considered include:

See also

Related Research Articles

<span class="mw-page-title-main">Client–server model</span> Distributed application structure in computing

The client–server model is a distributed application structure that partitions tasks or workloads between the providers of a resource or service, called servers, and service requesters, called clients. Often clients and servers communicate over a computer network on separate hardware, but both client and server may reside in the same system. A server host runs one or more server programs, which share their resources with clients. A client usually does not share any of its resources, but it requests content or service from a server. Clients, therefore, initiate communication sessions with servers, which await incoming requests. Examples of computer applications that use the client–server model are email, network printing, and the World Wide Web.

<span class="mw-page-title-main">Mainframe computer</span> Large computer

A mainframe computer, informally called a mainframe or big iron, is a computer used primarily by large organizations for critical applications like bulk data processing for tasks such as censuses, industry and consumer statistics, enterprise resource planning, and large-scale transaction processing. A mainframe computer is large but not as large as a supercomputer and has more processing power than some other classes of computers, such as minicomputers, servers, workstations, and personal computers. Most large-scale computer-system architectures were established in the 1960s, but they continue to evolve. Mainframe computers are often used as servers.

Computerized batch processing is a method of running software programs called jobs in batches automatically. While users are required to submit the jobs, no other interaction by the user is required to process the batch. Batches may automatically be run at scheduled times as well as being run contingent on the availability of computer resources.

<span class="mw-page-title-main">History of operating systems</span> Aspect of computing history

Computer operating systems (OSes) provide a set of functions needed and used by most application programs on a computer, and the links needed to control and synchronize computer hardware. On the first computers, with no operating system, every program needed the full hardware specification to run correctly and perform standard tasks, and its own drivers for peripheral devices like printers and punched paper card readers. The growing complexity of hardware and application programs eventually made operating systems a necessity for everyday use.

at (command) Task scheduling command on various operating systems

In computing, at is a command in Unix-like operating systems, Microsoft Windows, and ReactOS used to schedule commands to be executed once, at a particular time in the future.

In computing, scheduling is the action of assigning resources to perform tasks. The resources may be processors, network links or expansion cards. The tasks may be threads, processes or data flows.

The Job Entry Subsystem (JES) is a component of IBM's MVS mainframe operating systems that is responsible for managing batch workloads. In modern times, there are two distinct implementations of the Job Entry System called JES2 and JES3. They are designed to provide efficient execution of batch jobs.

HTCondor is an open-source high-throughput computing software framework for coarse-grained distributed parallelization of computationally intensive tasks. It can be used to manage workload on a dedicated cluster of computers, or to farm out work to idle desktop computers – so-called cycle scavenging. HTCondor runs on Linux, Unix, Mac OS X, FreeBSD, and Microsoft Windows operating systems. HTCondor can integrate both dedicated resources and non-dedicated desktop machines into one computing environment.

<span class="mw-page-title-main">Task (computing)</span> Unit of execution or work in software

In computing, a task is a unit of execution or a unit of work. The term is ambiguous; precise alternative terms include process, light-weight process, thread, step, request, or query. In the adjacent diagram, there are queues of incoming work to do and outgoing completed work, and a thread pool of threads to perform this work. Either the work units themselves or the threads that perform the work can be referred to as "tasks", and these can be referred to respectively as requests/responses/threads, incoming tasks/completed tasks/threads, or requests/responses/tasks.

The System Display and Search Facility (SDSF) component of IBM's mainframe operating system, z/OS, is an interactive user interface that allows users and administrators to view and control various aspects of the mainframe's operation and system resources. Some of the information displayed in SDSF includes Batch job output, Unix processes, scheduling environments, and the status of external devices such as printers and network lines. SDSF is primarily used to access batch and system log files and dumps.

The Terascale Open-source Resource and Queue Manager (TORQUE) is a distributed resource manager providing control over batch jobs and distributed compute nodes. TORQUE can integrate with the non-commercial Maui Cluster Scheduler or the commercial Moab Workload Manager to improve overall utilization, scheduling and administration on a cluster.

In system software, a job queue, is a data structure maintained by job scheduler software containing jobs to run.

In a non-interactive computer system, particularly IBM mainframes, a job stream, jobstream, or simply job is the sequence of job control language statements (JCL) and data that comprise a single "unit of work for an operating system". The term job traditionally means a one-off piece of work, and is contrasted with a batch, but non-interactive computation has come to be called "batch processing", and thus a unit of batch processing is often called a job, or by the oxymoronic term batch job; see job for details. Performing a job consists of executing one or more programs. Each program execution, called a job step, jobstep, or step, is usually related in some way to the others in the job. Steps in a job are executed sequentially, possibly depending on the results of previous steps, particularly in batch processing.

In computing job control refers to the control of multiple tasks or jobs on a computer system, ensuring that they each have access to adequate resources to perform correctly, that competition for limited resources does not cause a deadlock where two or more jobs are unable to complete, resolving such situations where they do occur, and terminating jobs that, for any reason, are not performing as expected.

gLite Grid computing software

gLite is a middleware computer software project for grid computing used by the CERN LHC experiments and other scientific domains. It was implemented by collaborative efforts of more than 80 people in 12 different academic and industrial research centers in Europe. gLite provides a framework for building applications tapping into distributed computing and storage resources across the Internet. The gLite services were adopted by more than 250 computing centres, and used by more than 15000 researchers in Europe and around the world.

OS 2200 is the operating system for the Unisys ClearPath Dorado family of mainframe systems. The operating system kernel of OS 2200 is a lineal descendant of Exec 8 for the UNIVAC 1108. Documentation and other information on current and past Unisys systems can be found on the Unisys public support website.

Advanced Systems Concepts, Inc. (ASCI) provides job scheduling, scripting and command language, and data replication and recovery software. Founded in 1981 in Hoboken, the company is now based in Morristown, New Jersey. Initially, the company was focused on the development of products for former Digital Equipment Corporation's (DEC) OpenVMS operating system (OS) product; now they can be used across different platforms and technologies, including Microsoft Windows, Linux, UNIX, and OpenVMS. Its products include ActiveBatch, XLNT, and RemoteSHADOW.

<span class="mw-page-title-main">Slurm Workload Manager</span> Free and open-source job scheduler for Linux and similar computers

The Slurm Workload Manager, formerly known as Simple Linux Utility for Resource Management (SLURM), or simply Slurm, is a free and open-source job scheduler for Linux and Unix-like kernels, used by many of the world's supercomputers and computer clusters.

Univa Grid Engine (UGE) is a batch-queuing system, forked from Sun Grid Engine (SGE). The software schedules resources in a data center applying user-configurable policies to help improve resource sharing and throughput by maximizing resource utilization. The product can be deployed to run on-premises, using IaaS cloud computing or in a hybrid cloud environment.

References