jobpool


Download

You can download the latest version of jobpool (0.2) here. It is known to compile under Linux, FreeBSD, and MacOS X; if you find an OS for which it does not compile, let me know and I'll try to fix it.


Description

Jobpool is a shell script utility for running jobs in parallel. It manages a queue of jobs by ensuring that no more than J jobs run at one time.

I wrote it because I would often write shell scripts to resize or to convert the format of a large number of images - I wanted to run these jobs in parallel, but if they all ran at once my machine ran out of RAM.

A command is put under jobpool's control by prefixing it with "jobpool" - if used in a loop, many commands may be added to the queue and jobpool will ensure that at most J of them run at a time.

Jobpool has the following command switches:


Limitations and oddities

Jobpool is not a shell. For example,

  > jobpool ls * | wc -l
pipes the output of jobpool into wc; it does not run that entire command in parallel. In this situation I write small shell scripts and execute those using jobpool.

Jobpool does not automatically run in the background. This is because every instance of jobpool peeks at the job queue and attempts to run a job if fewer than J jobs are running - if an instance is unable to run a job, it will exit. Thus you cannot expect jobpool to exit immediately nor, unless given the -w switch, to wait until its job completes. (You can, however, expect at least one instance of jobpool to be running until the job queue is empty.)

Jobpool runs commands in the order they entered the queue. If you create a lot of jobpool instances using & (as in example 1), queue order might not be the same as the loop order, because some instances may run faster than others.


Examples

  1. Resizing a large number of images (using ImageMagick):
        > for image in *.jpg; do
        >   jobpool mogrify -scale 50\% $image &
        > done

    In this example, control of the shell will be returned immediately, and jobpool will run at most 2 instances of mogrify at a time.

    (Note how jobpool does not automatically run in the background - an ampersand is used to ensure this.)

  2. Suppose that the order in which task1 and task2 processes run is sensitive, that the order of task3 processes is not sensitive, and that it doesn't matter if both a task1 and a task2 (and a task3) run at the same time:
        > ( for i in *; do
        >     jobpool -w task1 $i
        >   done )&
        > ( for i in *; do
        >     jobpool -w task2 $i
        >   done )&
        > for i in *; do
        >   jobpool task3 $i&
        > done
        > wait

    Here, jobpool will run at most 2 tasks at a time, and the task1 and task2 loops will be guaranteed to run sequentially.


This page, its contents and style, and the software available from this page, are the responsibility of the author and do not necessarily represent the views, policies or opinions of The University of Melbourne.

Last updated on