jobqueue {jobqueue}R Documentation

Assigns Jobs to Workers

Description

Jobs go in. Results come out.

Usage

jobqueue(
  globals = NULL,
  packages = NULL,
  namespace = NULL,
  init = NULL,
  max_cpus = availableCores(),
  workers = ceiling(max_cpus * 1.2),
  timeout = NULL,
  hooks = NULL,
  reformat = NULL,
  signal = FALSE,
  cpus = 1L,
  stop_id = NULL,
  copy_id = NULL
)

Arguments

globals

A named list of variables that all ⁠<job>$expr⁠s will have access to. Alternatively, an object that can be coerced to a named list with as.list(), e.g. named vector, data.frame, or environment.

packages

Character vector of package names to load on workers.

namespace

The name of a package to attach to the worker's environment.

init

A call or R expression wrapped in curly braces to evaluate on each worker just once, immediately after start-up. Will have access to variables defined by globals and assets from packages and namespace. Returned value is ignored.

max_cpus

Total number of CPU cores that can be reserved by all running jobs (⁠sum(<job>$cpus)⁠). Does not enforce limits on actual CPU utilization.

workers

How many background worker processes to start. Set to more than max_cpus to enable standby workers to quickly swap out with workers that need to restart.

timeout

A named numeric vector indicating the maximum number of seconds allowed for each state the job passes through, or 'total' to apply a single timeout from 'submitted' to 'done'. Can also limit the 'starting' state for workers. A ⁠function (job)⁠ can be used in place of a number. Example: timeout = c(total = 2.5, running = 1). See vignette('stops').

hooks

A named list of functions to run when the job state changes, of the form hooks = list(created = function (worker) {...}). Or a ⁠function (job)⁠ that returns the same. Names of worker hooks are typically 'created', 'submitted', 'queued', 'dispatched', 'starting', 'running', 'done', or '*' (duplicates okay). See vignette('hooks').

reformat

Set ⁠reformat = function (job)⁠ to define what ⁠<job>$result⁠ should return. The default, reformat = NULL passes ⁠<job>$output⁠ to ⁠<job>$result⁠ unchanged. See vignette('results').

signal

Should calling ⁠<job>$result⁠ signal on condition objects? When FALSE, ⁠<job>$result⁠ will return the object without taking additional action. Setting to TRUE or a character vector of condition classes, e.g. c('interrupt', 'error', 'warning'), will cause the equivalent of ⁠stop(<condition>)⁠ to be called when those conditions are produced. Alternatively, a ⁠function (job)⁠ that returns TRUE or FALSE. See vignette('results').

cpus

The default number of CPU cores per job. Or a ⁠function (job)⁠ that returns the number of CPU cores to reserve for a given job. Used to limit the number of jobs running simultaneously to respect ⁠<jobqueue>$max_cpus⁠. Does not prevent a job from using more CPUs than reserved.

stop_id

If an existing job in the jobqueue has the same stop_id, that job will be stopped and return an 'interrupt' condition object as its result. stop_id can also be a ⁠function (job)⁠ that returns the stop_id to assign to a given job. A stop_id of NULL disables this feature. See vignette('stops').

copy_id

If an existing job in the jobqueue has the same copy_id, the newly submitted job will become a "proxy" for that earlier job, returning whatever result the earlier job returns. copy_id can also be a ⁠function (job)⁠ that returns the copy_id to assign to a given job. A copy_id of NULL disables this feature. See vignette('stops').

Value

A jobqueue object.

Examples



jq <- jobqueue(globals = list(N = 42), workers = 2)
print(jq)

job <- jq$run({ paste("N is", N) })
job$result

jq$stop()


[Package jobqueue version 1.7.0 Index]