dask_jobqueue.SGECluster

class dask_jobqueue.SGECluster(queue=None, project=None, resource_spec=None, walltime=None, config_name='sge', **kwargs)

Launch Dask on a SGE cluster

Note

If you want a specific amount of RAM, both memory and resource_spec must be specified. The exact syntax of resource_spec is defined by your GridEngine system administrator. The amount of memory requested should match the resource_spec, so that Dask’s memory management system can perform accurately.

Parameters:
queue : str

Destination queue for each worker job. Passed to #$ -q option.

project : str

Accounting string associated with each worker job. Passed to #$ -A option.

resource_spec : str

Request resources and specify job placement. Passed to #$ -l option.

walltime : str

Walltime for each worker job.

name : str

Name of Dask workers.

cores : int

Total number of cores per job

memory: str

Total amount of memory per job

processes : int

Number of processes per job

interface : str

Network interface like ‘eth0’ or ‘ib0’.

death_timeout : float

Seconds to wait for a scheduler before closing workers

local_directory : str

Dask worker local directory for file spilling.

extra : list

Additional arguments to pass to dask-worker

env_extra : list

Other commands to add to script before launching worker.

python : str

Python executable used to launch Dask workers.

shebang : str

Path to desired interpreter for your batch submission script.

kwargs : dict

Additional keyword arguments to pass to LocalCluster

Examples

>>> from dask_jobqueue import SGECluster
>>> cluster = SGECluster(queue='regular')
>>> cluster.scale(10)  # this may take a few seconds to launch
>>> from dask.distributed import Client
>>> client = Client(cluster)

This also works with adaptive clusters. This automatically launches and kill workers based on load.

>>> cluster.adapt()
__init__(queue=None, project=None, resource_spec=None, walltime=None, config_name='sge', **kwargs)

Methods

__init__([queue, project, resource_spec, …])
adapt([minimum_cores, maximum_cores, …]) Turn on adaptivity For keyword arguments see dask.distributed.Adaptive Instead of minimum and maximum parameters which apply to the number of worker, If Cluster object implements worker_spec attribute, one can use the following parameters: Parameters ———- minimum_cores: int Minimum number of cores for the cluster maximum_cores: int Maximum number of cores for the cluster minimum_memory: str Minimum amount of memory for the cluster maximum_memory: str Maximum amount of memory for the cluster Examples ——– >>> cluster.adapt(minimum=0, maximum=10, interval=‘500ms’) >>> cluster.adapt(minimum_cores=24, maximum_cores=96) >>> cluster.adapt(minimum_memory=‘60 GB’, maximum_memory= ‘1 TB’)
close(**kwargs) Stops all running and pending jobs and stops scheduler
job_file() Write job submission script to temporary file
job_script() Construct a job submission script
scale([n, cores, memory]) Scale cluster to n workers or to the given number of cores or memory number of cores and memory are converted into number of workers using worker_spec attribute.
scale_down(workers[, n]) Close the workers with the given addresses
scale_up(n, **kwargs) Brings total worker count up to n
start_workers([n]) Start workers and point them to our local scheduler
stop_all_jobs() Stops all running and pending jobs
stop_jobs(jobs) Stop a list of jobs
stop_workers(workers) Stop a list of workers

Attributes

cancel_command
dashboard_link
finished_jobs Jobs that have finished
job_id_regexp
pending_jobs Jobs pending in the queue
running_jobs Jobs with currently active workers
scheduler The scheduler of this cluster
scheduler_address
SGECluster.scheduler_name
submit_command
SGECluster.worker_threads