HTCondorCluster(n_workers=0, job_cls: dask_jobqueue.core.Job = None, loop=None, security=None, silence_logs='error', name=None, asynchronous=False, interface=None, host=None, protocol='tcp://', dashboard_address=':8787', config_name=None, **kwargs)¶
Launch Dask on an HTCondor cluster with a shared file system
Total amount of disk per job
Extra submit file attributes for the job
Total number of cores per job
- memory: str
Total amount of memory per job
Cut the job up into this many processes. Good for GIL workloads or for nodes with many cores.
Network interface like ‘eth0’ or ‘ib0’.
Whether or not to start a nanny process
Dask worker local directory for file spilling.
Seconds to wait for a scheduler before closing workers
Additional arguments to pass to dask-worker
Other commands to add to script before launching worker.
Lines to skip in the header. Header lines matching this text will be removed
Directory to use for job scheduler logs.
Path to desired interpreter for your batch submission script.
Python executable used to launch Dask workers. Defaults to the Python that is submitting these jobs
Section to use from jobqueue.yaml configuration file.
Name of Dask worker. This is typically set by the Cluster
Number of workers to start by default. Defaults to 0. See the scale method
Log level like “debug”, “info”, or “error” to emit here if the scheduler is started locally
Whether or not to run this cluster object with the async/await syntax
A dask.distributed security object if you’re using TLS/SSL
- dashboard_addressstr or int
An address like “:8787” on which to host the Scheduler’s dashboard
>>> from dask_jobqueue.htcondor import HTCondorCluster >>> cluster = HTCondorCluster(cores=24, memory="4GB", disk="4GB") >>> cluster.scale(jobs=10) # ask for 10 jobs
>>> from dask.distributed import Client >>> client = Client(cluster)
This also works with adaptive clusters. This automatically launches and kill workers based on load.
__init__(self, n_workers=0, job_cls:dask_jobqueue.core.Job=None, loop=None, security=None, silence_logs='error', name=None, asynchronous=False, interface=None, host=None, protocol='tcp://', dashboard_address=':8787', config_name=None, **kwargs)¶
Initialize self. See help(type(self)) for accurate signature.
__init__(self[, n_workers, loop, security, …])
adapt(self, \*args, minimum_jobs, …)
Scale Dask cluster automatically based on scheduler activity.
logs(self[, scheduler, workers])
Return logs for the scheduler and workers
Return name and spec for the next worker
scale(self[, n, jobs, memory, cores])
Scale cluster to specified configurations.
scale_up(self[, n, memory, cores])
Scale cluster to n workers
sync(self, func, \*args[, asynchronous, …])