Executors#

Executors are responsible for executing steps within a job run. Once a run has launched and the process for the run (the run worker) is allocated and started, the executor assumes responsibility for execution.

Executors can range from single-process serial executors to managing per-step computational resources with a sophisticated control plane.


Relevant APIs#

NameDescription
@executorThe decorator used to define executors. Defines an ExecutorDefinition.
ExecutorDefinitionAn executor definition.

Specifying executors#

Directly on jobs#

Every job has an executor. The default executor is the multi_or_in_process_executor, which by default executes each step in its own process. This executor can be configured to execute each step within the same process.

An executor can be specified directly on a job by supplying an ExecutorDefinition to the executor_def parameter of @job or GraphDefinition.to_job:

from dagster import graph, job, multiprocess_executor


# Providing an executor using the job decorator
@job(executor_def=multiprocess_executor)
def the_job():
    ...


@graph
def the_graph():
    ...


# Providing an executor using graph_def.to_job(...)
other_job = the_graph.to_job(executor_def=multiprocess_executor)

For a code location#

To specify a default executor for all jobs and assets provided to a code location, supply the executor argument to the Definitions object.

If a job explicitly specifies an executor, then that executor will be used. Otherwise, jobs that don't specify an executor will use the default provided to the code location:

from dagster import multiprocess_executor, define_asset_job, asset, Definitions


@asset
def the_asset():
    pass


asset_job = define_asset_job("the_job", selection="*")


@job
def op_job():
    ...


# op_job and asset_job will both use the multiprocess_executor,
# since neither define their own executor.

defs = Definitions(
    assets=[the_asset], jobs=[asset_job, op_job], executor=multiprocess_executor
)

Note: Executing a job via JobDefinition.execute_in_process overrides the job's executor and uses in_process_executor instead.


Example executors#

NameDescription
in_process_executorExecution plan executes serially within the run worker itself.
multiprocess_executorExecutes each step within its own spawned process. Has a configurable level of parallelism.
dask_executorExecutes each step within a Dask task.
celery_executorExecutes each step within a Celery task.
docker_executorExecutes each step within a Docker container.
k8s_job_executorExecutes each step within an ephemeral Kubernetes pod.
celery_k8s_job_executorExecutes each step within a ephemeral Kubernetes pod, using Celery as a control plane for prioritization and queuing.
celery_docker_executorExecutes each step within a Docker container, using Celery as a control plane for prioritization and queueing.

Custom executors#

The executor system is pluggable, meaning it's possible to write your own executor to target a different execution substrate. Note that this is not currently well-documented and the internal APIs continue to be in flux.