Executors#

Executors are responsible for executing steps within a job run. Once a run has launched and the process for the run (the run worker) is allocated and started, the executor assumes responsibility for execution.

Executors can range from single-process serial executors to managing per-step computational resources with a sophisticated control plane.

Relevant APIs#

Name	Description
`@executor`	The decorator used to define executors. Defines an `ExecutorDefinition`.
`ExecutorDefinition`	An executor definition.

Directly on jobs#

Every job has an executor. The default executor is the multi_or_in_process_executor, which by default executes each step in its own process. This executor can be configured to execute each step within the same process.

An executor can be specified directly on a job by supplying an ExecutorDefinition to the executor_def parameter of @job or GraphDefinition.to_job:


from dagster import graph, job, multiprocess_executor


# Providing an executor using the job decorator
@job(executor_def=multiprocess_executor)
def the_job():
    ...


@graph
def the_graph():
    ...


# Providing an executor using graph_def.to_job(...)
other_job = the_graph.to_job(executor_def=multiprocess_executor)

For a code location#

To specify a default executor for all jobs and assets provided to a code location, supply the executor argument to the Definitions object.

If a job explicitly specifies an executor, then that executor will be used. Otherwise, jobs that don't specify an executor will use the default provided to the code location:


from dagster import multiprocess_executor, define_asset_job, asset, Definitions


@asset
def the_asset():
    pass


asset_job = define_asset_job("the_job", selection="*")


@job
def op_job():
    ...


# op_job and asset_job will both use the multiprocess_executor,
# since neither define their own executor.

defs = Definitions(
    assets=[the_asset], jobs=[asset_job, op_job], executor=multiprocess_executor
)

Note: Executing a job via JobDefinition.execute_in_process overrides the job's executor and uses in_process_executor instead.

Example executors#

Name	Description
`in_process_executor`	Execution plan executes serially within the run worker itself.
`multiprocess_executor`	Executes each step within its own spawned process. Has a configurable level of parallelism.
`dask_executor`	Executes each step within a Dask task.
`celery_executor`	Executes each step within a Celery task.
`docker_executor`	Executes each step within a Docker container.
`k8s_job_executor`	Executes each step within an ephemeral Kubernetes pod.
`celery_k8s_job_executor`	Executes each step within a ephemeral Kubernetes pod, using Celery as a control plane for prioritization and queuing.
`celery_docker_executor`	Executes each step within a Docker container, using Celery as a control plane for prioritization and queueing.

Custom executors#

The executor system is pluggable, meaning it's possible to write your own executor to target a different execution substrate. Note that this is not currently well-documented and the internal APIs continue to be in flux.