Executors are responsible for executing steps within a job run. Once a run has launched and the process for the run (the run worker) is allocated and started, the executor assumes responsibility for execution.
Executors can range from single-process serial executors to managing per-step computational resources with a sophisticated control plane.
Every job has an executor. The default executor is the multi_or_in_process_executor, which by default executes each step in its own process. This executor can be configured to execute each step within the same process.
from dagster import graph, job, multiprocess_executor
# Providing an executor using the job decorator@job(executor_def=multiprocess_executor)defthe_job():...@graphdefthe_graph():...# Providing an executor using graph_def.to_job(...)
other_job = the_graph.to_job(executor_def=multiprocess_executor)
To specify a default executor for all jobs and assets provided to a code location, supply the executor argument to the Definitions object.
If a job explicitly specifies an executor, then that executor will be used. Otherwise, jobs that don't specify an executor will use the default provided to the code location:
from dagster import multiprocess_executor, define_asset_job, asset, Definitions
@assetdefthe_asset():pass
asset_job = define_asset_job("the_job", selection="*")@jobdefop_job():...# op_job and asset_job will both use the multiprocess_executor,# since neither define their own executor.
defs = Definitions(
assets=[the_asset], jobs=[asset_job, op_job], executor=multiprocess_executor
)
The executor system is pluggable, meaning it's possible to write your own executor to target a different execution substrate. Note that this is not currently well-documented and the internal APIs continue to be in flux.