Creating a new Dagster project#

The easiest way to start building a Dagster project is by using the dagster project CLI. This CLI tool helps generate files and folder structures that enable you to quickly get started with Dagster.


Bootstrapping a new project#

You can scaffold a new project using the default project skeleton, or start with one of the official Dagster examples.

Using the default project skeleton#

To get started, you can run:

pip install dagster
dagster project scaffold --name my-dagster-project

The dagster project scaffold command generates a folder structure with a single Dagster code location and other files, such as pyproject.toml and setup.py. This takes care of setting things up with an empty project, enabling you to quickly get started.

Here's a breakdown of the files and directories that are generated:

File/DirectoryDescription
my_dagster_project/A Python package that contains your new Dagster code.
my_dagster_project_tests/A Python package that contains tests for my_dagster_project.
README.mdA description and starter guide for your new Dagster project.
pyproject.tomlA file that specifies package core metadata in a static, tool-agnostic way.

This file includes a tool.dagster section which references the Python package with your Dagster definitions defined and discoverable at the top level. This allows you to use thedagster dev command to load your Dagster code without any parameters. Refer to the Code locations documentation to learn more.

Note: pyproject.toml was introduced in PEP-518 and meant to replace setup.py, but we may still include a setup.py for compatibility with tools that do not use this spec.
setup.pyA build script with Python package dependencies for your new project as a package.
setup.cfgAn ini file that contains option defaults for setup.py commands.

Inside the my_dagster_project/ directory, the following files and directories are generated:

File/DirectoryDescription
my_dagster_project/__init__.pyThe __init__.py file includes a Definitions object that contains all the definitions defined within your project. A definition can be an asset, a job, a schedule, a sensor, or a resource. This allows Dagster to load the definitions in an installed package.
Refer to the Code locations documentation to learn other ways to deploy and load your Dagster code.
my_dagster_project/assets.pyA Python module that contains software-defined assets.

Note: As your project grows, we recommend organizing assets in sub-packages or sub-modules. For example, you can put all analytics-related assets in a my_dagster_project/assets/analytics/folder and use load_assets_from_package_module in the top-level definitions to load them, rather than needing to manually add assets to the top-level definitions every time you define one. Similarly, you can also use load_assets_from_modules to load assets from single Python files. Refer to the Fully Featured Project guide for more info and best practices.

Getting started#

The newly generated my-dagster-project directory is a fully functioning Python package and can be installed with pip.

  1. To install it as a package and its Python dependencies, run:

    pip install -e ".[dev]"
    

    Using the --editable (-e) flag instructs pip to install your code location as a Python package in "editable mode" so that as you develop, local code changes are automatically applied.

  2. Run the following to start the Dagit web server:

    dagster dev
    

    Note: This command also starts the Dagster daemon. Refer to the Running Dagster locally guide for more info.

  3. Use your browser to open http://localhost:3000 to view the project.


Development#

Adding new Python dependencies#

You can specify new Python dependencies in setup.py.

Environment variables and secrets#

Environment variables, which are key-value pairs configured outside your source code, allow you to dynamically modify application behavior depending on environment.

Using environment variables, you can define various configuration options for your Dagster application and securely set up secrets. For example, instead of hard-coding database credentials - which is bad practice and cumbersome for development - you can use environment variables to supply user details. This allows you to parameterize your pipeline without modifying code or insecurely storing sensitive data.

Refer to the Using environment variables and secrets in Dagster code guide for more info and examples.

Unit testing#

Tests can be added in the my_dagster_project_tests directory and run using pytest:

pytest my_dagster_project_tests

Deployment#

Once your project is ready to move to production, check out our recommendation for Transitioning Data Pipelines from Development to Production.

Check out the following resources to learn more about deployment options: