GitHub - gtauzin/kedro-dagster: Kedro plugin to support running pipelines on Dagster

Kedro-Dagster

What is Kedro-Dagster?

The Kedro-Dagster plugin enables seamless integration between Kedro, a framework for creating reproducible and maintainable data science code, and Dagster, a data orchestrator for machine learning and data pipelines. This plugin makes use of Dagster's orchestration capabilities to automate and monitor Kedro pipelines effectively.

What are the features of Kedro-Dagster?

Configuration‑Driven Workflows: Centralize orchestration settings in a dagster.yml file for each Kedro environment. Define jobs from filtered Kedro pipelines, assign executors, retries, resource limits, and cron-based schedules.
Customization: The core integration lives in the auto‑generated Dagster definitions.py. For advanced use cases, you can extend or override these definitions.
Kedro Hooks Preservation: Kedro hooks are preserved and called at the appropriate time during pipeline execution, so custom logic (e.g., data validation, logging) continues to work seamlessly.
MLflow Compatibility: Use Kedro-MLflow with Dagster’s MLflow integration to track experiments, log models, and register artifacts.
Logger Integration: Unifies Kedro and Dagster logging so logs from Kedro nodes appear in the Dagster UI and are easy to trace and debug.

How to install Kedro-Dagster?

Install the Kedro-Dagster plugin using pip:

pip install kedro-dagster

How to get started with Kedro-Dagster?

Installation

Install the plugin with pip:

pip install kedro-dagster

or add kedro-dagster to your project's requirements.txt or pyproject.toml.

Initialize the plugin in your Kedro project

Use the following command to generate a definitions.py file, where all translated Kedro objects are available as Dagster objects, and a dagster.yml configuration file:

kedro dagster init --env <ENV_NAME>

Configure Jobs, Executors, and Schedules

Define your job executors and schedules in the dagster.yml configuration file located in your Kedro project's conf/<ENV_NAME> directory. This file allows you to filter Kedro pipelines and assign specific executors and schedules to them.

# conf/local/dagster.yml
schedules:
  daily: # Schedule name
    cron_schedule: "0 0 * * *" # Schedule parameters

executors: # Executor name
  sequential: # Executor parameters
    in_process:

  multiprocess:
    multiprocess:
      max_concurrent: 2

jobs:
  default: # Job name
    pipeline: # Pipeline filter parameters
      pipeline_name: __default__
    executor: sequential

  parallel_data_processing:
    pipeline:
      pipeline_name: data_processing
      node_names:
      - preprocess_companies_node
      - preprocess_shuttles_node
    schedule: daily
    executor: multiprocess

  data_science:
    pipeline:
      pipeline_name: data_science
    schedule: daily
    executor: sequential

Launch the Dagster UI

Start the Dagster UI to monitor and manage your pipelines using the following command:

kedro dagster dev --env <ENV_NAME>

The Dagster UI will be available at http://127.0.0.1:3000.

For a concrete use-case, see the Kedro-Dagster example repository.

How do I use Kedro-Dagster?

Full documentation is available at https://gtauzin.github.io/kedro-dagster/.

Can I contribute?

We welcome contributions, feedback, and questions:

Report issues or request features: GitHub Issues
Join the discussion: Kedro Slack
Contributing Guide: CONTRIBUTING.md

If you are interested in becoming a maintainer or taking a more active role, please reach out to Guillaume Tauzin on the Kedro Slack.

Where can I learn more?

There is a growing community around the Kedro project and we encourage you to become part of it. To ask and answer technical questions on the Kedro Slack and bookmark the Linen archive of past discussions. For questions related specifically to Kedro-Dagster, you can also open a discussion.

License

This project is licensed under the terms of the Apache 2.0 License.

Acknowledgements

This plugin is inspired by existing Kedro plugins such as the official Kedro plugins, kedro-kubeflow, kedro-mlflow.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
.github		.github
docs		docs
features		features
src/kedro_dagster		src/kedro_dagster
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE.md		LICENSE.md
README.md		README.md
RELEASE.md		RELEASE.md
mkdocs.yml		mkdocs.yml
noxfile.py		noxfile.py
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

What is Kedro-Dagster?

What are the features of Kedro-Dagster?

How to install Kedro-Dagster?

How to get started with Kedro-Dagster?

How do I use Kedro-Dagster?

Can I contribute?

Where can I learn more?

License

Acknowledgements

About

Uh oh!

Releases 3

Uh oh!

Contributors 2

Uh oh!

Languages

License

gtauzin/kedro-dagster

Folders and files

Latest commit

History

Repository files navigation

What is Kedro-Dagster?

What are the features of Kedro-Dagster?

How to install Kedro-Dagster?

How to get started with Kedro-Dagster?

How do I use Kedro-Dagster?

Can I contribute?

Where can I learn more?

License

Acknowledgements

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 3

Uh oh!

Contributors 2

Uh oh!

Languages