Skip to content

Use snakemake conda envs with GitHub Actions/cache #26

@lczech

Description

@lczech

Hi there,

I am trying to create CI for a Snakemake workflow that needs to install several conda envs for different rules. Is there a way to use the GitHub actions/cache to eliminate the need to install the envs for every run?

Background

The main problem: Snakemake installs the per-rule conda envs with a name that is a hash of the env file. This makes it hard to know which path to use for the actions/cache. Is there a way to do this?

Secondary question

As a secondary question: If this was solved, how would that work across jobs? I have a matrix of jobs in my GitHub Actions workflow (testing different aspects of my Snakemake workflow), of which several use the same rules and conda envs. It would be ideal if conda envs would be re-used across them.

However, I could not find any information on how race conditions are solved when multiple jobs use the same cache keys. Imagine GitHub Actions job A and job B run in parallel, job A starts with the action/cache empty, and hence creates a conda env to be cached. At the same time, if job B starts before job A is done, it will not yet find the cache. This is because

On a cache miss, the action automatically creates a new cache if the job completes successfully. (source)

Will job B then also start creating the conda env and filling the cache? Or will it wait for job A to finish?

Cheers and thanks in advance
Lucas

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions