MaD generator: use `--threads=0` and 2GB per thread for `--ram` by default #19744

redsun82 · 2025-06-12T14:56:13Z

use --threads=0 and 2GB per thread for --ram by default
also fix a bug where the order of model generation was determined by the order in the download.json file of the experiment rather than the order in the config file
allow configuring --ram and --threads in the MaD generator scripts

The review should ignore ae3bbb0 and 5df292c in order to not look at black formatting changes.

* fix a bug where the order of model generation was determined by the order in the `download.json` file of the experiment rather than the order in the config file * allow configuring `--ram` and `--threads` in the MaD generator scripts * use no `--ram` and `--threads=0` by default in the bulk generator (single generator defaults are left unchanged) * allow to pass `--dca` multiple times, taking DBs from experiments listed last. This allows to run a subset of the sources in a "fixup" experiment and use it to "patch" a previous run without rerunning everything.

The standalone MaD generator now uses `0` for threads and throttles the RAM to use 2GB per thread by default. Also, replaced the hand-written argument parsing with `argparse`.

Copilot

Pull Request Overview

This PR enhances the MaD generator scripts by adding configurable resource flags, fixing model-generation ordering, and improving DCA experiment support.

Replace manual argument parsing in generate_mad.py with argparse, introducing --threads and --ram (default 0 threads and 2 GB per thread).
Update bulk_generate_mad.py to propagate these new flags into the generator, support multiple --dca runs, and remove the stale git-status precheck.
Fix bug where model-generation order followed download.json rather than the user’s config.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
misc/scripts/models-as-data/generate_mad.py	Switched to `argparse`, added `threads`/`ram` handling, and updated default RAM
misc/scripts/models-as-data/bulk_generate_mad.py	Changed `generate_models` signature to accept CLI args, wired new flags, updated DCA download loop

Comments suppressed due to low confidence (2)

misc/scripts/models-as-data/bulk_generate_mad.py:228

Update this function’s docstring to document the new args parameter (its type, purpose, and which CLI flags it carries).

def generate_models(config, args, project: Project, database_dir: str) -> None:

misc/scripts/models-as-data/generate_mad.py:143

Add tests (unit or integration) to verify the behavior of the new CLI flags (--threads, --ram, --with-*) and defaulting logic in generate_mad.py.

generator = p.parse_args(namespace=Generator())

misc/scripts/models-as-data/generate_mad.py

Models are regenerated with the fix from #19744 which corrects the order of generation.

redsun82 added 3 commits June 12, 2025 16:23

MaD generator: change default thread and ram

39a3623

The standalone MaD generator now uses `0` for threads and throttles the RAM to use 2GB per thread by default. Also, replaced the hand-written argument parsing with `argparse`.

MaD generator: run black formatter`

ae3bbb0

Copilot AI review requested due to automatic review settings June 12, 2025 14:56

Copilot AI reviewed Jun 12, 2025

View reviewed changes

misc/scripts/models-as-data/generate_mad.py Show resolved Hide resolved

redsun82 mentioned this pull request Jun 12, 2025

Rust: Use QL computed canonical paths in MaD Field tokens #19667

Merged

redsun82 added 3 commits June 13, 2025 08:42

Merge branch 'main' into redsun82/mad-generator

f7266c9

MaD generator: really fix ordering problem

1a36374

MaD generator: apply black formatting to all sources

5df292c

redsun82 added a commit that referenced this pull request Jun 13, 2025

Rust: regenerate models

118456d

Models are regenerated with the fix from #19744 which corrects the order of generation.

redsun82 mentioned this pull request Jun 13, 2025

Rust: regenerate models #19748

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

MaD generator: use `--threads=0` and 2GB per thread for `--ram` by default #19744

MaD generator: use `--threads=0` and 2GB per thread for `--ram` by default #19744

Uh oh!

redsun82 commented Jun 12, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

MaD generator: use --threads=0 and 2GB per thread for --ram by default #19744

Are you sure you want to change the base?

MaD generator: use --threads=0 and 2GB per thread for --ram by default #19744

Uh oh!

Conversation

redsun82 commented Jun 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

MaD generator: use `--threads=0` and 2GB per thread for `--ram` by default #19744

MaD generator: use `--threads=0` and 2GB per thread for `--ram` by default #19744

redsun82 commented Jun 12, 2025 •

edited

Loading