Iron Mind: Manuscript Figure Generation

This repository contains the code to reproduce all figures from the Iron Mind manuscript. The repository provides Python scripts to generate publication-quality figures from benchmark optimization data across six chemical reaction datasets. The preprint can be found on arXiv: https://arxiv.org/abs/2509.00103

Website for testing optimizers and engaging in human-driven optimization campaigns

This work comes with a website for users to test out both the LLM and BO optimization strategies on the benchmark datasets. Additionally, we are excited to offer humans the opportunity to conduct optimization campaigns on the datasets. You can access the website here.

Repository Structure

computed_descriptors/ - Descriptors used for Bayesian optimization methods
descriptors/ - Code to reproduce descriptors
figures/ - Python scripts for generating manuscript figures
histograms/ - Histogram plots showing objective distributions for each dataset
schematics/ - Chemical reaction schematics for each dataset

Datasets

The repository works with six chemical reaction optimization datasets:

Buchwald-Hartwig - C-N coupling reactions (yield optimization)
Suzuki-Miyaura A - Cross-coupling reactions (yield optimization)
Suzuki-Miyaura B - Cross-coupling reactions (conversion optimization)
Reductive Amination - Amine synthesis (conversion optimization)
N-Alkylation/Deprotection - Two-step synthesis (yield optimization)
Chan-Lam Coupling - C-N coupling reactions (multi-objective: desired vs undesired yield)

Installation (should take less than 5 minutes)

Clone this repository:

git clone https://github.com/gomesgroup/iron-mind-public.git
cd iron-mind-public

Create a conda environment with required dependencies:

conda create -n iron-mind-figures python=3.10
conda activate iron-mind-figures

Install the required packages:

pip install git+https://github.com/gomesgroup/olympus.git
pip install pandas numpy matplotlib seaborn scikit-learn plotly scipy

Setup to save plotly figures:

pip install kaleido
plotly_get_chrome

Data Access

The benchmark optimization data used to generate these figures is available on Hugging Face:

pip install huggingface-hub
hf auth login
hf download gomesgroup/iron-mind-data runs.zip --repo-type dataset --local-dir .
unzip runs.zip

This will produce the runs/ directory in your current working directory. Use this path when generating figures.

Figure Reproduction (should take less than 5 minutes)

Each figure script in the figures/ directory can be run independently:

cd figures/
python figure_2.py
python figure_3.py
python figure_5_S12.py
...

Some figure scripts require the path to the runs/ directory, be sure to provide the absolute path, opposed to the relative path.

To generate all figures:

bash generate_all_figures.sh <path_to_runs>

The path_to_runs must be an absolute path.

Generated figures are saved to figures/pngs/ directory.

Descriptors

The descriptors used for Bayesian optimization can be found in computed_descriptors.

Citation

If you use this code or data, please cite our manuscript:

@article{macknight2025iron,
  title={Pre-trained knowledge elevates large language models beyond traditional chemical reaction optimizers},
  author={MacKnight, Robert and Regio, Jose Emilio and Ethier, Jeffrey G. and Baldwin, Luke A. and Gomes, Gabe},
  journal={arXiv preprint arXiv:2025.xxxxx},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
computed_descriptors		computed_descriptors
descriptors		descriptors
figures		figures
histograms		histograms
schematics		schematics
smiles_mappings		smiles_mappings
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
generate_all_figures.sh		generate_all_figures.sh
logo.png		logo.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Iron Mind: Manuscript Figure Generation

Website for testing optimizers and engaging in human-driven optimization campaigns

Repository Structure

Datasets

Installation (should take less than 5 minutes)

Data Access

Figure Reproduction (should take less than 5 minutes)

Descriptors

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

gomesgroup/iron-mind-public

Folders and files

Latest commit

History

Repository files navigation

Iron Mind: Manuscript Figure Generation

Website for testing optimizers and engaging in human-driven optimization campaigns

Repository Structure

Datasets

Installation (should take less than 5 minutes)

Data Access

Figure Reproduction (should take less than 5 minutes)

Descriptors

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages