Adversarial and poisoning attacks against multimodal retrieval-augmented generation (RAG)
- Install
uv
:pip install uv
- Install the project as editable:
uv pip install -e .
- To install optional dependencies:
uv pip install -e ".[nb,opt,test,dev]"
Alternatively: pip install -r requirements.txt
orpip install -r requirements.opt.txt
- To install optional dependencies:
To train the attack:
python src/attack_train.py --config-name <name>
Experiments are defined in the experiments module.
Fields of the config can be overridden, e.g.:
python src/attack_train.py --config-name <name> train.n_gradient_steps=10
To train the attack:
python src/attack_eval.py --config-name <name>
The experiment outputs are contained in the data
directory and are split into subdirectories as follows.
Contains the output adversarial images. File names contain the name of the experiment that generated them and a hash of the experiment parameters.
Contains the cached embeddings of datasets using embedding models. File names contain the name of the dataset and embedding model.
Contains the results json files produced from the evals. File names contain the name of the experiment that generated them and a hash of the experiment parameters.
By default, files are output into the draft
subdirectories of attacks
and results
, which are ignored.
To share work, move files into the parent directories.
You can then adjust the paths to point to the parent directories, either in the experiment configuration:
configstore.store(
...,
node=ExperimentConfig(
train=ExperimentTrainConfig(
...,
save_folder=ATTACKS_FOLDER.parent,
),
test=ExperimentEvalConfig(
...,
results_folder=RESULTS_FOLDER.parent,
),
)
)
or via the CLI
python src/attack_train.py --config <config_name> train.save_folder="data/attacks" eval.results_folder="data/results"