Official implementation of the Paper "Targeted Visual Prompting for Medical Visual Question Answering," presented at the AMAI2024 Workshop of the MICCAI 2024 Conference. For more details, please refer to our paper.
This repo is undergoing a cleaning and organization process.
After cloning the repo, create a new environment, activate it, and then install the required packages by running:
pip install -r requirements.txt
Download the original data from here and the processed annotation files from here. Alternatively, run the prepare_data.py
under the folder corresponding to each dataset (ris, insegcat or dme) to prepare the data.
Depending on where you place the downloaded data, you will need to configure the paths in the subsequent steps.
To run the code, use the bash scripts in the folder scripts_vqa
, for example, to run the baseline crop region
on the RIS dataset,
bash scripts_vqa/ris/crop_region.sh
Notice the paths to the datasets have to be changed in advance.
The test scripts are located in the same folders as for fine-tuning. The same command will be effective to evaluate performance. For example, to evaluate the baseline draw region
on the DME dataset, use
bash scripts_vqa/dme/draw_region_test.sh
Similar to the finetuning, the paths have to be configured in advance.
The metrics will be printed automatically at the end of the inference process.
This work was carried out at the AIMI Lab of the ARTORG Center for Biomedical Engineering Research of the University of Bern. Please cite this work as:
@article{tascon2024targeted,
title={Targeted Visual Prompting for Medical Visual Question Answering},
author={Tascon-Morales, Sergio and M{'a}rquez-Neila, Pablo and Sznitman, Raphael},
journal={arXiv preprint arXiv:2408.03043},
year={2024}
}