Robust Conformal Outlier Detection under Contaminated Reference Data

This repository contains a Python implementation of the following paper: "Robust Conformal Outlier Detection under Contaminated Reference Data". The repository includes an implementation of the proposed Label-Trim method. Baseline methods implementation alongside code for real data experiments are included in this repository.

📖 Abstract

Conformal prediction is a flexible framework for calibrating machine learning predictions, providing distribution-free statistical guarantees. In outlier detection, this calibration relies on a reference set of labeled inlier data to control the type-I error rate. However, obtaining a perfectly labeled inlier reference set is often unrealistic, and a more practical scenario involves access to a contaminated reference set containing a small fraction of outliers. This paper analyzes the impact of such contamination on the validity of conformal methods. We prove that under realistic, non-adversarial settings, calibration on contaminated data yields conservative type-I error control, shedding light on the inherent robustness of conformal methods. This conservativeness, however, typically results in a loss of power. To alleviate this limitation, we propose a novel, active data-cleaning framework that leverages a limited labeling budget and an outlier detection model to selectively annotate data points in the contaminated reference set that are suspected as outliers. By removing only the annotated outliers in this ``suspicious'' subset, we can effectively enhance power while mitigating the risk of inflating the type-I error rate, as supported by our theoretical analysis. Experiments on real datasets validate the conservative behavior of conformal methods under contamination and show that the proposed data-cleaning strategy improves power without sacrificing validity.

📌 Usage Instructions

🔹 Setting Up a Conda Environment

To install dependencies, create a Conda environment using requirements.yml:

conda env update -f requirements.yml
conda activate robust-cod

📁 Preparing Datasets for Visual Experiments

For experiments involving visual data, you must provide a path to the score dataset. The dataset consists of outlier scores computed using a pretrained model and an outlier detection model.

To generate such datasets, use: python ImageOD/score_datasets.py

This script processes images and extracts outlier scores using a pretrained model and an outlier detection model.

🔹 Pretrained Models and Datasets

We use datasets and pretrained models from OpenOOD for outlier detection. The required datastes and checkpoint files can be downloaded from OpenOOD’s GitHub repository. For download instructions, refer to the OpenOOD GitHub repository.

⚡ Running python ImageOD/score_datasets.py automatically downloads missing datasets. Checkpoints must be downloaded beforehand.

🚀 Example: Generating a Score Dataset with ReAct + ResNet18

To train a ReAct-based outlier detection model using a ResNet18 pretrained on CIFAR-10, with Texture dataset as the outlier dataset, and a contamination rate of 0.03, run:

python ImageOD/score_datasets.py --save_path ./datasets_scores/cifar10_texture/n_train_2000_0.03/ \
  --id_dataset cifar10 --ood_dataset texture --postprocess react \
  --net_ckpt_path ./openood/results/checkpoints/cifar10_resnet18_32x32_base_e100_lr0.1_default/s1/best.ckpt \
  --net resnet18_32x32 --p_train 0.03 --n_train 2000

--id_dataset cifar10 → In-distribution dataset (CIFAR-10).
--ood_dataset texture → Out-of-distribution dataset (Texture).
--postprocess react → Applies ReAct post-processing.
--net_ckpt_path → Path to the pretrained model checkpoint.
--p_train 0.03 → Contamination rate of 3%.
--n_train 2000 → Number of training samples.

Once the dataset is created, you can use it in experiments by specifying its path.

🧪 Running Experiments

You can run experiments using main.py, either by specifying parameters directly or by using a YAML configuration file.

🔹 Option 1: Running a Single Experiment (Command Line Arguments)

To run a single experiment with custom parameters, use main.py.

For example, to compare all methods on the shuttle dataset using Isolation Forest model with a labeling budget of m=50, run:

python main.py --save_path ./results/ --model IF --level 0.01 --n_cal 2500 --p_cal 0.03 \
--n_train 5000 --n_test 1000 --p_test 0.05 --dataset shuttle --initial_labeled 50

For a full list of command-line arguments, run:

python main.py --help

🔹 Option 2: Running Experiments Using a YAML Configuration File

Instead of specifying parameters manually, you can use a YAML config file to define all parameters.

For example, to run a contamination rate experiment, execute:

python run_exp.py -c ./experiments/tabular_data/real_data_shuttle_contamination_rate_exp.yml -s ./results/

The YAML file can contain lists of parameters, allowing you to run all possible parameter combinations automatically.
Experiment configuration files are stored in the experiments/ folder.

Got it! Here’s the improved version of your note:

📝 Note:

You can run experiments with multiple target Type-I error rate levels by providing a list of values. For example:

--level 0.01 0.02 0.03

When plotting the results, you can filter by a specific target Type-I error level and repeat the process for different levels (this will be clearer after reading the "Plotting Experiment Results" part). For example:

--filter_k level --filter_v 0.01

💻 Running Experiments on SLURM Cluster (or Locally)

By default, the code is designed to run on a computing cluster using the SLURM scheduler for distributed execution.

Each random seed is executed as a separate SLURM job to enable parallelism.
To run experiments locally, add the --local flag when executing main.py (or set it in the YAML file under flag_params).
To disable seed-based job distribution, use the --no_distribute flag (or set it in the YAML file under flag_params).

Example:

python main.py --local --no_distribute

📊 Plotting Experiment Results

Once the experiments are complete, use plot_main.py to visualize the results. For example:

python plot_main.py --results_dir ./results/ --plot_dir ./plots/

🔹 Additional Options:

To plot a specific experiment, use the --x argument.
To filter results before plotting, use:
```
--filter_k <key> --filter_v <value>
```
To generate a summary table, add the --table flag.

For a full list of command-line arguments, run:

python plot_main.py --help

📜 License

This project is licensed under the MIT License.

📚 Citation

If you use this code or ideas from this project in your research, please cite:

@inproceedings{bashari2025robust,
              title={Robust Conformal Outlier Detection under Contaminated Reference Data},
              author={Meshi Bashari and Matteo Sesia and Yaniv Romano},
              booktitle={Forty-second International Conference on Machine Learning},
              year={2025},
              url={https://openreview.net/forum?id=s55Af9Emyq}
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
ImageOD		ImageOD
datasets		datasets
experiments		experiments
openood		openood
README.md		README.md
algo.py		algo.py
create_tmp_empty.sh		create_tmp_empty.sh
main.py		main.py
main_func.py		main_func.py
plot_main.py		plot_main.py
requirements.yml		requirements.yml
run_exp.py		run_exp.py
utils.py		utils.py
utils_plot.py		utils_plot.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Robust Conformal Outlier Detection under Contaminated Reference Data

📖 Abstract

📚 Table of Contents

📌 Usage Instructions

🔹 Setting Up a Conda Environment

📁 Preparing Datasets for Visual Experiments

🔹 Pretrained Models and Datasets

🚀 Example: Generating a Score Dataset with ReAct + ResNet18

🧪 Running Experiments

🔹 Option 1: Running a Single Experiment (Command Line Arguments)

🔹 Option 2: Running Experiments Using a YAML Configuration File

📝 Note:

💻 Running Experiments on SLURM Cluster (or Locally)

📊 Plotting Experiment Results

🔹 Additional Options:

📜 License

📚 Citation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Meshiba/robust-conformal-od

Folders and files

Latest commit

History

Repository files navigation

Robust Conformal Outlier Detection under Contaminated Reference Data

📖 Abstract

📚 Table of Contents

📌 Usage Instructions

🔹 Setting Up a Conda Environment

📁 Preparing Datasets for Visual Experiments

🔹 Pretrained Models and Datasets

🚀 Example: Generating a Score Dataset with ReAct + ResNet18

🧪 Running Experiments

🔹 Option 1: Running a Single Experiment (Command Line Arguments)

🔹 Option 2: Running Experiments Using a YAML Configuration File

📝 Note:

💻 Running Experiments on SLURM Cluster (or Locally)

📊 Plotting Experiment Results

🔹 Additional Options:

📜 License

📚 Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages