Skip to content

viam-labs/ir-person-detector

Repository files navigation

ir-person-detector

This repository contains code for training and evaluating a person detection model using infrared video data from multiple streams. It uses Hydra to format the configs and easily navigate between different models. It is optimized for use on the Orin GPU.

  • Training pipeline for person detection using PyTorch and Ultralytics YOLOv8
  • Utilities for:
    • Frame extraction from video
    • Visualization of bounding boxes
    • Dataset loading and augmentation
  • Modular scripts for training, evaluation, and inference

Models in this repo:

COCO models:

YOLO Model:

To inspect model architectures, run

 python model_summary.py 

File Organization:

## running COCO models 
coco_models/
├── configs
│   ├── config.yaml
│   ├── dataset
│   │   ├── flir.yaml
│   │   └── ir_data.yaml
│   ├── model
│   │   ├── custom_detector.yaml
│   │   ├── effnet.yaml
│   │   ├── faster_rcnn.yaml
│   │   └── ssdlite.yaml
│   └── optimization_results
│       ├── faster_rcnn.yaml
│       └── ssdlite.yaml  
└── src
    ├── datasets
    │   ├── flir_dataset.py
    │   ├── ir_dataset.py
    ├── eval.py
    ├── models
    │   ├── custom_detector.py
    │   ├── effnet_detector.py
    │   ├── faster_rcnn_detector.py
    │   └── ssdlite_detector.py
    ├── requirements.txt
    ├── train.py
    └── utils
        ├── clean_dataset.py
        └── transforms.py
## YOLO models
yolo_models/
├── configs
│   ├── config.yaml
│   ├── dataset
│   │   └── yolo.yaml
│   └── model
│       └── yolo.yaml
├── experiments
│   └── yolo_v8n_exp1_batchsize=16_in1_out5
└── src
    └── train_yolo.py
#datasets
├── FLIR_ADAS_v2 -> ../FLIR_ADAS_v2
├── ir_data -> ../ir_data
#preprocessing and misc scripts
├── filter.py
├── model_summary.py
├── requirements.txt
├── test_cuda.txt 

Datasets:

Two datasets were used throughout training, to compare results and optimally train. Both datasets include IR images of people, however the FLIR dataset included several other classes, which required filtering to isolate the 'person' class. The script filter_flir.py can be run to remove any other classes from the annotations files. The IR dataset included a higher number of clearer images of people, with many images of crowded scenes.

  1. FLIR_ADAS dataset
  2. IR_data

Training:

Experimental outputs are saved in /multirun, organized according to date and time of the experiment. Config files in configs/optimization_results are saved from using Optuna hyperparameter tuning while training, and these tuned parameters can be used to override default settings for optimized train and val losses. The device is configurd to CUDA GPU in the setup configs for both COCO and YOLO, but can be changed to CPU if GPU is not available.

To run the training pipeline on a COCO model:

Run:

 bash python src/coco_models/train.py --multirun model=model_name optimization_results=model_name

The model name options are custom_detector, effnet, ssdlite or faster_rcnn.

Any config parameters can be directly overriden from the terminal by adding the following at the end of the above command.

 ++param_name=override_value 

At the end of training, output results will be saved in a folder corresponding to the date and time of the experiment in /multirun, with a best_model.pth file and tensorboard logging to monitor train and validation loss throughout training. Hydra overrides and the experiment config will be saved in multirun/hydra/. Training logs will be outputted in train.log in the same folder.

To run the training pipeline on a YOLO model:

Required installation:

 pip install ultralytics 

Run:

 bash python src/yolo_models/train_yolo.py --multirun model=model_name
Experiment outputs will be saved in yolo_models/experiments/.

Evaluating:

For COCO models:

Load the trained model (best_model.pth) path into src/coco_models/eval.py by editing the script.

checkpoint_path = "path/to/best_model.pth"

Run:

 bash python src/coco_models/eval.py --multirun model=model_name

The model name options are custom_detector, effnet, ssdlite or faster_rcnn. There is no need to override optimization results during evaluation as the config is not referenced during training.

Outputs are saved in /outputs, with predictions.json and metrics.json which include the bounding boxes of predictions made by inference and the following metrics: { "AP", "AP50", "AP75", "APs", "APm", "APl" }

For YOLO models:

Evaluation metrics are computed by the ultralytics package, including a results.csv file, confusion matrices, mAP, precision and recall metrics and train/val batch loss diagrams. These are all saved in yolo_models/experiments when the train script is executed.

Visualization

TODO: add to this section

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages