Description
🐛 Bug
I tried to train a detector on a custom dataset (using https://pytorch.org/tutorials/intermediate/torchvision_tutorial.html) and the training is stuck on 60% of precision. So I started looking at the source code and noticed that get_coco_api_from_dataset() is probably doing the wrong thing. In case the Subset is used for train and test datasets, it returns its original dataset. Problem is that it returns the entire dataset, not just the test dataset, which is the part of the entire dataset. This results in slow down when building a needless full Coco dataset for evaluation. It also updates coco_evaluator in evaluate() with wrong stats because of conflicting "image_id". According to the tutorial "image_id" is unique inside one data set and this is why the dataset "index" is used. This all results in invalid evaluation stats presented during training.
To Reproduce
Steps to reproduce the behavior:
No reproducing is needed.
@torch.no_grad()
def evaluate(model, data_loader, device):
...
---> coco = get_coco_api_from_dataset(data_loader.dataset) # this returns full dataset
...
res = {target["image_id"].item(): output for target, output in zip(targets, outputs)}
evaluator_time = time.time()
---> coco_evaluator.update(res) # invalid target is updated
...
return coco_evaluator
Expected behavior
Subset dataset should be used in get_coco_api_from_dataset()
Environment
Collecting environment information...
PyTorch version: 1.6.0
Torchvision: 0.7.0
Is debug build: N/A
CUDA used to build PyTorch: N/A
OS: Ubuntu 20.04 LTS (x86_64)
GCC version: (Ubuntu 9.3.0-10ubuntu2) 9.3.0
Clang version: Not used
CMake version: version 3.16.3
Python version: 3.8 (64-bit runtime)
Is CUDA available: yes
CUDA runtime version: 11.0.2-1
GPU models and configuration: GPU 0: GeForce GTX 1080 Ti
Nvidia driver version: 450.51.05
cuDNN version: /usr/lib/x86_64-linux-gnu/libcudnn.so.7.4.2
Versions of relevant libraries:
[pip3] numpy==1.17.4
[conda] Not used